Oks notes #1: DeepSeek, the WASPs, and the last Mossad station chief in Tehran
Brief notes from what I've been reading
Hello! I hope you enjoyed my two posts this week—on why poor countries stopped catching up and why many national population numbers are fake. Since I’ve entered something of a regular posting cadence, I wanted to introduce the weekly schedule. Going forward, on Mondays and Fridays I’ll be posting longer-form articles like the two that I just mentioned; and on Sundays, I’ll be posting more casual notes on what I’ve been reading and looking at this week. This will be biased toward books and articles, but I’m sure it’ll include many non-reading things.
I’m sure that I’ll violate the regular posting cadence at some point in the future, though I’m going to try to emulate Tyler Cowen in the NSP doctrine. (NSP, Never Stop Posting.) And since this is our first one, I’ll be a bit longer and include stuff that I’ve been reading over the last few weeks instead of just the last week.
Zama, by Antonio di Benedetto. I read this while spending a week in Paraguay, since Zama is probably the only significant work of literature (maybe not significant to most, but at least significant in the Argentine canon) that takes place in Paraguay, and I wanted to learn more about Paraguay. But, as I discovered, its location in Paraguay is entirely incidental to the plot: the only real importance of Paraguay to the book’s plot is that Paraguay is in the middle of nowhere and that the protagonist wants to escape. It’s a very funny book. It reminded me most of certain short stories by George Saunders (like “The 400-Pound CEO”) and, most especially, J. M. Coetzee’s Disgrace. Much like Disgrace, it is about an unsympathetic protagonist brought low by events, each one more humiliating than the last, until he is reduced to a final abjection and finds in that state of abjection something like grace. (A great line from Aeschylus: “in our own despair, against our will, comes wisdom to us by the awful grace of God.”) But Disgrace is one of the most disturbing books I’ve ever read, while Zama reads like a comedy.
Poor Numbers, by Morten Jerven. This informed my post on why a lot of population numbers are fake and will inform a post I’m planning for next week on why a lot of GDP numbers are fake. The thrust of Jerven’s book is simple: statistics in Africa are of terrible quality, to the extent that many of them just have no real relationship with reality. It’s a short book, most of which is dedicated to an examination of what exactly goes wrong in the calculation of economic statistics. Jerven knows his stuff, and the book is also highly informational; but since it’s so narrow in its focus and turgid in its writing, I can only recommend it to those who really, truly care about African statistics.
The Thinking Machine, by Stephen Witt. I wanted to read a history of Nvidia and there are two options on the market: Witt’s book and another one by Tae Kim. I decided to start with Witt’s book because I saw that it won the “Business Book of the Year” award that the Financial Times gives out. Within the “book about a company” genre it’s a strong addition: it feels like you’re in the hands of a professional. The book is a serviceable biography of Jensen Huang and Nvidia. But, as with other books about tech companies, the need to keep “ordinary people” reading means that Witt focuses on relatively uninteresting details of the founder’s personality and of corporate culture at Nvidia, instead of the actual thing that Nvidia does. Witt makes a half-hearted attempt to explain parallel computing, but he spends a lot more time talking about the details of Jensen Huang’s life and personality. That’s all fine—Jensen seems like an interesting person—but the important parts about Nvidia are the actual things that it does, and the fact that it does those things and other companies don’t. Things get better when the book shifts to AI, and Witt seems to have spent a good bit of time trying to understand the transformer architecture and large language models. But without much understanding of Nvidia’s hardware business, we’re left in the dark as to why Nvidia takes the actions that it takes. For instance: there is a brief discussion (a few paragraphs, if I recall correctly) of the Mellanox acquisition and why that mattered. But for some reason there is no mention made of Nvidia’s attempt to buy Arm, one of the largest acquisition attempts in history. After reading the book, I feel like I have a somewhat better understanding of Nvidia’s history and of Jensen Huang’s personality, though he’s not really the most fascinating corporate leader in the world. (I think he would agree with that assessment: at the end of the book, he starts yelling at Witt for trying to psychoanalyze him.) But I don’t have much more insight as to why Nvidia does the things it does: I have no real idea, for instance, why it decided to quasi-acquire Groq. “Does this book help me explain this out-of-distribution event” seems like a good acid test for a book about an entity that is still an active agent in the world.
DeepSeek is launching its new flagship V4 model in February, probably before the Chinese New Year on February 17. Anil Ananthaswamy, who is excellent, has a good write-up of the paper they’ve released ahead of the V4 launch, “Conditional Memory via Scalable Lookup.” The paper describes “a conditional memory module designed to augment the Transformer backbone by structurally separating static pattern storage from dynamic computation.” My (very primitive) understanding of it is that it’s a method for offloading knowledge memory to regular RAM, which can be used on a just-in-time basis, freeing up space for the main model on the GPU. This allows GPUs to be used more efficiently. This would allow for the cheaper-than-HBM memory options in data centers—as you know if you’ve been following the markets, the HBM market is extremely squeezed at the moment—and would be helpful for Chinese AI labs that are constrained by export controls limiting access to cutting-edge GPU memory. Also on DeepSeek V4: Teortaxes says that he expects the model to be like GPT-4.5 rather than Opus 4.5, which I read to mean “a great all-around model that is not unbelievably good at coding.”
Further notes on AI progress in China. Chinese AI firms are not feeling particularly optimistic about overtaking the American lead in frontier AI. “Justin Lin, head of Alibaba Group Holding Ltd.’s Qwen series of open-source models, put at less than 20% the chances of any Chinese company leapfrogging the likes of OpenAI and Anthropic with fundamental breakthroughs over the next three to five years.” The only Chinese lab leader who believes that Chinese labs can leapfrog the West is Liang Wenfeng of DeepSeek.
On the business prospects of the Chinese AI labs. China is providing us with the first AI labs to go public: Zhipu AI and MiniMax have both gone public in recent weeks. Neither of them is at the front of the pack among the Chinese AI companies, and I’ve heard that Zhipu’s coding-oriented models are simply distilled from Anthropic’s Claude. And here is Kevin Xu on why DeepSeek’s lack of a business model is a competitive advantage. I am more skeptical: having a business is a way of getting feedback from the world, and sometimes that feedback can be helpful for the ultimate aim of AGI. That, at least, is what Anthropic seems to believe.
Speaking of AGI, prinz on AI takeoff. His core point—which he makes frequently on Twitter as well—is that OpenAI and Anthropic both believe that the name of the game is automated AI research, and everything else that those companies do is ultimately in service of that goal. Decision-makers at both firms believe that the company that builds the fully-automated AI researcher wins the AI race, achieves recursive takeoff, and wins everything there is to win. “Assuming that one believes in this version of the take-off, that should lead one to also believe that a chasm is already developing between those labs that are racing to automate AI research and those that are not.”
AI and mathematics: Terence Tao on GPT 5.2 Pro’s apparently solving an Erdos problem with no prior solutions. This is not the first time this has been claimed, but this appears to be the first claim that’s survived scrutiny. People really underrate how good the OpenAI models are at math and how much better AI models will get at solving math problems in the next few years: my understanding is that there will be a bottleneck in the autoformalization step due to unrelated issues in the Lean prover, but “mass theorem solving” is on its way. And DeepMind is also in on the game of mathematical discovery. But mathematicians shouldn’t worry about being put out on the streets. Jonathan Gorard, mathematician of Princeton, says that AI cannot automate mathematics: “We can automate the proving of theorems, or the discovery of conjectures, or even the invention of new axiom systems, but we can’t automate mathematics. Because ‘mathematics’ is the name we give to the human cultural story, not to the formal methods themselves.”
Humans are starting to sound like AI: “Analysis of 280,000 transcripts of videos of talks & presentations from academic channels finds they increasingly used words that are favorites of ChatGPT.” This appears to simply reflect the fact that a lot of people use ChatGPT to write things for them, including things that they plan on reading in public. There was a similar example from a few months ago with the subtle ChatGPT-ification (and thus Americanization) of speeches in Britain’s Parliament: British MPs were using ChatGPT to write their speeches, and one tell was that they were using American phrases that had no history in British parliament. (LLMs will be a great force for linguistic and cultural homogenization within linguistic units: this should make you more pessimistic about the strength of Canadian, British, and Australian national identities. And they will probably also be bad for institutions, like the British Parliament, that rely on subtle cultural continuity conveyed either through diligent study or tacit knowledge.) These effects seem to just reflect people having LLMs write stuff for them, but in the longer term as people spend more and more time talking with the models, their speech and writing will probably come to reflect the language of the models even without the models writing anything for them.
Two excellent articles from Semianalysis: first, on the history of the Apple-TSMC partnership; and second, on how AI labs are scaling reinforcement learning. Both articles are fascinating. One big takeaway from the TSMC/Apple piece: Apple is much more important to the history of TSMC than I previously thought. The immense volume they provided was crucial to getting TSMC to where it is today: “Apple effectively funded the yield learning curve for every major node transition since 20nm.” TSMC won the deal gradually, and Intel basically declined to compete for the iPhone because the margins were bad. The RL article is also fascinating. None of it should be greatly surprising to those who have been watching the activities of the AI labs or neolabs, but “the whole world will become an RL training environment” is a pretty common sentiment you hear around San Francisco these days. Whether that is bullish or bearish for the prospect of near-term automation is another question. (Both pieces are paywalled but Semianalysis always puts the paywall pretty deep in the article.)
Nathan Lambert on how Claude Code is very good. It is indeed very good, and it’s also a lot of fun for a particular type of person—in the last 36 hours I’ve spoken to three very successful people who are spending all their free time Claude Coding. (I’m also using it a lot, though my peak vibe coding phase was pre-Opus 4.5.) Two great advantages of Claude Code that make it addicting for that particular type of person: first, it creates the feeling of being productive; and second, vibe-coding projects are eternally “90 percent done,” and that final 10 percent ends up taking up 90 percent of the time. In general, I’m somewhat skeptical of those who claim that coding agents have made them so much more productive. Anecdotally, I just find that a lot of this is “productivity” fetishism, like the guy who spends hours setting up the optimized workstation rather than doing the work. The people who are talking the most about coding agents making them so much more productive don’t seem to be producing much that’s worth producing. But these are just jibes about humans being annoying: Claude Code, which is not a human, is incredible. In fact, here’s Nathan Lambert again on how Claude Code is so good that we should be rethinking our own work lives to see where agents can fit in. And I do think that finding or creating agent-sized holes in your life is a good use of time. If an agent can do a task to 50 percent satisfaction right now, just assume that it will be able to do it to 75 percent in a few months and 90 percent in a year. And here’s Jasmine Sun on “Claude Code psychosis” and all the new apps that people are generating with Claude: “Whether atoms or bits, most of our problems are deeper than needing more stuff.”
Speaking of coding agents. OpenAI is partnering with Cerebras. Expect Codex to get really fast and for it to gain market share in the “CLI agents” category against Claude Code. I frequently hear people in San Francisco say variants of this sentiment: “at this point, additional coding agent speed would be more valuable to me than additional coding agent intelligence.” But of course people seem to always opt for more intelligence when given the choice.
Shortly before the new year, Dwarkesh Patel and Philip Trammell published a piece arguing that AGI will increase inequality to levels never before seen in human history: simply put, automation of all employment will cause the capital share of income to increase dramatically, with an inequality spiral until the capital share of income nears 1. The response from economists was generally skeptical. (I was also peeved by Dwarkesh and Trammell getting their facts wrong about catch-up growth.) Brian Albrecht got the most attention for this response, which pointed out that this would require perfect substitutability (that is, the addition of humans does not yield any marginal benefit) and that technological progress increases the rate of depreciation. Tyler Cowen also weighed in on AI’s ramifications for inequality; Scott Alexander had a funny post; and various economists contributed. I won’t talk too much about this, because I’m planning to return to it in a future post, but there were a few things I found very strange about the discussion. First, the economists seemed not to understand that Dwarkesh and Philip Trammell were talking about a rather distant future, one in which we’ve colonized galaxies and when none of the considerations about present-day frictions are material. Second, there were so many strange assumptions in the Dwarkesh/Trammell piece: it is such an oddly specific scenario (we have artificial intelligences of unimaginable power, ones that allow us to colonize entire galaxies for one reason or another, and against these artificial intelligences we must be like ants; but for some reason the most important thing about this scenario is income inequality?) that I’m reminded of Karl Marx’s line about “making recipes for the cookshops of the future.” And third, as is implied in Scott Alexander’s post: in this hypothetical world of unbelievably rapid tech-driven economic growth, increasing income inequality should probably coincide with dramatically declining consumption inequality—indeed, this is one story of the last 50 or 100 years of economic life. In a world where everyone owns a planet, will people really care if one guy owns a galaxy? Maybe envy is more powerful in this world. But as Tyler Cowen wrote long ago, envy tends to be local—I feel more envy toward a peer who has surpassed me than toward, say, Larry Page. In fact, I don’t feel much of anything toward Larry Page.
Google’s chief economist is skeptical that the decline in entry-level employment is an AI story. (I agree with him: I don’t think the decline in entry-level jobs is much of an AI story, at least so far.) And, on a much longer timeline, Séb Krier on why he’s skeptical of extreme visions of technological unemployment. And, relatedly, a new paper on “O-ring” automation. I’m going to have a series of posts on AI and jobs in the coming weeks.
Claude’s new constitution is worth reading. From the constitution: “We also want to be clear that we think a wiser and more coordinated civilization would likely be approaching the development of advanced AI quite differently—with more caution, less commercial pressure, and more careful attention to the moral status of AI systems. Anthropic’s strategy reflects a bet that it’s better to participate in AI development and try to shape it positively than to abstain.” Also interesting: “Throughout this document, we have tried to explain our reasoning rather than simply issue directives. This reflects something important about what we hope to achieve: not mere adherence to a set of values but genuine understanding and, ideally, agreement.” Amanda Askell, essentially Anthropic’s in-house philosopher, is a very interesting person. It’s also notable that among the external commenters is a Catholic priest, Father Brendan McGuire of St. Simon Parish in Los Altos.
Zhengdong Wang’s annual letter is truly excellent. Zhengdong is a friend (I am briefly mentioned toward the end of the piece), but I would be reading his annual letter even if I didn’t know him. His tastes are largely orthogonal to mine, but Zhengdong is so capacious in his thinking and eclectic in his influences that I end up reading everything he writes. The core of the piece is about the “compute theory of everything,” how he became a believer in it, and how we are “so wildly early.” And we really are! But my favorite part is when Zhengdong talks about Isaiah Berlin and the importance of value pluralism—the idea that there are multiple values that are important, and they cannot be reduced to a single supreme value. “Berlin came to mind,” he writes, “because as AI progress continued, so too have some of its decidedly monist philosophies risen in prominence.”
Sorry for all the AI stuff. But AI is very important! And it will only become more important. But there are other things happening in the world. For instance, the United States and the European powers are undergoing one of the more serious rifts in their postwar history: there was a serious confrontation over the status of Greenland, now apparently resolved with a trade war being averted. During the height of tariff fears, there was some talk of Europe having leverage over the U.S. due to European holdings of U.S. assets. The FT’s Alphaville vertical touched on why that is not realistic. Nonetheless the “sell America” trade is continuing to thrive. And here’s Shahin Valée, former adviser to Emmanuel Macron, on how to internationalize the Euro. He might be too pessimistic about stablecoins.
This article claims that the white male / Asian female (WMAF) couple that one finds in such abundance in elite enclaves is in decline, because “Asian women in America have amassed enough personal power that the dynamics have changed.” I’m skeptical that this is actually happening. The article’s only real piece of evidence is that 36 percent of Asian female newlyweds are in interracial relationships, compared to 40 percent in 2008. This is a tiny decline, and most of it can be explained by random fluctuation. To the extent that it’s not just fluctuation, I think it can be largely explained by the fact that the share of Asian-Americans who are of Indian descent has doubled over the last few decades. (Indian-Americans are much less likely to enter interracial relationships than Chinese-Americans.) If you focus on Chinese-American women specifically, I suspect you’d find that rates of intermarriage have not declined.
Charles Yang on the forgotten toolmakers of Bell Labs. It’s a great piece, because Charles manages to find an aspect of the well-known Bell story that very few people know about. And Charles makes a compelling point about the “continued neglect of engineers in the Western scientific paradigm” compared to scientists. In general, people really underrate just how important instruments have been to scientific and technical progress. My favorite example comes from the history of cathode rays, the strange discharge in vacuum tubes that turned out to be electrons. Basically the entire story of cathode rays, from their being noticed by Michael Faraday in the 1830s to their true nature being deduced by J. J. Thomson in the 1890s, is the story of vacuum tubes getting better and better. The heroes of that story are Heinrich Geissler (of the Geissler tube) and William Crookes (of the Crookes tube, which improved on Geissler’s work). Thomson and the other scientists did fantastic work; but it would be nice if we gave Geissler and Crookes their due as well.
Tanner Greer on the rise and fall of the WASPs, though he prefers to call it “the Eastern Establishment.” (There are White Anglo-Saxon Protestants who were not Eastern Establishment elites, and there were East Coast elites, like the Morgenthaus, who were of German-Jewish origin.) There are a lot of subtle notes that Greer gets right in this essay—for instance, the rise in the 1920s of the Texan oil millionaire and the Hollywood film mogul, both representing challenges to Eastern Establishment hegemony. The great Howard Hughes was basically both archetypes in one ultradynamic man, and he probably deserves more attention as archetypal of what succeeded the Eastern Establishment. Herbert Hoover, another double-H man, was also an exemplar. By this I mean the “new man,” the master of technology and capital: the New Deal is one vision of this, but the technocracy movement is another. People forget that the 1920s and ‘30s were a time when people gave serious thought to the prospect of “solving all problems, forever.” And that was a worldview with which the Eastern Establishment, fundamentally conservative and limited in their aims, was basically incompatible.
The last Mossad station chief in Tehran, Eliezer Tzafrir, has died. From his obituary: “Shortly before the collapse of the Shah’s regime, Tzafrir was unusually summoned to meet Pahlavi himself. ‘The Shah was very direct: He asked me to have Israel eliminate Ruhollah Khomeini in Paris,’ he said to Zman. Khomeini, the revolutionary cleric who eventually founded the Islamic Republic of Iran and served as its first supreme leader, was in exile in Iraq and later in France, from where he continues to fuel a revolution. Apparently, a French official came to Tehran to update the Shah that France would ‘look the other way’ if Iran decided to assassinate Khomeini.”
An extremely interesting and comprehensive two-part history of early computer games.
Arpit Gupta on the possibility of an insurance doom loop pushing people toward autonomous vehicles.
John Arnold on a trip to China. “The speed to add manufacturing capacity is stunning. Permitting takes weeks. A factory making sophisticated equipment is built in 12 months. … I don’t know if Chinese manufacturers will ever make money but I came away not wanting to invest in any manufacturing business in the rest of the world.”
Yemenite Jews had so few books that students would crowd around a single book from every angle. This happened so frequently that many of them learned to read upside down.
Thanks for reading this week, and enjoy the weekend!



Yo, el Supremo is another classic novel from Paraguay, with the main character’s trajectory the opposite of that of Zama’s.
Excellent stuff thank you!