The KHIT Blog: Karen Hao AI concerns. And, Anthropic frets over AI "autonomous recursive self-improvement."

Karen Hao, MIT-trained engineer and author of Empire of AI, who interviewed over 260 people including 90 OpenAI employees, warns that the AI industry needs to add close to the entire annual energy output of the UK to the global grid within five years, mostly through fossil fuels, that two-thirds of new AI data centers are being built in water-scarce areas, and that Elon Musk's Colossus supercomputer in Memphis is powered by around 35 unlicensed methane gas turbines. She details Kenyan content moderators paid a few dollars an hour to process the worst content on the internet until they developed PTSD, describes a proposed 10-year moratorium on state-level AI regulation being inserted into US legislation, and warns that on the current trajectory the next 20 years will end democracy, with Silicon Valley increasingly promoting the idea that corporate structures with CEOs at the top should replace democratic governance entirely.

ANTHROPIC "RECURSIVE AUTONOMY" ANXIETY

From their "Institute" website.

Possible futures
What happens next depends on two things: whether the trend continues, and what we choose to do if it does. We can imagine at least three future scenarios:

1. The trend stalls, but today’s AI capabilities are widely diffused. This article features many exponential trajectories. But these trajectories may actually turn out to be S-curves. We may be approaching the bend in the curve, where returns to scale diminish and the line straightens, then flattens. The judgment that separates a competent researcher from a great one might be a capability that cannot come from scaling up training inputs like compute and data. If so, getting past this bottleneck would require a new idea, like an architectural approach that supplants the Transformer architecture that all current frontier models use.

  Alternately, the binding constraint to AI progress could be in the supply chain, not the model: advancing and diffusing the frontier may require more energy and compute than presently exists. The pace of chip fabrication, grid expansion, or interconnect bandwidth may be the constraint, rather than intelligence itself. We also cannot rule out an exogenous shock to the AI ecosystem that dramatically slows things, like a sudden diminishment in the supply of compute or electricity, either of which would slow progress and make forward investment by labs more expensive. Or we may not be anticipating some other barrier to progress.  

Even if model capabilities were frozen at today’s level, we would expect major changes to occur in the world. Project Glasswing is one early sign: in its first weeks, Mythos Preview found more than ten thousand high- and critical-severity software vulnerabilities across the world’s most important systems—enough that the bottleneck in cyber defense has already shifted from finding vulnerabilities to patching them fast enough. And we are still early in the diffusion of today’s models into the wider economy, where a 100-person company can increasingly do the work of a 1,000-person one, because each employee will sit atop a pyramid of agents.  

We include this scenario for completeness, but we don’t believe it’s likely. Every capability we can measure, including those that feel “squishier,” like quality of code and success on open-ended tasks, has so far followed the same curve. We have not yet seen that curve bend. Of the three futures we consider, this one would give governments and societies the most time to adapt. We are more worried about the next two, which would move faster and leave far less room for preparation. 

2. AI labs continue to see compounding efficiency gains. In this scenario, AI development becomes substantially automated, but humans continue to set research directions and judge results. Organizations that use AI systems would become much more efficient as time goes on, so we could expect to see significant productivity multipliers on each person in this organization. 100-person companies could do the work of 10,000- or 100,000-person organizations. This would revolutionize knowledge work and government services, but could also be turned to harmful ends, from authoritarian surveillance of whole populations to influence operations that tailor manipulation to each individual and run at a scale no human team could match. The role of humans at companies like Anthropic would shift. People would partner with AI systems to scale up research and generate new insights, and together they would build the systems needed to verify that AI outputs can be trusted.  

The evidence we’ve laid out here suggests that we’re likely heading into this scenario. But speeding up one part of a process often just shifts the bottleneck elsewhere: overall pace is capped by the parts that haven’t sped up. In computing, this is known as Amdahl’s law, and the same logic can apply to organizations. Anthropic has already encountered one signature of Amdahl’s law: as we’ve begun to push more code around the organization, human code review has become a new bottleneck.  

We’ve also encountered this friction outside engineering. There has been an explosion of new ideas, initiatives, tools, and simulations, as a result of Anthropic employees working with highly capable models—far more than we have the capacity to pursue. The rate at which organizations can spot and fix these bottlenecks may be a skill that improves over time, and it may become the most important skill for any organization. 

3. AI systems themselves become capable of full recursive self-improvement, and begin building their successors. If technical trends in advancing capabilities continue, and AI systems are able to develop the capabilities inherent to transformative human ingenuity, then it is plausible that AI systems could design and refine themselves.  

In this world, the pace of progress in AI development becomes determined entirely by the availability of compute (or the speed of discovering various efficiencies in algorithmic training or inference) for AI systems. Humans play a substantially diminished role in their development, likely moving most of our effort towards oversight, validation, and verification of an expanding “virtual lab” run by AI systems. We expect that systems capable of automated AI research and development would have skills that would transfer to the rest of science, allowing them to begin to revolutionize other fields.  

How the alignment problem gets solved—or not—in this future is something we are least certain about. Models could prove to be sufficiently aligned and capable enough of research taste that they discover and implement novel solutions that we have not yet reached. They could also be sufficiently wise to halt development if not. Alternatively, the rare occurrences of misalignment present in today’s models could compound as the models build their successors, growing more frequent but less understood until we lose control of them. It’s possible that we can’t build, integrate, and verify the tools that we’d need to understand which trendline we are actually on.  

We do not have good intuitions for what this world would look like, because our economy is currently driven by humans and human-built tools. By its nature, a world driven by fast recursive self-improvement could become dominated by the self-improving model as its capabilities fully eclipse those of humans and the model proliferates across the broader economy. It is difficult to predict what the economy looks like if human labor stops being competitive.  

Even if model development became fully automated and recursive, we can’t predict what that would mean for most humans’ daily lives. Amdahl’s law applies here as well. Recursive intelligence could lead to achieving many of the benefits outlined in Machines of Loving Grace, quickly in some domains. We expect that embodied intelligence (i.e., robotics) might quickly follow recursive intelligence, and follow a similar path of increasing returns at decreasing cost. More powerful intelligence might help us build things in the physical world more quickly, run more productive clinical trials of lifesaving drugs, and develop novel forms of coordination.  

But achieving recursive improvement alone does not suggest an immediate change in how industrial production occurs, societies organize, or markets function. More intelligence can’t learn what a drug does over decades of use, can’t hold elections sooner than a constitution dictates, and can’t turn a stranger into an old friend in a weekend. For most people, the felt pace of this future will still be set by the bottlenecks, even if the laboratory upstream runs at the speed of compute. That collision, where recursive intelligence building itself ever faster meets the world of humans, relationships, and governance, is another part of this future we can’t predict.

 What should we do? 
If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe. Without a global coordination mechanism, companies and governments will have to make difficult decisions about safety while under competitive and geopolitical pressures. 

We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner. 

A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more challenging than with other technologies. Training runs are far easier to conceal than missile silos, their inputs are general-purpose, and the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead. A credible pause also has to specify what triggers it, what lifts it, and who adjudicates. 

None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies (e.g., the Intermediate-Range Nuclear Forces Treaty)—but those regimes took decades to build both the infrastructure and the trust. We don’t have that long. A unilateral pause by one lab, by contrast, is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing. 

In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. We’ll publish what comes out of it. The window to investigate the questions together is here, and people outside AI companies should be involved in this deliberation.

One initial reaction of mine going to "autonomous recursive self-improvement:"

Who/what will define "improvement?"

MORE AI NEWS

Citing concerns that artificial intelligence will make it easier for anyone to build biological weapons, the leaders of several major AI companies—in a rare moment of unity—have penned a new letter urging U.S. lawmakers to impose tighter controls on firms that sell synthetic, made-to-order strands of DNA.

“AI systems are improving rapidly, and alongside incredible benefits to science and medicine, there is a real possibility that the knowledge barriers which have historically prevented bad actors from obtaining biological weapons will meaningfully erode,” states the 3 June letter, which is signed by the heads of OpenAI, Anthropic, Google DeepMind, and more than 50 other prominent players in AI, biotechnology, and national security.

The letter calls on Congress to pass legislation that would require companies that sell synthesized DNA and the machines that make it to carefully vet orders and customers, and to keep detailed records “so that any threat that might evade initial screening can be traced back to its source. … Awareness of traceability itself deters misuse.”

The push for regulation comes amid growing concerns that AI products, including large language models and specialized tools trained on troves of biological data, could enable nonspecialists to gather sophisticated information on how to construct deadly toxins or assemble deadly bacteria, viruses, or other pathogens, using equipment and techniques that are becoming cheaper and easier to acquire. Together, the combination could make for potentially catastrophic risks, such as an AI-designed pathogen that sparks a global pandemic...

UPDATE

Seen on X the other day.

You have noticed it. ChatGPT feels dumber than it used to. Your prompts that worked six months ago produce worse results now. The writing sounds flatter. The ideas sound safer. The internet itself feels like it is shrinking. Every article reads the same. Every email sounds the same. Every answer sounds like it was written by the same voice.

You thought it was you. It is not you.

Researchers at Oxford and Cambridge published a paper in Nature proving what is happening. They call it Model Collapse.

Here is the mechanism in one sentence. AI trained on AI-generated data gets dumber every generation until it forgets what real human data looked like.

The internet is filling with AI-generated content. Blog posts. Articles. Reviews. Comments. Social media. AI companies scrape the internet to train the next generation of models. Which means the next generation of AI is being trained on the output of the current generation.

Each cycle loses information. Not randomly. It loses the rarest, most unusual, most creative parts first. The researchers call these the "tails of the distribution." The weird ideas. The unexpected perspectives. The things that made the internet feel human. Those disappear first.

What remains is the average. The safe. The expected. The bland.

Then the next generation trains on that. And loses more. And the next generation trains on that. And loses more. The researchers proved this is not a slow decline. Major degradation happens within just a few iterations. Even when some of the original human data is preserved.

They tested it on large language models. On image generators. On statistical models. The pattern was the same every time. The output converges toward a narrow, flattened version of reality that looks nothing like the original data.

The lead researcher put it plainly. "Large language models are like fire. A useful tool. But one that pollutes the environment."

The pollution is invisible. You cannot see which sentence on the internet was written by a human and which was written by AI. Neither can the AI that is about to train on it. And once the tails are gone, they do not come back. The damage is irreversible.

This is not a prediction anymore. It is a diagnosis.

The internet you grew up on was built by humans writing things no algorithm would have written. Strange, personal, imperfect, alive. That internet is being diluted. One generation of AI at a time. And the models trained on what remains are learning a smaller and smaller version of the world.

Model Collapse is not a technical problem. It is a cultural one. The thing that made the internet worth reading is the thing that disappears first.

Click to enlarge. Smells like an "evolutionary adaptive utility decline" problem. Inadequate LLM training data linguistic token "gene pool."

The KHIT Blog

Monday, June 8, 2026

Karen Hao AI concerns. And, Anthropic frets over AI "autonomous recursive self-improvement."

No comments:

Post a Comment