Artificial [jagged/operational] General Intelligence *is* here

tl;dr version:  AjoGI = (jagged expert-level competence across many cognitive tasks) + (operational outcomes: focus on what it does, not what it “is”)  is here. AGI or at least AoGI isn't far away. This doesn't mean we'll have Einstein-in-a-box, but let's also not under-rate AI or humans*. 

Einstein is not the benchmark, even for a Nobel prize winning physicist.
------

I originally wrote a longer version of this post on December 31 2025, with the title "The AI Skeptic's guide to 2026". However,  it kept growing and growing, and I didn't publish it. 
I feel compelled to revisit it, truncate it and post some thoughts today because a rather excellent piece appeared in Nature this past week *and* because I have been amazed by the capabilities of some of the models lately.

The Nature article by Eddy Chen, Mikhail Belkin, Leon Bergen, and David Danks (combined expertise in philosophy, data science, AI and linguistics) makes a strong claim: by any reasonable criteria, the vision of human-level machine intelligence has arrived.  

  • They also ask us not to be distracted by the necessarily vague definition of AGI and define “general intelligence” = breadth + depth, not perfection / universality (no human can do every cognitive task)
  • They argue that the common “do almost all tasks a human can do” definition contains a trap: which human? If you mean a composite superhuman expert, then basically no individual qualifies (including humans).  
  • They also address many common criticisms.

One doesn't have to agree with everything (or anything) that is presented in this piece, but it lays down arguments better than any article I have ever come across on this topic, if only as a launch point for discussions. I strongly recommend you to read it with an open mind (not pro or against), before dissecting the argument.  Again, no need to take this as definitive, but it is a good starting point.

The article does cover a lot of ground and says many of things I wanted to say (and does it more eloquently and with more authority), so I won't repeat what's in there. Here are just a few of my additional thoughts:

a) Since the summer of 2023, I have mainly focused on the rate at which models were getting better and not so much on how good they are. I believe we have reached a point where model capabilities are astounding at some tasks. The jumps from 3.5 -> 4.0 -> o1 -> 5  were all big, but what has shocked me is how big and how quick the jump from 5 -> 5.2  (same with Claude recently) has been, and we all know that there's a lot more unhobbling to do. 

b) On the above point, as cheesy as "PhD level intelligence" sounded, *undeniably* frontier models are now better than the top PhD students on some tasks. Of course, a PhD degree is much more than accomplishing some tasks, but that's not my point here. 

c) 'Agentic AI' of 2025 has mostly been unimpressive, kinda functioning like a glorified wrapper. That is going to change dramatically in 2026 because having these models in a multi-agent setting (think a 100 Claude Opus 4.6 APIs running simultaneously, and some adversarially, interacting with tools) will be a game changer.

d) The big revelation for me by Spring 2023 was that human intelligence was not as special as I thought it would be (Note : I am not saying that it is not special, just that quite a bit of what it can do can be recreated. Again, I am not saying all of it can be recreated). Every jump in capability just reinforces this opinion.


e) I have thus far maintained (and even ranted) that "AGI" is a distraction and that we'll all be better off focusing on how to make these models do useful things (as an example, see page 3 here). I admit I was wrong.  AGjoI is here, and as amorphous as the definition of AGI is, I do not think it is too far away, nor do I think it is unachievable. (see figure above). 

-   j  is for `jagged' and refers to the fact that the models do some cognitive tasks at the level of experts but fail (though not as catastrophically as they used to) at some others. 
-   o is for `operational' and refers to looking at what these models do, rather than anthropomorphizing some vague notions of how they do it.

The Nature paper says this: "We don’t expect a physicist to match Einstein’s insights, or a biologist to replicate Charles Darwin’s breakthroughs. Few, if any, humans have perfect depth even within specialist areas of competence. Human general intelligence does not require perfection; neither should AGI" and qualifies that by this (and other) measures, AGI is here. I'd call this AjoGI, and I believe AGI or at least AoGI isn't far away. 2030 seems far way to me, honestly!

This doesn't mean we'll have Einstein-in-a-box, but let's also not under-rate humans or AI. The best physicists in the world don't generate Einsteinian level insights. That doesn't mean they are not amazing. Let's use the same yardstick here, and also remember it is not AI vs human, but AI+human+tool integration that will lead to many breakthroughs. 

I'll conclude with something that really belongs in next week's post, but I do want to leave it here because it helps understand the context behind some of my comments above.

Don't call these models "LLMs". That is so 2023.
Yes, even the Nature paper uses “LLM” constantly. A modern “model” in the wild is not just a static text generator. It is a proto-agentic system: a model wrapped in scaffolding- planning, tool use, code execution, retrieval, memory, self-critique, verification loops, and reinforcement signals (yes I am still talking about the commercially available GPT/Claude/Gemini models.. through the standard interface. All this is in the background). and oh btw, it is not trained on just language. I still see a number of critics talking about modern models the same way they used to talk about GPT 3.5 or 4.0.

This matters because:

  • The biggest performance jumps often come from the scaffolding around the model, not the base generator.
  • These systems can interact with “reality” via tools: they write code, run tests, check math, query databases, simulate, iterate, and close the loop. That feedback changes the nature of the computation.

So yes: the underlying base LLM generator may still be  pre-trained (not "trained") with next-token prediction. But reducing the whole system to “next-word completion” is like describing a jet engine as “just spinning metal.” Seriously. We have reached that point.

oh.. And: you can still criticize these agentic systems! There are many things to criticize! Just don't take the LLM/Stochastic Parrot/autocomplete route. Please.


My next post will be a more detailed AI Skeptic/Critic's guide. 

Comments