I tried the elephant prompt on ChatGPT. It took some convincing that the pictures were actually uploaded, but eventually it 'analysed' the images and spat out a response similar to yours.
Thanks for writing this! I hope you are right that AIs won't be able to automate AI R&D and that the intelligence explosion won't happen. I agree that overreliance on dumb AI systems might cause lots of problems. Some of what you said makes me feel like you have an axe to grind against mainstream AI safety people such as myself. I'm glad you choose to engage in dialogue with us at least.
My overall take, which won't surprise you I suppose given that you've read AI 2027, is that I do expect the intelligence explosion to happen sometime in the next few years, and I think that afterwards, AIs will be scarily competent, such that the problems they cause will be of a different sort than the problems you decide. Also, I think that AIs will be rather incompetent before the intelligence explosion begins, such that the problems you describe could happen--but, for the most part, I think they won't happen (at least not in ways that shake civilization) because there simply won't be enough time for businesses and governments to adopt AI that heavily. A criticism we often get is that AI 2027 depicts AI adoption happening too quickly; people point out that things take time, there are frictions, red tape, etc.. They usually say this about the 2028+ period in our scenario, where there is an army of superintelligences! The frictions and barriers to adoption are much higher, and harder to route around, for dumb pre-intelligence-explosion AI systems.
Thanks for taking the time to read it. Yes, a big part of the disagreement is on intelligence explosions: I've always been highly skeptical of the idea since I encountered it a decade ago: the advent of hallucinating LLM's have downgraded it's likelihood to nigh impossibility in my mind. It's like trying to built a skyscraper on a foundation of mud.
I do have an axe to grind, and I don't apologise for it. I think the people promoting the idea of near-term superintelligence are both wrong and are acting in a reckless manner. You do not have the epistemological rigor or degree of evidence required to justify the level of confidence you are displaying. Sending out AI2027 as a viral short story before it has been thoroughly vetted and checked by independent experts is deeply irresponsible. Parts of it like the fitting of superexponential curves seem highly flawed from the brief lookthroughs I've had of it. That doesn't matter to the layman, though: people are taking you at your word and making major life decisions based on an unreplicated, unreviewed blog post.
If you are wrong, you have essentially provided free hype to all the worlds leading AI companies, and promoted a false threat model that leaves people vulnerable to the real threats of AI misuse and misdeployment. You absolutely are crying wolf here, and the consequences could be disastrous.
Hey, two can play that game! The consequences could be disastrous if people listen to me and I turn out to be wrong? Compare that to the consequences if people listen to you and you turn out to be wrong... (/if people don't listen to me and I turn out to be right...)
We just gotta do our best to figure out what's coming. AI 2027 is currently state of the art; you may not like it but it's what peak AI forecasting performance looks like. Yes, it'll probably be wrong in various ways; yes, the trend extrapolations and analysis could be improved, but it's nevertheless the most rigorous scenario forecast that currently exists. (And we had many experts look at it beforehand, and many more have read it and endorsed it afterwards). If you don't like it, we encourage you to go write a better one!
I'm glad we both agree that people having an incomplete picture of reality is dangerous.
The truth is, we don't know what's coming, and it dangerous to pretend that we do. You have taken one extremely specific set of assumptions, that almost every expert outside of your bubble disagrees with, and blasted it across the internet, prioritizing virality and influence over genuine truth-seeking. You wrote a short story before you did a sensitivity analysis.
Your analysis could not just "be improved", it's shoddy as hell: as far as I can tell your topline prediction result of "AI 2027" is mostly determined by you putting like half your probability mass on a specific "superexponential" curve, and assumption you barely justify.
Look, I genuinely respect that you put a lot of effort into the forecast, and that you consulted with some experts and so on. The actual data you gathered could be of interest to future modellers. But three guys writing a short blog post with extremely basic modelling techniques are not going to achieve a level of magical prognostication that beats all the worlds best market analysists.
I'm glad of that as well. I don't pretend to know what's coming. Also, I really hope that I'm wrong and you are right. Seriously. I'd feel embarrassed in 2030 if AI has turned out to be a big empty hype bubble, but I'd also feel happy and grateful to be alive.
Yoshua Bengio (godfather of AI), Anthropic, OpenAI, GDM, and Dean Ball (white house) are all in my bubble? It must be a pretty huge bubble then.
I assure you we did not prioritize virality and influence over genuine truth-seeking. We spent almost a whole year working on this, we did multiple drafts, got hundreds of people to give feedback, and built models looked at data etc. We had Scott rewrite it to be more engaging and easier to read, but it was a painful process because we kept having to correct the ways in which his rewrites made it less accurate. As for blasting it over the whole internet--all we did was a tweet thread, a Dwarkesh podcast, and an interview with Kevin Roose. ~Everything else was people reaching out to us afterwards. Would you advise us to turn them down? We did in fact turn some of them down, I'm looking forward to getting past all the interviews and returning to research.
As for beating the world's best market analysts: The market for predicting AGI capabilities trends is not exactly efficient. My experience talking to people at the quant firms is that they mostly aren't thinking very much about AGI at all; they don't have anything published that is close to as good as our stuff & as far as I can tell there isn't anything private that's better also. Yes, I agree, that's more a sad commentary on the sorry state of human civilization than glowing praise for our thing. Our thing is SOTA but it's a shitty SOTA which hopefully will be dramatically improved upon.
As for the sensitivity analysis thing: Look, the METR graph basically didn't influence AI 2027, it came out after AI 2027 was almost finished. The benchmarks+gaps argument was the core argument we used, but also, I had had 2027 timelines for a few years before. The sensitivity analysis for the METR extrapolation just wasn't a crux for me. I acknowledged plenty of times that the future is uncertain and that AGI could come sooner, or later, than 2027, or even never. In fact around the time AI 2027 was published, I redid my thinking and updated towards 2028 as my median! And it's possible I'll update again. I think it's still valuable to make our best guesses and publish them, along with our reasoning, and still valuable to use concrete scenarios to illustrate our guesses, even though the future is uncertain. (We talk about this on the website in the "Why is this valuable?" section)
Re: excluding GPT-2 and GPT-3: Good question I'll ask Romeo, he made the graph.
Yeah look, I don't want to be too harsh, I understand that a lot of work went into this. I think that some of the work can still be valuable, even if I think the forecast itself is fatally flawed.
I've had a deeper dive through the timelines forecast today and i'm noticing some problems with the math that look concerning (see my response to your other comment below). I may take some time to write up a critique in the next few weeks.
My general concern is that people are going to see all the fancy graphs and data, and the title "AI 2027", and assume that means there is a ton of serious data backing up the idea that AI will arrive in 2027. I don't think your team has done a good enough job at emphasising just how shaky a ground these forecasts are being built on. Like, people might see your "probability mass" graphs and think that contains all the uncertainty involved, but that would be a mistake.
I bring up sensitivity analyses because I believe a lot of the parameters you put in there have next to zero effect on the topline results: I believe your final figure rests on much less of the data than one might naively expect from all the stuff you've compiled
OK I asked Romeo. He said he didn't think about it much, but basically since GPT2 and 3 were so long ago it made the graph look ugly (almost all the data points way over on the right) and he figured it was a low-quality data point anyway (like, GPT3 was so dumb, we can squint at it and say maybe it's capable of doing 1-second tasks but is that even meaningful?).
If you think it matters we'd be happy to make a new version of the graph with those data points and give it to you and/or tweet it.
I don't think it makes sense to exclude GPT3's 2 second horizon, but include gpt3.5's 8 second horizon. If one is meaningless, the other one should be too. And the actual effect of excluding them is to take out two points that would be much lower than the drawn curve.
Also: the text on the graph says there is a "15%" speedup happening. In the AI2027 timelines report, it says the speedup is only 10%. What is the actual value? Is the curve on the graph the same as the one in the report?
Next time, consider splitting posts like this up into 2 posts, the short-story and the analysis. I kept putting it off, despite knowing you write great stuff, because I was intimidated by 39 minute reading time.
Also consider editing this post to give the "Slopworld ingredients" subsections H4 subheaders, so people can scan them/find them again, when they use the sidebar navigation menu (on the left).
I tried the elephant prompt on ChatGPT. It took some convincing that the pictures were actually uploaded, but eventually it 'analysed' the images and spat out a response similar to yours.
Thanks for writing this! I hope you are right that AIs won't be able to automate AI R&D and that the intelligence explosion won't happen. I agree that overreliance on dumb AI systems might cause lots of problems. Some of what you said makes me feel like you have an axe to grind against mainstream AI safety people such as myself. I'm glad you choose to engage in dialogue with us at least.
My overall take, which won't surprise you I suppose given that you've read AI 2027, is that I do expect the intelligence explosion to happen sometime in the next few years, and I think that afterwards, AIs will be scarily competent, such that the problems they cause will be of a different sort than the problems you decide. Also, I think that AIs will be rather incompetent before the intelligence explosion begins, such that the problems you describe could happen--but, for the most part, I think they won't happen (at least not in ways that shake civilization) because there simply won't be enough time for businesses and governments to adopt AI that heavily. A criticism we often get is that AI 2027 depicts AI adoption happening too quickly; people point out that things take time, there are frictions, red tape, etc.. They usually say this about the 2028+ period in our scenario, where there is an army of superintelligences! The frictions and barriers to adoption are much higher, and harder to route around, for dumb pre-intelligence-explosion AI systems.
Thanks for taking the time to read it. Yes, a big part of the disagreement is on intelligence explosions: I've always been highly skeptical of the idea since I encountered it a decade ago: the advent of hallucinating LLM's have downgraded it's likelihood to nigh impossibility in my mind. It's like trying to built a skyscraper on a foundation of mud.
I do have an axe to grind, and I don't apologise for it. I think the people promoting the idea of near-term superintelligence are both wrong and are acting in a reckless manner. You do not have the epistemological rigor or degree of evidence required to justify the level of confidence you are displaying. Sending out AI2027 as a viral short story before it has been thoroughly vetted and checked by independent experts is deeply irresponsible. Parts of it like the fitting of superexponential curves seem highly flawed from the brief lookthroughs I've had of it. That doesn't matter to the layman, though: people are taking you at your word and making major life decisions based on an unreplicated, unreviewed blog post.
If you are wrong, you have essentially provided free hype to all the worlds leading AI companies, and promoted a false threat model that leaves people vulnerable to the real threats of AI misuse and misdeployment. You absolutely are crying wolf here, and the consequences could be disastrous.
Hey, two can play that game! The consequences could be disastrous if people listen to me and I turn out to be wrong? Compare that to the consequences if people listen to you and you turn out to be wrong... (/if people don't listen to me and I turn out to be right...)
We just gotta do our best to figure out what's coming. AI 2027 is currently state of the art; you may not like it but it's what peak AI forecasting performance looks like. Yes, it'll probably be wrong in various ways; yes, the trend extrapolations and analysis could be improved, but it's nevertheless the most rigorous scenario forecast that currently exists. (And we had many experts look at it beforehand, and many more have read it and endorsed it afterwards). If you don't like it, we encourage you to go write a better one!
I'm glad we both agree that people having an incomplete picture of reality is dangerous.
The truth is, we don't know what's coming, and it dangerous to pretend that we do. You have taken one extremely specific set of assumptions, that almost every expert outside of your bubble disagrees with, and blasted it across the internet, prioritizing virality and influence over genuine truth-seeking. You wrote a short story before you did a sensitivity analysis.
Your analysis could not just "be improved", it's shoddy as hell: as far as I can tell your topline prediction result of "AI 2027" is mostly determined by you putting like half your probability mass on a specific "superexponential" curve, and assumption you barely justify.
Look, I genuinely respect that you put a lot of effort into the forecast, and that you consulted with some experts and so on. The actual data you gathered could be of interest to future modellers. But three guys writing a short blog post with extremely basic modelling techniques are not going to achieve a level of magical prognostication that beats all the worlds best market analysists.
But while I'm here, could you explain why you excluded GPT-2 and GPT3 from this graph here: https://x.com/DKokotajlo/status/1916520276843782582?
I'm glad of that as well. I don't pretend to know what's coming. Also, I really hope that I'm wrong and you are right. Seriously. I'd feel embarrassed in 2030 if AI has turned out to be a big empty hype bubble, but I'd also feel happy and grateful to be alive.
Yoshua Bengio (godfather of AI), Anthropic, OpenAI, GDM, and Dean Ball (white house) are all in my bubble? It must be a pretty huge bubble then.
I assure you we did not prioritize virality and influence over genuine truth-seeking. We spent almost a whole year working on this, we did multiple drafts, got hundreds of people to give feedback, and built models looked at data etc. We had Scott rewrite it to be more engaging and easier to read, but it was a painful process because we kept having to correct the ways in which his rewrites made it less accurate. As for blasting it over the whole internet--all we did was a tweet thread, a Dwarkesh podcast, and an interview with Kevin Roose. ~Everything else was people reaching out to us afterwards. Would you advise us to turn them down? We did in fact turn some of them down, I'm looking forward to getting past all the interviews and returning to research.
As for beating the world's best market analysts: The market for predicting AGI capabilities trends is not exactly efficient. My experience talking to people at the quant firms is that they mostly aren't thinking very much about AGI at all; they don't have anything published that is close to as good as our stuff & as far as I can tell there isn't anything private that's better also. Yes, I agree, that's more a sad commentary on the sorry state of human civilization than glowing praise for our thing. Our thing is SOTA but it's a shitty SOTA which hopefully will be dramatically improved upon.
As for the sensitivity analysis thing: Look, the METR graph basically didn't influence AI 2027, it came out after AI 2027 was almost finished. The benchmarks+gaps argument was the core argument we used, but also, I had had 2027 timelines for a few years before. The sensitivity analysis for the METR extrapolation just wasn't a crux for me. I acknowledged plenty of times that the future is uncertain and that AGI could come sooner, or later, than 2027, or even never. In fact around the time AI 2027 was published, I redid my thinking and updated towards 2028 as my median! And it's possible I'll update again. I think it's still valuable to make our best guesses and publish them, along with our reasoning, and still valuable to use concrete scenarios to illustrate our guesses, even though the future is uncertain. (We talk about this on the website in the "Why is this valuable?" section)
Re: excluding GPT-2 and GPT-3: Good question I'll ask Romeo, he made the graph.
Yeah look, I don't want to be too harsh, I understand that a lot of work went into this. I think that some of the work can still be valuable, even if I think the forecast itself is fatally flawed.
I've had a deeper dive through the timelines forecast today and i'm noticing some problems with the math that look concerning (see my response to your other comment below). I may take some time to write up a critique in the next few weeks.
My general concern is that people are going to see all the fancy graphs and data, and the title "AI 2027", and assume that means there is a ton of serious data backing up the idea that AI will arrive in 2027. I don't think your team has done a good enough job at emphasising just how shaky a ground these forecasts are being built on. Like, people might see your "probability mass" graphs and think that contains all the uncertainty involved, but that would be a mistake.
I bring up sensitivity analyses because I believe a lot of the parameters you put in there have next to zero effect on the topline results: I believe your final figure rests on much less of the data than one might naively expect from all the stuff you've compiled
OK I asked Romeo. He said he didn't think about it much, but basically since GPT2 and 3 were so long ago it made the graph look ugly (almost all the data points way over on the right) and he figured it was a low-quality data point anyway (like, GPT3 was so dumb, we can squint at it and say maybe it's capable of doing 1-second tasks but is that even meaningful?).
If you think it matters we'd be happy to make a new version of the graph with those data points and give it to you and/or tweet it.
I don't think it makes sense to exclude GPT3's 2 second horizon, but include gpt3.5's 8 second horizon. If one is meaningless, the other one should be too. And the actual effect of excluding them is to take out two points that would be much lower than the drawn curve.
Also: the text on the graph says there is a "15%" speedup happening. In the AI2027 timelines report, it says the speedup is only 10%. What is the actual value? Is the curve on the graph the same as the one in the report?
This was excellent!
Next time, consider splitting posts like this up into 2 posts, the short-story and the analysis. I kept putting it off, despite knowing you write great stuff, because I was intimidated by 39 minute reading time.
Also consider editing this post to give the "Slopworld ingredients" subsections H4 subheaders, so people can scan them/find them again, when they use the sidebar navigation menu (on the left).