"It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." - Not Mark Twain
I think the rationalist mantra of "If It’s Worth Doing, It’s Worth Doing With Made-Up Statistics" will turn out to hurt our information landscape much more than it helps.
> In my world, you generally want models to have strong conceptual justifications or empirical validation with existing data before you go making decisions based off their predictions: this fails at both.
Thank you for putting succinctly into words what I've been feeling for months. Thanks for all this invaluable work; I hope this post gets a ton of views and subsequent critique itself.
I'm leaving the same comment here and in reply to eli on lesswrong.
First, thank you for engaging in good faith and rewarding deep critique. Hopefully this dialogue will help people understand the disagreements over AI development and modelling better, so they can make their own judgements.
I think I’ll hold off on replying to most of the points there, and make my judgement after Eli does an in-depth writeup of the new model. However, I did see that there was more argumentation over the superexponential curve, so I’ll try out some more critiques here: not as confident about these, but hopefully it sparks discussion.
The impressive achievements in LLM capabilities since GPT-2 have been driven by many factors, such as drastically increased compute, drastically increased training data, algorithmic innovations such as chain-of-thought, increases in AI workforce, etc. The extent that each contributes is a matter of debate, which we can save for when you properly write up your new model.
Now, let’s look for a second at what happens when the curve goes extreme: using median parameters and starting the superexponentional today, the time horizon of AI would improve from one-thousand work-years to ten-thousand work-years In around five weeks. So you release a model, and it scores 80% on 1000 work year tasks, but only like 40% on 10,000 work year tasks (the current ratio of 50% to 80% time horizons is like 4:1 or so). Then five weeks later you release a new model, and now the reliability on the much harder tasks has doubled to 80%.
Why? What causes the reliability to shoot up in five weeks? The change in the amount of available compute, reference data, or labor force will not be significant in that time, and algorithmic breakthroughs do not come with regularity. It can’t be due to any algorithmic speedups from AI development because that’s in a different part of the model: we’re talking about three weeks of normal AI development, like it’s being done by openAI as it currently stands.. If the AI is only 30x faster than humans, then the time required for the AI to do the thousand year task is 33 years! So where does this come from? Will we have developed the perfect algorithm, such that AI no longer needs retraining?
I think a mistake could be made in trying to transfer intuition about humans to AI here: perhaps the intuition is “hey, a human who is good enough to do a 1 year task well can probably be trusted to do a 10 year task”.
However, if a human is trying to reliably do a “100 year” task (a task that would take a team of a hundred about a year to do), this might involve spending several years getting an extra degree in the subject, read a ton of literature, improving their productivity, get mentored by an expert in the subject, etc. While they work on it, they learn new stuff and their actual neurons get rewired.
But the AI equivalent to this would be getting new algorithms, new data, new computing power, new training. ie, becoming an entirely new model, which would take significantly more than a few weeks to be built. I think there may be some double counting going on between this superexp and the superexp from algo speedups. ity, get mentored by an expert in the subject, etc. While they work on it, they learn new stuff and their actual neurons get rewired.
But the AI equivalent to this would be getting new algorithms, new data, new computing power, new training. ie, becoming an entirely new model, which would take significantly more than a few weeks to be built. I think there may be some double counting going on between this superexp and the superexp from algo speedups.
I remarked to Eli yesterday that people seem to really hate the superexponential even though it doesn't really affect the bottom line that much; I could have avoided so much flak if I had just used a faster exponential trend (akin to your "the new normal" curve) and made some qualitative arguments for why I thought things were going to go faster than the METR fit, such as (1) the feedback loop from AI partially automating AI R&D, and (2) my hypothesis that doubling horizon length should in some sense get inherently easier the longer your horizon lengths already are. (i.e. the skills you need to operate at horizon 2N are not that different from the skills you need to operate at horizon N, for large N)
I'm curious if you think that would have been an improvement, and if so, how significant of an improvement.
I respect what AI 2027 is doing (and there's way worst forecasts out there idk if you've seen Greg Colbourn's but it brings a tear to my eye with how horribly horribly Yudpilled it is) but this is good work
Even with low p(doom) i appreciate what they're doing to promote AI safety but like many Rats their work seems and is delivered as a lot more airtight than it is
I appreciated the critique (though I partly agree with AI 2027's responses). I've seen many critiques of AI 2027, and this is one of very few good ones. Two objections that I haven't seen anyone else raise:
"So a backast towards 2022 predicts an AI R&D speedup factor of around 0.6 for both type of forecasts. With a current factor of about 1.1, this means that a backcast is modelling that current AI progress is 66% faster than it was in 2022."
Backcasting the speedup to values below 1 is an unfair extrapolation. The speedup models how much faster using AI tools is than doing stuff manually. Even if the backcast perfectly describes how historical progress *would have proceeded using AI tools* (i.e. the model does its job perfectly in that regime), it won't describe actual historical progress. Back when using AI tools would've had a <1 speedup, people would've just done stuff manually, putting the effective progress speedup at 1, regardless of the theoretical using-AI-tools speedup.
"Most of these models predict superhuman coders in the near term, within the next ten years. This is because most of them share the assumption that a) current trends will continue for the foreseeable future, b) that “superhuman coding” is possible to achieve in the near future, and c) that the METR time horizons are a reasonable metric for AI progress."
You don't need assumption b. It follows from a and c. If time horizons are the (coding) task length that AI can do reliably, and that length becomes long enough, which will happen if current progress continues, then you get SCs (minus some nuances about cost which are quite likely to be resolved).
Rather different from your methods, but the strongest case against ASI I know is that ASI makes the Fermi Paradox way worse.
"It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." - Not Mark Twain
I think the rationalist mantra of "If It’s Worth Doing, It’s Worth Doing With Made-Up Statistics" will turn out to hurt our information landscape much more than it helps.
> In my world, you generally want models to have strong conceptual justifications or empirical validation with existing data before you go making decisions based off their predictions: this fails at both.
Thank you for putting succinctly into words what I've been feeling for months. Thanks for all this invaluable work; I hope this post gets a ton of views and subsequent critique itself.
Hey! Thanks for this critique. Due to formatting issues our reply can be found here: https://www.lesswrong.com/posts/PAYfmG2aRbdb74mEp/a-deep-critique-of-ai-2027-s-bad-timeline-models?commentId=pFp3WoJ7RoQPwELDr
I'm leaving the same comment here and in reply to eli on lesswrong.
First, thank you for engaging in good faith and rewarding deep critique. Hopefully this dialogue will help people understand the disagreements over AI development and modelling better, so they can make their own judgements.
I think I’ll hold off on replying to most of the points there, and make my judgement after Eli does an in-depth writeup of the new model. However, I did see that there was more argumentation over the superexponential curve, so I’ll try out some more critiques here: not as confident about these, but hopefully it sparks discussion.
The impressive achievements in LLM capabilities since GPT-2 have been driven by many factors, such as drastically increased compute, drastically increased training data, algorithmic innovations such as chain-of-thought, increases in AI workforce, etc. The extent that each contributes is a matter of debate, which we can save for when you properly write up your new model.
Now, let’s look for a second at what happens when the curve goes extreme: using median parameters and starting the superexponentional today, the time horizon of AI would improve from one-thousand work-years to ten-thousand work-years In around five weeks. So you release a model, and it scores 80% on 1000 work year tasks, but only like 40% on 10,000 work year tasks (the current ratio of 50% to 80% time horizons is like 4:1 or so). Then five weeks later you release a new model, and now the reliability on the much harder tasks has doubled to 80%.
Why? What causes the reliability to shoot up in five weeks? The change in the amount of available compute, reference data, or labor force will not be significant in that time, and algorithmic breakthroughs do not come with regularity. It can’t be due to any algorithmic speedups from AI development because that’s in a different part of the model: we’re talking about three weeks of normal AI development, like it’s being done by openAI as it currently stands.. If the AI is only 30x faster than humans, then the time required for the AI to do the thousand year task is 33 years! So where does this come from? Will we have developed the perfect algorithm, such that AI no longer needs retraining?
I think a mistake could be made in trying to transfer intuition about humans to AI here: perhaps the intuition is “hey, a human who is good enough to do a 1 year task well can probably be trusted to do a 10 year task”.
However, if a human is trying to reliably do a “100 year” task (a task that would take a team of a hundred about a year to do), this might involve spending several years getting an extra degree in the subject, read a ton of literature, improving their productivity, get mentored by an expert in the subject, etc. While they work on it, they learn new stuff and their actual neurons get rewired.
But the AI equivalent to this would be getting new algorithms, new data, new computing power, new training. ie, becoming an entirely new model, which would take significantly more than a few weeks to be built. I think there may be some double counting going on between this superexp and the superexp from algo speedups. ity, get mentored by an expert in the subject, etc. While they work on it, they learn new stuff and their actual neurons get rewired.
But the AI equivalent to this would be getting new algorithms, new data, new computing power, new training. ie, becoming an entirely new model, which would take significantly more than a few weeks to be built. I think there may be some double counting going on between this superexp and the superexp from algo speedups.
(This is not a full response yet, sorry)
I remarked to Eli yesterday that people seem to really hate the superexponential even though it doesn't really affect the bottom line that much; I could have avoided so much flak if I had just used a faster exponential trend (akin to your "the new normal" curve) and made some qualitative arguments for why I thought things were going to go faster than the METR fit, such as (1) the feedback loop from AI partially automating AI R&D, and (2) my hypothesis that doubling horizon length should in some sense get inherently easier the longer your horizon lengths already are. (i.e. the skills you need to operate at horizon 2N are not that different from the skills you need to operate at horizon N, for large N)
I'm curious if you think that would have been an improvement, and if so, how significant of an improvement.
I respect what AI 2027 is doing (and there's way worst forecasts out there idk if you've seen Greg Colbourn's but it brings a tear to my eye with how horribly horribly Yudpilled it is) but this is good work
Even with low p(doom) i appreciate what they're doing to promote AI safety but like many Rats their work seems and is delivered as a lot more airtight than it is
Only part-way through, but thoroughly agree with your points. Thank you for diving deep here!
I appreciated the critique (though I partly agree with AI 2027's responses). I've seen many critiques of AI 2027, and this is one of very few good ones. Two objections that I haven't seen anyone else raise:
"So a backast towards 2022 predicts an AI R&D speedup factor of around 0.6 for both type of forecasts. With a current factor of about 1.1, this means that a backcast is modelling that current AI progress is 66% faster than it was in 2022."
Backcasting the speedup to values below 1 is an unfair extrapolation. The speedup models how much faster using AI tools is than doing stuff manually. Even if the backcast perfectly describes how historical progress *would have proceeded using AI tools* (i.e. the model does its job perfectly in that regime), it won't describe actual historical progress. Back when using AI tools would've had a <1 speedup, people would've just done stuff manually, putting the effective progress speedup at 1, regardless of the theoretical using-AI-tools speedup.
"Most of these models predict superhuman coders in the near term, within the next ten years. This is because most of them share the assumption that a) current trends will continue for the foreseeable future, b) that “superhuman coding” is possible to achieve in the near future, and c) that the METR time horizons are a reasonable metric for AI progress."
You don't need assumption b. It follows from a and c. If time horizons are the (coding) task length that AI can do reliably, and that length becomes long enough, which will happen if current progress continues, then you get SCs (minus some nuances about cost which are quite likely to be resolved).
Rather different from your methods, but the strongest case against ASI I know is that ASI makes the Fermi Paradox way worse.
“this is a task that would take an immortal CS graduate a thousand years to accomplish,”
The windows OS would take an immortal CS graduate millions of years to write but it’s not too hard to imagine an AI doing it within a year.