John Schulman on dead ends, scaling RL, and building research institutions artwork

John Schulman on dead ends, scaling RL, and building research institutions

Cursor

December 17, 2025

A conversation with John Schulman on the first year LLMs could have been useful, building research teams, and where RL goes from here. 00:00 - Speedrunning ChatGPT 09:22 - Archetypes of research managers 11:56 - Was OpenAI inspired by Bell Labs?
Speakers: John Schulman
**SPEAKER_1** (0:20)
If the group of people that started OpenAI went back to 2015-2016, and wanted to speed rush building ChatGPT, how fast could they do it? And what would be the bottlenecks in not doing it even faster? And what moves would that group play that would be different than what actually happened?

**John Schulman** (0:38)
Yeah, I think if you wanted to make ChatGPT with a lot less compute, you could. And we've seen things like NanoGPT that sort of do this. I mean, like sometimes it's easier to do something with more compute, but then by adding more clever tricks, you can do it with less compute.
I mean, also, I guess we could have scaled a lot faster, or it would have been possible to scale if we knew that the returns would be what they were. Yeah, I think if you wanted to do it a lot earlier, if you had the whole recipe in mind, you probably could build a lot earlier. You could put together a big cluster and pre-train a model and then given all the things we know now about post-training, you can effectively increase your compute a lot by doing post-training better. So even if it takes a GPT-3 level model to create a good few shot prompted chat model, you can, if you're willing to do a lot of fine-tuning and construct the fine-tuning data set in a clever way, you can get a much smaller model to be quite good.

**SPEAKER_1** (1:55)
How many people do you think it would have required? And what year do you think it could have been done? And how many GPUs?

**John Schulman** (2:02)
I mean, if we assume full hindsight, I think...

**SPEAKER_1** (2:04)
Full hindsight, yeah.

**John Schulman** (2:05)
So NanoChat is just programmed by one person, and it runs on one box. That took him probably like half a year to write, so that's at least an upper bound. Obviously, this is on like H100s, and we would have had like V100s or something earlier.
But I think we could have networked together a few GPU boxes. You could have gotten something that was ChatGBT 3.5 level maybe back in 2018 or 2019 with a couple people. I might be underestimating all the different parts of the stack, but I think you could... Yeah, if you had a few talented people working for a year or so with full hindsight, I think you could get something. Actually, this is also building on pre-training data sets and scrapes that other people have done. So, yeah, I haven't thought this through fully, but I'd say you could probably do something back in 2018 or 2019 with a few people that would get to GPD 3.5 level. And maybe in the future, we'll get even more extreme, and there will be the demo scene, ChatGBT, that's one file that trains the whole thing and scrapes the web and does the whole thing in the day of training.

**SPEAKER_1** (3:31)
Well, so, OpenAI is one of the biggest companies in the world now from a market capitalization standpoint and among technology companies, maybe CapEx investment. But I think it's easy to lose sight of how informal and kind of like ragtag of a group it was early on. And to be curious if you agree with that premise, that it really was a group that felt very scaled down, informal, you know, maybe stuff felt much less weighty in kind of 2016, 2017 And then maybe to illustrate just a picture, help us fill in a picture of what early OpenAI looked like. I'm curious, what was one start that the group worked on? Like a project that just was a complete dead end, didn't work, and now it doesn't really get talked about as much in 2025

**John Schulman** (4:17)
Yeah, I'd say early on, it was more, it was sort of more rag tag, maybe even it was a little bit like an academic group. I mean, there were just a bunch of different research projects that people were working on, sort of driven by their own taste. And people were working in groups of one, two, three people on some kind of research project that would turn into a paper or a blog post. So I'd say the first couple of years of OpenAI had a lot of that flavor. I mean, there is also the idea of big projects and the idea that you could put together. You could, like compared to academia, we could go a lot further by doing serious engineering and putting together bigger groups of people on a project. So I'd say that idea was with us the whole time. And we were also influenced by DeepMind, who had pioneered this way of working to a large extent with projects like AlphaGo.

35 more minutes of transcript below

Feed this to your agent

Try it now — copy, paste, done:

curl -H "x-api-key: pt_demo" \
  https://spoken.md/transcripts/1000651996090

Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.

From $0.10 per transcript. No subscription. Credits never expire.

Using your own key:

curl -H "x-api-key: YOUR_KEY" \
  https://spoken.md/transcripts/1000741762325