🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik artwork

🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik

Latent Space: The AI Engineer Podcast

April 20, 2026

Today, we explain this piece of “clickbait” from our guest!
Speakers: Ron Alfa, RJ Honicky, Brandon Anderson
**Ron Alfa** (0:00)
So we basically opened the lab, we hired a team, we got all the instruments, we started sourcing tumor samples.
There was no prior here that any of this would work. Like zero. We just started generating data, and like sourcing human tumors, processing, we built this whole processing pipeline to get the tumors into like these arrays and the formats. So you've got like these two week runs where you're processing two slides, and we're just churning data for months. And we couldn't even train a model. So we sort of just built all this, and then like let's say 18 months later, hey, I wonder, can we train a model off? And then it was not, you know, like it wasn't obvious. Yeah, there wasn't really like anything major to go off of. I mean, there were like transformers developed for single cell data. There just like weren't really data sets out there that people have been able to develop on. We do a lot of like custom model building.

**RJ Honicky** (0:55)
Hi there, I'm RJ. Haneki and this is Brandon Anderson. We're the co-hosts of the Latent Space Science Podcast. And today we're really happy to be in the studio with some of the people from Noetic.

**Ron Alfa** (1:06)
I'm Ron Alfa, co-founder and CEO of Noetic, physician, scientist by training. My hobbies are making hot takes about AI curing cancer.
Hi, I'm Dan Bear. I'm VP of AI at Noetic. I'm a biologist by training. I did PhD work in neuroscience and then moved into comp neuro, computer vision, self-supervised learning, and have been doing AI research at Noetic for the past few years.

**RJ Honicky** (1:32)
Maybe we should start with what is Noetic? Why did you found it? What is the difference between Noetic and the other virtual cell companies?

**Ron Alfa** (1:41)
Maybe just start with a little bit of a contrarian thesis, which is really the reason for founding Noetic.
We all know the numbers at 90 percent, 95 percent of the cancer drugs fail in the clinic.
Why do they fail? So our thesis is they fail not because we're bad at pharmacology, not because we're bad at target selection, you're making the drug. We're actually better at that process than we have ever been in the history of drug development. Most of those drugs fail, we'd argue, because we're bad at selecting which patients those drugs are in our work. Oftentimes, you see trials where there is no placebo effect in cancer. Some patients respond to these drugs. If you have a patient that responds, that tells you something that there's some biology that's active there, but you have a problem in patient selection. Really, that's the thesis behind it. Why, I think, is can we build models that can fundamentally understand patient biology from the very beginning, and help you position molecules in the right patient population.

**RJ Honicky** (2:39)
So, you're actually using the models partly, at least, to select the patient cohort, not just so you can imagine it working either way. You could design, oh, I think that this molecule will do well because I know something about the patient population, but you could also say, I think that this patient population is the match for this molecule.

**Ron Alfa** (2:59)
That's where the power of the models is, once you've trained these models on patient data, you can use them on both sides of the equation. You can use them for discovering new targets directly from the patient data, which people often refer to as reverse translation, so starting from humans and then trying to understand which targets to go after, and then you can use that to develop molecules.
But you can also use them directly on patient data. If you have, let's say, a phase 2 or phase 3 trial, you can use these models to understand which patients, or what underlying biology of the patients in the trial is a predictor of response. We've been doing a ton of bad recently.

**RJ Honicky** (3:43)
Are you doing a lot of rescuing trials that had a bad effect?

**Ron Alfa** (3:48)
We are doing a lot of looking at data from phase 2, phase 3 trials, and then using the models essentially to run a difference on patient biopsies and understand whether there's underlying biology that would help us design the next trial. We haven't shared any of that yet, but you'll see this, too.

**Brandon Anderson** (4:07)
So cancer is kind of like infamous in that way. There are many, many different types of cancers. Whenever it says like cure cancer, that is almost a meaningless, factuous statement. So your point is even amongst cancer, or you pick a specific type of cancer, and then a subtype, and a subtype, there's a bunch of different patient populations that each one of them will respond differently to drugs. And your point is you can figure this out right now.

70 more minutes of transcript below

Feed this to your agent

Try it now — copy, paste, done:

curl -H "x-api-key: pt_demo" \
  https://spoken.md/transcripts/1000762427019

Works with Claude, ChatGPT, Cursor, and any agent that makes HTTP calls.

From $0.10 per transcript. No subscription. Credits never expire.

Using your own key:

curl -H "x-api-key: YOUR_KEY" \
  https://spoken.md/transcripts/1000762427019