AI Can’t “Cure All Diseases” Until It Beats Phase 2

One of the big dreams of AI researchers is that it will soon solve drug discovery and unleash a boom in new life-saving therapies. Alphabet committed $600 million in new capital to Isomorphic Labs on that rhetoric, promising to “cure all diseases” as its first AI‑designed molecules head to humans next year. And the first wave of AI molecules is moving quickly with Insilico, Recursion, Exscientia, Nimbus, DeepCure, and others all touting pipelines flush with AI‑generated candidates.

I can’t help but step back and ask if these AI efforts are focused on the right problem. We have no doubt increased the shots on goal upstream in the drug discovery process and (hopefully) have improved the quality of drug candidates being prosecuted.  

But have we solved the Phase 2 problem with AI yet? I think the jury is still out.  

As a young McKinsey consultant, I was staffed on several projects to benchmark R&D for pharma companies analyzing probability-of-success for molecules to graduate from phase 1 through phase 3 and achieve regulatory approval. Two decades and billions of dollars in R&D later, the brutal hard statistic that is impossible to ignore is that more than 70 percent of development programs still die in Phase 2. 

Phase 2 timelines, meanwhile, have stretched from 23.1 to 29.4 months between 2020 and 2023 as narrower inclusion criteria collided with stagnant site productivity. Dose‑finding missteps and operational glitches matter, but lack of efficacy still explains most Phase 2 failures, which comes down to our understanding of human biology.  

Human‑biology validation 1.0 — population genetics and its ceiling

When Amgen bought deCODE in 2012, it placed a billion‑dollar bet that large‑scale germ‑line sequencing could de‑risk targets by exploiting “experiments of nature.” I remember hearing the puzzlement in the industry around why a drug company would acquire a genomics company with an Icelandic cohort, but Amgen’s leadership had an inspired vision around human genetics. Its purchase of deCODE in 2012 was less about PCSK9—whose genetic validation and clinical program were already well advanced—and more about institutionalizing that genetics-first playbook for the next wave of targets. PCSK9 showed the concept works; deCODE was Amgen’s bet that lightning could strike again, this time in-house rather than through the literature. Regeneron followed a cleaner genetics-first path: its in-house Genetics Center linked ANGPTL3 loss-of-function to ultra-low lipids and later developed evinacumab, now approved for homozygous familial hypercholesterolaemia.

Yet even these success stories expose the model’s constraints. The deCODE Icelandic cohort is 94 percent Scandinavian; it produces brilliant cardiovascular signals but scant power in oncology, auto‑immune disease, or psych. Variants of large effect are vanishingly rare; deCODE’s 400,000 individuals yielded only thirty high‑confidence loss‑of‑function genes with drug‑like tractability in its first decade. More importantly, germ‑line data are static and de‑identified. Researchers cannot pull a fresh sample or biopsy from a knock‑out when a resistance mechanism appears, nor can they prospectively route those carriers into an adaptive arm without new consent and ethics review.

National mega‑registries were meant to fix that scale problem. The UK Biobank now pairs half‑a‑million exomes with three decades of clinical metrics, All of Us has over 450,000 electronic health records, and Singapore’s SG100K is sequencing a hundred‑thousand diverse genomes. Each has already contributed massively to science—UKB linked Lp‑a to coronary risk; All of Us resolved ancestry‑specific HDL loci—yet they remain fundamentally retrospective with high latency. Access to UK Biobank takes a median fifteen weeks from application to data release, and physical samples trigger an additional governance review whose queue exceeded 2,000 requests in 2024. All of Us explicitly bars direct re‑contact of participants except under a separate ancillary‑study board, adding six to nine months before a living cohort can be re‑surveyed. SG100K requires separate negotiation with every contributing hospital before a single tube can leave the freezer. None of these infrastructures were built for real‑time iteration, and so they do not break the Phase 2 bottleneck.

Twenty years after deCODE, the first hint that real‑time human biology could collapse development timelines came from Penn Medicine. By keeping leukapheresis, viral‑vector engineering, cytokine assays, and the clinic within one building, the Abramson group iterated through more than a hundred vector designs in four years and delivered CTL019, later commercialized by Novartis as Kymriah. In an earlier era, that triumph proved proximity and feedback loops matter.  

Human‑biology validation 2.0 — live tissue, live data, live patients

I believe the next generation of translational engines should be built around a simple rule: test the drug on the same biology it is meant to treat, while that biology is still evolving inside the patient. Academic hubs and data‑first companies can now collect biopsies and blood draws in real time, run single‑cell or organoid assays rapidly, and stream the results into AI and ML models that sit on the same network as the electronic health record. Because the material is fresh, the read‑outs still carry the stromal, immune and epigenetic signals that drive clinical response. In controlled comparisons, patient-derived organoid (PDO) assays explain roughly two-thirds of clinical response variance; immortal lines barely crack ten percent. The effect is practical, not academic. And the payoff: drugs that light up fresh tissue advance into enriched cohorts with a much higher chance of clinical benefit.

The loop does more than accelerate timelines. Serial sampling turns the platform into a resistance radar: if an AML clone abandons BCL‑2 dependence and switches to CD70, the lab confirms whether a CD70 antibody kills the new population and, if it does, the inclusion criteria change before the next enrollment wave. What begins as rapid failure avoidance quickly translates into higher positive‑predictive value for efficacy—fewer false starts, more shots on goal that land.

Put simply, live‑biology platforms might do for Phase 2 what human genetics did for target selection: they raise the pre‑test odds. Only this time the bet is placed at the moment of clinical proof‑of‑concept, when the stakes are highest and the cost of guessing wrong is measured in nine figures.

The academic medical center’s moment

Academic medical centers already hold the raw ingredients for this 21st century learning healthcare system: biobanks, CLIA labs, petabytes of historical EHR data, and a captive patient population. What they typically lack is integration. Tissue flows into siloed freezers; governance teams treat every data pull as bespoke; pathologists and computational scientists report to different deans. Institutions that solder those pieces into a single engine are becoming indispensable to AI chemists and to capital.

Privacy is no longer the show‑stopper; the tools to protect it—tokenized patient IDs, one‑time broad consent, and secure cloud pipelines—already work in practice. The real lift is technical and operational. A live‑biology hub needs a single ethics board that can clear new assays in days, a Part 11–compliant cloud that crunches multi‑omic data at AI scale, and a wet‑lab team able to turn a fresh biopsy into single‑cell or spatial read‑outs before the patient’s next visit. Just as important, it needs a funding model in partnership with pharma that pays for translational speed and clinical impact, not for papers or posters.

From hype to human proof

The next leap in drug development will come when AI‑driven chemistry meets the living biology that only hospitals can provide. Molecules generated overnight will matter only if they are tested, refined, and validated in the same patients whose samples inspire them. Almost every academic medical center already holds the raw materials—tissue, data, expertise—to close that loop. What we need now is the ambition to connect the pieces and the partnerships to keep the engine running at clinical speed. If you are building, funding, regulating, or championing this kind of “live‑biology” platform, I want to hear from you. Let’s compare notes and turn today’s proof points into tomorrow’s standard of care.

Leave a comment