Could billions of dollars in AI funding lead to the same number of new drugs?

Drug discovery has quickly become the most enticing place to apply artificial intelligence. Billions of dollars are being invested in AI-driven “techbios.” In an industry where nothing changes overnight, even large biopharma companies are touting AI as key to how they’re transforming their discovery engines.

But in the race to integrate AI into drug discovery, investing so heavily in scaling one part of the system overlooks the rest. Failing to reimagine R&D systems to handle the new speed and scale of AI-driven discovery risks overpromising and under-delivering to the people who need new medicines.

advertisement

There’s no question that new AI models for drug discovery deserve serious attention. Within the next five to 10 years, AI will fundamentally change the way drugs are designed, with the potential to produce an order of magnitude more high-quality candidates against a broad range of new diseases. In the last year alone, AI has been used for identifying novel targets in areas like cardiomyopathy, generating novel antibodies, and even designing newer modalities like optimized mRNA vaccines for influenza.

But the focus for AI can’t just be on discovery. The rest of the pharma R&D system — how drugs are developed, tested, approved, and manufactured — will need to accommodate this vast increase in speed and scale.

Pharma R&D has been heading in the wrong direction, becoming progressively less efficient over the last few decades. Large pharma companies now spend more than $6 billion on R&D per approved drug, compared to just $40 million (in today’s dollars) in the 1950s, with approximately 85% of that spending coming after discovery. The inequality between near-zero-cost AI-driven drug discovery and skyrocketing costs for clinical trials and regulatory approval will create a bottleneck, stalling promising drugs from reaching patients unless AI is equally applied across the entire R&D lifecycle. As co-founders of Benchling, a technology company focused on life science R&D, we’ve heard about this issue from scientists, R&D leaders, and CEOs across more than 1,200 companies.

advertisement

To be sure, the biopharma industry has long known it needs to evolve its R&D systems. The paper describing the rapid decline in R&D efficiency, often known as Eroom’s Law, is more than a decade old. AI, now being used across nearly every industry as a force for disruption, bringing with it automation, scalability, and intelligence, should be used to improve every part of the R&D lifecycle, not just discovery, to increase throughput and improve efficiency.

Rethinking the R&D lifecycle with AI

This is no small lift — applying AI will involve rethinking how R&D organizations operate, rather than simply applying AI to how things work today.

Just look at how AI-driven drug discovery is already putting new and different pressures on lab-based experimentation. Drug candidates discovered with AI need to be tested through experiments in the lab, which in turn generate experimental data that is used to further refine AI models through a “lab in a loop” process. An enormous influx of new AI-generated candidates and new data-hungry AI models means labs need to run experiments at a vastly greater scale.

That can’t be achieved by simply optimizing the manual processes at the bench that labs rely on today. Instead, labs need to be reinvented around complex imaging and single-cell omics assays that allow a more complete understanding of biology, along with robotic automation that enables these assays to be run at scale. Although this trend has already started, AI applications can rapidly accelerate it.

The dearth of specialized engineers needed to analyze data from complex assays and build robotic orchestration is a key bottleneck. AI models like scGPT can accelerate data analysis by automating code-intensive tasks such as reference mapping or cell annotation. AI agents will also enable scientists to set up robotic automation through natural language, democratizing access across the industry.

Advancements in AI also have the potential to address key challenges in clinical trials. Take, as an example, patient recruitment, the most time-consuming part of a trial. Even though millions of people are needed to participate in clinical trials, fewer than 5% of Americans have participated in clinical research of any kind.

Sound clinical trial design requires randomization, but that takes participants out of the driver’s seat — they no longer have final say over their treatment decisions, creating a barrier to recruiting. In 2022, the European Medicines Agency provided a qualification opinion allowing the use of AI models to develop predicted control outcomes for Phase 2/3 trials from historical control data, ultimately requiring fewer participants to make this difficult randomization choice.

Beyond the lab work and clinical trials required to get a product to market, there is also so much knowledge work, from R&D managers reporting decisions at key program milestones to medical writers drafting filings for health authorities, quality assurance staff confirming data integrity, and much more. This knowledge work is about translating R&D data into decisions and documentation, and requires answering questions and generating content in the natural language of scientists. Scientific large language models that are fine-tuned versions of popular general-purpose large-language models like GPT, such as BioGPT, or Llama, such as BioMedGPT-LM, have obvious potential.

But a generative pre-trained transformer built for scientific language isn’t enough. The real challenge is rethinking how the underlying R&D data are structured and managed. To automate knowledge work, these large-language models need to operate on top of data that comes from the lab, a foundation in which there are many problems today. Large pharma companies often employ hundreds of software applications within R&D labs alone, leading to data silos and a lack of data standardization and interoperability that make it excruciatingly difficult to apply AI effectively. At Benchling, we’re taking the same modern platform approach that has transformed how many businesses digitally manage their sales or financial data and applying it to R&D to make automation of knowledge work a reality.

AI will change how pharma competes

Another key element in the work of biopharma companies needs to be reinvented to make the promise of AI in biotech a reality: how companies compete.

Profluent Bio, a Berkeley, Calif.-based biotech, recently open sourced a novel, AI-designed, CRISPR-based, human gene editor. To open source such intellectual property was previously unthinkable. In an era where scientists working with AI can design many more drugs than we could possibly ever bring to market — imagine a world of drug abundance! — the competitive focus will shift away from protecting intellectual property and toward speed to market and creating step changes in the efficiency of clinical trials and regulatory approvals.

This shift will, in turn, help solve the biggest issue in making AI-driven drug discovery even more powerful: access to data. The historic focus on intellectual property has created an industry culture that treats all experimental data as proprietary. Yet the success of AlphaFold2 and AlphaFold3 is entirely predicated on the public availability of protein sequences and experimentally-resolved structures. Progress in developing new foundation models will require data abundance.

Open-source software has long been a tenet of the tech industry, fundamentally altering the nature of collaboration and competition. If the biopharma industry wants to realize the benefits of AI, companies must work together to generate the data needed to power it. Companies are already starting to collaborate on open-source software projects around managing data in areas like molecular modeling, connectivity to lab instruments, and bioinformatics code. Pre-competitive collaboration can extend even further to how AI models themselves are built. Federated learning allows companies to update a shared global model without sharing their underlying datasets with competitors. This approach has already shown significant improvement in AI models for small molecules, and can likely have an even larger impact for large molecules if companies invest in it together.

The era of biotech AI

The era of rational drug design — an atom by atom, computer-aided approach to designing drugs for a specific target — started more than 30 years ago. It had a tremendous impact, leading to breakthroughs for debilitating diseases like cystic fibrosis. We are now entering an era of AI-driven drug discovery, which promises to be AI’s greatest contribution to humanity by finding treatments for the thousands of currently untreatable diseases.

If the bulk of biopharma R&D continues to operate as it does today, a treasure trove of new drugs will be created that may never make their way to patients. Applying AI beyond drug discovery and using it to reinvent all elements of R&D will ensure that doesn’t happen.

Ashu Singhal and Sajith Wickramasekara are the co-founders of Benchling, a company that provides a cloud platform for life sciences R&D. Singhal is the company’s president; Wickramasekara is its CEO.