The healthcare industry is experiencing a data and technological revolution that is accelerating drug discovery and delivery to patients. Over the past 15 years, we’ve seen an explosion of targeted therapies — including small molecules, immunotherapies, cell-based treatments, and more — all made possible by the integration of large-scale ‘omics data with clinical insights, treatment approaches, and patient outcomes. However, despite these advancements, we still have a long way to go to fully realize insights gained from these breakthroughs in order to drive timely discoveries. Underscoring this is that there are over 7,000 rare diseases, yet only 500 have approved treatments. In oncology, a significant long tail of less common cancer types continues to present unique challenges for diagnosis, research, and treatment. Moreover, the complex genetics underlying common diseases remain largely unexplored, with much work needed to fully translate these insights into holistic improvements in patient care.
In an era where new data streams are constantly emerging, there is an urgent need for streamlined platforms that can seamlessly integrate the latest population-scale and proprietary datasets, including clinical, knowledge, and multi-omics data. This access to varied data sources, in conjunction with a diverse set of tools and workflows, will allow researchers to analyze complex data in real time, identify potential drug targets, and confirm potential treatments faster. This may be accomplished through novel expert systems, metadata and data driven mapping to appropriate analytics, and low-code solutions that assist in streamlining complicated processes, reducing manual steps, and accelerating research timelines. Overall, these advancements put drug development into double time, without risking quality due to the ongoing rigorous validation, thus catalyzing adoption of novel therapies as the standard of care.
The Promise of AI and Large Language Models
The healthcare industry has seen a boom in the use of artificial intelligence (AI) and large language models (LLMs), which are already changing how precision medicine is applied. AI is instrumental in recognizing patterns within complex datasets, improving interrogation accuracy, and facilitating the exploration of large volumes of clinical and ‘omics-based research data. LLMs, with their ability to analyze and generate human-like text, offer significant potential for compiling medical knowledge, interpreting research papers, and making predictions based on clinical data, allowing the research professional to interact with the system conversationally to start the process.
Yet, the full potential of these tools can only be realized when AI and LLMs are integrated into a unified and standardized analytics and insights platform that analyzes raw data/signals to identify discrete analytes such as variants, expression profiles, and protein levels and then derive insights from those analytes using advanced AI algorithms and annotations including those that may be derived using LLMs.
When connected, these technologies can analyze large datasets, infer disease mechanisms and pathophysiology, and shed light on trends, offering insights into potential future outcomes that might have otherwise been missed, ultimately driving research and accelerating the path to new treatments. Better leveraging of data- and traditional analytics-driven research ecosystems with AI and LLMs can speed up the integration of promising treatments into standard care.
Overcoming Challenges in AI and LLMs Integration
Despite the potential of AI and LLMs, their integration into healthcare is not without challenges. Researchers and healthcare providers need to be aware of the following:
- Data Quality and Integration: Use of curated data, such as annotation databases and analyzed multimodal datasets, with unstructured data, like clinical notes or research papers, as the basis for making inferences by LLMs is important to the ultimate accuracy of the results returned. The entire analytical toolkit, including AI algorithms, LLMs, clustering/classification tools, biostatistical techniques, and advanced computational techniques, needs to be used together intelligently for individual research or clinical contexts to ensure adequate specificity and sensitivity.
- Digital Hallucinations: Hallucinations occur when LLMs produce inaccurate information, a serious issue in healthcare. “Fit-for-purpose” models, which are fine-tuned with curated data or guided by prior knowledge, improve accuracy and reduce the risk of such errors by focusing on specific tasks.
- Model Transparency: Tailored models, which provide the underlying evidence leading to observations returned, allow users to trace results back to references which may then be verified by domain experts. This is important for making fact-checking more straightforward compared to generalized models.
Data Security and Privacy
With the growth of AI and LLMs in clinical settings, data security and patient privacy are more important than ever. In 2024 alone, over 300 cyberattacks on healthcare systems were reported, highlighting the vulnerabilities inherent in handling sensitive medical data. To mitigate these risks, healthcare organizations need to implement robust data security protocols, including:
- Anonymizing Patient Records: This is the process where personal information is removed from medical data before its use in AI systems such that the data may not be used to reidentify the individual. This ensures patient privacy while allowing AI to analyze data, identify patterns and produce predictions, supporting research without compromising patient privacy or safety.
- Establishing Dedicated Networks: Like controlling their digital environments on premise, researchers and precision medicine companies should establish their own dedicated cloud networks. By leveraging a private and secure cloud infrastructure, sensitive data can be processed, stored, and accessed with a greater sense of control.
- Establishing private environments for AI/LLMs : Precision medicine companies and research institutions should ensure that proprietary models based on commercially available LLMs do not allow for unauthorized use of data by the LLM provider and that a version controlled, non-modifiable local copy of the LLM is used prior to using techniques such as prompt engineering or fine tuning to create application-specific models. These unique models should operate in isolated environments to prevent data exposure, with strict version control and validation to ensure their accuracy and reliability over time. Regular validation helps maintain consistency and prevents errors or integrity issues.
Amidst these technological best practices, there does remain one more component that is not technology based. That missing piece is the human touch, which remains a crucial element to the establishment of a platform built on harnessing AI and LLMs.
The Human Touch
Despite the increasingly incredible capabilities of AI and LLMs, they cannot replace the critical role of healthcare professionals and researchers. The “human touch” remains essential for interpreting AI findings, applying insights in the context of individual patients, and ensuring that standards are upheld.
By creating collaboration between human experience and expertise, with technology’s abilities to consume extensive quantities of data, a synergy can be formed that accelerates the translation of discoveries into clinical practice while ensuring that high quality patient care remains at the center of the process.
Moving in Double Time
The traditional pace of medical discovery is far too slow for today’s fast-moving healthcare landscape. By harnessing a comprehensive technology toolkit including AI and LLMs and incorporating the human touch, all within a secure data-sharing platform, those within the precision medicine industry can see firsthand accelerated scientific breakthroughs reach patients and begin to create positive impacts sooner.
About Rakesh Nagarajan, MD, Chief Medical Officer at Velsera:
Dr. Rakesh Nagarajan is the Chief Medical Officer at Velsera, focusing on democratizing clinical genomics and advancing precision research through ‘omics technologies. With nearly thirty years of experience at the intersection of computer science, informatics, and medicine, he is a trained physician scientist committed to clinical and translational research.