The case for AI and machine learning in life sciences – Pharmaceutical Technology

The current tech landscape is rapidly evolving, with advancements in areas such as big data and conversational platforms. Most of us have heard of ChatGPT and AI but where and how do they fit into the life sciences ecosystem?

Big data is essentially extremely large data sets that can be analysed to reveal patterns, trends, and associations, especially relating to human behaviour and interactions. The AI industry relies heavily on this data to train machine learning (ML) models. Conversational platforms are AI-driven tools that simulate human-like conversations, like ChatGPT, which first burst onto the scene just three years ago.

Over the next two years, GlobalData’s thematic research unit shows further advancements in machine learning, falling into three main categories: technology, macroeconomic, and regulatory. Key technology trends include advancements in AI, AI chips, and natural language processing. Macroeconomic includes quantum computing, and for regulatory, data privacy and algorithmic bias regulation are predicted to be major themes moving forward.

AI-powered solutions in healthcare

While the impact of AI will be far-reaching, its pattern-matching abilities and processing speed means AI has the potential to revolutionise the field of life sciences and medicine, in particular.

Healthcare and pharma are often at the forefront of new innovations, but it is a highly regulated industry, and change can sometimes be slow. However, after realising the benefits of remote technologies and AI during the Covid-19 pandemic, it is now being used in many more areas across the value chain.

AI use in the healthcare space is expected to continue to increase in the next five years, with the potential to transform key aspects of the industry. Uses include data management, remote surgery, diagnostic and procedural AI assistants, drug discovery, and clinical trial design. GlobalData forecasts that the market for AI platforms for the entire healthcare industry will reach $4.3bn by 2024, up from $1.5bn in 2019[i].

Using AI in Electronic Medical Record systems

AI already plays a significant role in Electronic Medical Records (EMR) which have evolved from being electronic versions of personal health records to providing deep AI-driven analysis that provides clinical decision support (CDS). Thematic Research[ii] shows that adoption of EMR systems is very high, especially in the US, and CDS systems are leveraging AI technology and available EMRs to provide clinical insight to both physicians and patients.

In clinical trials, the use of plain language for participants is essential. Clinical trial documentation is often highly technical and full of jargon, but by translating into plain language summaries – in the official languages of the countries taking part in the study – sponsors ensure they are making clear the study processes and results, as well as maintaining transparency and retaining engagement with the patient.

Using electronic consent (eConsent) creates a further opportunity for both engagement and reader comprehension. Consent documents often describe technical medical conditions and regulatory bodies are increasing pressure on sponsors to ensure that patient-facing information can be easily understood by patients. eConsent, an online, paperless equivalent, can also include links to videos or hover-over text explanations to clarify technical terminology.

Additionally, eConsent can be integrated into a broader digital ecosystem. Automation can push eConsent to those who qualify on a clinical trial, with significant savings in time and costs. This same ecosystem can also be used to help manage the clinical trial itself.

Can AI transform document translation?

Generative AI models are trained on large datasets of multilingual text to learn how to translate between languages and these models are then used to translate text. Google Translate, for example, uses a Neural Machine Translation (NMT) system, based on deep learning models to provide translations for over 100 languages[iii]. Similarly, Microsoft Translator uses a combination of statistical machine translation and NMT to provide translations for over 60 languages. Language Weaver from RWS, supports over 130 languages using NMT.

It should be noted however, that both Google Translate and Microsoft Translator are not appropriate for sensitive data. Content input to these platforms become part of the engines and can be used publicly.

While such AI translation tools can be useful for the individual, they are not always accurate. Therefore, it is recommended to use AI translations as a tool to aid in communication rather than relying solely on them for important or sensitive information.

Although machine translation technology can translate words from one language into another, this linear method can result in errors because many MT programmes cannot recognise full sentences or their closest translations in the target language. Since some drugs and medical products can have severe side effects if used incorrectly, MT engines used to translate such critical information such as dosage instructions, must be checked and edited by human translators.

Using Neural Machine Translation for life sciences

Although products like the Google MT interface could look cost-effective, using open-source tools gives away a certain amount of control and comes with some risk, particularly when dealing with highly regulated—and sensitive—content that demands an extremely high standard of security and accuracy throughout translation. Thus, public tools such as Google Translate, and Microsoft Translator should not be used for highly proprietary data, clinical trial data, or sensitive personal data.

Expectations and reality about machine translation quality did not really start to converge until the release of Neural Machine Translation (NMT) in 2016, but now, NMT performance has finally reached a stage where the industry has confidence in its abilities, and its usage in highly regulated industries, such as the life sciences, is viable in many situations.

Neural networks learn to “translate” from language A to language B. However, while previous models used linear translations, i.e., one word at a time, now all the words in an input sentence are processed in parallel, to cover context, with this processing backed by advances in AI chips and processing power. This fine-tunes the relevance of words combining the original sentence – useful when relevant word positions change from one language to another.

The accuracy of translation depends on the quality of training data used, and this is why many life sciences organisations collaborate with bigger, international Language Service Providers (LSPs) that have access to substantial data resources.

Setting goals and expectations  

Multinational translation company RWS, has long been at the forefront of linguistic innovation.

One essential aspect of its innovative vision is working with new regulatory statutes, such as EudraVigilance[iv], which requires electronic reporting for marketing authorisation holders and sponsors of clinical trials using a data format based on international standards set by the ISO. This lends itself well to the use of NMT. Another area where NMT advances can bring automation to the translation process, is eConsent, which reduces time and costs, so sponsors can get their product to market faster.

Oversight and editing of sensitive and critical documentation are still the domain of real people however, and RWS maintains human linguist post-editing control. This additional feedback also leads to improvements to the NMT, to ensure it achieves continuous high-quality translations.

Finally, not all translation projects are suitable for using NMT, and RWS works closely with clients to ensure expectations align with NMT’s current capabilities. This includes assessing content for MT viability, and pilot testing.

As the language services industry as a whole moves toward NMT, RWS is leading the way to implement machine translation into production lifecycles.

For a detailed analysis and valuable insights into deploying successful NMT programs, download the paper below.


[i] GlobalData: Thematic Research: Artificial Intelligence in Healthcare 2021

[ii] GlobalData: Thematic Intelligence: Medical Electronic Medical Record Systems

[iii] GlobalData: Text-to-X: how ChatGPT and generative AI can transform the future of business

[iv] https://www.ema.europa.eu/en/human-regulatory/research-development/pharmacovigilance/eudravigilance