In the healthcare ecosystem, automation and data intelligence continue to gain momentum as artificial intelligence (AI) and machine learning (ML) are integrated to streamline and accelerate processes.
The clinical trial space is one area where pharmaceutical companies are leveraging these technologies to improve operations, in particular, data processing. A typical clinical trial generates over 13,000 documents in various formats, including text, audio, video and images, making it difficult to collect, organize and analyze the data. Intelligent document processing (IDP) can help by automating these tasks, boosting productivity, speeding up processes, improving accuracy and saving money.
In this article, we dive into how to implement IDP in the digital content flow of a clinical trial using transformative technologies like digital twinning, AI/ML, natural language processing (NLP), and generative AI agents. These technologies can automate the implementation process, enabling the rapid and intelligent transformation of thousands of documents into valuable research insights that can enable researchers to uncover hidden patterns and trends that could lead to new breakthroughs in medicine.
Assessment and planning
As the saying goes “fail to plan, plan to fail,” implementing IDP is no exception. Before implementing an IDP platform, sponsors must carefully consider their goals and objectives, specify their document processing requirements, as well as the types of documents that need to be processed and identify areas for improvement. It is important to be aware of the challenges that pharmaceutical companies face in their data flow during clinical trials. For example, manually populating site folders and electronic trial master files (eTMFs) is time-consuming and can lead to problems such as limited document security and data privacy, archiving and retrieval difficulties and human error, which can result in a failure rate of up to 25%.
In addition, the strict regulatory requirements in the healthcare industry mean that IDP systems must ensure patient privacy protection and maintain audit trails. Security continues to be a crucial aspect to prevent unauthorized access to sensitive data. It is because of these two challenges that the pharmaceutical industry has been cautious about integrating generative AI into regulated and sensitive data workflows, which in turn has slowed down adoption.
To overcome resistance to change in this conservative industry, it is important to demonstrate the value of AI-driven solutions and ensure that they are secure, private and compliant.
Implementing IDP in Clinical Trials
Selecting an IDP solution that meets the specific needs of the organization, including support for the relevant document formats and languages is obviously a key factor. Evaluating the solution’s AI/ML capabilities, its ability to integrate with existing systems, to incorporate technology and data science and integrate AI and generative AI, are all essential for handling digital content. The ability to analyze diverse documents often reveals hidden insights and patterns not easily detected through traditional methods.
To implement IDP successfully in a clinical trial’s digital content flow, companies must follow these crucial steps:
- Quality auto-review – Data quality is essential in healthcare and since AI’s effectiveness relies on the quality of its training data, it is paramount to promptly identify incoming content and verify its layout correctness and data convertibility. Clean and organize the data to ensure that it is in a format that can be processed by the IDP solution. This may involve tasks such as removing errors, converting images and scans to text and standardizing data formats, if necessary. If the system detects any issues, it should flag the information for user review and resubmission in real time. This will speed up data collection and quality control processes.
- Digital twin and classification – Deploying digital twins – continuously updated virtual representations of assets – helps with digitization of all clinical trial content for universal accessibility. This data can be used to train generative AI models to recognize patterns and relationships. Companies can also use digital twins to perform pre-audit checks, proactively populate eTMF and electronic common technical document (eCTD) and gain insights from past trials. Finally by maintaining the look and feel of the original content in digital twin structure it allows for a better user experience and easy to understand traceability back to source.
- Auto-translations – The ability to automatically convert content into other languages is an important step in deploying IDP. This can be done using domain-specific regulatory or safety language ontology sets. Auto-translation streamlines communications and drives efficiency gains.
- Sensitive data handling – Data privacy has become an intrinsic part of any clinical trial, so it is important to automate safe processes for sourcing, linking, combining, reusing and sharing protected data. Deploying privacy analytics can enhance the retention of sensitive data by running redaction capabilities and providing only relevant content to users.
- Entity extraction – Once content is digitized, teams need to be able to recognize the sections within that content and find information. NLP and natural language understanding (NLU) enable understanding of text and its meaning. For example, NLP can be used to analyze scheduled assessments for a patient and find out what is required of the patient in their participation. This information can then be used to build appropriate models to best manage patient burden.
- Insights and best actions – Technology can be used to aid in content analysis and generate actionable insights in risk assessment, patient burden, protocol amendments, potential outcomes and theoretical modeling. AI deployment and generative AI training leverage digital twins and NLP to enable natural language generation. Entity extraction is used to identify text, and another program interprets its meaning. A third program generates responses, insights, and next steps. The combination of digital twins and NLP enhances data understanding, enabling generative AI models to learn essential patterns and relationships for precise predictions and the creation of creative content.
The Benefits of Automating IDP
The automation of IDP is gaining popularity as a way to address legacy IDP challenges which involve speeding up operations, enabling continuous processing, improving accuracy, enhancing collaboration and ensuring regulatory compliance.
One of the key benefits of automating IDP in clinical trials is the ability to immediately review the quality of content from trial sites before it is fed into the eTMF for final trial file storage. This is achieved by adopting an API-enabled SaaS solution for automating IDP and having it in site folders. Overall, this process allows to identify and resolve issues early on, such as missing signatures, scan issues and layout problems
Automated eTMF systems offer a number of features that can help to address manual processing issues, including document version control, audit trails, notifications, remote accessibility and advanced search capabilities. They help eliminate manual eTMF entry while maintaining quality. These solutions index documents, automate workflows, aid translations, reduce processing time and free up employees for value-added tasks. Furthermore, they can handle scans and images in any language, extracting metadata and creating digital twins for better recognition.
Another important benefit of automating IDP in clinical trials is the ability to gain clear insights into whether trial sites have seen, acknowledged and understood protocol amendments. This is essential for optimizing communications with sites and ensuring regulatory compliance.
Automated IDP solutions can provide some of the highest levels of visibility by monitoring version control and audit trails. At a glance, IDP key benefits have to do with:
- Expediting clinical trials timelines – By automating the document processing tasks, IDP can help to accelerate the pace of clinical trials.
- Improving accuracy – IDP can reduce the risk of human error in the data collection and analysis process.
- Gaining deeper insights – IDP can help organizations to uncover hidden patterns and trends in clinical trial data that would be difficult to detect manually.
- Reducing costs – IDP can help to reduce the costs associated with manual data processing and analysis.
The Future Role of Technology
As pharmaceutical companies continue to seek ways to automate and streamline clinical trial processes, IDP’s transformative technologies are playing an increasingly important role by driving greater efficiencies and enabling more significant insights into research.
Significantly, more pharmaceutical companies are bringing all their content into digital form, which is allowing the industry to embrace AI more fully through safe approaches such as IDP. Companies are increasingly exploring generative AI applications for data mining, template creation, quality control, site communication and clinical trial operation guides. Generative AI can rapidly identify potential trial participants from medical records and monitor patients by analyzing medical data promptly and detecting safety issues. In the next couple of years, we expect to see pharmaceutical companies to set the foundation for their long-term journey with AI by developing and deploying “mini” versions of generative AI models internally. This will allow them to reap the benefits of AI while protecting the quality and security of their sensitive data.
The drug development journey with AI is already underway and is only going to accelerate in the coming years. IDP is a powerful tool that can help organizations to improve the efficiency and effectiveness of their clinical trials. By following the critical steps outlined above, organizations can successfully implement IDP and start benefiting from it. Taking a careful approach via steppingstones such as IDP, the pharmaceutical industry can ensure that it is well-positioned to capitalize on the full potential of AI.
About Gary Shorter – Head, AI and Data Science, IQVIA Technologies
Gary pursues the use of emerging technology to provide new and more efficient capabilities to enhance clinical trial management. This includes development of new design software through to more recent advancements with AI/ML capabilities where his team has developed several micro-products and micro-services that can be plugged in and used by any SaaS solution.