With latest tranche, U.K. Biobank has genome sequences from 500,000 people available for research

LONDON — Data from half a million people’s whole genome sequences are now available to researchers worldwide, as the U.K. Biobank on Thursday debuted the latest addition to what it aims to be the world’s most comprehensive health data resource.

The Biobank has been building its collection over 20 years, with 500,000 volunteers recruited to provide survey responses about their health, medical records, tests of molecular markers, and imaging scans. The whole genomes of some participants were already available, but the new release this week includes the DNA portraits of the full cohort of people — the result of more than 350,000 cumulative hours of sequencing.

advertisement

Adding genomic data to the heaps of the health data and medical records already available “again keeps Biobank very much in the lead globally in this space and will drive a whole set of new users to the data,” John Bell, a professor of medicine at Oxford University, said at a press briefing this week.

The idea behind such a bank is to allow researchers to comb through people’s genomes and look for links to different health issues, or find variants that can prove to be protective. They can glean who might benefit from earlier breast cancer screenings or from certain blood pressure-lowering medications if they’re at high risk for cardiovascular disease. With so many genomes available, scientists can track the impact of rare variants that may play a role in disease, or that could be targets for developing new medicines.

The timeframe of the project also opens up avenues of research. The volunteers were enlisted some 15 years ago, and many have gone on to develop different cancers or dementia over the years.

advertisement

“The scientists are looking at this like Google Maps,” said Rory Collins, Biobank’s principal investigator. “When they want to know, what are the pathways from lifestyle, environment, genetics to disease, they don’t go to Google, they go to U.K. Biobank.”

What sets the project apart, Collins added, was the “unique combination of scale, of depth, of duration, and I think most importantly, of accessibility.”

Researchers have to pay to use Biobank data, with discounts for early-career scientists and those in low- and middle-income countries. But some 30,000 researchers from 90 countries have access to it, fueling and enabling a huge number of studies. Just last month, Nature published a study tracking how genetic variation shaped proteins circulating in the blood, based on protein samples from more than 50,000 Biobank participants.

Researchers also argue that having the full genome sequences, as opposed to just the stretches that encode proteins, allows scientists to delve into the parts of DNA that have been comparatively understudied but that are increasingly being found to influence our health.

The Biobank participants include 440,000 people who are white — 88% of the total volunteers. (Roughly 82% of people in England and Wales are white). Of the others, 10,000 are of African descent, 10,000 are of Asian background, and the remainder are of mixed or other racial and ethnic groups, Collins said. Genomic research has been limited by datasets that are disproportionately from white people and that produce results that are not applicable to people from underrepresented racial and ethnic groups.

The Biobank project has been funded by U.K. Research and Innovation and the Wellcome Trust, as well as the biopharma companies Amgen, AstraZeneca, GSK, and Johnson & Johnson. deCode Genetics (an Amgen subsidiary) and the Wellcome Sanger Institute performed the sequencing.

As funders, the companies have a several-month head start on access to the data before they’re made widely available. (Academic groups involved in the project also have early access to them.)

Ruth March, AstraZeneca’s senior vice president of precision medicine, said Biobank data had already informed the design of drugs that are moving through the company’s pipeline.

“We believe that having this resource available will help us toward developing better medicines,” she said.