By Monica Heger
Despite considerable advances in whole-genome sequencing in recent years, key information is still missing from all of the genomes sequenced with next-generation sequencing technology — haplotype information.
Now, separate papers published this week in Nature Biotechnology detail two different ways to haplotype whole genomes. In one paper, a group from the University of Washington combined next-gen sequencing with large insert cloning to achieve a sequenced genome with haplotype information. In the second paper, a group from Stanford University used microfluidics technology in combination with genotyping to obtain haplotype information at the single-cell level. While the two methods are different, they could be complementary to each other, said Jay Shendure, who led the University of Washington team.
“The two papers show different techniques that are available to use for research immediately,” Rade Drmanac, Complete Genomics’ chief scientific officer, told In Sequence. Drmanac, who is also developing a haplotyping method at Complete Genomics, was not affiliated with either study published this week.
As of now, the only two genomes that have been completely haplotyped are the reference human genome and Craig Venter’s genome, both of which relied on Sanger sequencing and clone mapping to resolve the haplotypes — a labor-intensive and costly process.
While the newer sequencing technologies have allowed for exponential cost reductions and much higher throughput, the shorter reads are not amenable to obtaining haplotype information, which will be critical in the fields of personalized medicine and population genetics.
“For personal genome sequencing and diagnostics, [haplotyping] is really going to be critical,” particularly when genomes are sequenced early in life with the goal of predicting disease risk, Drmanac said. Without haplotype information, “reported genomes are a just consensus of the two parental genomes,” making disease risk prediction difficult.
Old School Meets New School
In the University of Washington paper, the team combined “old school genomics with new school genomics,” said Shendure.
First, they made a fosmid library from DNA from a HapMap individual of Indian descent, with inserts of about 37 kilobase pairs. They then split the library into more than 100 different pools, so that the odds of both alleles being contained in one pool were very low. Then, after barcoding the pools, the team shotgun-sequenced the libraries on the Illumina Genome Analyzer to a mean depth of 2.4-fold per haploid clone. Next, before phasing, the team used whole-genome resequencing to search for variants. They used the Illumina HiSeq 2000 with 50 base paired-end reads and sequenced the genome to 15-fold coverage.