By Monica Heger
Researchers from Harvard University and the University of Melbourne have used candidate gene-prediction algorithms combined with targeted sequencing on the Illumina Genome Analyzer to identify novel causal mutations in the mitochondrial disease human complex I deficiency, a respiratory disorder that cause skeletal muscle myopathy, cardiomyopathy, hypotonia, and other clinical manifestations.
In the study, published this week in Nature Genetics, the researchers sequenced 103 candidate genes in a cohort of 103 cases and 42 controls. In 60 of the cases, there had not been a molecular diagnosis, and the researchers were able to uncover the molecular cause in 13 of those cases, including identifying two previously unreported causal mutations. In total, the team identified 47 unique mutations in 20 different genes that appear to be associated with the disease.
The researchers said the method could be a good way to identify causal mutations for complex diseases because it enables the sequencing of many different genes in larger cohorts, without being prohibitively expensive.
“I think approaches like this will be popular in the next few years for certain groups of disease, such as heart disease, mental retardation, neurological disease, and cancer,” said David Thorburn, head of mitochondrial research at Murdoch Childrens Research Institute in Melbourne and a senior author of the study.
Those diseases have a strong genetic component, but typically involve hundreds of genes — unlike Mendelian diseases, for which whole-genome and whole-exome sequencing have worked well to find causal mutations by sequencing only a small number of related individuals (IS 3/16/2010 and 9/29/2009).
In the Nature Genetics study, the researchers first identified 103 genes they wanted to target. They began with 45 genetic subunits known to be involved in the enzymatic activity of the human complex I, said Vamsi Mootha, an associate professor of systems biology at Harvard Medical School and senior author of the paper. “We then used a phylogenetic strategy to identify additional assembly factors,” he said. The team looked at the evolutionary history of complex I, comparing organisms that have the complex to those that don’t, to determine which other genes are likely to be involved in the disease.
They then combined the DNA into five different pools for the cases and two pools for the HapMap controls, and performed PCR amplification reactions to capture the 103 genes, which comprised 145 kilobases of sequence. The resulting amplicons were then sequenced on the Illumina GA with 76-base single-end reads, to an average 168-fold coverage per individual.
Mootha said that since doing the experiment, there have been a number of technology developments that make the protocol easier and more accurate. For instance, the team is now using custom designed reagents on Agilent’s SureSelect platform instead of PCR amplification for target enrichment. Also, in the current study, the team did not barcode its samples before pooling, so after they did variant calling, they had to go back and match the variants to the individual.
The team called 898 single nucleotide variants and indels. They then filtered out variants present in healthy individuals, synonymous variants, non-coding variants that were not associated with splice sites or tRNA, and missense variants at sites with low evolutionary conservation. That narrowed the list down to around 200 variants, and the team then validated 151 likely deleterious variants.
They then looked at the variants in the 60 cases lacking a molecular diagnosis for known pathogenic mitochondrial DNA mutations, including homozygous and compound heterozygous variants. Three individuals had previously reported pathogenic mitochondrial mutations and eight had recessive-type mutations in known disease genes. Additionally, two individuals had recessive-type mutations in candidate disease genes NUBPL and FOXRED1.
The thirteen mutations, including the two mutations in NUBPL and FOXRED1, which were previously not associated with the disease, were all confirmed as disease-causing. When the researchers repaired the mutation in patients’ fibroblasts, the complex I was no longer deficient.
“We now have 56 patients with complex I deficiency with molecular diagnoses. These diagnoses comprise 47 unique mutations in 20 different genes,” said Thorburn. “For comparison, a ‘simple’ genetic disease such as cystic fibrosis is always caused by mutations in one gene, and 95 percent of patients have the same mutation.”
Thorburn said that the team is continuing to follow the group of patients to try and identify further mutations that could be used for molecular diagnoses. He said they will continue to use sequencing, and also array-CGH, to look for additional mutations.
“It is likely that some of our patients have mutations in genes not included” in the initial set of 103 genes, he said, so they are also expanding the list of genes. Additionally, they are looking for interactions between mutations in different genes. He said he will continue to focus on mitochondrial diseases.
Mootha added that the study could have implications for other diseases as well. “There are a fair number of common human disorders that are linked mechanistically to complex I including Parkinson’s and type 2 diabetes,” he said. “The hope is that identifying the genes underlying the severe phenotypes will help understand these other disorders.”
By Monica Heger
In the search for cancer-causing mutations, sequencing studies have mostly identified genes mutated in a small proportion of cases, with a smoking gun remaining elusive.
But now, two separate research groups have used sequencing to identify a novel, commonly mutated gene in ovarian clear-cell carcinoma. The gene was mutated in about half of all cases, making it a promising candidate for diagnostics and therapeutics.
Researchers from Johns Hopkins University published the results of a whole-exome study in Science this week, while a separate group from the BC Cancer Agency used transcriptome sequencing to come to the same conclusions. Their study was published this week in the New England Journal of Medicine.
“There are very few genes in cancer that have been found to be mutated in high proportion,” Nickolas Papadopoulos, director of translational genetics at the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University and a senior author of the Science paper, told In Sequence. “When a finding like this happens, it’s exciting because of its novelty.”
Papadopoulos and his team sequenced the exomes of eight tumor samples from patients with ovarian clear cell carcinoma to an average 84-fold coverage, using Agilent’s SureSelect target enrichment and the Illumina Genome Analyzer.
They identified 268 somatic mutations in 253 genes and confirmed 237 of them by Sanger sequencing. Four genes — PIK3CA, KRAS, PPP2R1A, and ARID1A — were mutated in more than one of the eight tumors.
The team then sequenced those four genes from matched tumor/normal DNA in 34 additional cases using Sanger sequencing.
In total, mutations in PIK3CA, KRAS, PPP2R1A, and ARID1A were identified in 40 percent, 4.7 percent, 7.1 percent, and 57 percent of the 42 tumors, respectively. Both PIK3CA and KRAS have been previously implicated in ovarian cancer and are well-characterized, but PP2R1A and ARID1A are novel. The researchers hypothesize that PPP2R1A is an oncogene, while ARID1A is a tumor suppressor. Oncogenes tend to have mainly missense mutations all at the same codon, or clustered at codons adjacent to each other, while tumor suppressors tend to be mutated at a variety of different positions in the coding region, and the mutations typically truncate the encoded protein. Also, tumor suppressors tend to affect both alleles, while oncogenes often only affect one. The mutations in PPP2R1A were clustered, while mutations occurred throughout ARID1A and were predicted to truncate the protein.
Papadopoulos said the team would now focus on understanding the function of ARID1A. “We know it’s involved in a complex of proteins that remodels the chromatin, which packages DNA. And this packaging of the DNA has implications in which a whole series of genes are regulated,” he said.
He said his team decided to use whole-exome sequencing because focusing on the protein-coding region would provide “more of an immediate chance to develop clinical applications.” He added that for future studies, they would continue to do exome sequencing. “Whole-genome sequencing has its advantages, but we’re not going to abandon exome sequencing,” he said.
Meanwhile, the group from BC Cancer Agency independently obtained similar results using transcriptome sequencing. They sequenced the transcriptomes of 18 ovarian clear-cell carcinoma tumors and one cell line, using paired-end sequencing on the Illumina GA. When they began the study, they were sequencing with read lengths of 37 base pairs, but by the end, they had increased to 75 base pairs.
David Huntsman, a genetic pathologist at the BC Cancer Agency and senior author of the paper, said that transcriptome sequencing provided a “very rich data set. The appeal is that you get mutations not only in coding genes but also gene fusions, and also accurate gene expression.” However, he added, while it is a rich data set, it is imperfect and does not always catch every mutation. “If you marry transcriptome sequencing to exome or whole-genome sequencing, then you have a data set which is very well rounded.”
Huntsman and his team found mutations in the ARID1A gene in six samples. They then used a targeted exon resequencing strategy to sequence the gene in an additional 210 samples, including 101 samples from patients with clear-cell carcinoma, 33 samples from patients with endometrioid carcinoma, and 76 samples from patients with high-grade serous carcinoma.
They found mutations in 46 percent of the ovarian clear-cell carcinoma patients, 30 percent of the endometrioid carcinomas, and none of the high-grade serous ovarian carcinomas.
The team also did an immunohistochemical analysis in more than 400 additional ovarian cancer tumors for the protein BAF250a, which is encoded by ARID1A and a key component of the chromatin remodeling complex. Loss of expression was strongly correlated with ARID1A mutations in both ovarian clear-cell carcinoma and endometrioid carcinoma, but not high-grade serous carcinoma.
The high-grade serous carcinoma subtype is being sequenced by the Cancer Genome Atlas project, but the current study’s finding that ARID1A does not play a role in that subtype appears to explain why the gene had not been previously implicated in ovarian cancer. Major disruption of genomic integrity is a key feature in that subtype, Huntsman said, but not in the clear-cell carcinoma subtype. Huntsman speculated that ARID1A mutations might define ovarian clear-cell carcinoma, and other cancer subtypes not marked by major genomic instability.
Huntsman added that the identification of ARID1A mutations in endometrioid cancer suggests that the gene could be used as a biomarker to determine which women with endometriosis are at the greatest risk for developing cancer. “Only a tiny fraction of women with endometriosis ever develop cancer,” he said. “Having a better tool to identify which women are at risk could help guide therapy” and determine which women would most benefit from surgery.
Both Huntsman and Papapdopoulos said that it would be difficult to target the ARID1A gene directly with a drug because it is a tumor suppressor, so mutations in the gene cause it to lose function.
“It’s easier to have a protein that is still active and [use drugs to] try to prevent its activity, rather than to have something missing and try to substitute for it,” Papadopoulos said. As a result, a major next step will be to determine which genes are regulated by ARID1A, he added, and then to figure out which of those are druggable, or whether there are already drugs that target any of them.
The need for fast, efficient, and less costly means to screen genetic variants associated with disease predisposition led us to develop an oligo-nucleotide array-based process for gene-specific single nucleotide polymorphism (SNP) genotyping. This cost-effective, high-throughput strategy has high sensitivity and the same degree of accuracy as direct sequencing, the current gold standard for genetic screening. We used the BRCA1 breast and ovarian cancer predisposing gene model for the validation of the accuracy and efficiency of our strategy. This process could detect point mutations, insertions or deletions of any length, of known and unknown variants even in heterozygous conditions without affecting sensitivity and specificity. The system could be applied to other disorders and can also b...