Development and analysis of a 20K SNP array for potato (Solanum tuberosum): an insight into the breeding history.
Theor Appl Genet. 2015 Aug 12;
Authors: Vos PG, Uitdewilligen JG, Voorrips RE, Visser RG, van Eck HJ
KEY MESSAGE: A 20K SNP array was developed and a comprehensive set of tetraploid cultivar was genotyped. This allowed us to identify footprints of the breeding history in contemporary breeding material such as identification of introgression segments, selection and founder signatures. A non-redundant subset of 15,138 previously identified SNPs and 4454 SNPs originating from the SolCAP project were combined into a 20k Infinium SNP array for genotyping a total of 569 potato genotypes. In this study we describe how this SNP array (encoded SolSTW array) was designed and analysed with fitTetra, software designed for autotetraploids. Genotypes from different countries and market segments, complemented with historic cultivars and important progenitors, were genotyped. This comprehensive set of genotypes combined with the deliberate inclusion of a large proportion of SNPs with a low minor allele frequency allowed us to distinguish genetic variation contributed by introgression breeding. This "new" (post 1945) genetic variation is located on specific chromosomal regions and enables the identification of SNP markers linked to R-genes. In addition, when the genetic composition of modern cultivars was compared with cultivars released before 1945, it appears that 96 % of the genetic variants present in those ancestral cultivars remains polymorphic in modern cultivars. Hence, genetic erosion is almost absent in potato. Finally, we studied population genetic processes shaping the genetic composition of the modern European potato including drift, selection and founder effects. This resulted in the identification of major founders contributing to contemporary germplasm.
PMID: 26263902 [PubMed - as supplied by publisher]
Scanning and Filling: Ultra-Dense SNP Genotyping Combining Genotyping-By-Sequencing, SNP Array and Whole-Genome Resequencing Data.
PLoS One. 2015;10(7):e0131533
Authors: Torkamaneh D, Belzile F
Genotyping-by-sequencing (GBS) represents a highly cost-effective high-throughput genotyping approach. By nature, however, GBS is subject to generating sizeable amounts of missing data and these will need to be imputed for many downstream analyses. The extent to which such missing data can be tolerated in calling SNPs has not been explored widely. In this work, we first explore the use of imputation to fill in missing genotypes in GBS datasets. Importantly, we use whole genome resequencing data to assess the accuracy of the imputed data. Using a panel of 301 soybean accessions, we show that over 62,000 SNPs could be called when tolerating up to 80% missing data, a five-fold increase over the number called when tolerating up to 20% missing data. At all levels of missing data examined (between 20% and 80%), the resulting SNP datasets were of uniformly high accuracy (96-98%). We then used imputation to combine complementary SNP datasets derived from GBS and a SNP array (SoySNP50K). We thus produced an enhanced dataset of >100,000 SNPs and the genotypes at the previously untyped loci were again imputed with a high level of accuracy (95%). Of the >4,000,000 SNPs identified through resequencing 23 accessions (among the 301 used in the GBS analysis), 1.4 million tag SNPs were used as a reference to impute this large set of SNPs on the entire panel of 301 accessions. These previously untyped loci could be imputed with around 90% accuracy. Finally, we used the 100K SNP dataset (GBS + SoySNP50K) to perform a GWAS on seed oil content within this collection of soybean accessions. Both the number of significant marker-trait associations and the peak significance levels were improved considerably using this enhanced catalog of SNPs relative to a smaller catalog resulting from GBS alone at ≤20% missing data. Our results demonstrate that imputation can be used to fill in both missing genotypes and untyped loci with very high accuracy and that this leads to more powerful genetic analyses.
PMID: 26161900 [PubMed - in process]
Combined Analysis of SNP Array Data Identifies Novel CNV Candidates and Pathways in Ependymoma and Mesothelioma.
Biomed Res Int. 2015;2015:902419
Authors: Wajnberg G, Carvalho BS, Ferreira CG, Passetti F
Copy number variation is a class of structural genomic modifications that includes the gain and loss of a specific genomic region, which may include an entire gene. Many studies have used low-resolution techniques to identify regions that are frequently lost or amplified in cancer. Usually, researchers choose to use proprietary or non-open-source software to detect these regions because the graphical interface tends to be easier to use. In this study, we combined two different open-source packages into an innovative strategy to identify novel copy number variations and pathways associated with cancer. We used a mesothelioma and ependymoma published datasets to assess our tool. We detected previously described and novel copy number variations that are associated with cancer chemotherapy resistance. We also identified altered pathways associated with these diseases, like cell adhesion in patients with mesothelioma and negative regulation of glutamatergic synaptic transmission in ependymoma patients. In conclusion, we present a novel strategy using open-source software to identify copy number variations and altered pathways associated with cancer.
PMID: 26185765 [PubMed - in process]
The Psychological Challenges of Replacing Conventional Karyotyping with Genomic SNP Array Analysis in Prenatal Testing.
J Clin Med. 2014;3(3):713-23
Authors: Riedijk S, Diderich KE, van der Steen SL, Govaerts LC, Joosten M, Knapen MF, de Vries FA, van Opstal D, Tibben A, Galjaard RJ
Pregnant couples tend to prefer a maximum of information about the health of their fetus. Therefore, we implemented whole genome microarray instead of conventional karyotyping (CK) for all indications for prenatal diagnosis (PND). The array detects more clinically relevant anomalies, including early onset disorders, not related to the indication and more genetic anomalies of yet unquantifiable risk, so-called susceptibility loci (SL) for mainly neurodevelopmental disorders. This manuscript highlights the psychological challenges in prenatal genetic counselling when using the array and provides counselling suggestions. First, we suggest that pre-test decision counselling should emphasize deliberation about what pregnant couples wish to learn about the future health of their fetus more than information about possible outcomes. Second, pregnant couples need support in dealing with SL. Therefore, in order to consider the SL in a proportionate perspective, the presence of phenotypes associated with SL in the family, the incidence of a particular SL in control populations and in postnatally ascertained patients needs highlighting during post-test genetic counselling. Finally, the decision that couples need to make about the course of their pregnancy is more complicated when the expected phenotype is variable and not quantifiable. Therefore, during post-test psychological counseling, couples should concretize the options of continuing and ending their pregnancy; all underlying feelings and thoughts should be made explicit, as well as the couple's resources, in order to attain adequate decision-making. As such, pre- and post-test counselling aids pregnant couples in handling the uncertainties that may accompany offering a broader scope of genetic PND using the array.
PMID: 26237473 [PubMed]