Effects of reduced panel, reference origin, and genetic relationship on imputation of genotypes in Hereford cattle.
J Anim Sci. 2012 Aug 2;
Authors: Huang Y, Maltecca C, Cassady JP, Alexander LJ, Snelling WM, Macneil MD
Abstract
The objective of this study was to investigate alternative methods of designing and utilizing reduced single nucleotide polymorphism (SNP) panels for imputing SNP genotypes. Two purebred Hereford populations, an experimental population known as Line 1 Hereford (L1, N=240) and registered Hereford with American Hereford Association (AHA, N=311), were utilized. Using different reference samples of 62 to 311 animals with 39,497 SNPs on 29 autosomes, and study samples of 57 or 62 animals for which genotypes were available for ∼2,600 SNPs (reduced panels), imputations were performed to predict the other ∼36,900 loci which had been masked. An imputation package including LinkPHASE and DAGPHASE (Druet and Georges, 2010) was used for imputation. Four reduced panels differing in minor allele frequency (MAF) and marker spacing were evaluated. Reduced panels included every fifteenth SNP across the genome (SNP_space); commercial Illumina Bovine3K Beadchip (SNP_3K); SNPs with the highest MAF (SNP_MAF); and SNPs with high MAF which were also evenly spaced across the genome (SNP_MS). Imputation accuracy was defined as the correlation of imputed genotypes and real genotypes. Reference samples were either from L1 or AHA. Among animals with genotypes, genetic relationships were estimated based on molecular marker genotypes or pedigree. Reduced panel design, number of animals in the reference sample, reference origin and the genetic relationship between animals in the reference and study samples all affected imputation accuracy (P < 0.001). Across genotyping schemes, imputed genotypes from SNP_MS had the greatest accuracy. A 0.1 increase in average pedigree relationship or average molecular relationship between reference and study samples increased imputation accuracy 10 to 20%. Using reference samples from the L1 population resulted in lower imputation accuracy than using reference samples from the admixed population AHA (P < 0.001). Increasing the number of animals in the reference panel by one hundred individuals increased imputation accuracy by 8% when pedigree relationship was used as a covariate and 6% when molecular relationship was used as a covariate. It was concluded that imputation accuracy would be increased through optimization of reduced panel design and genotyping strategy.
PMID: 22859753 [PubMed - as supplied by publisher]