Access Restriction

Source CiteSeerX
Content type Text
File Format PDF
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Ibd Probability ♦ Sequence Data ♦ Genetic Epidemiology ♦ Dense Snp ♦ Founder Haplotype ♦ Fundamental Importance ♦ Genotyped Marker Locus ♦ Haplotype Block ♦ Individual Share ♦ Family-based Test ♦ High-throughput Single Nucleotide Polymorphism ♦ Previous Finding ♦ Important Innovation ♦ Current Algorithm ♦ Quantitative Trait Mapping ♦ New Model Result ♦ Whereas Merlin Scale ♦ Reasonable Assumption ♦ Cluster Model ♦ Linkage Equilibrium ♦ New Model ♦ High-density Single Nucleotide Polymorphism ♦ Individual Sequencing ♦ Disease Gene ♦ Simple New Model ♦ New Algorithm Scale ♦ Imminent Arrival ♦ Linkage Disequilibrium
Abstract The probabilities that two individuals share 0, 1, or 2 alleles identical by descent (IBD) at a given genotyped marker locus are quantities of fundamental importance for disease gene and quantitative trait mapping and in family-based tests of association. Until recently, genotyped markers were sufficiently sparse that founder haplotypes could be modelled as having been drawn from a population in linkage equilibrium for the purpose of estimating IBD probabilities. However, with the advent of high-throughput single nucleotide polymorphism genotyping assays, this is no longer a reasonable assumption. Indeed, the imminent arrival of individual sequencing will enable high-density single nucleotide polymorphism genotyping on a scale for which current algorithms are not equipped. In this paper, we present a simple new model in which founder haplotypes are modelled as a Markov chain. Another important innovation is that genotyping errors are explicitly incorporated into the model. We compare results obtained using the new model to those obtained using the popular genetic linkage analysis package Merlin, with and without using the cluster model of linkage disequilibrium that is incorporated into that program. We find that the new model results in accuracy approaching that of Merlin with haplotype blocks, but achieves this with orders of magnitude faster run times. Moreover, the new algorithm scales linearly with number of markers, irrespective of density, whereas Merlin scales supralinearly. We also confirm a previous finding
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study