The loss of statistical power to distinguish populations when certain samples are ambiguous

O'Hely, Martin; Slatkin, M

File(s) under permanent embargo

The loss of statistical power to distinguish populations when certain samples are ambiguous

journal contribution

posted on 2003-09-01, 00:00 authored by Martin O'HelyMartin O'Hely, M Slatkin

Case-control studies are used to map loci associated with a genetic disease. The usual case-control study tests for significant differences in frequencies of alleles at marker loci. In this paper, we consider the problem of comparing two or more marker loci simultaneously and testing for significant differences in haplotype rather than allele frequencies. We consider two situations. In the first, genotypes at marker loci are resolved into haplotypes by making use of biochemical methods or by genotyping family members. In the second, genotypes at marker loci are not resolved into haplotypes, but, by assuming random mating, haplotypes can be inferred using a likelihood method such as the expectation-maximization (EM) algorithm. We assume that a causative locus has two alleles with a multiplicative effect on the penetrance of a disease, with one allele increasing the penetrance by a factor pi. We find, for small values of pi-1 and large sample sizes, asymptotic results that predict the statistical power of a test for significant differences in haplotype frequencies between cases and a random sample of the population, both when haplotypes can be resolved and when haplotypes have to be inferred. The increase in power when haplotypes can be resolved can be expressed as a ratio R, which is the increase in sample size needed to achieve the same power when haplotypes are resolved over when they are not resolved. In general, R depends on the pattern of linkage disequilibrium between the causative allele and the marker haplotypes but is independent of the frequency of the causative allele and, to a first approximation, is independent of pi. For the special situation of two di-allelic marker loci, we obtain a simple expression for R and its upper bound.

History

Journal

Theoretical population biology

Volume

64

Issue

2

Pagination

177 - 192

Publisher

Elsevier

Location

Amsterdam, The Netherlands

Publisher DOI

https://doi.org/10.1016/S0040-5809(03)00084-4

ISSN

0040-5809

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal

Copyright notice

2003, Elsevier

Usage metrics

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

The loss of statistical power to distinguish populations when certain samples are ambiguous

History

Journal

Volume

Issue

Pagination

Publisher

Location

Publisher DOI

ISSN

Language

Publication classification

Copyright notice

Usage metrics

Categories

Keywords

Licence

Exports