next up previous
Next: Deviations from Hardy-Weinberg Up: Microsatellite's and Genetic Distance Previous: General Introduction

Testing assumptions prior to analysis

 

Prior to the analysis of data for population substructure it is essential to test the genetic variation found at microsatellites to ensure that the basic assumptions upon which the subsequent theory is based are not violated. Three main assumptions need to be tested. First, the selective neutrality of each locus should analysed. Second, the presence of 'null alleles' (alleles which are not detected via PCR analysis) should be identified. Finally, before the data from various loci are combined, the independent assortment of the loci must be tested.

The assumption of the selective neutrality of microsatellite loci is the key tenet behind most of the analysis of this data. All of the subsequent analyses of data are based on the interaction of the forces of genetic drift (random change of allele frequency), mutation, and/or migration. Over time, the effects of drift and mutation will lead to the divergence of allele frequencies among subpopulations while migration will lead to a homogenisation of allele frequencies. Strong selection may overcome these forces. Selection at a locus may stabilise allele frequencies (e.g. via overdominance) across all subpopulations and therefore lead to an underestimation of population substructure or genetic distance. Alternatively, difference in selective pressures among regions may cause the fixation of alternate alleles in different subpopulations and cause the overestimation of these parameters. The effects of selection can confound results and any loci that are under selective pressure should be excluded from the analysis. Although, the vast majority of microsatellites are believed to be neutral, linkage of these markers to selected loci is present. One only has to look at the wealth of information on human genetic diseases that have been uncovered via the analysis of microsatellite loci tightly linked to candidate 'disease loci' to understand that this problem is not trivial (e.g. Robinson et al. 1996). Since, the exact location of most microsatellites used in the analysis of wildlife populations is unknown, the detection of the effects of selection become an important first step in any analysis.

The comparison of observed genotype frequencies to those expected from the predictions of the Hardy-Weinberg equilibrium may detect the presence of selection. This comparison is not straightforward at microsatellite loci because of the combined effects of modest sample sizes and a large number of alleles. Thus, data is usually pooled prior to comparison with expected values to increase the power of the tests. Various tests may be conducted and levels of comparisons are recommended. A selection of these tests is described below. First, observed and expected levels of heterozygosity can be compared (e.g. Edwards et al. 1992). An unbiased estimation of heterozygosity is,

where n is the sample size and is the frequency of the ith allele. In addition, the data for rare alleles can be pooled. A common way this is achieved is to compare three classes; homozygotes of the most common allele, heterozygotes with the most common allele, and all other rarer genotypes (e.g. Gottelli et al. 1996). Finally, the observed and expected values for all classes can be compared (e.g. Allen et al. 1995; Edwards et al. 1992). This final comparison should be conducted with caution as one of the common ways to test the significance of this data, the likelihood-ratio test (G-statistic; Sokal and Rohlf 1981), does not follow a standard distribution in the presence of a large number of alleles and a moderate sample size (Edwards et al. 1992). Thus levels of statistical significance have to be estimated by permutation of the data (Deka et al. 1991).



next up previous
Next: Deviations from Hardy-Weinberg Up: Microsatellite's and Genetic Distance Previous: General Introduction