Documentation for immanc program

last updated October 8, 1998

PROGRAM DESCRIPTION

immanc.c is a program designed to test whether or not an individual is an immigrant or is of recent immigrant ancestry. This test is described in detail in the following article: Rannala, B and Mountain, J.L. (1997) Detecting immigration by using multilocus genotypes. Proc. Natl Acad Sci, USA 94:9197-9201. In order to apply the test, one must have genotype data for multiple loci for multiple individuals of two or more populations (see section on input file below). Ideally, but not necessarily, all individuals have been tested for all polymorphisms. The method is appropriate for use with allozyme, microsatellite, or restriction fragment length data. Loci are assumed to be in linkage equilibrium. The power of the test depends on the number of loci, the number of individuals sampled, and the extent of genetic differentiation between populations. The program uses Monte Carlo simulations to determine the power and significance of the test: the number of replications requested by the user (see below) will determine the precision of these values.

INPUT FILE

The input file includes one line per genotype, i.e., one line per individual, per genetic marker. Each line includes five alphanumeric items, separated by tabs or spaces, in the following order:

 UniqueIndividualID UniquePopulationID UniqueMarkerID FirstAllele SecondAllele

 Note that FirstAllele and SecondAllele need not be unique. The program will link each one with the corresponding MarkerID. Blank lines and spaces between items are acceptable. Please do not include spaces within any item. Comments may be included by starting the line with the character, '%'

 See example input file, immanc.inp.

RUNNING THE PROGRAM

Run the program by entering, at the command line prompt:

 immanc

 Respond to the following questions, as requested.

 Name of input file (RETURN for stdin):

(enter name of file containing data)

 Name of output file (RETURN for stdout):

(enter name of file for storing output; make sure this is different from the input file!)

 Print out data?

(type y or Y if you would like data printed. Do this at least once to ensure that data are read properly. note that data may be in different order from input)

 Comment line to be included in output (max 100 characters):

(here you may enter a note that you would like to include at the beginning of the output file)

 Number of generations in past:

(0: present generation only; 1: present and parental generations; 2: present, parental,and grandparental generations; etc)

 Number of replications:

(recommended: between 1000 and 100,000, preferably at least 10,000. we recommend running the program with 100 iterations initially, just to make sure the input file is in the appropriate format, and that the program runs properly)

Significance level for power tests (default is 0.050000):

(this is new to release 5.0, in the previous version alpha was 0.05, this is now the default value, but other values may be specified)

 Wait for a while, depending on how many replications you have requested.

OUTPUT

The output will include the following:

 1) Comment line, if you have chosen this option

 2) Printout of data, if you have requested this. As noted above, the records may not be in the same order as the input file. They will be sorted by population.

 3) A table giving the number of markers for which data are available for both of the two populations.

 4) Matrices indicating the power of each possible test. The row of an entry indicates the sample population; the column indicates the potential source population. Note that the test of whether an individual of population A has immigrant ancestors from population B may have more or less power than the test of whether an individual of population B has immigrant ancestors from population A. Power depends on the heterozygosity within the populations as well as the genetic heterogeneity between the populations.

 5) A table providing the results of the immigration/immigrant ancestry test for each individual.

 Column 1: Individual - the identification code of the individual under consideration.

 Column 2: Sample - the population of which an individual is currently a member.

 Column 3: Source - the population from which the individual or ancestors may have immigrated.

 Column 4: ObsRatio - the logarithm of the observed posterior probability ratio, lambda or lambda d. This quantity is defined in the publication by log(lambda) (see equation 16), or log(lambda d) (see equation 19).

 Column 5: MixGen - d, the number of generations in the past that the immigration is hypothesized to have taken place. 0 corresponds to the hypothesis that the individual immigrated; 1 corresponds to the hypothesis that a parent immigrated, etc.

 Column 6: Alpha - the significance level of the test.

 Column 7: Power - the power of the test.

 Column 8: Markers - the number of markers used to perform the test.

 Column 9: One asterisk indicates that the test is significant at the 5% level; 2 asterisks indicate that the test is significant at the 1% level.