Purpose:
This program estimates the gene flow parameter q for a collection of two or more semi-isolated populations by (pseudo) maximum likelihood using either single or multi locus, haploid or diploid, genotype data. Suitable molecular data types minimally include allozymes, microsatellites, and RFLPs. The theory underlying the method is found in Rannala and Hartigan (1996). The method is explicitly based on Wright's (1931) island model of population demographic structure. An exact solution exists for the likelihood function of this model under Wright's diffusion approximation. However, the exact form of the likelihood function appears quite robust to the specific assumptions of the model and other, more general, population demographic models may have an identical likelihood function (Rannala 1996). For a (haploid) island model with a constant population size the parameter q can be interpreted as Nm, the expected number of immigrants per island, per generation. This quantity is most often estimated by the widely-used formula Nm = 1 - 1/Fst. However, the maximum likelihood estimator has less bias and lower variance than this classical estimator over widely varying levels of gene flow and is therefore preferable.
Where to get the software:
You may retrieve the files by anonymous ftp from mw511.biol.berkeley.edu
in directory pub or on the web at http://mw511.biol.berkeley.edu/.
Executables available for following operating systems:
Microsoft Windows 95/98/NT: PMLE22.exe (PMLE22.zip)
Macintosh: coming soon.
Note: the program is distributed in compressed format with
documentation included.
Using the program:
Input file format
The input file format is now compatible with the NEXUS standards implemented in many phylogenetics packages such as PAUP and in the population genetics package GDA developed by Paul Lewis and Dmitri Zaykin. One advantage of this is that input files created for use with PMLE22 can be subsequently used with the program GDA with very little modification. In addition, file conversion routines in GDA allow a PMLE22 input file to be translated into other formats, including the BIOSYS format used by several population genetics packages. The PMLE22 program is not quite as flexible about format requirements as GDA and so input files that work with GDA are not guaranteed to work with PMLE22. However, files that work with PMLE22 will almost certainly work with GDA by simply modifying the keyword pmledata to be gdadata instead (with diploid data the keyword diploid must also be removed from the format statement).
An example input file based on mtDNA from Channel Island foxes (Wayne et al. 1991) is given below:
#nexus
begin pmledata;
dimensions nloci=1 npops=6;
format haploid missing=? datapoint=standard;
locusallelelabels
1 locus_1;
matrix
Pop_SMi: _1_ 1
_2_ 1
_3_ 1
_4_ 1
_5_ 1
_6_ 1
_7_ 1
_8_ 1
_9_ 1
10_ 1
11_ 1
12_ 1
13_ 1
14_ 1
15_ 1
16_ 1
17_ 1
18_ 5
19_ 5
20_ 5
21_ 5
22_ 5,
Pop_SRo: _1_ 4
_2_ 4
_3_ 4
_4_ 4
_5_ 4
_6_ 4
_7_ 4
_8_ 5
_9_ 5
10_ 5
11_ 5
12_ 5
13_ 5
14_ 5
15_ 5
16_ 5
17_ 5
18_ 5
19_ 5
20_ 5
21_ 5
22_ 5
23_ 5
24_ 5
25_ 5
26_ 5
27_ 5
28_ 5
29_ 5
30_ 5,
Pop_SCr: _1_ 4
_2_ 4
_3_ 4
_4_ 4
_5_ 4
_6_ 5
_7_ 5
_8_ 5
_9_ 5
10_ 5
11_ 5
12_ 5
13_ 5
14_ 5
15_ 5
16_ 5
17_ 5
18_ 5
19_ 5
20_ 5
21_ 5
22_ 5
23_ 5
24_ 5
25_ 5
26_ 5
27_ 5
28_ 5,
Pop_SNi: _1_ 2
_2_ 2
_3_ 2
_4_ 2
_5_ 2
_6_ 2
_7_ 2
_8_ 2
_9_ 2
10_ 2
11_ 2
12_ 2
13_ 2
14_ 2
15_ 2
16_ 2
17_ 2
18_ 2
19_ 2
20_ 2
21_ 2
22_ 2,
Pop_SCa: _1_ 3
_2_ 3
_3_ 3
_4_ 4
_5_ 4
_6_ 4
_7_ 4
_8_ 4
_9_ 4
10_ 4
11_ 4
12_ 4
13_ 4
14_ 4
15_ 4
16_ 4
17_ 4
18_ 4
19_ 4
20_ 4
21_ 5
22_ 5
23_ 5
24_ 5,
Pop_SCl: _1_ 1
_2_ 1
_3_ 1
_4_ 1
_5_ 1
_6_ 1
_7_ 1
_8_ 1
_9_ 1
10_ 1
11_ 1
12_ 1
13_ 1
14_ 1
15_ 1
16_ 1
17_ 1
18_ 1
19_ 1
20_ 1
21_ 1
22_ 1
23_ 1
24_ 1
25_ 1
26_ 1
27_ 1;
end;
Executing the program with this data file (wayne.nex) produces the
following screen output:
Pseudo Maximum Likelihood Estimator of Gene Flow
Bruce Rannala, SUNY Stony Brook
Release 2.2 for Windows 95/NT (23/08/98)
Wed Jul 29 12:33:52 1998
Enter name of input file: wayne.nex
Opening NEXUS data file wayne.nex
Reading PMLE 2.2 data format
Data matrix has 6 populations, 1 loci, and 153 individuals
Missing data represented by the symbol ?
All loci haploid.
Estimate of theta: 0.42063 standard error: 0.206354
An example of a diploid (allozyme) input file is given below:
#nexus
begin pmledata;
dimensions nloci=3 npops=3;
format diploid
missing=? separator=/ datapoint=standard;
locusallelelabels
1 ADH,
2 LDH,
3 MDH;
matrix
Pop_1: _1_ A/a b/b A/?
_2_ A/A b/? A/A
_3_ A/A B/b A/A
_4_ A/a B/B A/A
_5_ A/a B/B A/A,
Pop_2: _1_ A/A B/B a/a
_2_ A/A B/b a/a
_3_ A/a B/B a/a
_4_ A/a B/b a/a,
Pop_3: _1_ A/A B/B B/B
_2_ A/A B/B B/B
_3_ A/A C/C B/B
_4_ A/A C/c B/B;
end;
To modify the above diploid input file to execute on GDA change pmledata to gdadata and remove the keyword diploid. Executing the program with this input file (diploid.nex) produces the following screen output:
Pseudo Maximum Likelihood Estimator of Gene Flow
Bruce Rannala, SUNY Stony Brook
Release 2.2 for Windows 95/NT (23/08/98)
Wed Jul 29 12:31:50 1998
Enter name of input file: diploid.nex
Opening NEXUS data file diploid.nex
Reading PMLE 2.2 data format
Data matrix has 3 populations, 3 loci, and 13 individuals
Missing data represented by the symbol ?
Different genes at one locus separated by the symbol /
All loci diploid.
Estimate of theta: 1.08421 standard error: 0.592875
Support:
You may email me if you have any problems.
Bruce Rannala
Department of Ecology and Evolution
State University of New York
Stony Brook, NY 11794-5245
Email: rannala@life.bio.sunysb.edu
References
Rannala, B., and J. A. Hartigan. 1996. Estimating gene flow in island populations. Genetical Research 67:147-158.
Rannala, B. 1996. The sampling theory of
neutral alleles in an island population of fluctuating size. Theoretical
Population Biology 50:91-104.