MAXDIP 

Overview MAXDIP is an application to estimate a reciprocal recombination (crossover) rate parameter (rho) and a gene conversion rate parameter (f) from population variation data. We apply an operational definition of gene conversion that is consistent with a number of genetic mechanisms whereby daughter chromosomes contain small tracts from a different parental chromosome. MAXDIP uses unphased diploid polymorphism data, tolerates missing data, and can utilize information on the ancestral state of alleles, if that information is available. Sites with three or more alleles will be ignored. At present, samples of up to 50 diploids can be used. MAXDIP uses a maximum composite likelihood method described in Hudson (2001) and extended to include gene conversion in Frisse et al. (2001). An infinitesites constantpopulationsize neutral model is assumed. The gene conversion model is that of Wiuf and Hein (2000), and assumes that the tract length is geometrically distributed. The program can be run in either one of two modes. In the first mode, the user provides a range of rho values and, optionally, a mean gene conversion tract length and a single value of the gene conversion rate parameter f. The program will then compute the estimated composite likelihood of the data for a discrete number of rho values in the range. In the second mode, the user provides an initial rho value and, optionally, a mean gene conversion tract length and a range of f values. The program then finds the value of rho that maximizes the composite likelihood of the data for each value of f, as well as the (rho, f)pair which results in the maximum composite likelihood for the specified set of f values. Ranges for both rho and f are specified by entering minimum and maximum values of the parameter, and a step size. Multiple data sets, with different sample sizes and numbers of sites, can be used simultaneously by simply concatenating the files. It should be emphasized that the method is a composite likelihood method, not a maximum likelihood method. For example, approximate confidence intervals obtained by treating these composite likelihoods as if they were true likelihoods will almost certainly grossly underestimate the size of such intervals. Also, it should be noted that when recombination rates are heterogeneous, either within or between loci, estimates of f using MAXDIP may be biased. Source code for MAXDIP (a version utilizing command line arguments) is available here. Click here for MAXDIP citations.  

1. Upload a polymorphism data file. (File size is limited to 1 MB.)  
Choose file:  



ancestral ID  
2. Select either mode A or B and enter the appropriate parameter values.  
 
3. Select optional settings for reading input file.  










%  
4. Enter and confirm an email address to which results should be sent.  
Input File Formats  
MAXDIP accepts nucleotide polymorphism data in the following formats: Click the above links to get descriptions and examples of each format.  
top  
Parameter Definitions  
rho: rho equals 4Nr, where N is the effective population size and r is the reciprocal recombination rate between adjacent base pairs.  
top  
Initial value of rho: MAXDIP limits its search for a rho value that maximizes the composite likelihood to within two orders of magnitude of the initial rho value. If MAXDIP returns a rho value that is near 100 times the initial value, the program should be rerun with either a larger or smaller initial rho value.  
top  
Minimum rho value: This is the lower end of rho values for which MAXDIP will compute the composite likelihood.  
top  
Minimum rho value: This is the upper end of rho values for which MAXDIP will compute the composite likelihood.  
top  
Step size of rho: This is the increment that MAXDIP will apply to the rho values when computing the likelihoods between the minimum and maximum rho values.  
top  
Gene Conversion Parameters  
f: f is the gene conversion parameter. It is defined as the ratio of g/r, where g is the probability per generation that a gamete has a gene conversion tract, which starts at a specified site (g is the same as in Wiuf and Hein, 2000). Thus, f is the ratio of reciprocal to gene conversion rate per base pair.  
top  
Minimum value of f: This specifies the lower end of the range of f values.  
top  
Maximum value of f: This specifies the upper end of the range of f values.  
top  
Step size of f: This specifies the size of the intervals between the f values in the range.  
top  
Mean gene conversion tract length: This is the mean of a geometric distribution of tract length (equal to 1/q in the notation of Wiuf and Hein, 2000).  
top  
Settings for Reading Input File  
Ignore ancestral data (if present): If 'Yes' is selected, the program will assume all ancestral states are unknown. If the file does not contain the ancestor line (or if all the ancestral alleles are labelled missing) this button will have no effect.  
top  
Parse individual IDs for populations: If 'Yes' is selected, the input file will be parsed into separate populations based on the leading alphabetic characters of the individuals' IDs. MAXDIP will then return rho estimates for each population in the data set. For example, if a file contains individual IDs CA23, AA100, and CAU54, MAXDIP will return separate rho estimates for populations CA, AA, and CAU.  
top  
Estimate rho for total sample: If MAXDIP identifies more than one sample in the data (see Parse individual ID's for populations) and 'Yes' is selected, MAXDIP will provide a rho estimate for a sample that includes all individuals in the file, in addition to providing rho estimates for all subsamples.  
top  
Parse alleles in casesensitive mode: Unless 'Yes' is selected, MAXDIP will assume upper and lowercase letters represent the same allele.  
top  
Missing data symbols: Those symbols checked will be read as representing missing data (i.e., the allele is unspecified). The user can also enter an arbitrary string of characters to represent missing data.  
top  
Allele frequency threshold: This is the minor allele or derived allele (if ancestral data is present) threshold frequency that must be met in order for the polymorphic site to be included in the MAXDIP analysis.  
top  
Output Definitions  
MCLE of rho: Value of rho which maximizes the composite likelihood of the data.  
top  
MAXDIP Citations
Hudson, Richard R. Twolocus sampling distributions and their application.
Wiuf, C., and Hein, J. The coalescent with gene conversion. Genetics 155:45162. (2000)  
top  
Contact Information  
Send Email inquiries to: David Witonsky  
Supported by the National Institute of General Medical Sciences (GM61393S1) 