Make your own free website on Tripod.com

			###########################
			##### Linkage Scripts #####
			###########################

Copyright 2001,2004 Greenwood Genetic Center

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA


************
**Contents**
************

Linkage.pm is the linkage module containing functions to processes the
pedfile.dat and datafile.dat.  It is required by all the programs included in
this script.

markerdist.pl is a simple script to output all the markers in a file and the
distances between them.  It is also a good simple example of how to use the
Linkage module.

mlink.pl is a script for automating the process of running mlink.  It requires
lsp, unknown, and mlink.

linkmap.pl is a script for automating the process of running linkmap.  It
requires lsp, unknown, linkmap, and linklods.


****************
**INSTALLATION**
****************

1. Uncompress and untar this archive to /usr/local/linkage:
     # tar -C /usr/local -zxvf linkage_scripts-0.11.tgz

2. Install the following utilities for linkage analysis:

   Note that these programs are covered by their respective licenses.

     a) The FASTLINK package by Cottingham et. al.
        In particular, the utilities unknown, mlink, and linkmap are required.
        Download: ftp://fastlink.nih.gov/pub/fastlink/

     b) The Linkage Auxiliary Programs by Peter Cartwright
        In particular, the utility lsp is required.
        Download: ftp://linkage.rockefeller.edu/software/linkage/

     c) The Linklods program by Jurg Ott
        Download: http://fog.bio.unipd.it/pub/fastlink/dos-binaries/
        Linklods is a DOS program; however it can be made to run under Linux
        by using WINE (http://www.winehq.org/).  Simply install WINE and copy
        linklods.exe to the /usr/local/linkage directory.  The linklods.sh shell
        script will run linklods using WINE--you should edit this script, if
        necessary.

   Also, the Linkage manual at http://linkage.rockefeller.edu/soft/linkage/ is
   extremely helpful.

3. After downloading, uncompressing, and untarring the linkage scripts, create a
   symbolic link to the Linkage module to a perl module directory.  To determine
   the location of the perl module directories for your system, execute the
   command
     # perl -e "print @INC"
   Assuming you would like to use the module directory /usr/lib/perl5/site_perl
   and the Linkage module is in /usr/local/linkage, execute the command
     # ln -s /usr/local/linkage/Linkage.pm  /usr/lib/perl5/site_perl

4. Create symbolic links to make linklods, mlink.pl, and linkmap.pl executable
   without the need to type the entire path to these programs
     # ln -s /usr/local/linkage/linklods.sh /usr/local/bin/linklods
     # ln -s /usr/local/linkage/mlink.pl /usr/local/bin/mlink.pl
     # ln -s /usr/local/linkage/linkmap.pl /usr/local/bin/linkmap.pl

***********************
**Running the Scripts**
***********************

1. Create a pedfile from your ipedfile and call it "pedin.dat".  For example,
   if your existing file is called "ipedfile2.dat", use the command
     $ makeped ipedfile2.dat pedin.dat
     Does your pedigree file contain any loops?    (y/n) -> n
     Do you want probands selected automaticaly?   (y/n) -> y

2. Make a copy of your parameter file called "datain.dat".  For example, if
   your existing file is called "datafile2.dat", use the command
     $ cp datafile2.dat datain.dat

4. Run one (or both) of the scripts.
     $ mlink.pl
     $ linkmap.pl
   mlink.pl creates two files called mlink.txt, which contains the same text
   outputted to the screen when mlink.pl finishes, and mlink.csv which contains
   the same data except the file is formatted for a spreadsheet program.  Since
   linkmap.pl typically produces too much data to be outputted to the screen at
   one time, only the file linkmap.csv is produced.

5. If a script freezes it is probably due to one of the required utilities
   displaying an error of some sort, try using the debug option:
     $ mlink.pl DEBUG
     $ linkmap.pl DEBUG

*******************************
**Automating Linkage Analysis**
*******************************

The following sections will only be useful if you intend to write your own
scripts to do linkage analysis or intend to edit my scripts.

****************************************************
**General Strategy for Automating Linkage Analysis**
****************************************************

Learn what programs need to be executed to get the results.  One good strategy
for this is to run lcp to create a shell script that does what you want done.
By examining this shell script it is fairly simple to determine what the syntax
is for using various programs.  Also, examining the scripts I have written is a
good starting point.

****************************
**Using the Linkage Module**
****************************

The Linkage module exports by default the functions parseDatafile() and
parsePedfile().  These functions place the data in the datain.dat and pedin.dat
into the exported variables %datainfo and @pedinfo.

%datainfo is a hash containing the following elements from the datafile.  See
the Linkage manual for the datafile to determine what each element means.

  1st Line of the Datafile
$datainfo{"nlocus"}
$datainfo{"risklocus"}
$datainfo{"sexlink"}
$datainfo{"nprogram"}

  2nd Line of the Datafile
$datainfo{"mutsys"}
$datainfo{"mutmale"}
$datainfo{"mutfem"}
$datainfo{"disequil"}

  3rd Line of the Datafile
$datainfo{"order"} is a pointer to an array of elements.  For example if the
first locus on the chromosome is 3 (the first value on line 3), then
$datainfo{"order"}->[0] will be 3.

  The Locus Data
The data for the each locus is loaded into a 3D anonymous array.  Only Affection
Status and Numbered Alleles are supported.  For example, if the first locus is
a Numbered Allele, then the frequency for the first locus would be determined
by
if($datainfo{1}->[0]->[0] == 3) { #verify the first locus is a numbered allele
   print $datainfo{1}->[1]->[0];
}

  The Line After The Loci In The Datafile
$datainfo{"sexdiff"} must be 0 (other values are not supported)
$datainfo{"sexintf"} must be 0 (other values are not supported)

$datainfo{"recomb"} is a pointer to an anonymous array containing the
recombination values between each locus.  For example, to find the values
between the first and second locus use
$datainfo{"recomb"}->[0];

By placing a "#NUM NAME" at the end of a comment for a locus, will cause NAME to
be loaded into an anonymous array.  To find the name of the first locus, use
$datainfo{"names"}->[0]


@pedinfo contains all of the pedinfo from the pedfile each entry is a pointer to
an array with the the information on each person.  For example to find the first person's family number, use
$pedinfo[0]->[0]
Also, the last two elements of this array are the original family and person
numbers determined by the comments produced by makeped.