Primer design
Primer design for multiplex polony amplification involves four steps:
(i) picking candidate primers using a modified MIT Primer 3 program,
(ii) finding all partial and complete matches of the candidate primer
sequence on the target genome by Blast, (iii) predicting the
amplification products of all possible primer combinations based on a
thermodynamic model, and (iv) searching a set of compatible primers
that are specific to the target loci. This computation procedure was
implemented in a Perl program. It also depends on NCBI Blast (should
obtain from NCBI directly) and a modified version of MIT Primer 3
(included), in which the Nearest-neighbor thermodynamic parameters were
updated. Please note that the version downloaded from MIT Primer 3
website does not work for our purpose!
The primer design program requires a Unix/Linux system with at least
15Gb of hard disk space and 4Gb of RAM. The easiest way to run this
program is do everything in one working directory. Here is the detailed
instruction:
(1) Download
and extract yamPCR.zip in your working folder.
(2) Download the human (or mouse or others) genome sequence (from
http://genome.ucsc.edu) to the same
directory. You need the chromFa.zip file, in which the sequence of one
chromosome is in a single file (for human:
http://hgdownload.cse.ucsc.edu/goldenPath/hg17/bigZips/chromFa.zip) .
Once you extract the zip file, you will see chr1.fa, chr2.fa ...,
chrX.fa, chrY.fa, chrM.fa. You will also see a number of files containg
random sequences that could not assembled to the contigs. These files
could be deleted.
(3) Prepare a Blast database (I assume you have a NCBI Blast installed
on your system, so if you don't, please download the software from
http://ncbi.mlm.nih.gov/blast and follow the instruction for
installation). You need to merge sequences of all chromosomes into one
file first:
cat *.fa >
HsGenome
(or MmGenome if you are working on mouse).
Then you can run the formatdb program included in the Blast package:
formatdb -i HsGenome -p F
(4) Download and install the following Perl modules from CPAN:
- Bioperl
- Algorithm::Numerical::Shuffle
- Getopt::Long
(5) Modify the yamPCR.pl program (line #81-85). Basically
you need to tell the program the locations of your blast program, blast
database, and other related files.
(6) Now you may run yamPCR.pl with the test input file included in the
package.
./yamPCR.pl --targetFile=yamPCR_test_input.txt >
yamPCR_test.log &
tail -f yamPCR_test.log
The whole-genome blast search and optimal primer searching is pretty
slow, so please use your patience.
(7) When you want to design a lot of primers, often the program fails
to find a complete set of primers for all the target loci. In such
cases, it attempts to design primers for as many loci as possible. If
you need primers for the loci that failed, you may prepare a file
containing those good primers. And run the program on the loci that
failed (note that you need to remove all loci with good primer from the
target file.) The file format is very simple, just one primer
sequence each line, no annotation, no space or tab. If you know some
primers do not work at all, you may create
another file containing primers that should be excluded.
Then you can feed these primer sequences to yamPCR:
./yamPCR.pl
--targetFile=remaining_loci_input.txt
--excludedPrimers='excluded-primers.txt'
--includedPrimers='included-primers.txt' > yamPCR_test.log
&