Primer design multiplexed polony amplification

Primer design

Primer design for multiplex polony amplification involves four steps: (i) picking candidate primers using a modified MIT Primer 3 program, (ii) finding all partial and complete matches of the candidate primer sequence on the target genome by Blast, (iii) predicting the amplification products of all possible primer combinations based on a thermodynamic model, and (iv) searching a set of compatible primers that are specific to the target loci. This computation procedure was implemented in a Perl program. It also depends on NCBI Blast (should obtain from NCBI directly) and a modified version of MIT Primer 3 (included), in which the Nearest-neighbor thermodynamic parameters were updated. Please note that the version downloaded from MIT Primer 3 website does not work for our purpose!

The primer design program requires a Unix/Linux system with at least 15Gb of hard disk space and 4Gb of RAM. The easiest way to run this program is do everything in one working directory. Here is the detailed instruction:
(1) Download and extract yamPCR.zip in your working folder.

(2) Download the human (or mouse or others) genome sequence (from http://genome.ucsc.edu) to the same directory. You need the chromFa.zip file, in which the sequence of one chromosome is in a single file (for human: http://hgdownload.cse.ucsc.edu/goldenPath/hg17/bigZips/chromFa.zip) . Once you extract the zip file, you will see chr1.fa, chr2.fa ..., chrX.fa, chrY.fa, chrM.fa. You will also see a number of files containg random sequences that could not assembled to the contigs. These files could be deleted.

(3) Prepare a Blast database (I assume you have a NCBI Blast installed on your system, so if you don't, please download the software from http://ncbi.mlm.nih.gov/blast and follow the instruction for installation). You need to merge sequences of all chromosomes into one file first:
cat *.fa > HsGenome (or MmGenome if you are working on mouse).
Then you can run the formatdb program included in the Blast package:
formatdb -i HsGenome -p F

(4) Download and install the following Perl modules from CPAN:

Bioperl
Algorithm::Numerical::Shuffle
Getopt::Long

(5) Modify the yamPCR.pl program (line #81-85). Basically you need to tell the program the locations of your blast program, blast database, and other related files.

(6) Now you may run yamPCR.pl with the test input file included in the package.
    ./yamPCR.pl --targetFile=yamPCR_test_input.txt > yamPCR_test.log &
    tail -f yamPCR_test.log
The whole-genome blast search and optimal primer searching is pretty slow, so please use your patience.

(7) When you want to design a lot of primers, often the program fails to find a complete set of primers for all the target loci. In such cases, it attempts to design primers for as many loci as possible. If you need primers for the loci that failed, you may prepare a file containing those good primers. And run the program on the loci that failed (note that you need to remove all loci with good primer from the target file.) The file format is very simple, just one primer sequence each line, no annotation, no space or tab. If you know some primers do not work at all, you may create another file containing primers that should be excluded. Then you can feed these primer sequences to yamPCR:
    ./yamPCR.pl --targetFile=remaining_loci_input.txt --excludedPrimers='excluded-primers.txt' --includedPrimers='included-primers.txt' > yamPCR_test.log &