Using CadPam

The overall purpose of CAD-PAM is to enable design, synthesis, assembly and testing of DNA constructs. Details of how the CAD-PAM software helps are found in the paper

Tian J, Gong H, Sheng N, Zhou X, Gulari E, Gao X, & Church GM (2004) Accurate Multiplex Gene Synthesis from Programmable DNA Chips.

You can run the CAD-PAM software by using this link:
https://arep.med.harvard.edu/cgi-bin/cadpam/worktest/cadpamworks.pl

Note that by using this secure link to the CAD-PAM interface all your web transactions will be encripted, reducing the likelihood of their falling into unauthorized hands. You are provided with links that will use email with the user in question to confirm the setting of a password that will be required for anyone to access any results directory that is made for you. In the descriptions of files for input and output below the file names are links to typical examples of these files that you can download and examine. Note that many of the actual output file names also include a string based on the run series designation you provide when you use CAD-PAM. For example, the file called chipProduction.txt was actually originally named chipProductionfeb2-3.txt, which shows it was the third time a run had been made with a run identifier of "feb2".

CAD-PAM Input
- Run Series Designation - You can group CAD-PAM runs by associating an identifier with each related group of runs, which will give you easy access to the same run instruction file with subsequent use of the same identifier.
- Email address - You will be notified of the location of the results of your CAD-PAM run at the email address you provide, and it also will be used to provide you access to previously used versions of the CAD-PAM run instructions file. We check that the email address is validly formed, and should it bounce because of a typing error the bounce report will go to a staff member who will attempt to correct the typing error and resend the message.
- Input list of sequences to provide oligos for - this is a FASTA formatted list of sequences that you can either enter directly or upload as a file. The FASTA format description is described here. An example of a typical input FASTA file that contains a single E coli gene sequence represented according to ISO standards in lowercase letters is at http://arep.med.harvard.edu/cadpamresults/examples/input.seq.
- cadpam.properties - This is the name of the file that provides the directions for running CAD-PAM. This link goes to the first default file, which is designed to handle DNA sequences without breaking them into fragments first. For now there are two additional default files to choose from, both of which are also designed to handle DNA sequences. One breaks the input sequences into 400 base pair fragments and adds universal primers that are well-suited for E coli sequences to both ends before generating the oligos needed to create each fragment. The third default file breaks the input into fragments that are created by breaking the sequence at specified restriction enzyme recognition sites and also adds the universal primers to the ends of each fragment. You can change the restriction enzyme recognition sites it looks for if you wish. The CAD-PAM software can also generate oligonucleotides that can be used to create sequences that code for amino acid sequences, but we do not yet have any default cadpam.properties designed for that use. At a minimum you will need to change the SeqType variable value from "DNA" to "PROTEIN" to use it on an input list of amino acid sequences. It currently uses a codon usage table that is designed to work well for E coli sequences, which can be found at ecoli_k12.txt. Right now there is no mechanism for subsituting a different codon usage file, but if you wish you are welcome to prepare one for another organism and send it to is, and we will incorporate its selection into this interface.
Output Files

cadpam.properties - This is the actual cadpam.properties file that was used in this run of the CAD-PAM software. This linked example of cadpam.properties shows the changes that were made to provide the default control file that breaks the input sequences into 400 base pair fragments. You can choose not only from the default list of control files but also from the most recently used cadpam.properties file for any run series you have previously run as well as upload edited files and recover later by using the same run series identifier you uploaded them with.
cadpam.txt - This file contains the standard output produced by the CAD-PAM software. This is also shown on the web page and includes information about what sequence positions are covered by each fragment as well as the universal primer sequences if you have chosen to use these on the ends of each fragment.
chipProduction.txt - This file lists the oligonucleotides to be created for the synthesis of each fragment of the input sequences.
rawchipProduction.txt - The list of construction oligos that can be used to synthesize the requested sequences with each oligo alone on its own line and separate header lines to identify which sequence in the input list and which fragment the oligos are for.
chipQA.txt - The list of oligos for a QA chip that can be used to help identify construction oligos that have problems in the synthesis process.
rawchipQA.txt - The list of QA chip oligos that can be used to help identify construction oligos that have problems in the synthesis process with each oligo alone on its own line and separate header lines to identify which sequence in the input list and which fragment the oligos are for.
chipSelectionA.txt - The Selection files list the oligos for the chips that are used as described in the paper for
chipSelectionB.txt - hybridization selection of the construction oligos to reduce error rates.
rawchipSelectionA.txt - The same Selection files
rawchipSelectionB.txt - presented in raw format.
full_sequences.txt - this file enumerates the fragments the input sequences were broken into. It looks like FASTA format but does not observe the restriction of 80 base pairs per line of the file. The CAD-PAM software also tolerates input FASTA formatted files that ignore that restriction.
info.txt - This file provides additional details about the oligos generated in this run.
*.seq - This file has a name based on the run identifier you used with the file extension ".seq" and it contains the input file that you provided for this run.
*.tar.gz - This file also has a name based on the run identifier you used followed by ".tar.gz". It is a compressed tar file that you can download and use to create your own copy of this results directory and all the files in it except for this tar file using the Unix tar command or the Windows WinZip utility.

Please contact George Church for more information, or with any questions, comments, or concerns.