Using CadPam
The overall purpose of CAD-PAM is to enable design, synthesis, assembly
and testing of DNA constructs. Details of how the CAD-PAM software helps
are found in the paper
Tian J, Gong H, Sheng N, Zhou X, Gulari E, Gao X, & Church GM (2004) Accurate Multiplex Gene Synthesis from Programmable DNA Chips.
You can run the CAD-PAM software by using this link:
https://arep.med.harvard.edu/cgi-bin/cadpam/worktest/cadpamworks.pl
Note that by using this secure link to the CAD-PAM interface all your
web transactions will be encripted, reducing the likelihood of their falling
into unauthorized hands. You are provided with links that will use email with the user in question to confirm the setting of a password that will be required for anyone to access any results directory that is made for you. In the descriptions of files for input and output
below the file names are links to typical examples of these files that you can
download and examine. Note that many of the actual output file names also
include a string based on the run series designation you provide when you use
CAD-PAM. For example, the file called chipProduction.txt was actually originally named chipProductionfeb2-3.txt, which shows it was the third time a run had been made with a run identifier of "feb2".
-
CAD-PAM Input
-
Run Series Designation - You can group CAD-PAM runs by associating an identifier
with each related group of runs, which will give you easy access to the same
run instruction file with subsequent use of the same identifier.
-
Email address - You will be notified of the location of the results of your
CAD-PAM run at the email address you provide, and it also will be used to
provide you access to previously used versions of the CAD-PAM run instructions
file. We check that the email address is validly formed, and should it
bounce because of a typing error the bounce report will go to a staff member
who will attempt to correct the typing error and resend the message.
-
Input list of sequences to provide oligos for - this is a FASTA formatted
list of sequences that you can either enter directly or upload as a file. The
FASTA format description is described
here.
An example of a typical input FASTA file that contains a single E coli gene
sequence represented according to ISO standards in lowercase letters is at
http://arep.med.harvard.edu/cadpamresults/examples/input.seq.
-
cadpam.properties - This is the name of the file that provides the
directions for running CAD-PAM. This link goes to the first default file,
which is designed to handle DNA sequences without breaking them into fragments
first. For now there are two additional default files to choose from, both
of which are also designed to handle DNA sequences. One breaks the input
sequences into 400 base pair fragments and adds universal primers that are
well-suited for E coli sequences to both ends before generating the oligos
needed to create each fragment. The third default file breaks the input
into fragments that are created by breaking the sequence at specified
restriction enzyme recognition sites and also adds the universal primers
to the ends of each fragment. You can change the restriction enzyme
recognition sites it looks for if you wish. The CAD-PAM software can
also generate oligonucleotides that can be used to create sequences that
code for amino acid sequences, but we do not yet have any default
cadpam.properties designed for that use. At a minimum you will need to
change the SeqType variable value from "DNA" to "PROTEIN" to use it on
an input list of amino acid sequences. It currently uses a codon usage
table that is designed to work well for E coli sequences, which can be
found at ecoli_k12.txt. Right now there is no mechanism for
subsituting a different codon usage file, but if you wish you are welcome
to prepare one for another organism and send it to is, and we will incorporate
its selection into this interface.
-
Output Files
-
cadpam.properties - This is the actual cadpam.properties
file that was used in this run of the CAD-PAM software. This linked example of
cadpam.properties shows the changes that were made to provide the default
control file that breaks the input sequences into 400 base pair fragments.
You can choose not only from the default list of control files but also from
the most recently used cadpam.properties file for any run series you have
previously run as well as upload edited files and recover later by using
the same run series identifier you uploaded them with.
-
cadpam.txt - This file contains the standard output produced by the
CAD-PAM software. This is also shown on the web page and includes information
about what sequence positions are covered by each fragment as well as
the universal primer sequences if you have chosen to use these on the ends
of each fragment.
-
chipProduction.txt - This file lists the oligonucleotides to be
created for the synthesis of each fragment of the input sequences.
-
rawchipProduction.txt - The list of construction oligos that can be used to synthesize the requested sequences with each oligo alone on its own line and separate header lines to identify which sequence in the input list and which fragment the oligos are for.
-
chipQA.txt - The list of oligos for a QA chip that can be used to help identify construction oligos that have problems in the synthesis process.
-
rawchipQA.txt - The list of QA chip oligos that can be used to help identify construction oligos that have problems in the synthesis process with each oligo alone on its own line and separate header lines to identify which sequence in the input list and which fragment the oligos are for.
-
chipSelectionA.txt - The Selection files list the oligos for the chips that are used as described in the paper for
-
chipSelectionB.txt - hybridization selection of the construction oligos to reduce error rates.
-
rawchipSelectionA.txt - The same Selection files
-
rawchipSelectionB.txt - presented in raw format.
-
full_sequences.txt - this file enumerates the fragments the
input sequences were broken into. It looks like FASTA format but does not
observe the restriction of 80 base pairs per line of the file. The CAD-PAM
software also tolerates input FASTA formatted files that ignore that
restriction.
-
info.txt - This file provides additional details about the oligos generated
in this run.
-
*.seq - This file has a name based on the run identifier you used with the file extension ".seq" and it contains the input file that you provided for this run.
-
*.tar.gz - This file also has a name based on the run identifier you used followed by ".tar.gz". It is a compressed tar file that you can download
and use to create your own copy of this results directory and all the files in it except for this tar file using the Unix tar command or the Windows WinZip utility.
Please contact George Church for more information, or with any questions, comments, or concerns.
Copyright (c) 2006 by Wayne Rindone and the President and Fellows of Harvard University