Example of upstream region prediction. If the gene lies within an operon, its promoter could lie several genes upstream. Thus, we must include several possible intergenic regions. Here an operon is defined as two tandem genes separated by less than a certain cutoff distance. We include up to 300 bp of noncoding sequence directly upstream of the head of the predicted operon, as well as the entire sequence of all of intergenic segments of length > 20 bp between the gene of interest and the operon head. This figure shows the predicted upstream region for gene F.

First the algorithm checks the length of the upstream region for the gene in question. If an intergenic region is shorter than the distance cutoff, then the entire intergenic region is stored for motif-finding and thenext intergenic region further upstream is considered as well. This continues until an intergenic region is encountered that is eitehr divergently transcribed, or longer than the distance cutoff.

Parameters:

The upstream region extraction program also has an option for "no operon prediction". The only parameter that you need in this case is the exact length to take upstream of each gene. This is useful if you simply want to know the upstream region of a particular gene or if you already know the operon structure.