Predicting DNA binding sites in E. coli: Choosing over-represented clusters of matrix search hits


We used recognition matrices constructed by aligning footprinted binding sites to make predictions about additional genes regulated by known transcription factors. Because many E. coli transcription factors bind to two or more closely spaced motifs within a single upstream region, we searched for over-represented spacings between pairs of matrix search hits for 49 known transcription factors in the E. coli genome. The most over-represented pairings correspond to the top predictions. We also examined spacing patterns between matrix search hits for all possible combinations of pairs of search matrices for different transcription factors and generated a ranked list of predicted transcription factor binding sites and possible interactions between transcription factors.

Bulyk, M. L., McGuire, A. M., Masada, N., Church, G. M. A motif co-occurrence approach for genome-wide prediction of transcription-factor binding sites in Escherichia coli. Genome Res. 14(2): 201-8, 2004.

Tab-delimited files containing lists of pairings between transcription factors:

  • Predictions based on individual spacings.
  • Predictions based on spacing bins.
  • All predictions in one table

    Click here for an explanation of file formats.

    Abigail Manson McGuire
    Genetics Department
    Harvard Medical School
    200 Longwood Ave.
    Boston, MA. 02115.
    Telephone: (617) 432-4136