John Aach, Wayne Rindone, and George Church
Department of Genetics, Harvard Medical School, 200 Longwood Ave., Boston, MA, 02115, USA
and
Lipper Center for Computational Functional Genomics, 77 Avenue Louis Pasteur, Boston, MA, 02115, USA
ExpressDB can also be accessed in other ways than the EXD application described in the text. First, we developed a tool for bulk batch database extracts that was used in the generation of the estimated relative abundance file. Second, users familiar with Structured Query Language (SQL) can compose custom queries and have them submitted directly to the database. As a case in point, a query was submitted to determine which time point of the seven provided in the Chu_spo set of experiments (see Table 1) is closest to the initial time point, where closeness was measured as the mean square difference (MSD), over all ORFs, of time point to control condition mRNA expression ratios compared to the ratios for the initial time point. The database returned the reasonable result that expression profiles of cells undergoing sporulation get progressively more distant from the initial profile for the early time points (note, however, time point 7): (time in hours, MSD) = (0.5, 0.764), (2, 0.869), (5, 2.988), (9, 4.916), (7, 10.095), (11.5, 13.358). NOTE: This computation did not exclude values called in error.
Please contact John Aach if you are interested in submitting SQL directly to ExpressDB.
use ExpressDB go set statistics time on go -- experiment to see if one can design SQL that compares two measures select MeasureName=substring(M2.MeasureName,1,30), RMS=avg(power(convert(real,E1.Value-E2.Value),2)) from Measure M1, Measure M2, EDPD E1, EDPD E2 where M1.MeasureName = "R/G:0" and M2.MeasureNo != M1.MeasureNo and M2.MeasureName like "R/G:%" and E1.MeasureNo = M1.MeasureNo and E2.MeasureNo = M2.MeasureNo and E1.ORFNo = E2.ORFNo and E1.DataPointNo = E2.DataPointNo group by M2.MeasureName order by RMS go
1>; use ExpressDB 1>; set statistics time on Execution Time 0. SQL Server cpu time: 0 ms. SQL Server elapsed time: 1094018792 ms. 1> -- experiment to see if one can design SQL that compares two measures 2> select MeasureName=substring(M2.MeasureName,1,30), 3> RMS=avg(power(convert(real,E1.Value-E2.Value),2)) 4> from Measure M1, Measure M2, EDPD E1, EDPD E2 5> where M1.MeasureName = "R/G:0" 6> and M2.MeasureNo != M1.MeasureNo 7> and M2.MeasureName like "R/G:%" 8> and E1.MeasureNo = M1.MeasureNo 9> and E2.MeasureNo = M2.MeasureNo 10> and E1.ORFNo = E2.ORFNo 11> and E1.DataPointNo = E2.DataPointNo 12> group by M2.MeasureName 13> order by RMS Parse and Compile Time 0. SQL Server cpu time: 0 ms. MeasureName RMS ------------------------------ -------------------- R/G:0.5 0.764014 R/G:2 0.869452 R/G:5 2.987885 R/G:ndt80-delete-early 3.662190 R/G:gal-ndt80 4.765511 R/G:9 4.916045 R/G:ndt80-delete-mid 5.363342 R/G:7 10.094530 R/G:11.5 13.358002 Execution Time 13920. SQL Server cpu time: 1392000 ms. SQL Server elapsed time: 2292063 ms. (9 rows affected)
ExpressDB took ~ 38.2 minutes to process this query. This reflects the size of the database at the time of submission (~17.5 M "data point" records), as well as other activity on the system at the time. Early in the loading of ExpressDB, the above query took ~ 3 minutes.