Accessing Databases
There are a large and ever-changing number of methods to access these databases. However, they fall into several major classes
-
- Using an E-mail server is easy. You send a formatted message to a special Internet E-mail address, and it returns the sequences (or search results). Major drawbacks are
- Difficult to integrate into other programs
- Mailbox may overflow with large search results
Internet servers
- There are a growing number of services which allow access to sequence data over internet links, using either standard or specialized tools.
-
- Entrez is a combined bibliographic, protein sequence, and nucleotide sequence database maintained by the National Center for Biotechnology Information (NCBI). A major advantage of Entrez are interconnections between the various databases -- you can move quickly from a sequence to its reference to another sequence. Another central concept in Entrez is neighboring, the grouping together of sequences and references by computed similarity scores. Entrez has sprouted several variants. First, the Entrez database can be accessed by either CD ROM or over the Internet. Second, Entrez can be used with either a custom Graphical Interface client, using a World Wide Web browser, with a command line browser (CLEVER), or via NCBI's toolkit written in C.
- World Wide Web / Gopher
- Many biosequence databases are available as hypertext on the World Wide Web, or as flat files from gophers.
Local copy of database
- You can, of course, maintain a local copy of a database. This is particularly advantageous for intensive users, as it prevents networks from becoming limiting factors. Most databases can be obtained by anonymous FTP; many can also be obtained on CD ROM.
- Requires allocation of local storage space
- Requires local access software, perhaps expensive
- Keeping up-to-date may become major time sink
This document is intended to serve as a guide to using certain bioinformatics
programs. It cannot be guaranteed to be free of errors or completely up-to-date. If you know of errors or other shortcomings of this document, please mail them
to Keith Robison (Church Lab, HMS Genetics)
KRobison@nucleus.harvard.edu