Personal Genome Project

Personal Genome Project (PGP)

(backup/mirror PGP site) ** News **
Nov-2007 Advice and Financial support
Oct-2007 PGP volunteer scale-up plans submitted for approval
Dec-2006 First PGP EBV-cell lines are available from Coriell.
Oct-2006 Protein coding regions (1% of the genome) sequencing strategy adopted as a PGP milestone
Sep-2006 PGP research subject volunteers

Background information. The Goal of this project is to develop affordable "personal genome sequences" and a variety of user-friendly applications of such data. This web page focuses on the practical issues of recruiting and informing volunteers for the (research-only, password-protected) PGP, with frank discussion of "open" options -- a testbed for personalized medicine and new ways of interfacing with the research subjects.

See these links for more background on potential technologies and societal implications: 2006 Scientific American "Genomes for all" ; Nature Reviews of Genetics article ; Dec-2005 Nature MSB editorial , ELSI news & resources , dbGaP , 2007 Edge World Question , Question of the Year.

Potential benefits. Some examples of benefits to society are described in the review article above. This project is more holistic and personalized (NOT anonymous, generic parts, or one-size-fits-all) than most biomedical research because integration of whole body systems biology was problematic under previous informed-consent practices. In addition, benefits to the individual (which would hopefully be initially perceived as minor in order to reduce coercive motivations) might include (1) early-adopter status, (2) self-curiosity, (3) suggestions for diagnostic tests. It should be emphasized that, at the beginning, while the sample size is small, we should not expect to make statistically significant assocations between genotype and phenotype, but rather to generate long lists of hypotheses and develop advanced data integration tools (especially common, 'obvious' traits).

Potential risks. Volunteers should be aware of the ways in which knowledge of their genome and phenotype might be used against them. For example, in principle, anyone with sufficient knowledge could take a volunteer's genome and/or open medical records and use them to (1) infer paternity or other features of the volunteer's genealogy, (2) claim statistical evidence that could affect employment or insurance for the volunteer, (3) claim relatedness of the volunteer to infamous villains, (4) make synthetic DNA corresponding to the volunteer and plant it at a crime scene, (5) revelation of disease lacking a current cure. (Note that this last example does not necessarily imply only helpless waiting, e.g. the affected individual can become an advocate for research on that disease.) The genetic information posted here, while directly associated only with the primary subject, may also have relevance to family members. Some may feel that the risk to the primary subjects is small, since they are recruited as healthy individuals and risk to relatives smaller still, but this evaluation should involve detailed discussions. (We note that most "healthy" individuals have some medical problems past, present, or future. The point is not exclude anyone "unhealthy", but rather to have the PGP recruit individuals who, in consultation with their family and health care providers, feel that they can give well-informed consent, accepting the risks of revealing whatever medical conditions they might have). Anything that is later inferred solely from their DNA sequence will be speculative with respect to the primary subjects, and even less predictive respect to their family, since inheritance of nearly all alleles is 50:50 random.

Informed consent. The potential advantages of having large sets of phenotype and genotype information in the public domain ideally would greatly exceed potential harm to the volunteers involved. (See note on challenges to "anonymous" methods). A diverse set of ten volunteers will initially (Aug 2006) come from the same well-informed communities as the advisory board (requested by the HMS IRB as "master's level or equivalent training in genetics or equivalent understanding of genetics research") will need to prove their knowledge in all of the relevant aspects, in particular the risks below. We will try to minimize coercion including lures of personal benefit to the volunteers, so that motivations will be confined to curiosity, science, and eventual societal impact. Similarly, potential volunteers would not be employees or students of the PGP leaders. Women and under-represented minorities are especially encourage to apply. Participants are not encouraged to donate funds to the PGP as the research component should not resemble a service relation. Fund-raising is considered a completely separate activity.

Advisory board. As the PGP technology is advancing rapidly we would like to begin discussing the best ways to recruit people to have their genome sequenced (in part or whole). This web page is a work-in-progress. Volunteers for designing this plan as well as potential volunteers for genome sequencing are welcome. This may involve ethicists, attorneys, database-security experts, medical records, management, fund raising, public health, public relations, education, etc.

Commercial vs non-profit. If the Human Genome Project (HGP) and HapMap are taken as examples, there will be strong motivation to make the results of PGP-like project freely-accessible (even if it starts as "privately owned"). Conversely if it starts as a public project, then the appearance of companies which offer related services may be considered a sign of success of a PGP. This web page represents a initial attempt at a non-profit model. The "openness" of the PGP technology and volunteers may inspire more support and new approaches. A non-profit PGP may inspire more trust from participants.

Costs. While the PGP costs are dropping very rapidly, the research and development costs and sequencing volunteer genomes at each stage deserves support. Production costs per subject range from $1K for a limited subset of the genome to over $200K per subject to cover a significant fraction of their DNA. Current R&D efforts include: (1) higher accuracy, (2) lower costs, (3) user-friendly software, (4) security for private genomes, (5) enhanced access for public genomes, (6) statistical association of genotype, environment, and phenotypes, (7) Ethical, legal and social impacts (ELSI).
Part of the personal environmental data will entail "pathogen/allergen weather maps" tracking the spread of recurring and emergent infectious agents in analogy to how we currently track weather for broad audiences.
Eventually to get excellent statistics, PGP may require millions of volunteers. Possibly after an initial trial with dozens of volunteers, some fraction of this PGP will transition to a more private model, but only if this is the desire of the volunteers and/or scientific communities. If you would like to help raise funds please contact us.

Volunteers & suggestions: Please contact us at the following email address: DNA It would be helpful if you mention your fields of interest or expertise and whether you are interested in being a donor of advice or funds. Please allow a week for response. Also note U.S. Surgeon General's Family History Initiative

The Human Study Protocol submitted 16-Sep-2004 and revised protocol approved by HMS IRB on 31-Aug-2005. Approved by Partner's Healthcare IRB on 18-May-2007 and MIT-IRB on 12-Apr-2007. Non-government funding sources: COUQ Foundation, Harvard Royalties, and the Broad Inst. of Harvard & MIT.

Available on request: Consent form, protocol, questions and answers during the HMS-IRB and NIH-NGHRI approval processes as is HMS-IRB approval # M11665-101,102,103,105,106,109,110. Latest approval valid through Aug 2007.

While the PGP is unique in its full human genome sequencing goals and open source nature, here are other examples of "Personal Genomics"
The Personal Genome page, DNAdirect.
Genographic project: Over 120,000 participants (2005)
Swedish Twin Registry: 140,000 twins. : 100,000 volunteers in Iceland.("over half of the adult population")
Genomics Collaborative, Inc. : over 120,000 patients.
Sorenson Molecular Genealogy Foundation : 50,000 participants.

Other related volunteer efforts
Software: Linux , GNU/FSF, PGPi
Wikipedians: over 3,000,000 user accounts.
SourceForge.Net: 1,260,983 registered users.
Red Cross Red Crescent: 95 million volunteers.
Nurses Health Study: 122,000 participants.
Women's Health Initiative: 161,000 participants.

Related work: Ting Wu Lab , I2B2 , HPCGG. GeneWatch. Genetic Screening Study Group (GSSG).

Updated 29-Aug-2007 (original Dec 2003) by GMC.