Read Here Is a Human Being Online
Authors: Misha Angrist
It was at Harvard that George met his future wife and Harvard colleague Ting Wu, also then a graduate student in genetics. The two were in a class together on the structure of chromatin, the essential but still somewhat mysterious molecular scaffolding found in chromosomes, an irony not lost on her. “Chromatin is one of the aspects of inheritance not entirely coded for by DNA.”
26
Her first impressions? “I remember noticing that this guy knew everything. Someone mentioned that an interesting article had just come out but couldn’t remember what journal or who the authors were. George always knew: ‘It’s in
PNAS,
volume such and such, page so and so.’ But when he said those things it would never come off in a know-it-all kind of way. He was just a nice guy. And even when I first I met him he was fascinated by sequencing.”
27
“When I got to Wally’s lab in ‘77 I was already thinking about it,” George said. “It just wasn’t rational thought.”
28
The first quasi-complete sequence assembled by the public Human Genome Project (HGP) in 2003
*
was actually a composite of several anonymous people who were recruited via ads in the
Buffalo News
in 1997.
29
One sample known as “RP11” appears to have been overrepresented (George told me that reporters have since gone to Buffalo to try to learn the identity of RP11
30
). In parallel to the publicly funded HGP, Craig Venter used his own DNA as part of a private sequencing effort led by the company he once ran, Celera,
31
and subsequently by his own research institute. Venter’s entire sequence was published in 2007
32
and represented the first complete human reference genome from a single individual.
*
In any case, depending upon when one marks the start date, obtaining that first composite sequence took something like thirteen years and $3 billion in public funds plus the hundreds of millions spent by Celera and its stockholders.
Until 2006 or so, few gave the idea of personal genomics much consideration, rational or otherwise. Yes, sequencing had already gotten much cheaper by the early 2000s, but until about 2005 a human genome was still going to cost you tens of millions of dollars.
Since his arrival in Harvard Yard in the late 1970s, Church has spent his time there (and at various companies he’s helped found) looking for ways to make sequencing cheaper and easier. In 2003,
Genome Technology
magazine suggested he would make “a good candidate for a lifetime achievement award in genome sequencing.”
33
He and later his small army of students and postdocs found clever chemical shortcuts and cost-savers. Over the last few years, much of his lab’s attention has been on polymerase colonies, or “polonies,” a method that uses enzymes to amplify billions of short DNA fragments and stitch those together into a form that can be sequenced.
34
Polony technology has since been licensed to several companies. As we’ll see, Church has teamed up with an engineering firm to make and sell a polony sequencer (“the Polonator”) for significantly less than other companies in the sequencing business.
35
Among DNA sequencers, it was and is a thing of beauty: even the beta testers who couldn’t get it to work admired its aesthetic qualities. The Polonator has a robotic arm inside humming along and dispensing chemicals, and a platform that holds the slide upon which sits the DNA. An expensive camera takes pictures of the samples. The whole apparatus sits inside a slightly Daliesque warped blue cube about the size of a small clothes dryer.
For the last few years sequencing costs have been in free fall, thanks in no small part to George’s work. In 2007 he told me he expected that soon his lab would be able to use the Polonator to sequence the entire human exome, that is, the protein-coding 1 percent of the human genome, for as little as a thousand dollars. In reality this was still more than two years away, but I came to learn that this type of error is a chronic occupational hazard for an incurable optimist such as George.
The implications of an affordable exome and/or genome left him both excited and unsettled. And he wasn’t the only one. He forged ahead with the PGP, but was forced to use private money. He believed the National Institutes of Health (NIH) was balking at paying for the project (despite having approved it and every other aspect of his $10 million genome technology grant) because he refused to do it under the standard ethical paradigm set forth by the agency, that is, one in which subjects give informed consent and in return are promised, more or less, privacy and confidentiality.
36
“I just feel uncomfortable signing up people under the supposition of anonymity if that’s not something that can be assured.”
37
Why couldn’t it be assured? On his website, George offered a laundry list of real-life scenarios where presumptively anonymous subjects were reidentified without their consent.
38
Some of these became infamous stories. In the 1990s, for example, then-MIT grad student Latanya Sweeney used publicly available voter records and a public, anonymized database of state employees to identify the medical records of Massachusetts governor William Weld. She was also able to identify the five African-American women living in the predominantly gay enclave of Provincetown, Massachusetts, purely on the basis of public data.
39
Another example: a few years ago, a fifteen-year-old boy used a combination of a commercial DNA test of his Y chromosome, genealogical records, and Internet searches to locate his “anonymous” sperm donor father.
40
In George’s eyes, these types of privacy-hacker stories would only become more common as genomic data proliferated. DNA is the ultimate digital identifier, after all—a social security number is only nine digits, while a genome is three billion. Not that anywhere near that number would even be necessary to hack one’s identity: a paper in
Science
suggested that as few as eight genetic markers could be considered a risk for reidentifying humans.
41
Twenty-five such markers would likely be fully identifying, akin to the thirteen forensic DNA markers typically typed from crime scene samples.
*
And in 2008, researchers from the Translational Genomics Research Institute and University of California at Los Angeles were able to identify individual DNA samples from a complex mixture of samples from as many as two hundred people, even if a particular individual’s sample accounted for only one-tenth of 1 percent of the total DNA mixture.
42
In response the NIH immediately backpedaled on its promise to facilitate widespread sharing of human DNA samples among the studies it funded.
43
George saw this coming. Given the technical realities, he said, to promise privacy and confidentiality would be disingenuous at best.
44
And there’s another reason not to do it, he said: it makes for bad, or at least limited, science. Ensuring anonymity, assuming it can be done at all, means restricting the use of any phenotypic information that could be used to reidentify a subject. But it is exactly those sorts of unique bits of data on human traits—hair color, eye color, facial features, cognitive measures—that George saw as necessary to fully leverage whole genomes and really begin to understand the human gestalt. “Some people look at a person’s face and think they can tell everything about his past, present and future,” he said. “That’s not true, of course. But when you ask who you are, a huge fraction is what your face is. So take the face plus the genome, plus metabolites, plus proteomics, and then you’re starting to get something that is without parallel in current practice.”
45
Giving broad access can have practical benefits, too. One of George’s favorite stories—recounted on his website, naturally
46
—is from 2004. He was preparing to give a lecture in Seattle at the University of Washington Medical School when a hematologist in the audience raised his hand and said, “You really ought to get your cholesterol checked.” He had looked at George’s personal medical records Web page and seen that his total cholesterol was 288 mg/dL (the normal level is less than 200). The 288 measurement was more than a year old and George had not been back to his doctor to see if the statin he was taking was having the desired effect. It turned out it was not. His doctor doubled the dose and George went back to a strict vegan diet. In six weeks his cholesterol had fallen to 156 mg/dL. What struck George was that this interaction with a total stranger had had a tangible positive effect on his health. “That total-stranger expert will eventually be replaced by software,” he predicted.
47
For that to happen and for it to be of practical value will require both genetic data and health records. George’s Harvard colleague, pediatrician, and champion of electronic health records Isaac “Zak” Kohane told me essentially the same thing as every other genome scientist I spoke to. “Without the phenotype, the genome is just not that useful.”
48
But even ambitious genome sequencing efforts such as the internationally sponsored “1000 Genomes Project” were not prepared to collect detailed trait data on their subjects, preferring instead to use “old” DNA samples from “anonymized” subjects who had consented years ago.
49
Collecting trait data is hard, it’s time-consuming, it’s expensive, and yes, it makes the subjects that much easier to identify.
For the last problem, George’s solution was “open consent.” Why not, he asked, recruit subjects willing to forgo guarantees of privacy and confidentiality—that is to say, people like him? This is why he became Subject Number One in the PGP.
50
Of the first two prominent public genomes, Craig Venter has been open about his sequence, even though it told him that he is at increased risk for Alzheimer’s and macular degeneration.
51
James Watson’s DNA now resides in a public DNA database.
52
Watson’s son Rufus has schizophrenia.
53
When I asked Watson if he consulted with his family members before making his genome public, he shook his head and smiled. “They might have said no.”
54
Indeed, one sometimes wonders what might ultimately be found in Watson’s DNA. The same week in which he publicly implied that Africans were less intelligent than whites and lost his job because of it,
55
he suggested that crystallographer and onetime rival the late Rosalind Franklin had been partially autistic.
56
One wants to kick him under the table or pull him aside and say, “Dude. Stop.” When I asked George about Watson being one of the first complete and public genomes, he gave a less than ringing endorsement. “I don’t think it was an ideal choice. If you’re gonna put all your eggs in one basket, he doesn’t seem like the obvious first basket.”
57
In 2005, the Harvard Medical School Institutional Review Board approved the Personal Genome Project, but only after lengthy discussion.
*
Initially, according to rabbi and IRB member Terry Bard, a longtime lecturer on pastoral counseling in psychiatry at Harvard, the Harvard IRB was not even sure the PGP was in its bailiwick. “Everyone, including me,” said Bard, “was scratching their heads and saying, ‘Why is this here?’ A number of members were not convinced—and maybe remain unconvinced—that, as presented, this was actually a
research
study. Did it meet the basic scientific criteria for research?” George appeared before the IRB twice to convince members that it was indeed research and to seek IRB guidance as to what to include in the protocol and the consent form.
58
Initially, George was asked to limit the initial subject total to one: himself. Bard shepherded him through the consent process. Since George was putting both his genotype and phenotype on the Web and therefore making it available for public scrutiny, Bard said the IRB wanted to gauge the initial experience first. “We wanted to know how it would play in Peoria, so to speak,” said Bard. “Dr. Church made monthly reports back to us on how his decision was being received and what kinds of interactions he had had.”
59
Bard is a short man who, when I met with him in his softly lit office at Beth Israel Deaconess Medical Center, wore a white hospital coat. As a perpetually confused Jew, I was anxious to avail myself of his other skill set. So at the end of our hour-long conversation, I asked him what the halachic view of the PGP might be—how would it be perceived through the lens of Jewish law? He said he’d never really thought about it. “Are you exposing yourself or others to irrevocable harm that can be avoided? If that’s the case, then in the Jewish tradition that’s not permissible. But on the other hand, you know the joke from Rabbi Akiva: ‘All is foreseen and free will is given.’ I don’t have an answer. You’re never gonna find anything that two Jews agree upon except that some third person should give to charity.”
60
Eventually the IRB agreed to consent to admit two more people but still wanted to proceed with extreme caution. Each of the first ten participants would have to have a master’s degree in genetics or equivalent. Presumably those people could be expected to understand the risks of open consent. Or to put it more bluntly, as one bioethicist familiar with the PGP said to me, “If we have highly educated ‘altruists’ willing to take a hit and potentially go through life without insurance, then we should explore the unintended consequences with
them
because at least they could afford it.”
61
In May 2006 the National Human Genome Research Institute asked for a “single coherent document” that would put forward a scholarly presentation of the PGP and address the attendant ethical, legal, and social issues (“ELSI”).
62
George assembled a virtual collection of seventeen bioethics and legal experts (including, I should say, one of my colleagues at Duke, Bob Cook-Deegan, and me). With Dutch doctoral candidate Jeantine Lunshof and Harvard Law student Dan Vorhaus (both have since graduated), George drafted a white paper that tried to make the two cases I articulated above, namely 1) promising genomic privacy is dangerous and unrealistic, and 2) the only way to fully exploit genomic data is by integrating it with other biological and phenotypic data, some of which are intrinsically identifying (faces, for example).