Here Is a Human Being (19 page)

Authors: Misha Angrist

BOOK: Here Is a Human Being

6.16Mb size Format: txt, pdf, ePub

“Watson may not know his APOE genotype, but I do,” Cariaso told McGuire in front of the stunned crowd. “And if anyone else wants to know, the information is still on the [National Center for Biotechnology Information] server.”
¹³

He returned to his seat. Someone in the audience, an aghast and agitated Baylor genomicist I imagined, her face pale, marched up to him and began firing questions, each of which Cariaso answered with quiet confidence. The facts were inescapable: The Nobel Prize–winning and DNA-discovering source of the second completely sequenced human genome had asked that of his 20,000+ genes, sequence information from just one lousy gene—
one!
—not be made public. This task was left to the molecular brain trust at Baylor University, one of the top genome centers in the world. Its mission was to keep secret a single genotype from a single gene. But the Baylor team was outfoxed by a thirty-year-old autodidact with a bachelor’s degree who preferred to spend most of his time on the Thai-Burmese border distributing laptops and teaching kids how to program and perform Google searches.

If there were any remaining doubts as to the relatively easy availability of Watson’s APOE status, they were erased a few months later when Australian researchers came to the same conclusion as Cariaso. The title of their paper in the
European Journal of Human Genetics
said it all: “On Jim Watson’s APOE Status: Genetic Information Is Hard to Hide.”
¹⁴

After I got home I asked Mike via email what he made of the minor shitstorm he had started at Marco Island. He wrote back: “The ethical conundrum is: What did Watson intend
not
to know? Was it: 1. ‘Don’t tell me my APOE sequence'; 2. ‘Don’t tell me my ApoE4 [trait] status'; 3. ‘Don’t tell me anything that might reveal my ApoE4 [trait] status'; or 4. ‘Don’t tell me anything that predicts Alzheimer’s?’

“Number 1 and Number 2 were addressed. Number 4 is impossible, since it’s based on what we might discover tomorrow. Given the best data we have today, we know that Number 3 wasn’t covered due to the high linkage disequilibrium with a distant neighbor [on the same chromosome]. If they’d scrubbed APOE [plus another] 30,000 base pairs on either side, then they would have covered what we know today. But that doesn’t mean tomorrow we won’t learn a new way of determining [APOE genotypes] from some sequence 50,000 base pairs away or even on a different chromosome. It’s tough to guard against the future.”
¹⁵

Mike and his friend Greg Lennon run SNPedia, a wiki-based website that is in some ways the do-it-yourself version of 23andMe. Sort of. Mike, Greg, and anyone else who wants to can “simply” dig through the human genetics literature and look for associations between genetic variants and human traits. They catalog these and write brief narrative descriptions of them:

rs6457617 has been reported in a large study to be associated with rheumatoid arthritis. This SNP is reported to be the most statistically significant of many SNPs similarly located in the MHC region. The risk allele (oriented to the dbSNP entry) is (T); the odds ratio associated with heterozygotes is 2.36 (CI 1.97–2.84), and for homozygotes, 5.21 (CI 4.31–6.30).
¹⁶

What the hell does this mean? To start with, “rs6457617” is the SNP number; that is, it is the unique identifier of a particular variant in human DNA (“rs” stands for “Reference SNP,” one that has been validated and mapped to a particular place in the genome). Now, recall the DNA alphabet: A, G, C, and T. Our genomes are the 3 billion A’s, G’s, C’s, and T’s we get from our mothers and the 3 billion we get from our fathers. Of the thousands of people who have been screened for this particular SNP associated with rheumatoid arthritis, virtually everyone on earth is one of the following: CC, CT, TC, or TT. People who inherited a C allele at this SNP from each parent are at average risk for developing rheumatoid arthritis. People who inherited a T from one parent and a C from the other at this position are 2.36 times more likely to develop RA than average. People who inherited a T from both parents (as I did) are 5.21 times more likely to develop RA.

Okay, but what does that mean in
absolute
terms? We don’t know with 100 percent accuracy, but after typing me for this SNP and five others associated with arthritis, Navigenics told me that my lifetime risk of developing RA was 2.8 percent, or a little less than twice the average. The first big caveat: from twin studies, we know that only slightly more than half of the risk for RA is inherited; the rest is likely due to environmental factors, which Navigenics has not measured (nor, as far as I know, have any of the other commercial or noncommercial genome scanners, probably because no one knows exactly what they are).
¹⁷The second big caveat: there’s no reason to think that scientists won’t find another dozen SNPs in the human genome that contribute to RA susceptibility. The model for risk prediction in RA will probably look much different in a few years (if not months) and it will probably be much more complicated.

But for Mike Cariaso and Greg Lennon, that wasn’t the point. Over a long lunch at a cheap French restaurant in a nondescript part of Bethesda, Maryland, near the hotel where NIH likes to hold meetings, Greg recounted the genesis of SNPedia.
¹⁸He took me back to 2005–2007, a simpler time when there were no direct-to-consumer genomics companies or gargantuan databases brimming over with information on human genomic variation. This state of affairs didn’t sit right with Lennon, a handsome man in his early fifties with thinning gray hair and bright blue eyes. He spoke in relaxed, measured tones, although one sensed impatience just below the surface. In the early 1990s he had finished his postdoc with übergeneticist Hans Lehrach at the Imperial Cancer Research Fund in London and had followed that with a successful career as a biotech scientist and executive. He and Cariaso met when both were working at Larence Livermore in northern California. Cariaso then followed Lennon to Gene Logic, one of the first companies to take seriously the idea that gene expression—to what extent certain sets of genes were active in particular cells and tissues—could be used to identify drug targets. Thus, for example, white blood cells express high levels of genes that code for infection-fighting proteins; neurons express high levels of genes that code for neurotransmitters such as dopamine; and so on. To a large extent, different cell types can be defined by the genes they do or don’t express. (Alas, thus far this has not led to much in the way of drugs.)

The lingua franca for measuring gene expression in the late 1990s and early 2000s was the microarray: tiny spots of DNA fixed to a solid surface such as a glass microscope slide or nylon membrane. A microarray typically contains thousands of genes. To survey the expression of those genes in a cell or tissue with a microarray, one would prepare fluorescently labeled RNA (the intermediate coded for by DNA that usually goes on to code for protein) from that cell type. All of those bits of RNA serve as probes: they find their complementary DNA partners and stick to them like molecular Velcro. When they find their match, they fluoresce. The strength of the fluorescent signal that lights up at the spot in the array representing each individual gene provides a snapshot of how active that gene is in the sample. Genes that are especially active or inactive in diseased cells and tissues are potential drug targets.

But by 2000, microarrays were beginning to be used for purposes beyond gene expression. Among the new applications was SNP detection, and this was even easier than gene expression. Instead of a curve measuring the extent to which a gene was expressed, genotyping was binary: In any given individual, was a particular DNA variant present or absent? And if present, was there one copy or two? By doing case-control studies on hundreds or thousands of people, say, half with a certain disease and half without, and by finding SNPs that were more frequently found in those
with
the disease, SNP scans could be used to identify disease susceptibility genes.

The leap Cariaso and Lennon made was to take those findings and begin to apply them to individuals. Because once those genome-wide association studies (GWAS) were done on one or more populations, then hypothetically anyone could examine some fraction of her complement of SNPs and see whether she carried SNPs that raised or lowered her disease risks or otherwise contributed to her traits. In 2007 this was still an expensive, labor-intensive, and high-risk proposition for two guys in their basements. So why even bother?

“I was frustrated,” Greg Lennon told me. “I could go into any restaurant and ask, ‘Has anyone here benefited from the Human Genome Project? Do you know
anything
about your own genetics? Do your doctors know anything about it?’ The answer in general would be—and for the most part, still is—a resounding no. Not even at the level of cocktail banter. I spent my career in an area of science that I felt and continue to feel is very promising. Yet it doesn’t seem to matter. It hasn’t affected anybody. So I began to ask myself, ‘Have I just wasted my time?'”
¹⁹

After talking about it for more than a year, Lennon and Cariaso decided to go native. From Gene Logic’s deep and abiding work on gene expression, the two were quite familiar with Affymetrix GeneChips, the dominant microarray platform at the time.
²⁰They had both done some genotyping. So how hard could it be to run some Affymetrix chips on themselves and have a look at their own genomes?

Arguably the pair’s most difficult hurdle turned out to be getting DNA out of their own bodies. Spit kits and spit parties were still many months away. Blood was easier and cheaper to process. But there was a problem. “When you’re a random individual wandering the streets,” Lennon said, “no one really wants to collect blood from you.”
²¹Cariaso stopped in a fire station in suburban Maryland and found that paramedics have a lot of time on their hands between calls. He chatted them up, got them interested in what he was doing, and soon had his sleeves rolled up. And that would have been that, except … “Sitting in the back of an ambulance with the needle in my arm, the station alarms went off.”
²²Another possibility was the Red Cross: Lennon told me that if you had “the right attitude,” then you could get some of your own blood to go. He wound up getting his own blood sample from a general practitioner who had a soft spot for human genetics, despite the admission that he remembered almost nothing from his cursory medical school training in the subject.
²³

Lennon and Cariaso found a contract molecular biology lab to isolate DNA from their white blood cells and to run the latest and greatest Affy chip (five hundred thousand markers). According to Lennon, that’s when the fun began. “We got that data back, and for all of our brilliance, we just stared at those huge files going, ‘Now what?’”
²⁴

They were convinced that at some point during the thirteen-year, $2.7 billion effort to map the human genome, surely
someone
had taken the initiative and developed a database that systematically linked variation across the genome to human phenotypes. There was Online Mendelian Inheritance in Man,
²⁵an incredibly useful tool developed by the late Victor McKusick, the father of clinical and medical genetics. The catalog began as a hardcover book,
Mendelian Inheritance in Man,
in 1966.
²⁶But even though it now lived online, OMIM was a text-based catalog, not a digital one, and it mostly contained diseases and phenotypes caused by rare changes in single genes: if a doctor in Saudi Arabia observed a child with widely spaced eyes and elevated enzyme levels in 1968 and published a case report, McKusick would make a note of it. The catalog was and is remarkably comprehensive: an astounding collection of our species’ variation.
²⁷That said, I often found reading OMIM to be annoying: helpful and fascinating case reports and research studies were amassed under gene and/or disease headings and subdivided (“clinical features,” “animal models,” “pathogenesis,” etc.), but without any real narrative flow. At its best, it was like an Audubon field guide for clinicians—a terrific, handy reference. At its worst, it could be a painful slog for anyone interested in a high-level view of any particular genetic disease. I would be terribly upset if it didn’t exist, but as the great Irish writer Roddy Doyle said of
Ulysses,
it might benefit from a little more editing.
²⁸

The public database of genetic variants, dbSNP, began in 1999. In 2002 it contained 1.3 million unique, validated human variants. In early 2010 it had 9.5 million.
²⁹But unlike OMIM, dbSNP had no intrinsic clinical content, and until recently it didn’t “talk” to OMIM. “I respect McKusick and the way he put OMIM together,” said Lennon. “But that doesn’t mean it’s kept up with the times. It doesn’t help you annotate your genome. The vast majority of the information is effectively anecdotal. There’s nothing wrong with that. But you can’t actually have software work with OMIM. At least dbSNP had the nomenclature part of it roughly right.”
³⁰

Disappointed that the billions spent on the Human Genome Project had not resulted in a resource linking genotype to phenotype, Lennon and Cariaso were at a loss. Despite their ambition, the idea that the two of them could take all of the world’s human genetics literature and turn it into useful information for the benefit of a dozen or a thousand people was laughable. The idea that
any
number of people could do it was debatable. Lennon and Cariaso could pour the foundation, but other people would have to finish the floors and furnish the house. It would take a village … or at least, something like Wikipedia. And that was the approach they took: the two put other people’s SNP data up on the site (they never got around to posting their own) and let the world have at it.

“We faced and still face the exact same questions that Wikipedia faces,” said Lennon. “How do you control quality? Is the information credible? Fine—those are fair things to ask. Why not be skeptical about anything you read? On the other hand, we have a huge advantage over Wikipedia because we’re not trying to cover everything from the Israeli-Palestinian conflict to Britney Spears. And we don’t get a lot of flame wars on the site. Genetics is usually pretty boring.”
³¹

Other books

Falling into Forever (Falling into You) by Abrams, Lauren

Thor Is Locked in My Garage! by Robert J. Harris

Uncle John’s Unsinkable Bathroom Reader by Bathroom Readers’ Institute

Poison by Davis, Leanne

The Man Who Sees Ghosts by Friedrich von Schiller

Dirty Truths by Miller, Renee

Lewis & Ondarko - Best Friends 03 - Now and Zen by Deb Lewis, Pat Ondarko

The Middle of Somewhere by J.B. Cheaney

I'll Take Care of You by Caitlin Rother

Secrets of a Career Girl by Carol Marinelli