What Stays in Vegas (18 page)

Read What Stays in Vegas Online

Authors: Adam Tanner

BOOK: What Stays in Vegas
4.58Mb size Format: txt, pdf, ePub

Before accepting new volunteers, Church requires that they take an online exam about privacy risks. They must intimately know the details of the twenty-four-page consent form. “The Personal Genome Project is a new form of public genomics research and, as a result, it is impossible to accurately predict all of the possible risks and discomforts that you might experience,” it says. Having someone identify participants is one of the listed risks. The exam does not pose a simple generic question such as “Do you understand the risks?” It lists twenty questions, and Church requires a perfect score. Potential volunteers can take the test as many times as they need until they pass. One person took the test ninety times before getting the required perfect score.
15

Of course, almost no one reads privacy policies because they are so dull and obtuse. One study found that it would take between eight and twelve minutes to read a typical website privacy statement. The study's authors estimated that it would take a person between 181 hours and 304 hours a year to read all the privacy statements he or she came across over that period—well over a month of working hours.
16

People likely read privacy policies about medical experiments more carefully, but Church says most studies are disingenuous in describing privacy risks. “This is one of the ways people get in over their heads in terms of personal data being exposed,” he says. In fact, many surveys
can expose personally identifiable information even if they say they are anonymous.

As of 2014, more than three thousand people had volunteered their data to the Personal Genome Project. Church would like to recruit up to one hundred thousand people, but he needs additional funding (it costs about $4,000 per person to take the DNA test and cover related administrative costs).

Every year the project hosts a conference where scientists and participants meet for two days of formal lectures as well as informal discussions. For the 2013 event in Boston, Sweeney set up a table in the hallway with her assistant to demonstrate that she could unmask the identity of many participants. Ahead of the conference she programmed her computers to collect publicly posted data on 1,130 of the volunteers. Of this number, 579 provided ZIP code, date of birth, and gender—the key information her 1997 study had shown could be used to identify large swaths of the US population. By cross-referencing the three pieces of information against voter registration records or other public documents, Sweeney identified 241 people, 42 percent of the total.
17
The Personal Genome Project confirmed that she had the names right 84 percent of the time, or 97 percent when adding nicknames and other variations on the first name.
18

Participants at the conference reacted to Sweeney's findings largely by saying they expected one day to be identified. Gabriel Dean, who works at a telephone company, signed up after hearing about the Personal Genome Project on National Public Radio. He checked first with his siblings because he realized what he gave away about himself could reflect on them. As open as he was about his medical data, he remains concerned about revealing information on social networks, so he does not maintain profiles on Facebook or LinkedIn.

Throughout the two-day conference, study participants stopped by the table where Sweeney walked them through her website
aboutmyinfo.org
to demonstrate how easily she could identify them. She asked people to enter their ZIP code, date of birth, and gender into the site, which in turn told users if they were unique and thus identifiable. One
woman came up and asked in a somewhat feisty tone why she should care. Sweeney responded that, for example, a life insurance company could theoretically deny writing a policy based on personal data. The woman turned pale. “I was just denied life insurance,” she said. The Harvard professor quickly replied that someone could be denied life insurance for many reasons and that it was far from clear that anyone had actually seen her medical data. But the woman did seem to recognize the potential danger from having such intimate medical details out there.

Harvard professor Latanya Sweeney talks with her research assistant Sean Hooley at her office. Source: Author photo
.

Many attending the conference embraced a let-the-world-know ethos. Steven Pinker, a well-known experimental psychologist and author of the 2011 book
The Better Angels of Our Nature
, stepped forward as one of the first ten volunteers in the study. He posts his genome and a 1996 scan of his brain on his website and insists even that amount of information does not reveal much about him as a person.
19
“There just isn't going to be an ‘honesty gene' or anything else that would be nearly as informative as a person's behavior, which, after all, reflects the effect of all three billion base pairs and their interactions together with chance, environmental effects, and personal history,” he says. “As for
the medical records, I just don't think anyone is particularly interested in my back pain.”

Sweeney's goal in publishing such findings is not to humiliate people by outing them. She believes that researchers with access for medical data on millions of patients may be able to find new cures for diseases or different patterns of effective treatment. Yet she wants to encourage people to find a better balance between sharing data and preserving some privacy. For example, people could list just their year of birth rather than full birth date, and just three rather than five or nine digits from their ZIP code. “Vulnerabilities exist, but there are solutions too,” she says. “If they change those demographics, they can thwart that attack without losing research value.”

Does someone need Sweeney's training, an advanced degree in computer science, to reidentify people from the Personal Genome Project? Apparently not. To test Sweeney's findings, I tried to find three participants who had especially detailed and lengthy medical histories. One profile listed an abortion, anal itching, constipation, marijuana use, urinary tract infection, and many other ailments. She gave her weight as 160 pounds and said she took medication for high blood pressure. I went to the site of a commercial data broker and entered the birth date of a woman in a certain ZIP code. Instantly two names came up for that birth date in that ZIP code, only one of which was a woman.

She turned out to be a professor and well-known scholar. She was surprised when I contacted her out of the blue. “I certainly did pay attention to the caveats about ‘personal identification' when I signed up for the PGP, but didn't realize it would be so ridiculously easy to track down an individual,” she said. “It doesn't worry me over-much, perhaps because I'm at an age where I'm not all that concerned ‘what people might think' of various aspects of my history. I can imagine, though, that would not always have been the case.”

Although she did not object to my using her case to illuminate the problem, I checked back several times. She works for a faith-based institution that strongly disapproves of abortion. The school staff handbook warns that engaging in conduct detrimental to the reputation of the institution could lead to dismissal. She said she was confident
nothing would happen to her after serving as a tenured professor for decades. “It certainly isn't anything I hide (in fact, I use it as an example in class). That said, it might have made a difference many years ago, especially given that some former administrators were much more conservative than those we have today,” she said. But in the end she did not want her name published—the details were just too intimate.

Another woman, sixty-eight, admitted on her Personal Genome Project survey to using cocaine and marijuana in the past and gave a long list of her ailments and the medications that she took. She also said she had suffered from child abuse from 1946 to 1963. For her birth date and ZIP code combination four or five female names appeared (one name was Lee, which could have been either gender). Yet the volunteer had also uploaded a genetic test to her profile that included her name. It took a little searching to find the phone number and email address for the woman, who had been involved in her high school's fiftieth reunion. She had left her web address details on a school alumni website, which in turn led to her email address.

The third volunteer was a seventy-two-year-old man whose profile listed alcoholism, bed-wetting to age twelve, bipolar disorder, cocaine use, depression, and many other ailments. Two men share his birth date in his Santa Barbara, California, ZIP code. One appeared to have moved. Searching the name of the second man led to a LinkedIn page saying he had gone to Harvard and had worked as a scientist. Fred Gamble confirmed he did indeed participate in the Personal Genome Project. He expressed surprise that someone had identified him, but not concern. He was retired and had become an active gardener. “Mine is detailed and there is some stuff in it that a younger person might not want broadcast, but I'm seventy-two and I don't really care,” he told me.
20

People volunteer for the Personal Genome Project because they are more open about their personal data in the first place. But not everyone wants to share such intimate data so freely. As a black woman who grew up in the South in the 1960s, Sweeney is sensitive to the potential for discrimination based on personal identity. She is also concerned about the vulnerability of anonymized medical data made public,
which includes hospital discharge records released by most states. Such records exclude a patient's name, address, and Social Security number but still contain identifying clues. Insurance companies, labs, pharmacies, and various middlemen also have wide access to claims data related to medical conditions.

Selling deidentified data has become a multibillion-dollar business, even if such practices are largely hidden from the public. For example, when you fill a prescription, the pharmacy sells details about that transaction, earning about a penny. American pharmacies fill more than 2.5 million prescriptions every day, so over time those pennies add up. Nearly all of the country's sixty thousand pharmacies send out details of each transaction to companies that compile and analyze the information to resell to others. The data include age and gender of the patient; the doctor's name, address, and contact details; and details about the prescription.
21

Despite assurances from the health-care industry, some privacy advocates say the trade in personal medical data will eventually harm people through reidentification. One prominent medical privacy advocate is Deborah Peel, a Freudian psychoanalyst. The first week Peel opened her practice in 1977, a patient startled her with an unusual question.

“If I pay you cash, will you keep my medical records private?”

At medical school Peel thought that mental health records could not be released without the patient's explicit permission. Yet she learned that records did get to employers who either fired people or demoted them. She agreed to keep records off the insurance rolls for cash. Over the years, she became ever more concerned about patient privacy. In 2004 she stopped taking new patients and set up the Patient Privacy Rights Foundation, based in Austin, Texas. “It's really hard not to come off as kind of a wing nut or separatist or I don't know what. But I'm just a doctor who's watched this for thirty-five years,” she says. “With all this data out there, it's going to be the greatest source of job discrimination we've ever seen in this country, and it's going to start very early with your kid.”

“The dirty little secret . . . whenever people talk about privacy rights, it always devolves to health data, and most lawyers and most of
the public cannot believe that we have no control over our health data. But we don't! We fucking don't.”

Good Data Intentions Gone Bad: Netflix/AOL Data Releases

In recent years, the ability to identify people thought to be anonymous has embarrassed well-known companies and institutions that have released data. One well-known incident occurred in 2006 after Netflix announced a dramatic contest: the company offered to pay $1 million to anyone who could improve its movie rating system by 10 percent. In theory, the plan would benefit everyone: predicting what movies would most appeal to individual subscribers could boost business and create a better customer experience. By outsourcing the effort to the public, Netflix could lure some of the best minds in data science and fill the wallet of a computer expert or team of researchers. “We're quite curious, really. To the tune of one million dollars,” the company advertised.
22
The contest attracted wide attention. Hoping to snag the seven-figure payout, 51,051 people on 41,305 teams entered. Netflix released recommendations from nearly half a million subscribers, replacing the names of its customers with internal ID numbers.

Other books

FSF, January-February 2010 by Spilogale Authors
Awaken by Rachel D'Aigle
Golem in My Glovebox by R. L. Naquin
Going It Alone by Michael Innes
This Book is Gay by James Dawson
The Wilder Life by Wendy McClure
The Bullet Trick by Louise Welsh