A
WSJ oped with J J Plecs, formerly of
Roam Analytics, which does a lot of health related data work. This is the full oped now that 30 days have passed. The
previous blog post has a lot of interesting updates and commentary.
The discovery that cigarettes cause cancer greatly improved human health. But that discovery didn’t happen in a lab or spring from clinical trials. It came from careful analysis of mounds of data.
Imagine what we could learn today from big-data analysis of everyone’s health records: our conditions, treatments and outcomes. Then throw in genetic data, information on local environmental conditions, exercise and lifestyle habits and even the treasure troves accumulated by Google and Facebook .
The gains would be tremendous. We could learn which treatments and dosages work best for which people; how treatments interact; which genetic markers are associated with treatment success and failure; and which life choices keep us healthy. Integrating payment and other data could transform medical pricing and care provision. And all this information is sitting around, waiting to be used.
So why isn’t it already happening? It’s not just technology: Tech companies are overcoming the obstacles to uniting dispersed, poorly organized and incompatible databases. Rather, the full potential of health-care data analysis is blocked by regulation—and for a good reason: protecting privacy. Obviously, personal medical records can’t be open for all to see. But medical-data regulations go far beyond what’s needed to prevent concrete harm to consumers, and underestimate the data’s enormous value.
Most of us have seen how regulations kept medicine in the fax-machine era for decades, and how electronic medical records are still mired in complexity. It’s tough enough for patients to access their own data, or transfer it to a new doctor. Researchers face more burdensome restrictions.
“Open Data” initiatives in medical research, which make medical data freely available to researchers, are hobbled by Health Insurance Portability and Accountability Act (HIPAA) regulations and data-management procedures that reduce the data’s value and add long lead times. For example, regulations mandate the deletion of much data to ensure individual privacy. But if the data are de-identified to the point that patients can’t possibly be distinguished, nobody will be able to tell why a given patient experienced a better or worse result.
HIPAA “safe harbor” guidelines require removing specific dates from patient data. Only the year when symptoms emerged or treatments were tried can be shown. So which treatment was tried first? And for how long? Was the patient hospitalized before the treatment or three months later? All of a sudden, the data aren’t so helpful.
Health-care data released for public use are also closely hemmed in. For instance, Medicare prescription data are censored if a doctor wrote 10 or fewer prescriptions for a particular drug. That means whole categories of usage and prescribers are systematically missing from the publicly available data.
Regulators need to place greater weight on the social value of data for research. Data use can be limited to research purposes. Specific dangers, rather than amorphous privacy concerns, can be enumerated and addressed. The Internal Revenue Service seems to have figured out how to keep individual-level tax data private while allowing economic researchers to study it. Similar exploration is needed for health data; the opportunity cost of medical discoveries not made is too high to ignore.
Research consortia or governmental agencies can release patient-level data sets, including high resolution on symptoms, treatments, lab test-results and medical outcomes, but with names and identifying details anonymized. It should be freely available to researchers first for conditions with the most serious need for new insights, such as Alzheimer’s, ALS or pancreatic cancer. These can be the leading edge for which regulators develop data-control systems they can trust.
Laws and regulations can stipulate that patients’ medical data can’t be used for nonmedical and nonresearch purposes such as advertising. Patients can be explicitly protected against any harms related to being identified by their data. Data couldn’t be used to deny access to insurance, set the cost of insurance, or for employment decisions. Patients should opt-in by default to share their medical records for research purposes, but always be able to decline to share if they’d like.
Free societies have long benefited from a wise balance between the open exchange of ideas and information, and individuals’ rights and sensitivities. We need to get that balance right for medical data. Otherwise, societies less concerned with individual rights and privacy may seize the opportunities we’re giving up.
Mr. Plecs is a consultant for the pharmaceutical and biotechnology industries. Mr. Cochrane is a senior fellow of the Hoover Institution and an adjunct scholar of the Cato Institute.
More updates. In addition to to
Roam,
Tafi,
Datavant and the
FDA sentinel initiative mentioned in the
previous blog post, a colleague points out
Project Data Sphere which aims to "share, integrate, and analyze our collective historical cancer research data in a single location." It also mixes a wide variety of data sources, and makes data available to academics.