Wednesday, December 26, 2018

Imagine what we could cure -- full oped

WSJ oped with J J Plecs, formerly of Roam Analytics, which does a lot of health related data work. This is the full oped now that 30 days have passed. The previous blog post has a lot of interesting updates and commentary.

The discovery that cigarettes cause cancer greatly improved human health. But that discovery didn’t happen in a lab or spring from clinical trials. It came from careful analysis of mounds of data.

Imagine what we could learn today from big-data analysis of everyone’s health records: our conditions, treatments and outcomes. Then throw in genetic data, information on local environmental conditions, exercise and lifestyle habits and even the treasure troves accumulated by Google and Facebook .

The gains would be tremendous. We could learn which treatments and dosages work best for which people; how treatments interact; which genetic markers are associated with treatment success and failure; and which life choices keep us healthy. Integrating payment and other data could transform medical pricing and care provision. And all this information is sitting around, waiting to be used.

So why isn’t it already happening? It’s not just technology: Tech companies are overcoming the obstacles to uniting dispersed, poorly organized and incompatible databases. Rather, the full potential of health-care data analysis is blocked by regulation—and for a good reason: protecting privacy. Obviously, personal medical records can’t be open for all to see. But medical-data regulations go far beyond what’s needed to prevent concrete harm to consumers, and underestimate the data’s enormous value.

Most of us have seen how regulations kept medicine in the fax-machine era for decades, and how electronic medical records are still mired in complexity. It’s tough enough for patients to access their own data, or transfer it to a new doctor. Researchers face more burdensome restrictions.

“Open Data” initiatives in medical research, which make medical data freely available to researchers, are hobbled by Health Insurance Portability and Accountability Act (HIPAA) regulations and data-management procedures that reduce the data’s value and add long lead times. For example, regulations mandate the deletion of much data to ensure individual privacy. But if the data are de-identified to the point that patients can’t possibly be distinguished, nobody will be able to tell why a given patient experienced a better or worse result.

HIPAA “safe harbor” guidelines require removing specific dates from patient data. Only the year when symptoms emerged or treatments were tried can be shown. So which treatment was tried first? And for how long? Was the patient hospitalized before the treatment or three months later? All of a sudden, the data aren’t so helpful.

Health-care data released for public use are also closely hemmed in. For instance, Medicare prescription data are censored if a doctor wrote 10 or fewer prescriptions for a particular drug. That means whole categories of usage and prescribers are systematically missing from the publicly available data.

Regulators need to place greater weight on the social value of data for research. Data use can be limited to research purposes. Specific dangers, rather than amorphous privacy concerns, can be enumerated and addressed. The Internal Revenue Service seems to have figured out how to keep individual-level tax data private while allowing economic researchers to study it. Similar exploration is needed for health data; the opportunity cost of medical discoveries not made is too high to ignore.

Research consortia or governmental agencies can release patient-level data sets, including high resolution on symptoms, treatments, lab test-results and medical outcomes, but with names and identifying details anonymized. It should be freely available to researchers first for conditions with the most serious need for new insights, such as Alzheimer’s, ALS or pancreatic cancer. These can be the leading edge for which regulators develop data-control systems they can trust.

Laws and regulations can stipulate that patients’ medical data can’t be used for nonmedical and nonresearch purposes such as advertising. Patients can be explicitly protected against any harms related to being identified by their data. Data couldn’t be used to deny access to insurance, set the cost of insurance, or for employment decisions. Patients should opt-in by default to share their medical records for research purposes, but always be able to decline to share if they’d like.

Free societies have long benefited from a wise balance between the open exchange of ideas and information, and individuals’ rights and sensitivities. We need to get that balance right for medical data. Otherwise, societies less concerned with individual rights and privacy may seize the opportunities we’re giving up.

Mr. Plecs is a consultant for the pharmaceutical and biotechnology industries. Mr. Cochrane is a senior fellow of the Hoover Institution and an adjunct scholar of the Cato Institute.

More updates. In addition to  to  RoamTafi, Datavant and the  FDA sentinel initiative mentioned in the previous blog post, a colleague points out Project Data Sphere which aims to "share, integrate, and analyze our collective historical cancer research data in a single location." It also mixes a wide variety of data sources, and makes data available to academics.


  1. Cochrane, I think you and WSJ editors know this is hypocrisy. MSM is pushing drugs and investment in drugs. Alcohol is a drug. Vaping is about drugs. Cigarettes are drugs, ...and money is a drug. Happy new year to all the addicts in MSM.

  2. I've run into similar issues with FERPA. When I built the data model to do some research, I purposely designed it to have the data anonymized to prevent primary and secondary discovery. Yes, we lost some valuable demographic data, but it's a first step.

    We managed to convince the upper echelons this sort of research was worthwhile and they pulled the necessary levers to make things happen. There was a fair amount of bureaucratic red tape but we eventually sliced through it all by designing a process to snag and anonymize the data from the data warehouse. So, now, they've got a recyclable process that can be used over and over again so that other kinds of research can be done while keeping everyone happy.

    It would have been nice to have demographic data, but that will come later. To snag that data we would have needed all kinds of release forms and such for individuals who used the platform. And, we would have needed all kinds of secure systems to store that data. So, yes, we had to build a workaround to get meaningful data.

    As it relates to healthcare, the Leviathan is real. The recent messes with FB only make people more fearful as to how their data can be used.

    As an alternative, look at China. They're building a social credit score that tracks a person's activities to determine what benefits they qualify for. Now, imagine integrating health data into a person's overall profile. Imagine the kinds of inferences that could be made. Somewhat scary.

    I'm all for using data in the right way. There's a treasure trove out there that could be used for good. What people need are assurances they can believe in. It's all about trust. It's a currency that has real value.

  3. What's the 30 days note about in the 2nd sentence?

  4. What's the "30 days have passed" comment about?

  5. Would be interesting to better identify the contours of "the good reason" for impeding better health outcomes -- "protecting privacy" since my interactions with healthcare providers on behalf of myself and family are impeded by privacy protections i neither want nor need.

  6. I would like to get your opinion on the following Healthcare reform ideas:


Comments are welcome. Keep it short, polite, and on topic.

Thanks to a few abusers I am now moderating comments. I welcome thoughtful disagreement. I will block comments with insulting or abusive language. I'm also blocking totally inane comments. Try to make some sense. I am much more likely to allow critical comments if you have the honesty and courage to use your real name.