May 10, 2019: Treating Data as an Asset: Data Entrepreneurship in the Service of Patients (Eric Perakslis, PhD)

Speaker

Eric D. Perakslis MS, PhD
Rubenstein Fellow, Duke University
Lecturer, Department of Biomedical Informatics
Harvard Medical School

Topic

Treating Data as an Asset: Data Entrepreneurship in the Service of Patients

Keywords

Digital health; Health data; General Data Protection Regulation (GDPR); Data sharing

Key Points

  • The only 100% common element of digital transformation across all industries is data.
  • With data and digital transformation, patients are changing: They are active, connected, informed, and savvy.
  • Security, compliance, and privacy are different things.

Discussion Themes

Is there any hope of data sharing policies helping to bridge the micro and macro silos of healthcare data?

As data starts to flows through institutions, it ends up in multiple places. Part of sharing data is protecting a single source of truth.

If something is relevant to the bedside, it’s worth doing.

Read Dr. Perakslis’s commentary in The Lancet (May 2019).

Tags

#healthdata, #pctGR, @Collaboratory1

September 28, 2018: Assessing and Reducing Risk of Re-identification When Sharing Sensitive Research Datasets (Greg Simon, MD, MPH, Deven McGraw, JD, MPH, Khaled El Emam, PhD)

Speakers

Gregory Simon MD, MPH
Investigator, Kaiser Permanente Washington Health Research Institute

Deven McGraw, JD, MPH, LLM
General Counsel & Chief Regulatory Officer, Ciitizen

Khaled El Emam, PhD
Department of Pediatrics, University of Ottawa
Children’s Hospital of Eastern Ontario Research Institute

Topic

Assessing and Reducing Risk of Re-identification When Sharing Sensitive Research Datasets

Keywords

Clinical trials; Research ethics; Data security; Data sharing; Sensitive research data; De-identified data

Key Points

  • The cycle of risk de-identification involves setting a risk threshold, measuring the risk, evaluating the risk, and applying transformations to reduce the risk.
  • The Safe Harbor method of de-identification (removal of 18 categories of data) is a legal minimum standard that does not take context into account, and may not be sufficient when sharing sensitive data publicly.
  • A higher standard for de-identification is the “Expert Determination” method, whereby an expert with contextual knowledge of the broader data ecosystem can determine whether the risk is “not greater than very small.”
  • With increasing concern about the risks of sensitive data sharing, it is important to be transparent with data participants and continue to build trust for data uses.

Discussion Themes

When is a dataset safe for sharing? What is the risk of re-identification, and how can we reduce the risk? Consider who you are releasing the data to and what other kinds of data might they have access to that could potentially lead to re-identification.

For more information on the de-identification of protected health information, visit the U.S. Department of Health and Human Services’s Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule.

The Health Information Trust Alliance de-identification framework identifies 12 criteria for a successful de-identification program and methodology.

Tags

#pctGR, #PragmaticTrials, #HealthData, @HealthPrivacy @Collaboratory1, @PCTGrandRounds

November 17, 2017: New Video in Living Textbook Explores Data Sharing and Embedded Research

As part of an article published in Annals of Internal Medicine, Dr. Greg Simon created a short video in which he describes concerns related to data sharing and embedded research, as well as potential solutions for those concerns. We recently added this video to the Living Textbook chapter on Data Sharing and Embedded Research. In the chapter, the authors expand on the ideas presented in the Annals article and fame them using lessons learned from the Collaboratory’s Demonstration Projects. Data collected as part of research embedded in a health system comes from a fundamentally different context than stand-alone explanatory trials. When they are taken out of context or used for comparisons, they have the potential to do harm—something that can potentially discourage health systems from volunteering to participate in embedded research. The authors suggest that data sharing plans for embedded research be developed in partnership with health system leaders in ways that maximize the amount of data that can be shared while protecting patient privacy and healthcare system interests.

“Ultimately, it’s a practical question: if we want healthcare providers and healthcare systems to participate in research, we shouldn’t expect them to bear extra risk. In an ideal world, all information about the quality of health care and healthcare outcomes across the country would be completely open to everyone, but we don’t live in that world now. So if we are asking healthcare providers and healthcare systems to open up and be more transparent by participating in research, we certainly would not want to punish those who volunteer.” — Simon et al. in video for Ann Intern Med

 

Simon G, Coronado G, DeBar L, et al. Data Sharing and Embedded Research: Introduction. In: Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials. Bethesda, MD: NIH Health Care Systems Research Collaboratory. Available at: https://rethinkingclinicaltrials.org/data-share-top/data-sharing-and-embedded-research-introduction/. Updated November 13, 2017.

October 3, 2017: New Collaboratory Article Explores Data Sharing and Embedded Research

In an article published in Annals of Internal Medicine, authors from the NIH Collaboratory describe concerns and solutions regarding data sharing and embedded research. Pragmatic research embedded in health systems uses data from the electronic health record and comes from a fundamentally different context than explanatory trials, which collect research-specific data. Data from embedded research have the potential to do harm if taken out of context or used for comparisons. Therefore, while the authors enthusiastically support data sharing, they also recognize that mandating data sharing may discourage health systems from volunteering to participate in embedded research.

“In an ideal world of transparency regarding healthcare processes and outcomes, health systems would have no expectation of or need for privacy regarding quality of health care delivery.  But the current world is not perfect, and unintentional disclosures from participation in embedded research could be far greater than that required for public quality measures. Health systems volunteering to participate in research to improve public health may not be willing to bear the additional risk of misuse of sensitive information.” — Simon et al. Ann Intern Med

The authors use examples from the NIH Collaboratory Demonstration Projects to illustrate potential solutions, and emphasize that data sharing plans for embedded research should be developed in partnership with health system leaders in ways that maximize the amount of data that can be shared while protecting patient privacy and healthcare system interests.

Journal Editors Propose New Requirements for Data Sharing

On January 20, 2016, the International Committee of Medical Journal Editors (ICMJE) published an editorial in 14 major medical journals in which they propose that clinical researchers must agree to share the deidentified data set used to generate results (including tables, figures, and appendices or supplementary material) as a condition of publication in one of their member journals no later that six months after publication. By changing the requirements for manuscripts they will consider for publication, they aim to ensure reproducibility (independent confirmation of results), foster data sharing, and enhance transparency. To meet the new requirements, authors will need to include a plan for data sharing as a component of clinical trial registration that includes where the data will be stored and a mechanism for sharing the data.

Evolving Standards for Data Reporting and Sharing

As early as 2003, the National Institutes of Health published a data sharing policy for research funded through the agency, stipulating that “Data should be made as widely and freely available as possible while safeguarding the privacy of participants, and protecting confidential and proprietary data.” Under this policy, federally funded studies receiving over $500,000 per year were required to have a data sharing plan that describes how data will be shared, that shared data be available in a usable form for some extended period of time, and that the least restrictive method for sharing of research data is used.

In 2007, Congress enacted the Food and Drug Administration Amendments Act. Section 801 of the Act requires study sponsors to report certain kinds of clinical trial data within a specified interval to the ClinicalTrials.gov registry, where it is made available to the public. Importantly, this requirement applied to any study classified as an “applicable clinical trial” (typically, an interventional clinical trial), regardless of whether it was conducted with NIH or other federal funding or supported by industry or academic funding. However, recent academic and journalistic investigations have demonstrated that overall compliance with FDAAA requirements is relatively poor.

In 2015, the Institute of Medicine (now the National Academy of Medicine) published a report that advocates for responsible sharing of clinical trial data to strengthen the evidence base, allow for replication of findings, and enable additional analyses. In addition, these efforts are being complemented by ongoing initiatives aimed at widening access to clinical trial data and improving results reporting, including the Yale University Open Data Access project (YODA), the joint Duke Clinical Research Institute/Bristol-Myers Squibb Supporting Open Access to clinical trials data for Researchers initiative (SOAR), and the international AllTrials project.

Responses to the Draft ICMJE Policy

The ICMJE recommendations are appearing in the midst of a growing focus on issues relating to the integrity of clinical research, including reproducibility of results, transparent and timely reporting of trial results, and facilitating widespread data sharing, and the release of the draft policy is amplifying ongoing national and international conversations taking place on social media and in prominent journals. Although many researchers and patient advocates have hailed the policy as timely and needed, others have expressed concerns, including questions about implementation and possible unforeseen consequences.

The ICMJE is welcoming feedback from the public regarding the draft policy at www.icmje.org and will continue to collect comments through April 18, 2016.

Resources

Journal editors publish editorial in 14 major medical journals stipulating that clinical researchers must agree to share a deidentified data set: Sharing clinical trial data: A proposal from the International Committee of Medical Journal Editors (Annals of Internal Medicine version). January 20, 2016.

A New England Journal of Medicine editorial in which deputy editor Dan Longo and editor-in-chief Jeffrey Drazen discuss details of the ICJME proposal: Data sharing. January 21, 2016.

A follow-up editorial in the New England Journal of Medicine by Jeffrey Drazen: Data sharing and the Journal. January 25, 2016.

Editorial in the British Medical Journal: Researchers must share data to ensure publication in top journals. January 22, 2016.

Commentary in Nature from Stephan Lewandowsky and Dorothy Bishop: Research integrity: Don’t let transparency damage science. January 25, 2016.

National Public Radio interview on Morning Edition: Journal editors to researchers: Show everyone your clinical data with Harlan Krumholz. January 27, 2016.

Institute of Medicine (now the National Academy of Medicine) report advocating for responsible sharing of clinical trial data: Sharing clinical trial data: maximizing benefits, minimizing risk. National Academies Press, 2015.

Rethinking Clinical Trials Living Textbook Chapter, Acquiring and using electronic health record data, which describes the use of data collected in clinical practice for research and the complexities involved in sharing data. November 3, 2015.

NIH Health Care Systems Research Collaboratory data sharing policy. June 23, 2014.

List of International Committee of Medical Journal Editors (ICMJE) member journals.

New Living Textbook Chapter on Acquiring and Using Electronic Health Record Data for Research

Topic ChaptersMeredith Nahm Zozus and colleagues from the NIH Collaboratory’s Phenotypes, Data Standards, and Data Quality Core (now the Electronic Health Records Core) have published a new Living Textbook chapter about key considerations for secondary use of electronic health record (EHR) data for clinical research.

In contrast to traditional randomized controlled clinical trials where data are prospectively collected, many pragmatic clinical trials use data that were primarily collected for clinical purposes and are secondarily used for research. The chapter describes the steps a prospective researcher will take to acquire and use EHR data:

  • Gain permission to use the data. When a prospective researcher wishes to use data, a data use agreement (DUA) is usually required that describes the purpose of the research and the proposed use of the data. This section also describes use of de-identified data and limited data sets.
  • Understand fundamental differences in context. Data collected in routine care settings reflect standard procedures at an individual’s healthcare facility, and are not collected in a standard, structured manner.
  • Assess the availability of health record data. Few assumptions can be made about what is available from an organization’s healthcare records; up-front, detailed discussions about data element collection over time at each facility is required.
  • Understand the available data. A secondary data user must understand both the data meaning and the data quality; both can vary greatly across organizations and affect a study’s ability to support research conclusions.
  • Identify populations and outcomes of interest. Because healthcare facilities are obligated to provide only the minimum necessary data to answer a research question, investigators must identify the needed patients and data elements with specificity and sensitivity to answer the research question given the available data.
  • Consider record linkage. Studies using data from multiple records and sources will require matching data to ensure they refer to the correct patient.
  • Manage the data. The investigator is responsible for receiving, managing, and processing data and must demonstrate that the data are reproducible and support research conclusions.
  • Archive and share the data after the study. Data may be archived and shared to ensure reproducibility, enable auditing for quality assurance and regulatory compliance, or to answer other questions about the research.

In Nature: The Precision Medicine Initiative & DNA Data Sharing


A recent article in Nature highlights the Precision Medicine Initiative, launched in January 2015 and spearheaded by the National Institutes of Health. Precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person. This initiative will involve collection of data on genomes, electronic health records, and physiological measurements from 1 million participants. A main objective is for participants to be active partners in research.

But a major decision faced by the initiative’s working group is how much information to share with participants about disease risk, particularly genetic data. Though there is much debate in the field, the article suggests that public opinion on data sharing may be shifting toward openness.

The Precision Medicine Initiative working group will be releasing a plan soon. For details on the goals of the Precision Medicine Initiative, read the perspective by NIH Director Dr. Francis Collins in the New England Journal of Medicine.


 

Study Examines Public Attitudes Toward Data-Sharing Networks


A new study examining public attitudes about the sharing of personal medical data through health information exchanges and distributed research networks finds a mixture of receptiveness and concerns about privacy and security. The study, conducted by researchers from the University of California, Davis and University of California, San Diego and published online in the Journal of the American Medical Informatics Association (JAMIA), reports results from a telephone survey of 800 California residents. Participants were asked for their opinions about the importance of sharing personal health data for research purposes and their feelings about related issues of security and privacy, as well as the importance of notification and permission for such sharing.

The authors found that a majority of respondents felt that sharing health data would “greatly improve” the quality of medical care and research. Further, many either somewhat or strongly agreed that the potential benefits of sharing data for research and care improvement outweighed privacy considerations (50.8%) or the right to control the use of their personal information (69.8%), although study participants also indicated that transparency regarding the purpose of any data sharing and controlling access to data remained important considerations.

However, the study’s investigators also found evidence of widespread concern over privacy and security issues, with substantial proportions of respondents reporting a belief that data sharing would have negative effects on the security (42.5%) and privacy (40.3%) of their health data. The study also explored attitudes about the need to obtain permission for sharing health data, as well as whether attitudes toward sharing data differed according to the purpose (e.g., for research vs. care) and the groups or individuals among which the data were being shared.

The authors note that while data-sharing networks are increasingly viewed as a crucial tool for enabling research and improving care on a national scale, they ultimately rely upon trust and acceptance from patients. As such, the long-term success of efforts aimed at building effective data-sharing networks may depend on accurately understanding the views of patients and accommodating their concerns.


Read the full article here: 

Kim KK, Joseph JG, Ohno-Machado L. Comparison of consumers' views on electronic data sharing for healthcare and research. J Am Med Inform Assoc. 2015 Mar 30. pii: ocv014. doi: 10.1093/jamia/ocv014. [Epub ahead of print]

NIH Finalizes Policy on Genomic Data Sharing


The National Institutes of Health has issued a final NIH Genomic Data Sharing (GDS) policy to promote data sharing as a way to speed the translation of data into knowledge, products, and procedures that improve health while protecting the privacy of research participants. The NIH news release contains highlights of the policy.

The GDS policy is an extension of and replaces the Genome-Wide Association Studies (GWAS) data sharing policy. A key tenet of the policy is the expectation that researchers obtain the informed consent of study participants for the potential future use of their de-identified data for research and for broad sharing. NIH has similar expectations for studies that involve the use of de-identified cell lines or clinical specimens.

NIH officials finalized the GDS policy after reviewing public comments on a draft released September 2013. Starting January 25, 2015, the policy will apply to all NIH-funded, large-scale human and non-human projects that generate genomic data. This includes research conducted with the support of NIH grants and contracts and within the NIH Intramural Research Program. A report from members of the NIH Genomic Data Sharing policy team appears in the August 27, 2014, advance online issue of Nature Genetics.


“Diagnosis: Data” Series Chronicles Applications of Healthcare Data in Camden, NJ


A series aired on American Public Media’s Marketplace program profiled real-world examples of using healthcare data to improve care and reduce costs in Camden, NJ.

1. Using data to treat the sickest and most expensive patients
2. Data: The secret ingredient in hospital cooperation
3. Data opens doors in healthcare, but then what?
Image of Jeffrey Brenner, MD
Jeffrey Brenner, MD. Courtesy: MacArthur Foundation.

The first segment follows Dr. Jeffrey Brenner, a MacArthur award–winning family physician who is conducting a randomized controlled trial of a care program targeted at healthcare “superutilizers”—patients with chronic conditions who accumulate high numbers of hospital visits and associated costs. The second story highlights how hospitals in Camden have created a health information exchange to share data for their Medicaid patients, with the goal of providing better care and preventing waste associated with duplicate tests. Interviewees explain that despite the potential benefits, hospitals have traditionally been reluctant to share data, but this position may be changed by incentives. The final program demonstrates that data can only go so far, sometimes revealing challenges that are not easily addressed.

“Diagnosis: Data” was produced in collaboration with Healthy States as part of reporting work examining changes in healthcare in the wake of the Affordable Care Act.