Inpatient Endpoints in Pragmatic Clinical Trials

Choosing and Specifying Endpoints and Outcomes

Section 5

Inpatient Endpoints in Pragmatic Clinical Trials


Eric L. Eisenstein, DBA

Kevin J. Anstrom, PhD

Meredith Zozus, PhD

Davera Gabriel, RN

Keith A. Marsolo, PhD

Bradley G. Hammill, PhD

Miguel Vazquez, MD

Lesley H. Curtis, PhD

Contributing Editor

Karen Staman, MS

Pragmatic Trial Inpatient Endpoints

In a traditional explanatory trial, data are collected on a case report form (CRF) usually by a study coordinator, but this can be an expensive and inefficient mechanism for gathering data. Explanatory trials also may include outcomes and endpoints that typically are not captured in clinical practice and may not be familiar to health care practitioners. In contrast, embedded pragmatic clinical trials  (ePCTs) typically do not rely upon study coordinators for data collection and focus on outcomes that are “directly relevant to participants, funders, communities, and healthcare practitioners (Califf and Sugarman 2015). Pragmatic trials approaches for acquiring patient information may include information from the electronic health record, claims from Medicare or health insurers, and patient reports. There are a number of considerations for using this “real-world” data for inpatient-based event ascertainment and there are many different sources for acquiring this information. These sources have different time lags in their availability and differing types and extent of error and bias. These factors must be understood and could affect a study’s design and results.

When an inpatient event is a PCT endpoint, the necessary data to answer a particular research question might vary depending on the level of specificity required. It might be enough to know that the patient was hospitalized in the last six months, or it might be important to know the particular cause, for example, if the patient was hospitalized for a heart attack. Using real-world data for inpatient event ascertainment also may require that inpatient events are redefined to make them more meaningful for practitioners and amenable to capture by real-world data sources. In other words, in order to be more pragmatic in data collection, an investigator may need to be more pragmatic in event definitions.

In this section, we will define how inpatient events are classified, describe the different data sources that can be used for inpatient event ascertainment and include information from the literature (where possible) about the reliability of these data sources. For most PCTs, a hybrid approach that uses more than one patient data source may be the best way to get reliable information as inexpensively as possible (Perkins et al. 2000). We also will provide case studies of pragmatic trials that use inpatient events as endpoints, and propose methods for evaluating the relative accuracy of different inpatient event data collection methods.

Role of inpatient endpoints

In certain therapeutic areas, inpatient events may capture the progression of disease, but this can be an imprecise assessment. Patients can be hospitalized for many reasons, so, depending what is needed for a trial, just the occurrence of a patient hospitalization may not provide sufficient information.

Current concepts about what constitutes inpatient and outpatient endpoints are grounded in Medicare’s part A and B definitions. Medicare defines hospital status as follows:

  • An inpatient stay is when a patient is formally admitted to a hospital. “An inpatient admission is generally appropriate for payment under Medicare Part A when you’re expected to need 2 or more midnights of medically necessary hospital care, but your doctor must order this admission and the hospital must formally admit you for you to become an inpatient (Centers for Medicare and Medicaid Services 2018)”
  • An outpatient stay includes emergency department services, observation services, outpatient surgery, lab tests, X-rays, or any other hospital services when the doctor hasn’t written an order to admit a patient to a hospital, even if the patient stays overnight (Centers for Medicare and Medicaid Services 2018).

Other inpatient care, such as stays in rehab and skilled nursing facilities, are defined differently. Further, over the years, as inpatient census acuity has steadily risen, some care that used to be inpatient, such as diagnostic catheterization and percutaneous coronary intervention, can occur at a hospital but is considered outpatient (observational) care if the patient stay is not long enough to be considered inpatient. Not only is inpatient care a changing definition, there are also regional and health-system related variations in how inpatient care is defined and provided. Therefore, investigators may need to consider not only inpatient stays for outcome ascertainment, but also observation stays and emergency department visits for events that may be similar to a hospitalization.

For more, see the information sheet from Medicare: Are you an Inpatient or an Outpatient?

How endpoints are classified

Hospitalization endpoints may be classified differently depending on diagnosis. For example, the 2017 Cardiovascular and Stroke Endpoints for Clinical Trials the definitions for cardiovascular death, myocardial infarction, coronary artery bypass surgery, and stroke do not include the word “hospitalization” (Hicks et al. 2018), although it is likely that the patient was hospitalized for these conditions. The endpoint definitions that explicitly include the term: “hospitalization” are for the conditions that are more challenging to diagnose, i.e., “hospitalization for unstable angina” and “hospitalization for heart failure.” For diseases like cancer, hospitalization is not a common endpoint, rather, endpoints are based on the progression of the disease. Recommended endpoints for cancer include disease-free survival, time to progression and progression-free survival, and treatment failure (U.S. Department of Health and Human Services 2018). Taking another tack, an FDA perspective on clinical trial endpoints focuses on direct endpoints (survival, symptoms, functioning) and indirect endpoints (biomarkers, walk tests, and tumor size) and does not focus on hospitalization per se (Burke 2012). Nonetheless, many FDA endpoints will occur in the inpatient setting and will require inpatient data for their determination.

In some cases, a hospitalization can be a marker of disease progression, and in other cases, a hospitalization can mean that a patient is getting well enough for treatment. As an example, consider acute heart failure syndrome, where there is a general lack of agreement about appropriate endpoints (Allen et al. 2009). As articulated by Allen et al, there are a number of reasons why hospital stay endpoints are challenging in patients with acute heart failure:

  • Preferences regarding staying in hospitals differ across patients
  • Regional practice patterns vary
  • Patients with a long index hospital stay are less at risk for a repeat hospital stay, because they have less time out of the hospital
  • Patients who do not survive are also not at risk for a repeat hospital stay (Allen et al. 2009).

However, for a pragmatic trial an inpatient hospitalization (acute care, observation, or emergency department) still may be the most appropriate outcome measurement.

Pragmatic Trial Outcomes

There are many factors to consider when choosing what outcomes to measure and how to measure them, including feasibility, interference with practice, validity, and precision of specific measures (Welsing et al. 2017). Because pragmatic trials focus on problems that are directly relevant to patients and clinician decision-making, the choice regarding outcome should be made with stakeholder preferences and requirements in mind (Welsing et al. 2017). There also are specific considerations related to inpatient events, such as:

  • What is the concept of interest?
  • What data is needed to answer the research question?  What data elements are needed to classify or make a decision about an event?
  • What is routinely measured for the anticipated study population? Are the data commonly documented with consistency during routine care? When and how are the data captured? Does the source provide those data with the right level of specificity and timing?
  • What is the quality of the data? Are the measures valid, accurate, and precise?
  • What opportunities are available to measure the accuracy of the data?
  • What other data sources can be combined to make the data more reliable?
  • Can an inpatient event be attributed to a specific cause? How? What confirmatory data is needed? Can manual review be avoided?
  • Are there cases that do not present to the facility?
  • How will you capture out-of-network events?
  • Are there disease modifiers that will change the treatment effect? (Welsing et al. 2017)

After some initial decisions are made, an investigator will need to consider how to collect the data with as little burden and disruption to the health care system’s workflow as possible (Larson et al. 2015).

Key questions to be considered include:

  • Will the act of measuring change clinical practice or generalizability of the study’s results?
  • Do different local practices and workflow result in different interpretations of the same data? How might data availability and interpretation differ by facility, unit or provider?
  • Are the documentation procedures consistent across facilities, units, and providers?
  • What costs are associated (consider both monetary and time costs)?

If, after considering these questions, it does not seem practicable to measure the selected outcome, and investigator may need to reconsider what to measure.

Inpatient Event Data Sources

In explanatory trials, inpatient events for enrolled patients typically are identified through four inter-related mechanisms: (1) reports run on EHR-based patient lists, (2) hospital service admissions, (3) follow-up patient phone contact, and (4) retrospective medical reviews. Many EHRs have the ability to identify patients associated with specific studies who meet certain criteria (e.g., inpatient admission, emergency room visit, etc.) and can send an alert to a provider associated with that study. When these patients are admitted, for instance, the site’s clinical research coordinator (CRC) will be notified. Additionally, CRCs will screen patients admitted to their services each day to see whether any study patients have been admitted. Many studies require follow-up phone contact with a patient to schedule visits or obtain patient information. These contacts also provide an opportunity to obtain hospitalization information. However, patients may not recall all of their inpatient events or may not be contacted in a specific follow-up interval. For this reason, CRCs will periodically conduct retrospective medical reviews as an additional source of inpatient event data. All of these methods work together to allow CRCs to identify admissions at their institutions. However, patient report may be  the only method for identifying admission at other institutions. Each of the four methods described above can be used to trigger an inpatient event. Some trials will require that these triggered events are validated by independent clinical events classification (CEC) procedures. This is beneficial when there is subjectivity and inconsistency involved in event ascertainment. Each trial’s site instructions will identify which types of information are required to validate specific inpatient events (e.g., stroke, myocardial infarction, etc.). The CRC will forward event-specific event information such as ECG’s, procedure reports, test results, to the CEC organization that will then utilize standard definitions and procedures to determine whether an event occurred or to classify an event in some other way.

In contrast with explanatory trials, pragmatic trials seek to minimize site-based direct data collection activities. This means they will rely upon patient-reported data or upon the secondary use of data collected as a part of patient care. In the following section, we discuss each of these data collection options.


Patient-Reported Data

The participant can be a valuable source of data on hospitalizations. Patient-reported data is a relatively low cost option and helps keep the participants engaged in the research (Marsolo 2019), however the accuracy of patient reported outcomes, including outcomes as notable as hospitalizations, have been questioned (Klungel et al. 1999; Barr et al. 2009; Krishnamoorthy et al. 2016). One study compared the cumulative incidence of re-hospitalizations identified by patient report versus medical bills and found that 10% of patients over-reported and 18% under-reported hospitalizations (Krishnamoorthy et al. 2016).  Of note, there were significant differences in patient characteristics associated with being an under or over reporter: under-reporters were more likely to be older, female, African American, unemployed, or a non-high-school graduate; over-reporters were more likely to be female and unemployed (see figure; Krishnamoorthy et al. 2016).

The authors of this study conclude that pragmatic research studies should not rely on patient-report alone to identify re-hospitalizations and suggest that additional mechanisms are needed. All sources of data have flaws—this is especially true of real-world data. Using multiple sources is a common technique for addressing this.

“A major challenge of pragmatic research trial design is to navigate the tension between the uncertain accuracy of patient self-report and the need to efficiently utilize study resources and lower trial costs." From Krishnamoorthy et al. 2016

There are a number of different methods to get information regarding patient-reported hospitalizations, and some are more passive (and less expensive) than others. The three most frequently used are patient portals, mobile apps, and call centers.


As part of a clinical study, participants may be asked to self-report hospitalizations either through a portal or a mobile app. For example, the ADAPTABLE trial studies the use of low dose (baby aspirin) vs. a regular dose of aspirin in individuals with heart disease on outcomes such as myocardial infarction, mortality, and hospitalizations. To report their health status, participants are prompted to login to a patient portal. However, because this method might miss some patients, a call center additionally contacts patients who do not login to the portal (National Patient-Centered Clinical Research Network (PCORnet) 2015).

Mobile apps

Mobile apps can be used as standalone methods for patient data collection or in combination with other technologies.

Some “enhanced recovery after surgery” programs use text messaging for extended surveillance. For example, text messaging has been used for home-surveillance after colorectal surgery, where text message questionnaires led to follow-up care, including  in-hospital care, re-hospitalizations, and unplanned surgeries (Carrier et al. 2016).

iPhones and other mobile technologies can be used to report hospitalizations. For example, the National Evaluation System for Health Technology (NEST) was developed to ensure the post-market safety and effectiveness of medical devices (Fleurence and Shuren 2019). A NEST feasibility/pilot Demonstration Project is being conducted to evaluate an mHealth platform for obtaining real-world post-market surveillance data for patients either after bariatric surgery (sleeve gastronomy or gastric bypass) or after catheter-based atrial fibrillation ablation (Dhruva 2018). The goal is to enroll 60 participants at their pre-operative appointment for 8 weeks of post procedure follow-up ( identifier NCT03436082). The investigators plan to obtain electronic medical record data, pharmacy data, mobile device data, and patient-reported outcomes (through questionnaires). The mobile application is HugoPHR, which aggregates data from the EHR, pharmacy portals, wearable devices, and patient-reported outcomes (PROs). HugoPHR is a sync-for-science model: people gain access to the EHR and pharmacy data, sync this (and their wearable device data) for research purposes (Dhruva 2018). It is important to note that a comprehensive picture can only be obtained if patients link their data at all the places where they receive care, and only approximately 600 sites are currently linked. This will become more feasible as more sites adopt the Fast Healthcare Interoperability Resources (FHIR) standards. The data will include information on encounters, such as a hospitalization, duration of hospitalization, and reason for hospitalization, if the patient links the app with the health system.

Call center

In a pragmatic trial, a call center can be used as back-up for other secondary information sources (as with the ADAPTABLE example above), or it can be the primary source of information (as with the TRANSFORM-HF example later in this section). The important point is that a call center can be used to validate inpatient events that are triggered by other means. For example, the INfluenza Vaccine to Effectively Stop Cardio Thoracic Events and Decompensated heart failure (INVESTED) trial is a pragmatic trial comparing the effectiveness of high-dose flu vaccine versus the standard dose in patients with a history of recent heart failure or myocardial infarction hospitalization (Vardeny et al. 2018).  Participants will be asked to inform site personnel of hospitalizations and will be called during influenza season and during the summer to ascertain outcomes.

Secondary Data Sources: EHR

One of the problems with longitudinal data collection is ensuring you get data from all the sites where a patient is treated. How do you collect EHR data for another hospital that is outside the participating system? If a patient has a heart attack while on vacation, how will you capture that? It is generally recommended to use multiple mechanisms to get secondary use data on longitudinal outcomes, such as a call center and Medicare claims. Here we describe some of the methods that use the EHR to collect inpatient event information from the enrolling and other health systems. The three most frequently used methods are database extraction from individual sites, the use of a common data model by all participating sites in a study, and the use of standards-based methods for extracting data from site EHRs.

Database Extract

IT staff or a vendor-based IT resource can generate a database extract at a given site.  These extracts can be large scale and automated, but complex queries rely on the skills of the local analyst, and extracts are more difficult to do at smaller sites that may not have the resources (Marsolo 2019). Hence, this method may have limited applicability in multi-center PCTs.

Common Data Model

Information about hospitalizations or encounters can be recorded in different ways across sites (and, as we discussed in the previous sections, across diagnoses.). Sites that participate in distributed research networks (such as PCORnet or Sentinel) or national registries (such as the American College of Cardiology’s National Cardiovascular Data Registry) may have agreed to format their stud data in a pre-specified way, i.e., by using a common data model.  If a cause for a hospitalization is needed, the level of specificity required may or may not be available because that level of specificity is not a part of the CDM definition although Common data models usually use standard controlled terminology such as ICD-9 or ICD-10 codes for diagnoses or RxNorm for drugs. However, these do not guarantee that data from different facilities are comparable. For example different site coding practices can cause differences in patients counted as having one condition versus another. Further, common data models require mapping data from an EHR repository or institutional data warehouse to the common data model. Usually the latter is a more abstract model and useful context and other detail is lost in mapping the data to the common data model (Garza et al. 2016). The encounter endpoints that include hospitalization for PCORnet and Sentinel’s common data models are described below.

Sentinel is a U.S. medical product surveillance system designed to monitor FDA-regulated medical products. The Patient-Centered Outcomes Research Institute (PCORI) funded PCORnet to build a network-of-networks to support clinical research. Their common data model was built based on the Sentinel model (Table below), but was extended to support many of the data elements found in EHRs. The table below shows the “type of encounter” definitions for the Sentinel Common Data Model and the National Patient-Centered Clinical Research Network (PCORnet).


Name Definition Sentinel PCORnet
Ambulatory Visit (AV) Includes visits at outpatient clinics, same day surgeries, urgent care visits, and other same-day ambulatory hospital encounters, but excludes emergency department encounters. x x
Emergency Department (ED) Includes ED encounters that become inpatient stays (in which case inpatient stays would be a separate encounter). Excludes urgent care visits. ED claims should be pulled before hospitalization claims to ensure that ED with subsequent admission won't be rolled up in the hospital event x x
ED Admit to Inpatient (EI) Emergency Department Admit to Inpatient Hospital Stay: Permissible substitution for preferred state of separate ED and IP records. Only for use with data sources where the individual records for ED and IP cannot be distinguished. x
Inpatient Hospital (IP) Includes all inpatient stays, same-day hospital discharges, hospital transfers, and acute hospital care where the discharge is after the admission date. (PCORnet only: Does not include observation stays, where known.) x x
Observation Stay (OS) Hospital outpatient services given to help the doctor decide if the patient needs to be admitted as an inpatient or can be discharged. Observations services may be given in the emergency department or another area of the hospital.” Definition from Medicare, CMS Product No. 11435, x
Institutional Professional Consult (IC) Permissible substitution when services provided by a medical professional cannot be combined with the given encounter record, such as a specialist consult in an inpatient setting; this situation can be common with claims data sources. This includes physician consults for patients during inpatient encounters that are not directly related to the cause of the admission (e.g. a ophthalmologist consult for a patient with diabetic ketoacidosis) guidance updated in v4.0).


Non-Acute Inst. Stay (IS) Includes hospice, skilled nursing facility (SNF), rehab center, nursing home, residential, overnight non-hospital dialysis and other non-hospital stays. x x
Other Ambulatory (OA) Includes other non overnight AV encounters such as hospice visits, home health visits, skilled nursing facility visits, other non-hospital visits, as well as telemedicine, telephone and email consultations. (PCORnet only: May also include "lab only" visits [when a lab is ordered outside of a patient visit], "pharmacy only" [e.g., when a patient has a refill ordered without a face-to-face visit], "imaging only", etc.]


x x
Other (OT)


Unknown (UN)


No Information (NI) x

There are many possible encounter types, and it is important for investigators to understand encounter type definitions and to harmonize them across sites if possible. As an example, in 2014-15 the University of California Research Exchange (UCReX) harmonized data from EHR sources across five University of California medical campuses in order to establish a common definition against which a single query would return patients counts against the geographically distributed but federated system architecture (Gabriel et al. 2014) . The data harmonization team discovered that across the sites contributing EHR data extracts there were 60 unique encounter types (Gabriel et al. 2014). Many EHRs have 100-200 encounter types, which creates two important considerations: 1) whether people can correctly map their encounter types to the CMS’s specified value set; 2) whether those value sets contain enough granularity for the research question.  In some cases, there may be benefit in using the “raw” encounter types instead of the harmonized ones.

Standards-Based Data Exchange

Fast Healthcare Interoperability Resources (FHIR) is emerging as a standard to (1) extract data from an EHR and (2) insert it in the correct place in a study database (Garza et al. 2019). There have been federal efforts to require that all certified EHRs be able to send data via FHIR.

To date, most FHIR applications have been for ‘business purposes’ (e.g., pharmacy prescriptions) while FHIR’s research resources remain less well developed. There are currently only two FHIR research resources: Research Study and Research Patient and both are at HL7 maturity level zero. Hence, while the use of FHIR for research has great potential, it is largely un-tested.

Dynamic Data Pull (DDP) is one of the few data exchange products available for research. This middleware pulls patient data from Epic into a REDCap database, is readily configurable by non-developers and can be used across multiple sites (Campion et al. 2017), making it an option for collection of outcome data in PCTs.   However, it is currently not part 11 compliant and does not use the FHIR standard (Campion et al. 2017).

Data exchange approaches face two major hurdles. First, because EHR data are not collected in a standard way, there is potential for mapping issues (Marsolo 2019). That is, two sites with the same EHR may map FHIR resources in slightly different ways. Additionally, many sites will have limited experience delivering data in this way, and the skillset to develop, maintain, and deliver data through data exchange is highly specialized (Marsolo 2019). Some of these issues are being mitigated through the US Core and Argonaut (consensus mappings) and by EHR vendors implementing these mappings as a part of their products. Nonetheless, facility-specific implementation decisions will impact the EHR vendor standard mappings.

Secondary Data Sources: Claims

Claims are another secondary data source for collecting inpatient event information from the enrolling and other health systems. However, the accuracy of billed diagnoses in identifying potential events has been shown to be less reliable than physician adjudicated events (Guimarães et al. 2017). For example, for in the Treatment With Adenosine Diphosphate Receptor Inhibitors: Longitudinal Assessment of Treatment Patterns and Events After Acute Coronary Syndrome (TRANSLATE-ACS) trial, investigators compared the 1-year incidence of events after acute myocardial infarction as identified by medical claims or physician adjudication. They found modest agreement for MI and stroke, and poor agreement for bleeding (Guimarães et al. 2017).

There are several sources of claims data that can be used for research:

  • Medicare Claims Data includes data from Medicare beneficiaries who enroll in the traditional fee-for-service Medicare program (and does not included data from patients who enroll in Medicare Advantage)
  • Medicare Advantage Data includes both Parts A and B but is provided by private insurance companies and therefore NOT included in the data sources that we describe below.
  • Collected bills from private insurance companies. For example, in the Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-term Effectiveness (ADAPTABLE) study, investigators engaged with two large, national insurance to support record linkage for participating members. (See the chapter on Using EHR Data for more information.)
  • Collected bills from a patient’s inpatient care facilities. (See the TRANSLATE-ACS example below)

It is important to note that for the data sources described below, patients authorize data to be provided from their providers to a study. This is different from workflow in the previous section that described EHR data use in research, i.e., pulling data from patents on a study from facilities where (1) patients are associated with a study in the EHR and (2) the study consented the patients for the data to be pulled.

Blue Button / Hugo / Direct (Summary of Care)

Blue Button was created by the US Department of Health and Human Services as an online tool that allows patients to view, print, and download their medical records (Turvey et al. 2014) and was intended to help with coordination of care.  Blue Button is available on the patient portal for Medicare beneficiaries (, for veterans (MyHealtheVet) and on the patient portals of those practices and clinics that choose to use it. Medicare beneficiaries can download three years of claims data, and veterans can download “demographic information (age, gender, ethnicity and more), emergency contacts, a list of their prescription medications, clinical notes, and wellness reminders.” (From With Blue Button, the patient provides the data for research; however, the completeness of the data varies by site and EHR. The document is text-based (a XML file), would need to be parsed through an app (in this case Hugo), and a patient would need to request a file from each site where they receive care (Marsolo 2019). The validity and reliability of Blue Button for use in research has not been tested.

The goal of Hugo is to combine clinical data from all a person’s encounters (pharmacy, labs, health systems and payors) with patient-generated and patient-reported data for use by the patient (Beckman and Gupta 2018).  Although the use of Hugo is relatively new in “sync-for-science” research, a recent study demonstrated its feasibility for enrollment and continued engagement in 25 people after percutaneous coronary intervention (Dhruva et al. 2019).  Hugo Health can be downloaded in the app store for both Apple and Android use on a smartphone or tablet or accessed through a website, and a person can add a study code (given to them by their health care provider) to join a study. Apple Health Records is in beta testing and is an option that is available for use on iPhones; it uses an application programming interface (API) that leverages FHIR so they are computable, but there is still some question about the validity of these data because it is so new (Marsolo 2019). For both Hugo and Apple Health Records, the participant needs to request their information at each health system or pharmacy they go to, and the list of sites that use these solutions is growing.

The CMS Blue Button 2.0 API enables Medicare beneficiaries to authorize third parties to obtain and use their part A, B and D claims data directly from CMS (as opposed to through the Blue Button patient portal), for coordinating care, services, and research. It uses the FHIR standard (API).

A proposed rule from the ONC (mentioned above) and the Centers for Medicare and Medicaid Services (CMS) would mandate a United States Core Data for Interoperability (USCDI) standard, which, if adopted, would require that CMS agencies support FHIR APIs. The rule would also add data beyond those included in the current common clinical data model to support nationwide interoperability of CMS data. A Fact Sheet describes the data classes, including those that will be added if the measures are adopted (clinical notes, data provenance, pediatric vital signs, patient address and phone number [to support data matching]).

CMS Research Identifiable Files

The traditional method for obtaining CMS data for research is through a formal request process and shipment of data files. Data can be requested from the Medicare program (which covers 95% of people aged 65 and older) or the Medicaid program (which covers low-income children, pregnant women, people with disabilities, and some elderly and non-elderly adults, although coverage differs by state, since Medicaid programs are state-run). Researchers can request the public-use data set, a limited data set with de-identified data, or a research identifiable file (RIF) with individual level data.  For use in trial follow-up, RIF files are the only option; to get RIF on individuals, an investigator must obtain private health information (PHI), such as social security numbers and date of birth, and send it to the CMS data distributor for linking. This adds more difficulty to the process. Although these data are well curated, gaining access to the data can be expensive, it can be time consuming to go through the CMS request process, and data latency can be an issue (Marsolo 2019).

Collected Participant Bills

Explanatory trial economic and quality of life studies frequently collect and abstract participant bills from study sites. However, this process is expensive and requires specially trained individuals with expertise in hospital billing and accounting systems. Hence, this method would not be preferred for a pragmatic trial.

Case Studies

In this section, we use case studies to illustrate some of the challenges faced in pragmatic trials that use inpatient events as endpoints.

1. ICD-Pieces: Understanding how re-imbursements affect outcomes measurement

As described earlier in this section, Medicare defines stays in the hospital as either inpatient, outpatient, emergency department, or observation stays.

Medicare Part A pays for inpatient care at a hospital. As part of the plan, a typical patient pays $0 for the first 20 days of inpatient stay, although there is a Part A inpatient deductible of $1364 (in 2019).

Medicare Part B pays for outpatient care or care received under observation status. A typical patient usually pays 20% for this care (Medicare Interactive 2019).

To reduce costs, in 2016, Medicare updated the rule regarding when inpatient admissions are appropriate for payment under Medicare Part A:

Two-Midnight rule: “Inpatient admissions would generally be payable under Part A if the admitting practitioner expected the patient to require a hospital stay that crossed two midnights and the medical record supported that reasonable expectation (Centers for Medicare and Medicaid Services 2015).”  Note that Medicare considers observation care an outpatient service, even if that patient stays overnight in the hospital. In addition, Medicare pays for nursing home care only if a patient spends at least three consecutive days in the hospital as an admitted patient, and observation care doesn’t count on that clock (Medicare Interactive 2019).

This Two Midnight rule differs from definitions currently used by clinical researchers and should be considered when investigators are designing a pragmatic clinical trial. For example, the Improving Chronic Disease Management with Pieces (ICD-Pieces) trial was designed to determine if clinical decision support tools (PIECES) used by practice facilitators can improve the outcomes of patients with chronic kidney disease (CKD), diabetes, and hypertension. The primary outcome is the one-year hospitalization rate (unanticipated hospitalization based on the hospital wide readmission standard from CMS; see definition below) and secondary outcomes including ED visits, readmissions, CV events and death. About two years into the trial, the two-midnight rule was implemented.  The original planned primary outcome was unanticipated hospitalization based on the hospital-wide re-admission standard from CMS. Early on in the planning year (the UH2 phase), because of implementation of a new criteria for a “2 midnight rule,” some hospital systems were shifting admissions into observation status (Vazquez and Oliver 2019).

Because of this change in criteria, the investigators defined an un-anticipated hospitalization as both an observational stay and the unplanned hospitalization stay (Vazquez and Oliver 2019). The ICD-pieces team was able to modify their definition of hospitalization as an endpoint to make it more accurately reflect actual care  (Vazquez and Oliver 2019).

CMS defines a hospital re-admission as occurring for Medicare beneficiaries who have a readmission for any cause, except for certain planned readmissions, within 30 days from the date of discharge after an eligible index admission. If a beneficiary has more than one unplanned admission (for any reason) within 30 days after discharge from the index admission, only one is counted as a readmission (Centers for Medicare and Medicaid Services 2016).

CMS guidance for reporting outcomes has gone from just hospitalization and re-admission to a new standard: excess days in acute care. In response, health systems have keep patients in observation care or emergency departments. This is important because Medicare’s hospitalization definition may differ from that used in clinical practice.

  • Part A (hospital insurance) pays for care received at the hospital by an inpatient.
  • Part B (medical insurance) pays for care received at the hospital by an outpatient under observation status.
  • Part C (Medicare Advantage) includes both Parts A and B but is provided by private insurance companies.


The ToRsemide compArisoN with furoSemide FOR Management of Heart Failure (TRANSFORM-HF) trial ( Identifier: NCT03296813 is a cluster-randomized PCT for patients hospitalized for new or worsening heart failure (~6,000 patients and ~50 sites). Patients receive a prescription for an oral diuretic (torsemide or furosemide; both are commonly prescribed) prior to hospital discharge; all-cause mortality (see Using Death as and Endpoint) and re-hospitalization over 12 months are endpoints.

A re-hospitalization event is defined as “an admission to an inpatient unit or a visit to an emergency department that results in at least a 24-hour stay (or a change in calendar date if the time of admission/discharge is not available) after discharge from the index hospitalization. This definition excludes observational stays that are less than 24 hours. Only hospitalizations verified with an official medical record, as described in the process detailed below, and that meet the minimum stay definition will count as a re-hospitalization event for the secondary endpoint analysis. The type and reason for the admission are not factored into the event definition (from the TRANSFORM-HF statistical analysis plan).”

Re-hospitalization events may be triggered via telephone interviews at 1, 6, and 12 months with the patient or proxy, through official medical records obtained when neither the patient nor any of their proxies can be reached, or through discharge summaries from a 12-month medical record query that screens for hospitalizations 12 months after randomization at the patient’s hospital/site (Figure; TRANSFORM HF Statistical Analysis Plan). If an event is triggered, then the event is recorded in the database along with the date and time of admission and discharge.

Figure. TRANSFORM-HF Hospitalization Events

It is worth noting the process for identifying hospitalizations that occur in health systems other than the systems participating in the trial: if at one of the follow-up calls a patient or proxy indicates that a hospitalization occurred, the study staff will request a discharge summary from sites that are not participating in the trial.

3. Case Example: ADAPTABLE

The Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-term Effectiveness (ADAPTABLE) study is a large, pragmatic trial designed to determine the optimal maintenance aspirin dose for patients with coronary artery disease. According to the protocol, the primary endpoint of this study is the composite rate of all-cause mortality, hospitalization for nonfatal MI, or hospitalization for nonfatal stroke (National Patient-Centered Clinical Research Network (PCORnet) 2015). The sites participating in ADAPTABLE are part of the Patient Centered Outcomes Research Network (PCORnet), and have agreed to format their data according to a common data model (CDM). Algorithms are being validated for extracting endpoint data from the EHR through the Common Data Model, from Medicare claims data, and from select private health plan data (for more on longitudinal data linkage in ADAPTABLE, see this Case Study.) ADAPTABLE expects to enroll around 15,000 participants; they are referred by their physician, and enrolled through an online portal. As a part of this process, they are asked to report any hospitalizations through the portal. The reconciliation of these patient-reported hospitalizations is shown in the figure below.

From the ADAPTABLE Protocol. Used with permission.

The ADAPTABLE investigators researched patient-reported health data and meta-data standards for patient-reported outcome (PRO) data. They found no useful standardized concepts for hospitalization for nonfatal myocardial infarction (MI) or stroke and state “We do not believe there is much to be gained for the purposes of this study by combining existing atomic concepts to represent these more complex items” (Yang and Tenenbaum 2018).

Data Source Accuracy

The NIH Collaboratory has defined three data quality dimensions for determining research data fitness for use: completeness, accuracy, and consistency: Assessing Data Quality for Healthcare Systems Data Used in Clinical Research (Version 1.0) (Zozus et al. 2016). Although each data quality dimension has direct relevance for inpatient event data quality, the importance of data accuracy often is neglected in discussions regarding the choice of inpatient event data collection methods.

Although numerous methods have been proposed for collecting inpatient event data, there is little comparative research regarding the relative accuracy of these methods and the resulting implications for pragmatic clinical trial design. Comparative accuracy research is needed to allow clinical trial planners to understand the limitations associated with different data collection methods, properly estimate inpatient event rates, and inform their sample size estimates.

As an example, the Treatment With Adenosine Diphosphate Receptor Inhibitors: Longitudinal Assessment of Treatment Patterns and Events After Acute Coronary Syndrome (TRANSLATE-ACS; Identifier: NCT01088503) study was designed to evaluate the use of prasugrel and other ADP receptor inhibitor therapies among myocardial infarction (MI) participants treated with percutaneous coronary intervention (PCI) (Chin et al. 2011).  It is one of the few multi-center studies to compare inpatient event data collection methods (Krishnamoorthy et al. 2016; Guimarães et al. 2017). These investigators compared (a) patient-reported versus physician-adjudicated myocardial infarction (MI) events and (b) hospital bill-derived versus physician-adjudicated MI events. We use this as a case study because it provides insights into the limitations inherent in using patient-reported and medical claims data as the sole sources for determining inpatient endpoints.


TRANSLATE-ACS was an observational study that enrolled 12,365 acute myocardial infarction patients. After discharge, a centralized call center conducted telephone interviews with patients at 6 weeks, and at 6, 12, and 15 months follow-up. During these interviews patients were asked for information regarding re-hospitalizations. In a subset of interviews, patients were asked for their re-hospitalization reason. Initiating a source document collection process to obtain objective evidence, such as a claim or bill for a hospitalization, was triggered based upon patient reported re-hospitalizations and automatic queries from their enrolling hospital at 12 months follow-up. In a second stage, the investigators requested patient medical records when a patient’s hospital bill indicated the patient may have experienced a major adverse cardiovascular event (MACE). An independent physician committee (also called a CEC above) then adjudicated patient medical records to validate MACE events. In this study’s context, hospital bill data collection assumes prior patient interviews and physician adjudication data collection assumes prior triggering of events, collection of source documents for those events and expert adjudication of the events based on the source documents. Subsequent analyses compared results from these different inpatient event data collection methods through 12 months follow-up. At one year after MI, they found that event rates for MI, stroke, and bleeding were lower when medical claims were used to identify events than when adjudicated by physicians (Guimarães et al. 2017).

Validating an inpatient event data collection method is conceptually similar to validating a computable phenotype. The data collection method must be able to identify the endpoint event it purports to identify and meet a desired accuracy level when compared with the best methods for assessing the endpoint event (i.e., the gold standard, which is physician adjudicated medical records) (Richesson and Smerek 2014). Without such validation, the researcher can’t confidently state that the data support the conclusions. Whereas with such a validation, the researcher can provide evidence and will have quantified the uncertainty. There is a trade-off between a definition that can be applied consistently [across multiple reviewers (in the case of CEC) or EHR data from multiple facilities in the case of computational definitions over EHR data] versus clinical accuracy (agreement with the truth). The existence of the trade-off emphasizes that measurement of the inaccuracy is needed.

We will define “data collection ascertainment accuracy” using three metrics: sensitivity, specificity and positive predictive value (PPV). We also will define Type I and II errors associated with data collection accuracy. While these data collection errors may influence traditional hypothesis testing Type I and II error rates, they are distinct and represent another factor to be considered in pragmatic trial sample size estimation. Essentially, traditional sample size estimation methods make implicit assumption regarding data accuracy. Here, we are making those assumptions more explicit.


For TRANSLATE ACS data collection accuracy ascertainment

  • True positive: physician adjudication and a patient report or hospital bill indicates that a MI event occurred.
  • False negative: physician adjudication indicates that a MI event occurred, but according to patient report or hospital bill, it did not occur.
  • False positive: patient report or hospital bill indicates that a MI event occurred, but according to physician review of the medical record, it did not actually occur.
  • True negative: both physician adjudication and patient report or hospital bill indicate that an event did not occur.


Gold Standard Condition (Physician adjudication indicates an event occurred)
Yes No
Comparator Condition (Patient report or hospital bill indicates an event occurred) Yes True positive (TP) False positive (FP)
No False negative (FN) True negative (TN)


Sensitivity is the true positive (TP) rate, in this case, the proportion of actual inpatient myocardial infarction (MI) events that are identified by a given inpatient event data collection method. Sensitivity is measured using the number of true positives (TP) and false negatives (FN) with this formula: TP/(TP + FN).

Sensitivity will help gauge the possibility of a Type II Error (the possibility of rejecting the null hypothesis when it is true) (Sharma et al. 2009). The type II error rate with respect to measurement is defined as 1 – sensitivity by Sharma et al (2009). As shown in the table below, the sensitivity of patient-reported MI events in TRANASLATE-ACS is quite low, meaning that patients tended to under state their MI events, while the sensitivity of hospitals bills is higher, and improves with the number of ICD diagnosis codes that are used to identify the MI event.


Myocardial Infarction Event Data Sources TP FP FN TN Sensitivity







Standard Compare    
Physician adjudicated Patient Report 103 257 254 0.289 0.286
Physician adjudicated Hospital Bill
   1st dx code 482 66 264 1145 0.646 0.945 0.880
   2nd dx 588 90 158 1121 0.788 0.926 0.867
   All dx code 625 103 121 1108 0.838 0.915 0.859


Specificity is the true negative (TN) rate. In this case the proportion of events that are not MIs identified by a given inpatient event data collection method. Specificity is measured using the number of true negatives (TN) and false positives (FP) with this formula: TN/(TN+FP).

Specificity will help us gauge the possibility of a Type I Error (the failure of rejecting a false negative hypothesis) (Sharma et al. 2009). The type I error rate is 1-specificity as defined by Sharma et al (2009). In the table above, there is no specificity for patient-reported MI events because the TRANSLATE ACS authors did not report the associated true negatives. In contract with sensitivity, the specificity of hospital bill inpatient event data collection is high but decreases with increases in the number of ICD diagnosis codes used to identify the MI event.

Positive Predictive Value (PPV) is defined as the proportion of actual inpatient events among those identified by a given inpatient event data collection method (TP/(TP + FP)) and will help gauge the precision associated with an inpatient event data collection method. As with sensitivity, the PPV was low for patient report and higher for hospital bills, and improves with the number of ICD diagnosis codes used to identify the MI event.

We can extend these metrics to compute the measurement concepts of Type I (1-specificity) and Type II (1 – sensitivity) Error rates (Sharma et al. 2009), as shown in the table below. For hospital bills using all diagnosis codes, the Type I Error Rates is 0.162 (1- 0.838), and the Type II Error Rate is 0.085.


Myocardial Infarction Event Data Sources Errors
Standard Comparator Type I

(false positive)

Type II

(false negative)

Physician adjudicated Patient report 0.711 0.714
Physician adjudicated Hospital Bill
   1st diagnosis code 0.354 0.055
   2nd diagnosis code 0.212 0.074
   All diagnosis codes 0.162 0.085


Our Type I and Type II Error metrics tell us that the combined patient report and hospital bill data collection methods will miss a number of true MIs (FN) and include events that are not true MIs (FP). One reason for false positives may be the stringent definitions used in adjudication of endpoints within traditional trials. Nonetheless, this information will help us understand how the accuracy of data collection methods may impact sample size estimation and subsequent hypothesis testing.

Metrology (the science of measurement) evaluates the reliability of measurements in terms of their objectivity and intersubjectivity (Mari et al. 2012). Objectivity guaranteesthat measurement results are independent of their context (e.g., properties of the object being measured, the measurement system, and the person who is measuring). To gauge objectivity, we can compare the TRANSLATE –ACS hospital billing data collection results with those from the Women’s Health Initiative (WHI) (Hlatky et al. 2014). In the WHI study, physician adjudicators used standardized forms and definitions to review patient medical records and identify MI events. Inpatient hospital bill MI events then were identified by linking study patients with their CMS Medicare Part A claims. The resulting analysis only compared physician adjudicated MIs with CMS claims-identified MIs in the first or second diagnosis codes. The resulting sensitivity (0.790) and specificity (0.988) are close to their corresponding TRANSLATE-ACS values. However, the PPV (0.708) is lower.  This difference illustrates how data from different facilities, units, and providers (and patients/sub-groups of patients) may yield different and biased answers. A researcher’s options are to (1) measure the impact, or (2) measure the range and location of the errors and simulate the impact and to show that the error is inconsequential (or not) (Richesson et al. 2013 Sep 11).

Metrology’s intersubjectivity evaluation standards requires that measurement results convey both information regarding measurement values and the degree of trust that should be attributed to those values (Mari et al. 2012). This information can be presented as a target measurement uncertainty that defines the minimal quality needed to support a specific decision. The implication is that if the actual uncertainty is greater than the target, the measurement value is not considered valid because it cannot support the intended use. This is similar to the use of Type I and II Error rates in sample size estimation and hypothesis testing. In the example above, measurement values are the MI event rates obtained by different data collection methods and the degree of trust attributed to those values are their associated Type I and Type II Error rates. The unanswered question is whether these rates meet minimal data quality accuracy standards. While it could be argued that data quality accuracy errors may have minimal effect upon clinical trial outcomes, the burden of proof lies with the investigator. Calls for reporting data quality assessment results with published research results have been made (Kahn et al. 2015).  Those who use these data and rely upon their associated study results in decision-making should be aware of potential data quality accuracy limitations that may influence their use and interpretation of study results. For this reason is it essential that minimum inpatient event data quality accuracy standards are developed.

One way to compensate for measurement error is to increase the sample size, but this can increase financial costs and risks; another way to compensate is to 1) improve the reliability of raters through training or 2) use multiple methods for ascertaining information (Perkins et al. 2000). Meaningful reductions in sample size have been gained from using the mean of multiple sources and other improvements in reliability (Perkins et al. 2000). Nonetheless, investigators will need to determine the data collection methods most appropriate for their study before designing their study and making their sample size estimates.

The example above uses CEC adjudicated endpoints as the gold standard. However, data collection methods will vary in their accuracy versus this gold standard and in their applicability to clinical practice. Previous research has demonstrated that the accuracy of claims data is high for cardiovascular procedures (e.g., CABG surgery and PCI) and much lower for bleeding events. Perhaps, this is because bleeding event definitions used in explanatory trials (Mehran et al. 2011) are not relevant for clinical practice and could be replaced by the number of transfusions. Similarly, the explanatory trial CEC myocardial infarction definition may differ from that commonly used in clinical practice.

All data collection methods are associated with type I and II errors. These error rates will vary by endpoint and data collection method. These errors typically are not accounted for in sample size estimates. Research is needed to determine these error rates and how they may influence sample size estimates. It is also the case that certain endpoints used in explanatory trials may have little relevance in actual practice and may not be recorded in EHRs. Because of these issues, the pragmatic trial community needs to collectively determine which endpoints are relevant for pragmatic trials, how they can be measured and validated, and how the accuracy of these measurement methods may impact hypothesis testing sample size estimates.







back to top

Allen LA, Hernandez AF, O’Connor CM, Felker GM. 2009. End Points for Clinical Trials in Acute Heart Failure Syndromes. Journal of the American College of Cardiology. 53:2248–2258. doi:10.1016/j.jacc.2008.12.079.

Barr ELM, Tonkin AM, Welborn TA, Shaw JE. 2009. Validity of self-reported cardiovascular disease events in comparison to medical record adjudication and a statewide hospital morbidity database: the AusDiab study. Intern Med J. 39:49–53. doi:10.1111/j.1445-5994.2008.01864.x.

Beckman AL, Gupta S. 2018. Empowering people with their healthcare data: An Interview with Harlan Krumholz. Healthcare. 6:238–239. doi:10.1016/j.hjdsi.2018.08.002.

Burke LB. 2012. An FDA Perspective on Clinical Trial Endpoint Measurements.

Califf RM, Sugarman J. 2015. Exploring the ethical and regulatory issues in pragmatic clinical trials. Clinical Trials. 12:436–441. doi:10.1177/1740774515598334.

Campion TR, Sholle ET, Davila MA. 2017. Generalizable Middleware to Support Use of REDCap Dynamic Data Pull for Integrating Clinical and Research Data. AMIA Jt Summits Transl Sci Proc. 2017:76–81.

Carrier G, Cotte E, Beyer-Berjot L, Faucheron JL, Joris J, Slim K. 2016. Post-discharge follow-up using text messaging within an enhanced recovery program after colorectal surgery. Journal of Visceral Surgery. 153:249–252. doi:10.1016/j.jviscsurg.2016.05.016.

Centers for Medicare and Medicaid Services. 2015. Fact sheet: Two-Midnight Rule.

Centers for Medicare and Medicaid Services. 2016. Hospital-Wide All-Cause Unplanned Readmission Measure (NQF #1789).

Centers for Medicare and Medicaid Services. 2018. Are you an Inpatient or an Outpatient.

Chin CT, Wang TY, Anstrom KJ, et al. 2011. Treatment with adenosine diphosphate receptor inhibitors-longitudinal assessment of treatment patterns and events after acute coronary syndrome (TRANSLATE-ACS) study design: expanding the paradigm of longitudinal observational research. Am Heart J. 162:844–851. doi:10.1016/j.ahj.2011.08.021.

Dhruva SS. 2018. Using a Novel mHealth Platform to Obtain Real-World Data for Post-Market Surveillance: A NEST Demonstration Project.

Dhruva SS, Mena-Hurtado C, Curtis J, et al. 2019. Learning How to Successfully Enroll and Engage People in a Mobile Sync-for-Science Platform to Inform Shared Decision Making. Journal of the American College of Cardiology. 73:3039. doi:10.1016/S0735-1097(19)33645-9.

Fleurence RL, Shuren J. 2019 Mar 19. Advances in the Use of Real‐World Evidence for Medical Devices: An Update From the National Evaluation System for Health Technology. Clinical Pharmacology & Therapeutics.:cpt.1380. doi:10.1002/cpt.1380.

Gabriel D, Meeker D, Bell D, Matheny M. 2014. Data Harmonization and Synergies: OMOP, PCORnet CDM and the CTSA cohort identification models.

Garza M, Del Fiol G, Tenenbaum J, Walden A, Zozus MN. 2016. Evaluating common data models for use with a longitudinal community registry. J Biomed Inform. 64:333–341. doi:10.1016/j.jbi.2016.10.016.

Garza M, Myneni S, Nordo A, et al. 2019. ESource for Standardized Health Information Exchange in Clinical Research: A Systematic Review. Stud Health Technol Inform. 257:115–124.

Guimarães PO, Krishnamoorthy A, Kaltenbach LA, et al. 2017. Accuracy of Medical Claims for Identifying Cardiovascular and Bleeding Events After Myocardial Infarction : A Secondary Analysis of the TRANSLATE-ACS Study. JAMA Cardiol. 2:750–757. doi:10.1001/jamacardio.2017.1460.

Hicks KA, Mahaffey KW, Mehran R, et al. 2018. 2017 Cardiovascular and Stroke Endpoint Definitions for Clinical Trials. Circulation. 137:961–972. doi:10.1161/CIRCULATIONAHA.117.033502.

Hlatky MA, Ray RM, Burwen DR, et al. 2014. Use of Medicare Data to Identify Coronary Heart Disease Outcomes in the Women’s Health Initiative. Circulation: Cardiovascular Quality and Outcomes. 7:157–162. doi:10.1161/CIRCOUTCOMES.113.000373.

Kahn MG, Brown JS, Chun AT, et al. 2015. Transparent reporting of data quality in distributed data networks. EGEMS (Wash DC). 3:1052. doi:10.13063/2327-9214.1052.

Klungel OH, de Boer A, Paes AH, Seidell JC, Bakker A. 1999. Cardiovascular diseases and risk factors in a population-based study in The Netherlands: agreement between questionnaire information and medical records. Neth J Med. 55:177–183.

Krishnamoorthy A, Peterson ED, Knight JD, et al. 2016. How Reliable are Patient-Reported Rehospitalizations? Implications for the Design of Future Practical Clinical Studies. J Am Heart Assoc. 5. doi:10.1161/JAHA.115.002695.

Larson EB, Tachibana C, Thompson E, et al. 2015 Jul. Trials without tribulations: Minimizing the burden of pragmatic research on healthcare systems. Healthcare. doi:10.1016/j.hjdsi.2015.07.005.

Mari L, Carbone P, Petri D. 2012. Measurement Fundamentals: A Pragmatic View. IEEE Trans Instrum Meas. 61:2107–2115. doi:10.1109/TIM.2012.2193693.

Marsolo K. 2019. Approaches to Patient Follow-Up for Clinical Trials: What’s the Right Choice for Your Study?

Medicare Interactive. Medicare in 2019.

Mehran R, Rao SV, Bhatt DL, et al. 2011. Standardized bleeding definitions for cardiovascular clinical trials: a consensus report from the Bleeding Academic Research Consortium. Circulation. 123:2736–2747. doi:10.1161/CIRCULATIONAHA.110.009449.

National Patient-Centered Clinical Research Network (PCORnet). 2015. Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-term Effectiveness (ADAPTABLE) Study Protoco.

Perkins DO, Wyatt RJ, Bartko JJ. 2000. Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials. Biol Psychiatry. 47:762–766.

Richesson RL, Rusincovitch SA, Wixted D, et al. 2013 Sep 11. A comparison of phenotype definitions for diabetes mellitus. J Am Med Inform Assoc. doi:10.1136/amiajnl-2013-001952.

Richesson RL, Smerek M. Electronic Health Records-based Phenotyping, in Rethinking Clinical Trials A Living Textbook in Pragmatic Clinical Trials. NIH Health Care Systems Research Collaboratory. Published June 27, 2014.

Sharma D, Yadav UB, Sharma P. 2009. The concept of sensitivity and specificity in relation to two types of errors and its application in medical research. J Reliability Stat Stud. 2:53–58.

Turvey C, Klein D, Fix G, et al. 2014. Blue Button use by patients to access and share health record information using the Department of Veterans Affairs’ online patient portal. J Am Med Inform Assoc. 21:657–663. doi:10.1136/amiajnl-2014-002723.

U.S. Department of Health and Human Services. 2018. Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics Guidance for Industry.

Vardeny O, Udell JA, Joseph J, et al. 2018. High-dose influenza vaccine to reduce clinical outcomes in high-risk cardiovascular patients: Rationale and design of the INVESTED trial. American Heart Journal. 202:97–103. doi:10.1016/j.ahj.2018.05.007.

Vazquez MA, Oliver G. 2019. ICD-Pieces: Lessons Learned in an Ongoing Trial.

Welsing PM, Oude Rengerink K, Collier S, et al. 2017. Series: Pragmatic trials and real world evidence: Paper 6. Outcome measures in the real world. J Clin Epidemiol. 90:99–107. doi:10.1016/j.jclinepi.2016.12.022.

Yang Z, Tenenbaum J. 2018. ADAPTABLE Supplement Report:Patient-Reported Health Data and Metadata Standards in the ADAPTABLE Study.

Zozus MN, Hammond WE, Green BB, et al. 2016. Assessing Data Quality for Healthcare Systems Data Used in Clinical Research.

Version History

Published June 29, 2019.


Eisenstein E, Anstrom K, Zozus M, et al. Choosing and Specifying Endpoints and Outcomes: Inpatient Endpoints in Pragmatic Clinical Trials. In: Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials. Bethesda, MD: NIH Health Care Systems Research Collaboratory. Available at: Updated September 10, 2019. DOI: 10.28929/117.