Using Death as an Endpoint

Choosing and Specifying Endpoints and Outcomes

Section 4

Using Death as an Endpoint


Eric L. Eisenstein, DBA

Lesley H. Curtis, PhD

Kristi Prather, MPH

Mike Hogarth, MD

Kevin J. Anstrom, PhD

Davera Gabriel, RN

Daniel Mark, MD

Robert Mentz, MD

Stephen Greene, MD


Lillian Ingster, PhD, Director, NDI

Ursula Rogers, DHTS ETL Expert

Tina Harding

Amanda Harrington, BSW

Contributing Editor

Karen Staman, MS

Prevention of death through therapeutic intervention is often a major focus of clinical research. At the individual patient level, death is the hardest of the "hard" or objectively measurable endpoints. In traditional explanatory trials, patient deaths are identified by site personnel, and the study's clinical events committee adjudicates types of death. However, death identification and adjudication may be more complicated with pragmatic clinical trials (PCTs) that rely on data collected from the patient's electronic health record (EHR), medical claims, self-report, or medical devices. Ascertaining if and how a patient death has occurred is considerably more complicated, especially if a patient dies outside the clinical care system (Eisenstein et al. 2019).

Since the typical United States healthcare system does not have standardized processes to ascertain patient deaths, explanatory clinical trials frequently include significant resources dedicated to collecting death data; a study coordinator may contact a family member or other proxy to schedule a study visit, or may search the internet to determine the patient's current location (Eisenstein et al. 2019). Compounding the issue, there is no timely and comprehensive national death database that efficiently links with EHR records (Eisenstein et al. 2019). This is due to differences between the types of information EHRs collect, what death databases require for linking, and state restrictions on the reuse of death data and other vital records (da Graca et al. 2013). However, for PCTs that use death as an endpoint, there are alternate procedures for obtaining death data. In this section, we describe alternative death data sources and methods for obtaining information from them. We then illustrate the use of these procedures in a hybrid death identification and verification approach that is being used in the ToRsemide compArisoN with furoSemide FOR Management of Heart Failure (TRANSFORM-HF) PCT ( Identifier: NCT03296813).

Sources of Death Data

Obtaining information to determine whether someone is alive or dead would seem to be a simple task. However, the reported mortality rate for a group of individuals can vary widely depending upon the death information source used and the method for ascertainment (Warren et al. 2017). As a first step in death event identification planning, researchers should determine what data they require from the death event. This will determine the most appropriate death data source for their study, as sources of death data vary in both the amount and type of data they provide. For example, it may be sufficient to simply know the study participant has died, otherwise known as the “fact of death” (FOD) information about that decedent. This often involves a non-comprehensive death file with enough data elements to link with to determine the date of death. However, some studies will want the date of death, cause of death, and other related conditions, and perhaps occupation and educational level.

Next, researchers should consider whether a single death information source is sufficient for their needs or whether a hybrid approach that combines multiple data sources might yield better results. Factors to consider include whether one source makes the most sense for a specific study, and if multiple sources are combined, how will death discrepancies between data sources be addressed? While researchers can reason about the appropriateness of specific death information sources for a particular study, there is scant empirical data to use in making those determinations. Three national databases that should be considered are the Death Master File from the Social Security Administration (SSA), the Medicare Master Beneficiary Summary File, and the National Death Index (NDI) from the National Center for Health Statistics (NCHS). Other sources of death information are individual state vital statistics and the use of a central call center that consolidates site follow-up activities typically conducted in explanatory clinical trials. An additional recent source to consider is not a database but rather an FOD web service provided through the nonprofit National Association of Statistics and Information Systems (NAPHSIS).

The Federal-State Relationship in Collecting Vital Records

A brief overview of the process that produces vital event data and statistics is a useful introduction to explaining the nuances of who “owns” the data and why it is so challenging to have a single, timely, and complete national file available for adjudication of vital status.

The United States has always had a highly decentralized federal vital statistics system. Collection of vital statistics in the United States is a state function rather than a federal function. The reason is due both to how vital record collection evolved and also to the legal situation: because the collection of these data is not explicitly outlined as a federal responsibility in the Constitution, federal authority in directly conducting this process is limited. On a practical level, the local nature of vital event registration is most likely due to its origins as a local government function and because it would have been impossible to undertake the process efficiently at a federal level until the advent of modern technologies. The civil registration of births, deaths, and marriages is one of the oldest systematic collections of data in the United States. Births, deaths, and marriages were a civil registration function in the Commonwealth of Virginia as early as 1632. The modern era of systematic registration of vital events began in 1842 in Massachusetts with passage of the first law requiring statewide registration of vital events. Since 1933, all states and territories have required registration of vital events. Since then, the registration of vital events has broadened to include not only civil registration but also the collection of public health data. More recently, vital records offices have the added responsibility of helping to ensure national security through the effective stewardship of birth certificates (National Research Council 2009).

Today, vital event data reported to the federal government includes data on birth, death, and fetal death events. Although local vital record offices also collect data on marriages and divorce, the federal government stopped routine collection of this information in 1995. The NCHS is charged with collecting and aggregating these data at the federal level. Since vital event registration happens at a local level, the NCHS obtains data from local registrations through the Vital Statistics Cooperative Program (VSCP), which pays each state/territory for its data. Each jurisdiction has a VSCP agreement with NCHS to provide data according to NCHS national standards for quality and timeliness (National Research Council 2009).

There are a number of challenges with national aggregation of vital statistics. One of these is timeliness. Vital statistics compiled centrally at the national level can only be as timely as the latest state. Many states have implemented electronic systems but not fully. As of June 2018, NAPHSIS reported 46 jurisdictions with an electronic death registration system (EDRS), but only 39 had over 75% of the death events registered through an EDRS (NAPHSIS EDRS Map). Four jurisdictions did not have an EDRS of any kind. Another limitation is the need to ensure state laws governing the release of vital records are honored by federal agencies that receive these data. The redaction from the death master file, explained in more detail below, is the result of this legal constraint. Another challenge is ensuring a high degree of data quality in a process that involves 57 separate jurisdictions, with those using electronic data capture systems rarely using the same system. The public health aspects of data collection are perhaps the most challenging as they often involve a clinical provider. The vast majority of clinical providers use EHRs today. Yet, integration between EHRs and electronic vital record systems is rare. A handful of states have enabled integration of an EHR with their EBRS. As of June 2018, only two states (California and Utah) had demonstrated an integration between their EDRS and an EHR. The California experience showed relatively high implementation costs for health systems (>$50,000 to start), which could be a significant barrier to broader adoption. Furthermore, when it comes to deriving the important “cause of death,” the process is not immediate, and in 25% of the cases requires a manual review and adjudication by a trained nosologist. There is only one cause of death for the decedent. Think of it as "classifying" the death into only one cause, or assigning it a single code from the International Classification of Diseases (ICD). In order to accomplish this, the model death certificate in the United States has four blocks of narrative text known as the “underlying causes of death.” The medical certifier uses these to outline the cascade, or sequence, of events that resulted in the death. The NCHS uses a semi-automated process to classify the single "cause of death" in the NDI database based on the information on the death certificate. The national file that includes cause of death is not complete until all the deaths have been classified for the reporting time period, typically a calendar year.

Death Master File

To administer its programs, the Social Security Administration (SSA) collects death information from family members, funeral homes, financial institutions, postal authorities, states, and federal agencies. Prior to 2011, the SSA Death Master File (DMF) was the timeliest, most comprehensive, and least expensive method for obtaining patient death data (da Graca et al. 2013). However, in 2011, the SSA agreed with closed record states that §205(r) of the Social Security Act (SSA 1983) could not supersede state laws limiting disclosure of the state records. This resulted in the removal of 4 million records (5%) and in the annual exclusion of 1 million new files (40% of new deaths) from the DMF (National Technical Information Service 2011; da Graca et al. 2013). While the public DMF does not contain death data received from states, it still includes information obtained from other sources and remains a valuable death data resource for researchers. Potential users can apply to the certification program and pay an annual subscription fee for access to the files. Alternatively, DMF data are commercially available, in whole or in part, in services such as and When using the DMF, researchers should be aware that these death records are incomplete. A recent analysis compared the DMF to Medicare and commercial insurance databases and demonstrated that the DMF markedly underestimated mortality rates (Navar et al. JAMA Cardiology 2019). This under-capture of death data varied significantly overtime and between states, leading the authors to conclude that “Researchers should avoid relying on mortality estimates based on the SSDMF alone and be aware of heterogeneity in SSDMF data completeness” (Navar et al. JAMA Cardiology 2019). The implication is that while DMF deaths are actual deaths, the absence of a DMF death does not mean that a patient is alive.

Medicare Master Beneficiary Summary File

If a substantial number of study patients are Medicare beneficiaries, the Medicare Master Beneficiary Summary File may be an option for obtaining death data. This file includes death information received from Medicare claims, family member online date of death edits, and Medicare benefits information collected from the Railroad Retirement Board and the SSA. This file is available from the Research Data Assistance Center (ResDAC) with a 9-month lag from the close of the calendar year (da Graca et al. 2013). The standard linking approach relies on direct identifiers—Social Security /Health Insurance Claims number (Medicare ID number), date of birth, and sex. In the past, ResDAC also has allowed the use of deterministic linkage approaches based on dates of service (Hammill et al. 2009). Researchers should be aware that Medicare Master Beneficiary Summary File death records only include people with a Medicare beneficiary number. This means that the absence of a death in this file does not mean that a non-Medicare patient is alive.

State Vital Statistics

Only states have the authority to collect birth and death data, and they typically collect this information through vital records offices within departments of public health. The process of registering the death event has both legal and epidemiological components and involves at least six separate sources of information, making it a complex process often not well understood outside the confines of the small public health vital statistics community. Coroner cases are more complex and involve additional sources of information including a coroner (law enforcement) and a forensic pathologist. Some states, like California, have made the vital statistic process entirely digital and accept applications for public use birth and death files. California has two types of death files available. The Comprehensive Death File requires review and approval by the California Vital Statistics Advisory Committee (VSAC) and includes a substantial amount of data, including cause of death, Social Security number, etc. The California non-comprehensive death file is an FOD file and includes deaths from 2005 to present. The death data has a short lag time (85% of deaths are less than 30 days old when included in the file, 96% of deaths are less than 60 days old). The cost is minimal ($120 per year). California does not include a Social Security number in the non-comprehensive death file, as the statute regarding public release of death information prohibits it. Depending on the number and location of sites in a clinical trial, the use of state vital statistics, alone or in combination with another death data source, may be a viable option for obtaining death endpoint data. However, information from more than one state may be necessary to detect the death of a patient who resided in one state and died in another. Researchers should be aware that state vital statistics death records are incomplete and only contain information for deaths occurring in that state. This means that the absence of a death in a state death file does not mean that a patient is alive.

NAPHSIS EVVE Fact of Death Service

The NAPHSIS FOD service is not a database but instead a secure online web service derived from a service originally developed to verify birth certificates. Birth certificates are often required to be presented for benefit eligibility, and origination of key identification documents (driver’s licenses, Social Security cards, passports, etc.). NAPHSIS electronic verification of vital events (EVVE) allows a state agency office in one state to verify a birth certificate being presented by the applicant from another state. NAPHSIS extended this service to provide fact of death querying based on death certificate data from participating states. A key aspect of the NAPHSIS EVVE system is that it is a distributed querying system that directly accesses vital records databases of participating states on demand, so it is using the most current information for each state. The SSA DMF, Medicare Master Beneficiary Summary File, and the Centers for Disease Control and Prevention's NDI have varying degrees of latency, or “lateness,” because they must aggregate data from all the states into a single centralized system, which can make the databases anywhere from 6 to 36 months behind when the death events occurred. State files like the California non-comprehensive file are timelier but still have a lag time of around 15 to 30 days. Although the NAPHSIS EVVE FOD does not have the 6- to 36-month aggregation time lag, not all states participate. As of July 2018, 40 states and the Commonwealth of Puerto Rico were part of the EVVE FOD. Pricing for nongovernmental use varies and is based on volume. For example, submitting 5000 to 10,000 records to the service costs $3000, or $1.60 per record. A query with 1 million records costs $10,000, or $0.01 per record. It is important to note that although this is the timeliest source of death data, it is only fact of death information, and some states (including California) do not participate.

National Death Index

The Centers for Disease Control and Prevention’s NCHS contracts with state vital statistics offices to receive and compile annual death registries in the NDI, a centralized database of all US deaths. The mission of NCHS is to help guide public health and health policy decisions and to “aid health and medical investigators with their mortality ascertainment activities” (NCHS). This service does not allow access to the public or organizations for legal, administrative, or genealogy purposes. The NCHS only allows use of the NDI for mortality determination in qualifying research studies. Use for administrative purposes is not allowed. This is not the result of statutory restriction, but rather because of how the NDI has evolved. The NDI is not provisioned by law nor is it funded through Congressional appropriation. Instead, the NDI is the result of over 40 years of trust building between the 57 federal jurisdictions and the NCHS (Dr. Charles Rothwell, personal communication). The agreed-upon process and use has been acceptable to all 57 jurisdictions under the existing research-only use model, which is also an accepted use in every jurisdiction. NCHS is, in essence, an “honest broker,” trusted by the 57 jurisdictions to use their data to support research studies in any jurisdiction. This is key in ensuring the NDI has all the deaths nationally and is a complete data set. Broadening the use of the NDI beyond the research-only scope would invariably bring it into conflict with states that have restrictive laws on the use of these data, thus leading to redactions and an incomplete data set. What has been acceptable to all 57 jurisdictions has been tightly controlled use for research vital status adjudication. The NDI service is self-supporting through fees, which have portions of revenue allocated back to the state/territory jurisdictions that provision the data. Unlike the other death data sources listed above, the NDI is a complete death data set. All jurisdictions report all deaths to the NCHS, which makes the NDI the most complete death data set available in the United States today. The implication is that NDI deaths are actual deaths and the absence of an NDI death means that a patient can be considered alive at the end of the reporting year. This distinction becomes important when determining a study subject’s last known status (ie, dead or alive).

NDI Cause of Death

NDI Plus reports patient cause of death using death certificate information. However, this information is not always reliable (Lauer et al. 1999). Death determination can be inexact due to low patient autopsy rates and complex medical conditions. Physicians typically receive no training in completing death certificates and often confuse the mechanism of death (eg, cardiac arrest) with the underlying cause of death (eg, cancer). Lloyd-Jones and colleagues compared cause of death information from death certificates with that adjudicated by a panel of three physicians (Lloyd-Jones et al. 1998). These researchers found that coronary heart disease as the cause of death was 24% greater on death certificates versus physician panel adjudications. They also found that this error rate increases with patient age.

Using the NDI

To gain access to the NDI, potential users submit an application form along with a current Institutional Review Board (IRB) approval document. There is an approximate 2- to 3-month review period. Once the application is approved, investigators are not granted direct access to the database but rather submit study subject records in a standard text file (flat file) format, using NDI’s coding specifications, on a password-protected CD-ROM. The investigator will receive a password-protected CD-ROM with potential death matches for the investigator to verify.

The fees for using the NDI are as follows:

  • $350 for the first search and $100 for each subsequent search
  • Routine search is $0.15 per unique patient per year being searched
  • NDI Plus search (which adds cause of death) is $0.21 per unique patient per year being searched
  • Known death search is $5 per patient regardless of how many years searched

NCHS provides a worksheet for calculating charges. Depending on the number of patients enrolled in a trial (many PCTs enroll tens of thousands of patients) and the number of years for follow-up, use of the NDI can be expensive.

Sample worksheet for NDI fees

Source: Screen shot from the worksheet for calculating National Death Index Charges. Available at:

Time Lag for NDI Data

Deaths are submitted electronically by states to NCHS throughout the calendar year. Death records are added to the NDI database (in a batch) after the end of each calendar year. An early release file is made available for both NDI Routine and NDI Plus searches when approximately 90% of the (previous) year's death records have been received and processed. This file typically is available for investigator search requests in late January or early February and is considered preliminary. Additional deaths can still be added and demographic variables (such as age) are subject to change. At the time of this file’s internal release, the NCHS also provides a summary table showing the completion rate by state so that studies can decide whether they should use the preliminary search. Note that although the CDC uses the nomenclature “early release” and “final file,” these files are not available for downloading, they are available for matching the records of submitted files using the application process described in the "Using the NDI" section.

NDI Early Release File – Vital Status Reporting Completion Status Sample Summary Table.

Note: The second column reflects cause of death data, which are included in an NDI Plus search. (Full table available at Screen shot taken June 17, 2018.)

The final file reflects “all” of the (previous) year’s death records. This file is available in late October or early November. Once the final file for a given year is available, users of the early release file can submit the same records for one free rerun search, provided that all parameters are identical to the original early release search and submitted within 6 months of notification that the final file is available. The final file is static and won’t be modified unless there was a serious error; for example, about 125 records from Tennessee and Massachusetts were modified in the 2014 death final file. The cause of death was changed from “unspecified external” cause of death to either “suicide” or “homicide.” In this example, studies were allowed to resubmit their “true matches” from 2014 to obtain the corrected cause of death at no cost.

Is Final Really Final?

Belated records do exist. For example, a death in 2014 deemed a suicide may be disputed by family members and therefore not released to the NCHS for inclusion in the NDI until 2016.

However, the 2014 Final Death File would not be modified to include that death. Instead, it would show up in the 2016 file (released in 2017) with a date of death in 2014. Bottom line: once a year’s death records have been searched in the final file, there is no need to repeat that year in future searches unless in rare cases of modification where the investigator would be notified by NCHS.

See Appendix for the NDI death identification and adjudication process.

Data Elements

Death data sources use different data elements for matching. This means that investigators need to ensure that adequate matching information and required patient consents are obtained during the course of the trial. For example, in the TRANSFORM-HF case example described below, the investigators will use the NDI that requires more detailed Asian race categories than typically are reported for clinical trials. These investigators also decided to obtain data elements that are not required but increase the likelihood of an NDI match. These include patient marital status (using NDI categories), patient social security number, and patient middle initial. The table below compares the types of information used in searching different death data sources. Researchers should be advised that some of these data elements are optional for specific data sources and that the specific coding for data elements may differ between data sources. For this reason, researches should consult the instructions for each data source when planning data collection for their study.

Comparison of Data Elements Used for Matching
Data element Death Master File National Death Index Medicare Master Beneficiary File California’s non-comprehensive death file NAPHSIS EVVE FOD
Social security number x x Last 4 digits x
Beneficiary identification code x
First name x x x x
Middle name x x x
Middle initial x
Last name x x x x
Birth date x x x x x
Month of birth x
Day of birth x
Year of birth x
Death date x x x x
Last known zip code x
Sex x x x
Father’s surname x x
State of birth x
Place of birth x
Race x x
County code x
State of residence x
State of death x
Place of death (county or state/county) x
Marital status x

Call Centers and Research Staff

Some studies use a central call center that consolidates follow-up activities typically performed by study sites. Call centers will perform telephone interviews with patients or their proxies at regular intervals using standard procedures that enhance overall study data quality and completeness. Research staff in the call center also may perform internet searches, visit and other social medial sources, search for obituaries and grave markers, and contact the patient’s health care providers for patient death information.

Comparison of Sources of Death Data for use in PCTs

The table below compares different death information sources.

Comparison of Sources for Death Data
Death Master File National Death Index Call center California non-comprehensive Medicare Master Beneficiary Summary File
Creator Social Security Administration Centers for Disease Control and Prevention National Center for Health Statistics Coordinating Center California department of public health Centers for Medicare and Medicaid
Source data Primarily from family members of the deceased when a claim is made by a beneficiary but also by funeral homes, financial institutions, and the states Family members and funeral homes report deaths to the state. The states have monetary contracts to provide death registries annually to the NDI A proxy, grave marker,,, online searches, social media, obituaries Death certificates Death information received from Medicare claims, family member online date of death edits, and Medicare benefits information
Cost Subscribers pay annual fee Per record/subscription. Can be expensive Cost of call center Relatively inexpensive ($120/year) Variable. See fee worksheet
Lag 4-6 months Reporting occurs in batches at the end of a calendar. The early release file (90% accurate) is generally available in February and the final file is available by the end of the calendar year. Depends on call/follow-up schedule defined in the protocol 60 days for 96% of deaths. 9 months from the close of the calendar year
Cause of death No Included in an NDI plus search ? Sort of, depending on the study and what records they collect and get “adjudicated.”
Data acquisition Researchers can run their own queries Data on patients must be submitted and a file will be returned
Cons In 2011, SSA stopped reporting 40% of deaths due to HIPAA regulations (National Technical Information Service 2011; da Graca et al. 2013). 1. Because deaths are included at the end of the calendar year, if a patient dies in January, and a researcher waits for the final file, it could involve almost a 2-year lag. 2. Depending on how many patients are involved in a trial and the follow-up duration, this method can be prohibitively expensive. Involves extra effort and cost Depending on the number of states involved in the trial, collecting the data from every state can be time consuming. File only includes information on Medicare beneficiaries

While the NDI will eventually report complete death data, there are reasons why a PCT may consider a hybrid approach that combines a call center with NDI searching. As an example, if a PCT participant dies in January, it potentially will be a year and a month before their death data are available in the early release file and almost two years before the data are available in the final file. If a participant dies in December, the lag to the early release file is only a month or two. This time lag can create problems for studies that rely upon NDI data and have reached the end of their follow-up period. Either they can wait until NDI reports all potential deaths for their study or they can supplement NDI with other death data sources. As an example, assume a clinical trial contacts study subjects at six-month intervals for telephone interviews. If a patient enrolls in January 2017 and the trial ends follow-up in February 2018, the NDI early release file for February 2018 deaths will not be available until early 2019 and the final file will not be available until October-November 2019; whereas, death information derived from a 6- month follow-up telephone call to relative or friend will be available in August 2018. In contrast, if the patient enrolls in October 2017 and the trial ends follow-up in February 2018, the NDI early release file for a November 2017 death will be available in early 2018 and the final file in October-November 2018. In this case, a death identified in the NDI’s early release file will be available before the patient’s next scheduled telephone interview in April 2019. Due to time lags in death information availability, investigators may choose a hybrid approach that relies upon multiple death data sources. This hybrid approach may be particularly useful in an event-driven trial that ends follow-up when a prespecified number of events have occurred.


Use of NDI and Call Center for Ascertaining a Death Endpoint

The ToRsemide compArisoN with furoSemide FOR Management of Heart Failure (TRANSFORM-HF) PCT ( Identifier: NCT03296813), pragmatic trial investigators used both the NDI and a call center for ascertaining death as and endpoint. Details of death ascertainment in TRANSFORM-HF were previously published by the authors of this chapter (Eisensten et al. 2019), and we will briefly summarize the salient aspects here.

TRANSFORM-HF is a randomized, PCT of ~6000 patients hospitalized for new or worsening heart failure. The trial is cluster randomized by site (~50 sites) and patients receive a prescription for an oral diuretic (torsemide or furosemide; both are commonly prescribed) prior to hospital discharge.

All-cause mortality is the primary endpoint for TRANSFORM-HF, and investigators anticipate more than 720 deaths. The DCRI Call Center will conduct follow-up interviews with patients at 30 days, 6 months, 12 months, and at 6-month intervals to ascertain patient status. Because the information sources used by the call center may (ie, contacting a proxy, searching for obituary acceptable grave marker, requesting medical records related to death) may be incomplete, investigators will also search NDI for patient deaths. Either method can be used to verify a patient death (Figure).

Flow Diagram for Confirmation of a Death

From: Eisenstein et al. 2019. Used with permission.

The TRANSFORM-HF study investigators considered the following Key Questions in their use of death as an endpoint. We believe these questions can be applied to other clinical trials where sites will not be responsible for identifying subject deaths and clinical events committees will not determine cause of death.

  • Do we need just FOD or do we need other data such as cause of death, occupation, marital status, educational level, manner of death, etc. Could EVVE FOD be sufficient instead of using NDI?
    • TRANSFORM-HF’s primary endpoint is all cause mortality. This FOD endpoint is the simplest death endpoint to collect.
      • If other death-related (e.g., cause or location) and non-death-related (e.g., occupation, marital status) are required, this will constrain the number of potential death data sources and may degrade the overall quality of information available to determine the death endpoint.
  • How often should we submit the NDI search?
    • TRANSFORM-HF’s NDI search plan is based upon the anticipated accrual of death events during the trial. TRANSFORM-HF will use both the early and final release files to capture death events at the earliest times they are available.
    • Since NDI data are available for specific calendar years, trials typically will conduct at least one search per year. Because of the time involved in managing NDI searches, many trials may choose only to use each year’s final release file.
  • Should we search all patients or only when vital status is unknown?
    • TRANSFORM-HF’s NDI search plan includes all patients not matching a previous NDI search. This means that patient with death verified by the DCRI Call Center may be included in NDI searches. This decision was made to permit comparison of death data accrual for NDI and the Call Center.
    • Other studies may choose to eliminate patients from NDI searches when they are identified by another death data source.
  • If we find a "true match" during a search, can we drop the subject from future searches?
    • TRANSFORM-HF’s NDI search plan stipulates that when patients are identified as a true NDI match, they will not be included in subsequent years searched.
    • Since the NDI file is final for each year, there is no reason for a study to include matched patients in subsequent years searched.
  • We’re only doing follow-up phone calls for a maximum of 30 months post-randomization. Should we keep doing the NDI search on the early cohort of patients even if the death file being searched is beyond 30 months from their randomization date?
    • TRANSFORM-HF’s NDI search plan excludes patient from subsequent year searches who have reached their 30-month anniversary.
    • Other studies may choose to extend NDI searches beyond the follow-up period as a means of obtaining long-term mortality data for their patients. However, this use of NDI data should be noted in the protocol, informed consent, and other regulatory documents.
  • Similar question for the later cohorts with less follow-up time by the Call Center.
    • As stated above other studies may choose to extend NDI searches beyond their follow-up period. However, this use of NDI data should be noted in the protocol, informed consent, and other regulatory documents.
  • What are the possible permutations of non-agreement between the Call Center and NDI results that we should account for?
    • Most instances of non-agreement will be timing related: one death data source identified a death event before the other.
    • The TRANSFORM-HF’s statistical analysis plan considers both death event sources (DCRI Call Center and NDI Search) as being equal. Hence, if either source identifies a death event, it is accepted without verification by the other death data source.
    • In situations where both data sources have had sufficient time for death event identification, there may be instances were the NDI search identifies a death and the Call Center does not. Presumably, this will be due to deficiencies in Call Center death data sources. However, if the Call Center identifies a death event and NDI does not, the Call Center death will be referred for further investigation.
    • Other studies will determine how they will manage disagreements between different death data sources. This should occur before the study commences enrollment.
  • Can we assume if a patient wasn’t found dead by the Call Center or NDI that they are alive?
    • Positive contact by the Call Center will be used as evidence the patient is alive as of the call date. However, if the Call Center does not verify the patient is dead, that does not mean the patient is alive. All that is known is that the patient was alive on their last contact date.
    • In contrast, if the NDI search does not identify the patient as dying in a calendar year, that patient is presumed to be alive on the last day of the calendar year.
  • What will we consider the event-free censoring date for mortality given we have staggered amount of follow-up per “cohort”, Call Center and NDI methodology, etc.?
    • Although TRANSFORM-HF subjects do not have an “end of study visit” with the enrolling sites, they are schedule to have a final telephone visit with the DCRI Call Center. Depending upon their entry cohort, this final visit will occur at 12, 18, 24, or 30 months follow-up. If that visits occurs, the visit date will be the subject’s censoring date.
    • Other studies faced with this will need to determine how they will determine censoring dates for study subjects. In making this determination, they will need to consider what is stated in the study protocol, informed consent and other regulatory documents.

Appendix. NDI Death Identification and Adjudication Process

NDI Death identification process

  • Each submitted record must contain at least one of the following combinations of identifying data elements:
    • Social Security number, sex, full date of birth present
    • Last name, first initial, month of birth, year of birth present
    • Last name, first initial, Social Security number present
  • Additional demographic variables increase the odds of a true match:
    • Middle initial
    • Father’s surname
    • State of Birth
    • State of residence
    • Marital Status
    • Race

Death Adjudication Process

Search Results Score

  • Records are returned with a score reflecting the degree of agreement between the identifying information on the submission record and the NDI death record.
  • The score is based upon probabilistic weights assigned to each of the identifying data items used in the NHIS -NDI record match (Fellegi and Sunter 1969).
    Score = {ΣWSSN1 + …+ WSSN95} + Wfirstname x sex x birthyear + Wmiddleinitial x sex + Wlastname + Wrace + Wsex + Wmaritalstatus x sex x age + Wbirthdate + Wbirthmonth + Wbirthyear + Wstateofbirth +Wstateof residence


  • Then, each match is categorized into one of five mutually exclusive classes that take into account which identifying items agree.
  • Classes reflect that some of the 12 NDI identifying items are more important for determining true matches than others (e.g. SSN versus state of birth) and that non-changing identifying information is more important than information that can change over time (e.g. birth surname versus marital status).
  • As SSN is a key identifier in the matching process, each NHIS-NDI record match is initially classified according to whether SSN is:
    • present and agrees (Class 1 or 2), or
    • present but disagrees (Class 5), or
    • missing (Class 3 or 4).
  • Class 1: Agrees on at least 8 (of 9) digits of SSN, first name, middle initial (including blank), last name, birth year (+/- 3 years), birth month, sex, and state of birth.
  • Class 2: Agrees on at least 7 (of 9) digits of SSN and at least 5 more of the following items: first name, middle initial (including blank), last name, birth year (+/- 3 years), birth month, sex, and state of birth.
  • Class 3: There are two types of Class 3 matches - Type A and B
  • Class 4: SSN is unknown on either the NHIS submission record or the NDI record and fewer than 8 of the items listed in Class 3 match.
  • Class 5: SSN is present but fewer than 7 (of 9) digits on SSN agree.

Algorithm for Determining True Matches

1.  Exclude poor matches:

–      If NDI death date is before the randomization date

–      If NDI score<=0

–      If NDI class=5

2.  Narrow down potential matches to best single match:

–      Drop duplicates (i.e., match records with same death certificate)

–      Select the one with the smallest value of class for the patient

–      Select the one with the largest score

–      Manual review in event of a tie (use importance of matching items as tiebreaker)




back to top

da Graca B, Filardo G, Nicewander D. 2013. Consequences for Healthcare Quality and Research of the Exclusion of Records From the Death Master File. Circulation: Cardiovascular Quality and Outcomes. 6(1):124–128. doi:10.1161/CIRCOUTCOMES.112.968826. PMID: 23322808.

Eisenstein EL, Prather K, Greene SJ, Harding T, Harrington A, Gabriel D, Jones I, Mentz RJ, Velazquez EJ, Anstrom KJ. 2019. Death: The Simple Clinical Trial Endpoint. Stud Health Technol Inform. 257:86–91. DOI: 10.3233/978-1-61499-951-5-86. PMID: 30741178.

Fellegi IP, Sunter AB. 1969. A Theory for Record Linkage. Journal of the American Statistical Association. 64(328):1183–1210. doi:10.1080/01621459.1969.10501049.

Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. 2009. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. American Heart Journal. 157(6):995–1000. doi:10.1016/j.ahj.2009.04.002. PMID: 19464409.

Lauer MS, Blackstone EH, Young JB, Topol EJ. 1999. Cause of death in clinical research: time for a reassessment? J Am Coll Cardiol. 34(3):618–620. DOI: PMID: 10483939.


Lloyd-Jones DM, Martin DO, Larson MG, Levy D. 1998. Accuracy of death certificates for coding coronary heart disease as the cause of death. Ann Intern Med. 129(12):1020–1026. DOI: 10.7326/0003-4819-129-12-199812150-00005. PMID: 9867756.

National Center for Health Statistics. National Death Index. [accessed 2018 Jun 17].

National Research Council. 2009. Vital Statistics: Summary of a Workshop. Washington, D.C.: National Academies Press. doi: 10.17226/12714. PMID: 25032356.

National Technical Information Service. 2011. Important Notice: Change in Public Death Master File Records. [accessed 2018 Jun 17].

Social Security Administration. 1983. Compilation of the Social Security Laws. [accessed 2018 Jun 17].

Navar AM, Peterson ED, Steen DL, et al. Evaluation of Mortality Data From the Social Security Administration Death Master File for Clinical Research. 2019. doi: 10.1001/jamacardio.2019.0198. PMID: 30840023

Warren JR, Milesi C, Grigorian K, Humphries M, Muller C, Grodsky E. 2017. Do inferences about mortality rates and disparities vary by source of mortality information? Ann Epidemiol. 27(2):121–127. doi:10.1016/j.annepidem.2016.11.003. PMID:27964929.

Version History

Published February 28, 2019.


Eisenstein E, Curtis L, Prather K, et al. Choosing and Specifying Endpoints and Outcomes: Using Death as an Endpoint. In: Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials. Bethesda, MD: NIH Health Care Systems Research Collaboratory. Available at: Updated August 13, 2019. DOI: 10.28929/118.