HEALTHCARE COST & UTILIZATION PROJECT

User Support

Do Your own analysis
Explore Expert Research & Limited Datasets

HCUP Using Multiple Years of Data - Accessible Version


HCUP Using Multiple Years of Data

Introduction

Thank you for joining us for this Healthcare Cost and Utilization Project, or HCUP, online tutorial on multi-year analysis. This course presents solutions that may be necessary when conducting analyses that span multiple years.

One of the strengths of HCUP is that multiple years of data are available. This makes it possible to study trends over time on topics such as utilization, access, charges, quality, and outcomes. It also allows researchers to study rare conditions by combining multiple years to gain sufficient sample size.

However, errors in study results may occur when two or more years of data are combined. This course will describe problems that may arise when using multiple years of HCUP data and provide you with easy solutions for addressing these issues.

This tutorial presents examples and solutions using the HCUP National (Nationwide) Inpatient Sample, or NIS, and Kids’ Inpatient Database, or KID. However, some of the issues about data elements are also relevant to other HCUP databases. This course will take 30 minutes to complete.

Return to Contents


Learning Objectives

Objective #1: The first objective is to become familiar with the ways that HCUP data have changed over the years and how these changes can affect your research. This course does not provide an exhaustive list of database changes. Rather, the course highlights the types of changes, alerts you as a user to these changes, and then provides solutions and more detailed resources.

Objective #2: The second objective is to learn how to prevent problems that can arise from combining multiple years of data.

Objective #3: The third objective is to learn how the NIS Trend Weights Files and the KID Trend Weights File can help with analysis using multiple years of HCUP data prior to 2012.

Return to Contents


About HCUP

Before we get started, a quick word about HCUP: HCUP is sponsored by the Agency for Healthcare Research and Quality (AHRQ). AHRQ is part of the U.S. Department of Health and Human Services: HCUP is a family of databases, software tools, and related research products that enable research on a variety of healthcare topics, including cost and quality of health care services, medical practice patterns, access to health care, and treatment outcomes.

If you are unfamiliar with HCUP or would like a refresher, please consider taking our general HCUP Overview Course

Return to Contents


Summary of Changes That Affect Multi-Year Analysis

HCUP databases have grown and changed over the years. To calculate results accurately when using multiple years of HCUP data, it is important to understand the following changes and account for them in your analyses.

Changes to NIS and KID Designs and Weights

Sampling design changes were made to the NIS in 1998 and 2012 and to the KID in 2000. These changes caused a discontinuity in the application of weights between the 1997 and 1998 NIS, the 2011 and 2012 NIS, and the 1997 and 2000 KID.

In addition to design changes, the data available for the NIS and KID sampling frame have changed annually, because over time more States have been added. Estimates from earlier years of the NIS and KID may be subject to more sampling bias than later years of the NIS and KID.

Changes to Data Elements

Data Element Changes occur across all HCUP databases. For example, the labels, values, and availability of some data elements have changed over time, and other data elements have been discontinued.
Previous Current
DISPUB92 DISPUB04
DISCWT_U DISCWT
HOSPID HOSP_NIS
HOSPST HOSP_DIVISION
KEY KEY_NIS



Overview of the NIS

  • The National (Nationwide) Inpatient Sample (NIS) is HCUP’s nationally representative database for hospital inpatient stays.


  • Beginning with 2012 data, the NIS was redesigned to improve national estimates. The NIS is a sample of discharges from all community hospitals participating in HCUP, approximating a 20% sample of discharges from U.S. community hospitals, excluding rehabilitation and long-term acute care hospitals.


  • Prior to 2012, the NIS was a sample of hospitals, approximating a 20% sample of U.S. community hospitals, excluding rehabilitation hospitals. All discharges from sampled hospitals were retained.


  • The NIS data are available yearly from 1988 and can be used to identify, track, and analyze national trends in healthcare utilization, access, charges, quality, and outcomes.


  • To highlight the design change, beginning with 2012 data, AHRQ renamed the NIS from the "Nationwide Inpatient Sample" to the "National Inpatient Sample".
For more detailed information about the NIS and other HCUP databases, please visit the HCUP Overview Course and the HCUP Sample Design Tutorial.

Return to Contents


Overview of the KID

The Kids' Inpatient Database (KID) is HCUP’s nationwide database of hospital inpatient stays for children. It is a sample of pediatric discharges (age 20 or less at admission) from community, nonrehabilitation hospitals from participating HCUP State Partners included in the HCUP State Inpatient Databases (SID).

Pediatric discharges in the KID are stratified into three categories:

  1. uncomplicated in-hospital births
  2. complicated in-hospital births
  3. all other pediatric hospital stays.
Then systematic random sampling is used to select 10% of uncomplicated in-hospital births and 80% of other pediatric cases from each frame hospital.

The KID is available every three years starting in 1997. Researchers and policymakers can use the KID to identify, track, and analyze national trends in hospital utilization, access, charges, quality, and outcomes for children.

For more detailed information about the KID and other HCUP databases, please visit:



Return to Contents


Overview of the NEDS

The NEDS is the largest all-payer emergency department (ED) database in the United States, yielding national estimates of hospital-based ED visits.

The NEDS is sampled from HCUP State Partners participating in the HCUP State Inpatient Databases, or SID, and the HCUP State Emergency Department Databases, or SEDD.

The NEDS is produced annually beginning in 2006. Researchers can use the NEDS to support health care policy and research on a variety of topics, such as hospital and patient characteristics of ED visits, medical treatment effectiveness, and impact of health policy changes.

Many of the principles discussed in this tutorial apply to the NEDS, although no Trend Weights are necessary for conducting a multi-year analysis of the NEDS as it has not been redesigned since it was originally developed in 2006. For more detailed information about the NEDS and other HCUP databases, please visit:


NIS, KID, and NEDS for National Estimates

The NIS, KID, and NEDS are large databases with sample designs that are intended to be nationally and regionally representative of all hospitalizations and emergency department visits for adult and pediatric patients in the U.S.

To make accurate national estimates, it is important to weight the observations in your studies.

To learn more about the NIS, KID, and NEDS sample designs and how to weight the data or for more detailed information about the NIS and other HCUP databases, click the links below.


Return to Contents


Changes in the NIS Design and Weights 2011-2012

Major changes in the NIS sampling design occurred in 2012 to improve national estimates.

  • Beginning with 2012 data, the NIS was redesigned and is now created using a sample of discharges from all community hospitals participating in HCUP, approximating a 20% sample of discharges from U.S. community hospitals, excluding rehabilitation and long-term acute care hospitals.


  • Prior to 2012 the NIS was created from a sample of hospitals, approximating a 20% sample of U.S. community hospitals, excluding rehabilitation hospitals. All discharges from sampled hospitals were retained.
As a result of the changes implemented in the 2012 redesign, users should expect one-time disruptions to historical trends for counts, rates, and means estimated from the NIS, beginning with data year 2012.

Some of the differences anticipated across the 2011-2012 data include:
  • overall trends in discharge counts to decline by about 4.3 percent,
  • overall trends in average length of stay to decline by about 1.5 percent,
  • overall trends in total charges to decline by about 0.5 percent, and
  • overall trends in hospital mortality to decline by about 2.0 percent.
For additional information, click on the links below.
In order to facilitate analysis of trends using multiple years of NIS data, AHRQ makes available files that provide adjusted weights when using multiple years of NIS data across the years 1993 to 2011. The data element TRENDWT contains weights that can be applied for consistency with those that are used in the 2012 redesigned NIS and are designed to be used instead of the original NIS discharge weights, or (DISCWT), for trends analysis.

For analysis spanning 1993 to 2002 NIS data years, users have the choice to use either the 1993 to 2011 NIS Trend Weights Files or the 1993 to 2002 NIS Supplemental Discharge-Level Files. Both the NIS Trend Weights Files and the Supplemental Discharge-Level Files contain the new NIS trend weight, or (TRENDWT), to account for the redesign of the 2012 NIS. However, the NIS Supplemental Discharge-Level Files provide users with data elements that were added or changed in later data years of the NIS (through 2002). The Supplemental Discharge-Level files are only available through 2002, so the hospital-level Trend Weights Files are the only option for 2003 to 2011. The NIS Supplemental Discharge-Level Files are only available through the HCUP Central Distributor.

Please note that this tutorial focuses on the 1993 to 2011 NIS Trend Weights Files. The NIS Trend Weights Files only go through data year 2011 because trend weights are not needed for the 2012 NIS as the intention of the NIS Trend Weights Files is to make 1993 to 2011 NIS data comparable to the 2012 NIS.

The chart below shows a comparison of the NIS Trend Weights Files and the NIS Supplemental Discharge-Level:
  NIS Trend Weights Files NIS Supplemental Discharge-Level Files
Years Available 1993-2011 1992-2002
Trend Weight To create consistent weights across years, use the new Trend Weight (TRENDWT) contained in the NIS Trend Weight Files, instead of the original discharge weight (DISCWT), which is contained in the NIS Core File To create consistent weights across years, use the new Trend Weight Files, instead of the original discharge weight (DISCWT), which is contained in the NIS Core File
Additional Data Elements None (other than YEAR and HOSPID for linkage) Additional data elements that were added or changed for later data years of the NIS through 2002 (example: FEMALE, (indicator of sex))
Source Available for download, free of charge, through the HCUP User Support (HCUP-US) Web site Available only through the HCUP Central Distributor (HCUP Supplemental Files)




Changes in the KID Design and Trend Weights 1997-2000

The KID is a sample of pediatric hospital discharges from each participating HCUP Partner organization. (The NIS is a sample of hospital discharges.) Beginning with the 2000 KID, short-term rehabilitation hospitals were excluded, and the definition of total discharges in the universe was revised. Additional changes beginning with the 2000 KID included the addition of patients aged 19 to 20 to the database.

For more detailed information about the changes in the 2000 KID, please click on the following links:

Return to Contents


Using the NIS Trend and KID Trend Weights Files

  • Facilitate analysis of trends using multiple years of NIS and KID data

    The NIS Trend Weights and KID Trend Weights files were created in order to facilitate analysis of trends using multiple years of NIS and KID data.
    • For the NIS, the 1993 to 2011 trend weights contained in the NIS Trend Weights Files were calculated for consistency with the weights in the redesigned 2012 NIS and should be used instead of the original 1993 to 2011 NIS discharge weights included in the NIS Core files.


    • For the KID, the 1997 trend weight contained in the KID Trend Weights File was calculated for consistency with the weights for the 2000 and later years of the KID. Trend analyses for 2000 and later KID data do not need the KID trend weights.


  • Merge with the original database files to produce a single data set for each year

    The NIS and KID Trend Weights files were designed to be merged with the original database files to produce a single data set for each year.
    • For trends analysis including 2012 and earlier NIS data, the trend weight (TRENDWT), found in the NIS Trend Weights Files, should be used prior to 2012 data to make estimates consistent with the new 2012 NIS design. Trend Weights are not needed for the 2012 NIS, and the regular discharge weights included in the NIS Core file (DISCWT) should be used for 2012 (and future) data.


    • For multi-year analysis of KID data, researchers should use the KID Trend Weights (named DISCWT) found in the KID Trend Weights File in place of the original KID yearly weights (named DISCWT_U ) included in the Core file of the KID for data years prior to 2000.

Return to Contents


One-time Revisions in NIS and KID Data Elements

HCUP made one-time revisions to several data elements in the NIS and KID with the 2012 NIS redesign. There were three types of data element revisions: replaced, new, and discontinued.

Below some examples of these one-time data element changes—this is not an exhaustive list. For more detailed information, please click on one of the onscreen links.

Replaced Data Elements

There are several replaced data elements.

An example of this type of change occurred with HCUP-encrypted Hospital identifiers in the NIS and KID. Prior to 2012, the NIS and KID used the HCUP hospital number (HOSPID). In 2012, HOSPID was replaced with HOSP_NIS or HOSP_KID.

New Data Elements

The 2012 NIS and KID contain a few data elements that were not included in earlier years.

For example, Census Division of hospital (HOSP_DIVISION) was added to the NIS and KID beginning with 2012 data. Hospital State (HOSPST) is no longer available, and the NIS is stratified by Census Divisions rather than Census Regions.

Discontinued Data Elements

Several data elements that could not be used for national estimates were discontinued from the NIS and KID beginning with 2012 data.

For example, the Expected secondary payer (PAY2) and Expected secondary payer as received from source (PAY2_X) were dropped from the NIS and KID beginning with 2012 data as they are not available uniformly across the States.

Choosing NIS and KID Data Elements for Multi-year Analysis

For all data elements you plan to use in your analysis, first perform descriptive statistics and examine the range of values, including the number of missing cases.

NIS 1993-2012

Not all data elements in the NIS and KID are provided by each hospital. These data elements are provided on the NIS and KID because they can be valuable for research purposes, but they should be used with caution. For example, RACE is missing for some hospitals; thus, national estimates using RACE should be interpreted and reported with caveats.

Differences exist across the State data sources in the collection of information. For example, indication of ED services prior to admission can be indicated by various data elements, such as hospital charges for ED services, ED as the point of origin to the hospital, or CPT procedure codes indicating ED physician services. Unfortunately, a State data source may report only some of these data elements. The most reliable way to identify ED admissions in the HCUP databases is to use the data element HCUP_ED, which considers all available indicators of ED services.

For additional information, click on the following links:

Return to Contents


How to Use the NIS and KID Trend Weights Files

  • The NIS and KID Trend Weights files are to be used in conjunction with the original NIS and KID files when conducting an analysis of multiple years of NIS and KID data.
  • Merge each year’s NIS or KID Trend Weights files with the original NIS or KID database for that year.
  • Then, combine the merged output files for each year necessary for your study.
  • For the NIS, the new Trend Weight, or (TRENDWT), found in the NIS Trend Weights Files, should be used in place of the original discharge weight, or (DISCWT), included in the NIS Core file, to create national estimates for trends analysis.
  • For the KID, the new discharge weight, or (DISCWT), should be used in place of the original discharge weight, or (DISCWT_U).
Trend Examples

Below are trend examples and steps to follow.

Trend
One example of a trend analysis might be a multi-year analysis using the 2003 to 2012 NIS to examine a condition across 10 years of the NIS databases.

Small Sample
An example of a small sample could be when you're performing a study of a relatively rare disease and realize that one year of the NIS does not secure sufficient sample size and statistical power to detect differences in outcomes. You estimate about 300 people are diagnosed with this disease every year and plan to pull the records with the diagnosis of this disease from 10 years of the NIS.

Steps to Follow
Regardless of whether you will use multiple years of the NIS to calculate a trend or to build a larger sample to allow you to study a rare disease, the following steps are the same.

Example
In this example, you are going to use SAS to analyze the data.

  1. First, obtain the NIS databases for 2003 to 2012.
  2. Next, obtain the NIS-Trend Weights files for 2003 to 2011.
  3. Then load these databases into SAS.
  4. You will then merge the complete NIS Core file for each year with the corresponding NIS Trend Weights file for that year, for the years 2003 to 2011. Or, in other words, merge the 2003 NIS Core file with the 2003 NIS Trend Weights file, then merge the 2004 NIS Core file with the 2004 NIS Trend Weights file, and so on.
  5. The next step is to check the availability of data elements in the time frame. If you plan to use the data elements included in the NIS Hospital Weights File, Severity Measures File, or Diagnosis and Procedure Groups File, merge these files with the merged NIS Core–Trend Weight file.
  6. You will then concatenate all the databases: the 2003 to 2011 merged NIS Core–Trend files and the 2012 complete NIS Core file.
  7. Finally, subset the database above, keeping the data elements necessary for your analysis.
Remember to keep and use the NIS Trend Weights, or (TRENDWT), found in the NIS Trend Weights Files, in place of the original NIS weights, or (DISCWT), included in the NIS Core files, for all the records that are necessary for your study.

These steps will produce a file that can be used to calculate 2003-2012 trends or to study the relatively rare disease with about 3,000 total observations over the 10-year time period. Example SAS programs to concatenate multiple years of the NIS can be found at the conclusion of this course.

KID Trend Weights File

The KID Trend Weights file is a discharge-level file that provides KID data users with discharge weights that are consistently defined across data years.

  • The KID Trend Weights file is available and needed only for the 1997 KID.
  • To facilitate analysis using multiple years of data, the discharge weights included in this file were calculated for 1997 data to be consistent with the weights for the 2000 and later years of the KID.
  • The KID database for 2000 and forward can be used for multi-year analysis as is, because there are no KID design changes that affect weights during that time.
The KID Trend Weights file should be used in conjunction with the original 1997 KID database in the same manner as the NIS Trend Weights files are used with the NIS.

Yearly Data Element Changes Overview

In addition to HCUP database design changes that affect weights, there are three general types of data element changes that researchers must take into account when using multiple years of HCUP databases.

States Adding/Modifying Data Elements

The first type of data element change is yearly changes for a few data elements as a result of either States adding data elements or modifying previously available data elements.

Changes in the Standard Billing Form

The second type of data element change is in the standard billing form, such as UB92 and UB04, which results in changes to data elements.

Updates to Coding Systems

The third type of data element change is updates to coding systems, including the Ninth Revision International Classification of Diseases, Clinical Modification (ICD-9-CM) and Diagnosis Related Group (DRG) that occur yearly.

These types of changes in data element coding and definitions need to be adjusted manually by the user. The pages in this section include some examples of these changes, their effect on research studies, and solutions that users can implement. While these examples do not represent an exhaustive list, they are designed to encompass the data element changes that could affect your research study.

For more information please visit the following links.

States Modifying Data Elements

HIV/AIDS diagnosis data is an example of how the availability of data elements may change over time due to state modifications to existing data elements.

Caution

Over time, policies have changed in some states about whether certain data elements are made available, or if they are, if the data elements will be perturbed or masked in some way. These decisions are made by HCUP Partner organizations to protect the privacy of patients.

Effect

Beginning in 2001, Iowa prohibited the release of information about inpatient records with HIV infections (defined by MDC of 25).

Nebraska also does not release inpatient discharge records to HCUP for patients with HIV diagnoses. In addition, New York masks information if the inpatient record contains an HIV/AIDS diagnosis.

These missing data could result in underestimation of the number of discharges with HIV/AIDS.

Solution

Determine which data elements are available and if there are any restrictions on them for the years you will include in your multi-year analysis.

Check the State Specific Restrictions in the Introduction to the NIS and the Introduction to the KID.

Also check the Availability of Data Elements in the NIS and the Availability of Data Elements in the KID.

Changes in the Standard Billing Form

Changes occur for a few data elements as a result of changes in the standard billing form, such as UB92 and UB04. An example of this type of change occurred with source of admission.

Caution

Beginning in the 2007 data year, the National Uniform Billing Committee updated the specifications on admission source coding methods. Some hospitals adopted the new codes quickly, but others took more than one year to adopt them. Detailed information can be found on the HCUP-US web site as listed below:
Effect

The continuity of admission source information was disrupted because of the transition from ASOURCEUB92 to PointOfOriginUB04.

Because many hospitals gradually shifted from the use of the old admission source codes to the new codes, record counts for ASOURCEUB92 and ASOURCE continually declined since 2007, while the record counts for PointOfOriginUB04 and PointOfOrigin_X increased.

Solution

Be aware of changes across years if you are using the following admission source and point-of-origin data elements:
  • ASOURCEUB92
  • ASOURCE
  • ASOURCE_X
  • PointOfOriginUB04
  • PointOfOrigin_X
If your analysis includes the 2008 to 2012 NIS, consider using the TRAN_IN data element instead of the data elements above. The data element TRAN_IN indicates that the patient was transferred into the hospital and is defined using either admission source (ASOURCE) or point of origin (PointOfOriginUB04).

For additional information, see the NIS Description of Data Elements.

Updates to Coding Systems

Caution

The ICD-9-CM diagnosis codes and DRGs are updated on October 1 each year. The National Center for Health Statistics updates ICD-9-CM diagnoses codes for the U.S. and the Centers for Medicare & Medicaid Services updates DRGs. Most codes remaint the same across multiple years, but some codes are added and retired each year.

Effect

Using diagnoses codes that are not correctly matched to the data years you are studying may produce inaccurate results.

Solution

Using diagnoses codes that are not correctly matched to the data years you are studying may produce inaccurate results.

Check to make sure the ICD-9-CM diagnosis codes and DRG codes are consistent across the years of data you are using and have a consistent meaning for the years you will include in your multi-year analysis. The annual update to ICD-9-CM diagnoses can be found on the CDC web site.

Find this update and DRG codes by clicking on the onscreen links.
You may need to refer to a medical coder to determine the correct ICD-9-CM diagnosis and DRG codes.

Return to Contents


Obtain the NIS and KID Trend Weights Files

To obtain the NIS and KID Trend Weights files free of charge, contact HCUP User Support:

E-mail: hcup@ahrq.gov
Phone (toll free): 9866) 556-4287
FAX: (866) 792-5313

The new NIS Trend Weights files and the KID Trend Weights files are available for download as self-extracting PKZIP compressed ASCII files along with SAS and SPSS load programs by clicking the below links:

Return to Contents


Hot Tips

  • Familiarize yourself with the HCUP databases you plan to use and the changes to the databases over time before beginning to analyze the data.
  • Make sure that the data elements you need for your analysis are available across each year of the databases you will use.
  • Be aware of changes to the HCUP databases over time, and use the solutions provided in this tutorial if you think these changes might affect your analyses.

Return to Contents


Checklist

Take a moment to review the common troubleshooting questions below to become familiar with some of the most common errors and how to avoid them.

Does your multi-year analysis use the 1993-2002 NIS?

If your analysis uses the 1993–2002 NIS, you have the choice to either use the 1993–2002 NIS Trend Weights files or the 1993–2002 NIS Supplemental Discharge-Level files, which are available through the HCUP Central Distributor. Both files have been updated with the new NIS discharge trend weights (TRENDWT), which were calculated in the same way as the weights for the redesigned 2012 NIS. The only difference between the files is that the 1993–2002 NIS Supplemental Discharge-Level files include additional uniform data elements that are defined consistently with later years of the NIS. The NIS Supplemental Discharge-Level files are not available for data years of the NIS after 2002.

For additional information, please see the 1993-2002 NIS Supplemental Discharge-Level files and 1993-2011 NIS Trend Weights Files.

For additional information regarding the 1993-2002 NIS Supplemental Discharge-Level files, you may contact the HCUP Central Distributor.

Does your multi-year analysis use the 1997 KID?

Merge the KID Trend Weights file and the 1997 KID database, and use the weighting variable in the KID Trend Weights file (DISCWT) in place of the one in the 1997 KID (DISCWT_U).

Please see "Using the Kids' Inpatient Database to Estimate Trends" for more information.

Will your multi-year analysis use ICD-9-CM and DRG codes?

Check for any changes in ICD-9-CM and DRG codes of interest across the years.

Will your multi-year analysis rely on any data elements that have been added, changed, or removed over time?

Check for the availability of and any changes to data elements of interest across the years.

Why are NIS data years 1988 to 1992 not included in the NIS Trend Weights files or NIS Supplemental Discharge-Level files?

The previous 1988 to 1992 NIS Trend Files and Supplemental Discharge-Level files were retired because NIS Trend analysis is not recommended for data years prior to 1993.

Return to Contents


Conclusion

This concludes the HCUP Trend Analysis course.

What would you like to do next?

Provide feedback on this tutorial
hcup@ahrq.gov

Return to Contents


Resources




Internet Citation: HCUP Using Multiple Years of Data - Accessible Version. Healthcare Cost and Utilization Project (HCUP). September 2015. Agency for Healthcare Research and Quality, Rockville, MD. www.hcup-us.ahrq.gov/tech_assist/trends/508/508course_2015.jsp.
Are you having problems viewing or printing pages on this website?
If you have comments, suggestions, and/or questions, please contact hcup@ahrq.gov.
Privacy Notice, Viewers & Players
Last modified 9/25/15