HCUP Sample Design: National Databases - Accessible Version
Thank you for joining us for this Healthcare Cost and Utilization Project or HCUP online tutorial on Sample Design of National Databases. This tutorial was created for researchers who are using HCUP national databases, and who have some background in basic research methods.
In order for you to create accurate and unbiased estimates in your research, it is essential for you to understand the sampling methods of HCUP national databases.
In this tutorial, you'll learn about three national HCUP databases created from the state-level HCUP databases. The three national databases are: the National Inpatient Sample (NIS); the Nationwide Emergency Department Sample (NEDS); and the Kids' Inpatient Database (KID). The Nationwide Readmissions Database (NRD), is another national database, but it won't be covered in this tutorial. To learn more about the NRD please visit the NRD tutorial on the HCUP Online Tutorial Series page.
Because the HCUP national databases each serve a different purpose, each one is designed slightly differently. Understanding these differences and how they impact your research is critical to ensure your data estimates are accurate and unbiased, and that you draw sound conclusions. This course will take approximately 30 minutes to complete.
Before we get started, a quick word about HCUP:
HCUP is sponsored by the Agency for Healthcare Research and Quality or AHRQ. HCUP is a family of databases, software tools, and related research products that enable research on a variety of healthcare topics.
If you are unfamiliar with HCUP or would like a refresher, please consider taking our general Overview Course.
The underlying goal of this module is to ensure that you select the best HCUP databases for your research. One important factor in doing so, is understanding the sample designs of the national databases. By the end of this module, you will:
Because this module is focused on sample design, there are a few key terms that are helpful to review. We will be tying this information to the HCUP database design, later in the module.
We have state and national-level databases, that reflect inpatient, emergency department, and ambulatory surgery care. The national databases are derived from our state-level databases. In other words, the state-level databases serve as the sample frame for the national databases: the NIS, the NEDS, and the KID.
The State Inpatient Databases (SID), are a set of databases that include all inpatient hospital discharges from community hospitals in participating states.
The State Emergency Department Databases (SEDD), are a set of databases that contain all treat-and-release hospital emergency department visits from community hospitals in participating states.
The The State Ambulatory Surgery and Services Databases, or SASD, are a set of databases that include encounter-level data for ambulatory surgery and other outpatient services from hospital-owned facilities in participating states. In addition, some States provide ambulatory surgery and outpatient services from nonhospital-owned facilities.
In summary, HCUP has seven types of databases that cover inpatient, emergency department, and ambulatory surgery data at state, regional, and or national levels.
The next three sections will focus on the design of three of HCUP's national-level databases:
The NIS, the NEDS, and the KID. To learn more about the NRD, refer to the NRD tutorial on the HCUP Online Tutorials page.
The National Inpatient Sample (NIS), is a unique and powerful database of hospital inpatient stays. Researchers and policymakers use the NIS to identify, track, and analyze national and regional trends in hospital utilization, access, charges, and quality.
The NIS contains annual data from 1988 forward.
The NIS is a sample of discharges from the State Inpatient Databases (SID).
Starting with 2012 data year, the NIS was redesigned to improve national estimates.
To highlight the design changes, AHRQ renamed the "Nationwide Inpatient Sample (NIS)" to the "National Inpatient Sample (NIS)".
Previously, the NIS sample design was defined by: The target universe was all US community hospitals; The sample frame remained constant as the SID but included only 90% of the target universe; The sample strata was US region, and remaining , constant are the urban and rural location, teaching status, ownership, and bed size strata's; The sample unit was a 20% stratified sample of hospitals, 100% of the discharges from each of the sampled hospitals included in the NIS. This was a single-stage cluster sample.
In creating the National Inpatient Sample, the first step was to stratify the SID hospitals according to five strata: census division, location, teaching status, ownership, and bed size.
To produce national or regional estimates, the HCUP databases provide a "weight" variable that you can apply to your data. If you're interested in learning more about the weighting on the National Inpatient Sample (NIS), please access the NIS Trend Weights Files for more details.
Now that you understand the NIS sample design, you should know that revisions have been made to the NIS sample design that could affect estimates calculated from the NIS.
You should always check the NIS online documentation on the HCUP User Support website, before starting your research project.
Over time there have been changes to the NIS. States have been added to the sampling frame. In 1988, the NIS was based on 8 states. The more recent years of the NIS have 40+ states. There were important sample design changes in 1998. The NIS excluded short term rehabilitation hospitals from frame, changed the definition of discharges from total discharges to hospital discharges, discontinued the preference for NIS hospitals that were in the sample in prior years, and redefined the hospital stratification variables for sampling.
There were also important design changes in 2012. The 2012 NIS excluded long-term acute care hospitals from the sampling frame, improved the estimates of discharges in the universe, used State hospital identifiers rather than AHA hospital identifiers, and drew a sample of discharges from all hospitals in the sampling frame, rather than draw all discharges from a sample of hospitals.
The sample designs are refined over time in other databases as well. There is useful documentation on the HCUP User Support website that details how you can account for these sample design changes.
In this section you have learned the following information about the NIS database:
The Nationwide Emergency Department Sample (NEDS), is a unique and powerful database of emergency department visits. Researchers and policymakers use the NEDS to identify, track, and analyze national and regional hospital emergency department: care, utilization, access, charges, and quality. The NEDS contains annual data from 2006 forward.
The NEDS is a stratified sample of hospitals from the State Emergency Department Databases (SEDD), and the State Inpatient Databases (SID).
The NEDS is stratified, single-stage cluster sample. The NEDS is constructed by categorizing hospitals according to five strata. The strata include geographic region, location, teaching status, ownership, and trauma-level designation.
To produce national or regional estimates, the HCUP databases provide a "weight" variable that you can apply to your data. If you're interested in learning more about weighting the national databases, please access the HCUP tutorial on weighting entitled Producing HCUP National Estimates.
In this section you have learned the following information about the NEDS database:
The third national-level database, we will review the Kids' Inpatient Database (KID), is specifically designed for pediatric research, particularly for the study of rare pediatric conditions.
The KID is produced every three years starting with 1997 data.
The way the KID is created is quite different than the NIS and the NEDS.
This is great from a public health perspective, but all these healthy, uncomplicated births overwhelm the data making it difficult to identify rare pediatric hospitalizations.
The KID is designed to accommodate research on rare pediatric conditions that require hospitalization, such as congenital anomalies, as well as rare pediatric medical procedures, such as heart surgery and organ transplantation.
While the NIS does include pediatric discharges, the NIS is not optimized for research on rare pediatric inpatient hospitalizations. It's best to use the KID for this kind of research.
Note that the NEDS is well-suited for research on pediatric emergency care.
The KID is a stratified sample of discharges from the State Inpatient Databases (SID).
The KID is stratified by uncomplicated in-hospital births; complicated in-hospital births, and all other pediatric hospital stays. Unlike the NIS or NEDS, the KID records are post-stratified in order to enable users to create national and regional estimates. The discharges are post-stratified in proportion to the number of AHA newborns and the total number of non-newborn AHA admissions.
In order to produce national or regional estimates of pediatric hospitalizations using the KID, discharge weights are developed using the American Hospital Association or AHA target universe as the standard.
To do so, KID records are post-stratified by US region, urban or rural location, teaching status, ownership, and bed size, with the addition of a stratum for freestanding children's hospitals.
The KID is stratified by freestanding children's or other hospitals. Children's hospitals restrict admissions to children, while other hospitals admit both adults and children. There may be significant differences in practice patterns, severity of illness, and available services between children's hospitals and other hospitals. Children's units in general hospitals are not stratified as children's hospitals.
If you're interested in learning more about weighting the national databases, please access the weighting tutorial Producing HCUP National Estimates.
In this section, you have learned the following information about the KID database:
There are some mistakes that are easy to make when working with the HCUP national databases. Understanding the sample design of each database, will help you avoid these errors.
One of the most common errors is not weighting the NIS, NEDS, and KID data, when attempting to produce national and/or regional estimates. Remember that these national databases are based on samples - they must be weighted to derive national and/or regional estimates. If you do not weight the data, what you have are sample record counts, not national and/or regional estimates.
A serious violation occurs if users report cell sizes less than or equal to 10 in their publications. Remember that you signed an HCUP Data Use Agreement or DUA, that prohibits you from reporting any cell sizes less than or equal to 10. This is required as a privacy precaution. From a sample design perspective, any estimate that you based off of such a low count probably isn't that reliable anyway. If you'd like a refresher on the HCUP DUA, please consider reviewing the HCUP DUA module - it's only 15 minutes in length, and can be accessed via the link on the screen.
Another error is that sometimes new users attempt to produce state-level estimates from the national databases. Remember that none of the HCUP national databases have a sample design that includes "state" as a strata variable. Only national and regional estimates should be produced from the national databases. Trying to produce state-level estimates from the NIS, NEDS, or KID could result in biased results. In fact, AHRQ removed the data element identifying states beginning with the 2012 NIS.
New users sometimes use the inappropriate database for a particular study. For example, remember to use the KID, rather than the NIS, for your research on rare pediatric conditions as the sample design of the KID is specifically created to accommodate rare pediatric research. Also, take caution when using any of the HCUP national databases for race-related research as race data are not uniformly available across the HCUP state databases, or, put another way, across the "sampling frame."
Sometimes users try to work with the HCUP national databases in software packages that are not designed to account for complex survey design, such as Microsoft Excel. You must use statistical software, such as SAS, Sudaan, or Stata, that can handle data derived from complex sampling designs. This is important because analyses that fail to account for the sample design could yield biased estimates, and may have direct impact on your variance calculations.
Users sometimes neglect to check their estimates against other data sources. At a minimum, it is recommended that you check your estimates against HCUPnet, which is a free online query system with access to HCUP data.
When looking at key differences in sample design amongst the NIS, NEDS, and KID, remember that each database has a unique purpose and that the target universe, frame, strata, and unit of each database differ. The table below highlights those differences.
|Target Universe||All community, non-rehabilitation hospitals in the United States excluding long-term acute-care hospitals||All ED visits from hospital-owned ED units in community, non-rehabilitation hospitals in the United States||Pediatric discharges from community, non-rehabilitation hospitals in the United States|
|Sample Frame||All discharges from community, non-rehabilitation hospitals, excluding long-term acute care hospitals, in the participating HCUP Partner States||All ED visits from hospital-owned ED units in community, non-rehabilitation hospitals in the participating HCUP Partner States||Pediatric discharges from community, non-rehabilitation hospitals in the participating HCUP Partner States|
|Sample Strata||US Census Division, urban or rural location, teaching status, ownership, bed size||US Region, urban or rural location, teaching status, ownership, trauma-level||Uncomplicated births, complicated births, all other pediatric hospital stays|
|Sample Unit||Discharge-level||Hospital-owned ED-level||Pediatric discharges|
Return to Contents
As you begin your work with the HCUP national databases, you will want to keep in mind the following key points:
If you are looking for more information on the subject matter covered here, many resources are available on the HCUP User Support website.
If you can't find what you need, feel free to email the HCUP Technical Assistance staff at firstname.lastname@example.org. AHRQ has senior research personnel available to answer technical questions you may have.
Thank you for accessing this module. There are several other HCUP online tutorials that can be accessed. Take a look to see if there are other topics that could be helpful to you.
If you have any feedback regarding this module, please email us at email@example.com.
|Internet Citation: HCUP Sample Design: National Databases - Accessible Version. Healthcare Cost and Utilization Project (HCUP). November 2018. Agency for Healthcare Research and Quality, Rockville, MD. www.hcup-us.ahrq.gov/tech_assist/sampledesign/508_compliance/index508_2018.jsp.|
|Are you having problems viewing or printing pages on this Website?|
|If you have comments, suggestions, and/or questions, please contact firstname.lastname@example.org.|
|Privacy Notice, Viewers & Players|
|Last modified 11/5/18|