HEALTHCARE COST & UTILIZATION PROJECT

User Support

Do Your own analysis
Explore Expert Research & Limited Datasets

HCUP National Estimates Tutorial - Accessible Version


HCUP National Estimates Tutorial - Accessible Version


Welcome to the HCUP National Estimates Tutorial

Thank you for joining us for this Healthcare Cost and Utilization Project, or HCUP, online tutorial on calculating national estimates using the HCUP nationwide databases.

HCUP's nationwide databases provide estimates for hospital stays, emergency department visits, or major ambulatory surgery encounters across the United States. They are built from the HCUP State databases. The databases contain information on all discharges or encounters, regardless of expected payer. They can be used to create national estimates of healthcare utilization, access, charges, quality, and outcomes. The nationwide databases are available for purchase through the HCUP Central Distributor. Statistics from select databases are available on HCUPnet.

This tutorial is organized into five modules specific to each nationwide database:

  1. National Inpatient Sample, or NIS
  2. Kids' Inpatient Database, or KID
  3. Nationwide Ambulatory Surgery Sample, or NASS
  4. Nationwide Emergency Department Sample, or NEDS
  5. Nationwide Readmissions Database, or NRD

Each module is divided into four sections that provide background information on the nationwide database and demonstrate how it can be used to produce national and, in some cases, regional estimates for healthcare-related analyses. Also provided is sample SAS® code demonstrating how to use the nationwide database to produce such estimates.

This tutorial is self-paced. Therefore, the time to complete this tutorial will vary based on the individual user's experience. This tutorial includes narration.


Return to Contents


Overview of the HCUP National Estimates Tutorial Structure

This tutorial contains five modules, with each corresponding to an HCUP nationwide database. Each module is divided into four sections that cover specific topic areas for the respective HCUP nationwide database. These topic areas are the same across all modules.

Modules
Module 1
National Inpatient Sample (NIS), which includes the following sections:
  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates
Module 2
Kids' Inpatient Database (KID), which includes the following sections:
  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates
Module 3
Nationwide Ambulatory Surgery Sample (NASS), which includes the following sections:
  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates
Module 4
Nationwide Emergency Department Sample (NEDS), which includes the following sections:
  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates
Module 5
Nationwide Readmissions Database (NRD), which includes the following sections::
  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates

Return to Contents


Module 1: National Inpatient Sample (NIS)

The National Inpatient Sample, or NIS, is the largest publicly available all-payer inpatient care database in the United States, containing data on more than 7 million hospital stays.

Information on the NIS is organized by the four sections below. These include:

  • Overview
  • Weighting the Data
  • SAS Code Examples, and
  • Validating Estimates

Additional information about the NIS is available on the NIS Database Documentation page on the HCUP User Support, or HCUP-US, website.


Return to Contents


Module 1: National Inpatient Sample (NIS), Overview of the NIS

The NIS is the largest publicly available all-payer inpatient healthcare database in the United States. It is designed to produce U.S. regional and national estimates of inpatient utilization, access, cost, quality, and outcomes. Unweighted, it contains data from more than 7 million hospital stays each year. Weighted, it estimates more than 35 million hospitalizations nationally.

The NIS is sampled from the HCUP State Inpatient Databases (SID), which include all inpatient data from participating HCUP Partners that currently contribute to HCUP. Available since data year 1988, the NIS sampling frame has grown from including data from 8 HCUP Partners to 49 HCUP Partners, including about 97 percent of discharges from U.S. community hospitals.

Additional information on the sample design of the NIS is available in the NIS Introduction or the HCUP Sample Design tutorial.


Return to Contents


Module 1: National Inpatient Sample (NIS), Weighting the NIS

NIS Data Element Discharge Weight

To produce nationally or regionally representative estimates, the NIS data must be weighted. This can be done using the data element discharge weight, or DISCWT, which is assigned to each record in the NIS. The value of DISCWT is 5 for all records because the NIS is a self-weighted sample.

When the discharge weights are applied to the NIS data, the result is an estimate of the number of discharges for the target universe, which includes all inpatient discharges from community hospitals in the United States, excluding rehabilitation hospitals beginning in 1998 and long-term acute care hospitals beginning in 2012.

Per the American Hospital Association, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term acute care, rehabilitation, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.

NIS Discharge Weights Over Time

NIS data are available annually beginning with data year 1988. Users should be mindful of changes to the discharge weight variable over time. These changes are listed in the table below.

Years Variable Name Use
2001+ DISCWT All national estimates
2000 DISCWT National estimates except those including total charge (data element TOTCHG)
2000 DISCWTcharge National estimates of total charge (data element TOTCHG)
1998-1999 DISCWT All national estimates
1988-1997 DISCWT_U All national estimates

Accounting for the NIS Redesign in Data Year 2012 for Multi-Year Analysis

In 2012, the NIS was redesigned to improve national estimates:

  • Beginning with data year 2012, the NIS is a sample of discharges from all hospitals in HCUP.
  • Through data year 2011, the NIS is composed of all discharges from a sample of hospitals in HCUP.

If conducting a trend analysis that uses NIS data before and after data year 2012, it is recommended that users account for the redesign when creating national estimates. The NIS Trend Weights (data element TRENDWT) for data years 1993-2011 have been developed to assist with such an analysis.

For example, if conducting a trend analysis for NIS data years 2010-2013, you should use a combination of TRENDWT and DISCWT to generate comparable national estimates.

Additional information on the NIS Trend Weights is available on the HCUP-US website. For information on how to conduct a trend analysis using NIS data before and after data year 2012, refer to the HCUP Multi-Year Analysis tutorial. The SAS code examples included in the next section do not provide an example of a trend analysis.


NIS Hospital Weight

As described on the prior slide, the NIS was redesigned in data year 2012. For data years 1988-2011, the NIS is composed of all discharges from a sample of hospitals. To project NIS hospitals to the number of hospitals in the target universe, users must apply a hospital weight (data element HOSPWT) to the NIS data for these years. For data years 1988-1997, this data element was named HOSPST_U.

Beginning in 2012, HOSPWT is no longer applicable because the NIS is a sample of discharges from all hospitals from participating HCUP Partners.

The SAS code included in the next section does not provide an example of using HOSPWT. However, such an example is available in Module 4, which covers the Nationwide Emergency Department Sample (NEDS), where the use of HOSPWT is still applicable.


Return to Contents


Module 1: National Inpatient Sample (NIS), SAS Code Examples

SAS Code for Producing National Estimates by Expected Payer

This example SAS code produces national estimates of discharges by the primary expected payer or data element PAY1 in the 2019 NIS.


Title "Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)";
Libname NIS2019	"V:\NIS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;

proc format;
    Value FPAY /* PAY1 and PAY2 */ 
         1 = " 1: Medicare"
         2 = " 2: Medicaid"
         3 = " 3: Private insurance"
         4 = " 4: Self-pay"
         5 = " 5: No charge"
         6 = " 6: Other"
         . = " .: Missing"
        .A = ".A: Invalid"
        ;
run;

proc surveymeans data=nis2019.nis_2019_core missing sumwgt ;
     cluster HOSP_NIS ;
     strata NIS_STRATUM ;
     domain PAY1 ;
     format PAY1 fpay. ;
     weight DISCWT ;
     var KEY_NIS ;
run;
 

The first section of this example SAS code includes a PROC FORMAT, which is a procedure that assigns data labels to the data values in the output. For this example, we are focused on data element PAY1, which has the following mappings:

  • Numeric value 1 for Medicare
  • Numeric value 2 for Medicaid
  • Numeric value 3 for Private insurance
  • Numeric value 4 for Self-pay
  • Numeric value 5 for No charge
  • Numeric value 6 for Other
  • A decimal point means a numeric value is missing
  • A decimal followed by the uppercase letter, A, means the value is invalid.

This PROC FORMAT is specific to this example and should be modified if your analysis requires a different data element of interest. For example, if you are interested in obtaining national estimates for patient race or data element RACE, the proc format would include the mapping for that data element.

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NIS. This procedure includes the following statements:

  • The CLUSTER statement, which includes the NIS hospital identifier or data element HOSP_NIS.
  • The STRATA statement, which includes the NIS stratum identifier or data element NIS_STRATUM.
  • The DOMAIN and FORMAT statements, which are specific to this analysis, and produce national estimates by data element PAY1.
  • The WEIGHT statement, which includes the NIS discharge weight or data element DISCWT.
  • The VAR statement, which includes the NIS record identifier or data element KEY_NIS.

Note that for analysis including 2012 and earlier years: Replace HOSP_NIS with HOSPID in the CLUSTER statement and use the NIS Trend Weight (TRENDWT) in place of the original discharge weight (DISCWT) in the WEIGHT statement. See the Multi-Year Analysis Tutorial for more information.


Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for PAY1 domains                                 

                                                                              Sum of
    PAY1                        Variable        Label                        Weights
    --------------------------------------------------------------------------------
     .: Missing                 KEY_NIS         NIS record number              42179 
    .A: Invalid                 KEY_NIS         NIS record number        1945.003493
     1: Medicare                KEY_NIS         NIS record number           14529614
     2: Medicaid                KEY_NIS         NIS record number            7927119
     3: Private insurance       KEY_NIS         NIS record number           10249101
     4: Self-pay                KEY_NIS         NIS record number            1525965
     5: No charge               KEY_NIS         NIS record number             109380
     6: Other                   KEY_NIS         NIS record number            1033720
    --------------------------------------------------------------------------------
 

The output for this example SAS code provides the following weighted record counts for PAY1 in the 2019 NIS:

  • Missing: 42,179
  • Invalid: 1,945
  • Medicare: 14,529,614
  • Medicaid: 7,927,119
  • Private insurance: 10,249,101
  • Self-pay: 1,525,965
  • No charge: 109,380
  • Other: 1,033,720

Example SAS Code for Producing National Estimates for Asthma

This example SAS code identifies the number of weighted records in the 2019 NIS with a principal diagnosis of asthma, which is based on the default HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-CM diagnosis category, RSP009 for Asthma.


Title "Produce National Estimate of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)";
Libname NIS2019	"V:\NIS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;

data asthma;
    merge nis2019.nis_2019_core (keep=HOSP_NIS KEY_NIS DISCWT NIS_STRATUM)
          nis2019.nis_2019_dx_pr_grps (keep=HOSP_NIS KEY_NIS DXCCSR_Default_DX1)
    ;
    by HOSP_NIS KEY_NIS;
    Attrib Asthma length=3 label='Asthma default DXCCSR=RSP009';
    Asthma =(DXCCSR_Default_DX1='RSP009');
run;

proc surveymeans data=asthma sum std mean stderr;
     cluster HOSP_NIS ;
     strata NIS_STRATUM;
     var Asthma;
     weight DISCWT;
run;
 

The first section of this example SAS code includes the DATA step, which is looking for records with a default CCSR category of RSP009, Asthma, for the principal diagnosis. This step includes the following statements:

  • The MERGE statement, which combines the NIS Core File with the NIS Diagnosis and Procedure Groups Files by HOSP_NIS and KEY_NIS. This process results in the acquisition of the default CCSR category for the principal diagnosis or data element DXCCSR_Default_DX1.
  • The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and DXCCSR_Default_DX1.
  • The ATTRIB statement, which assigns a length and a label to a new data element (ASTHMA) specific to our example analysis. The next statement, Asthma =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of RSP009 for the principal diagnosis (NIS data element DXCCSR_Default_DX1=RSP009).

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NIS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_NIS.
  • The STRATA statement, which includes NIS_STRATUM.
  • The WEIGHT statement, which includes the data element DISCWT.
  • The VAR statement, which includes the value, ASTHMA, that we defined in the DATA step above.


Produce National Estimate of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)         

                                       The SURVEYMEANS Procedure                        

                                             Data Summary                              

                              Number of Strata                 201                  
                              Number of Clusters              4568                  
                              Number of Observations       7083805
                              Sum of Weights              35419023


                                              Statistics                               

                                                        Std Error                     Std Error
          Variable      Label               Mean          of Mean          Sum           of Sum
          -------------------------------------------------------------------------------------
          Asthma        Asthma default      0.004781     0.000109       169330      4029.071202
                        DXCCSR=RSP009
 

The output for this example SAS code provides the total number of weighted records in the 2019 NIS with a default CCSR category for the principal diagnosis of RSP009, Asthma, which is 169,330.

Example SAS Code for Producing Regional Estimates for Asthma

This example SAS code is focused on producing regional estimates for asthma in the 2019 NIS, which we defined based on the default CCSR category of RSP009 for the principal diagnosis.


Title "Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)";
Libname NIS2019 "V:\NIS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
     
proc format;
     Value St_Regn 
         1 = "1: Northeast"
         2 = "2: Midwest"
         3 = "3: South"
         4 = "4: West"
     ;
run;

data asthma;
    merge nis2019.nis_2019_core (keep=HOSP_NIS KEY_NIS DISCWT NIS_STRATUM)
          nis2019.nis_2019_dx_pr_grps (keep=HOSP_NIS KEY_NIS DXCCSR_Default_DX1)
    ;
    by HOSP_NIS KEY_NIS;
    /* Look up region */
    if _n_=1 then do;
	   if 0 then set nis2019.nis_2019_hospital (keep=HOSP_REGION);
	   declare hash h (dataset: "nis2019.nis_2019_hospital");
	   h.defineKey('HOSP_NIS');
	   h.defineData('HOSP_REGION');
	   h.defineDone();
    end;
    if h.find() ne 0 then abort; /* all disharges should have a matching hospital record */
    Attrib Asthma length=3 label='Asthma default DXCCSR=RSP009';
    Asthma = (DXCCSR_Default_DX1='RSP009');
run;

proc surveymeans data=asthma missing sum mean ;
    cluster HOSP_NIS ;
    strata NIS_STRATUM ;
	var Asthma;
	weight DISCWT ;
    domain HOSP_REGION ;
	format HOSP_REGION St_Regn. ;
run;

The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:

  • Numeric value 1 for Northeast
  • Numeric value 2 for Midwest
  • Numeric value 3 for South
  • Numeric value 4 for West

The second section includes the DATA step, which includes the following statements:

  • The MERGE statement, which links the NIS Core File with the NIS Diagnosis and Procedure Groups File keeping essential data elements from each file.
    • For this specific example, there is an additional step that uses the hash technique to acquire the data element, HOSP_REGION, from the NIS Hospital File. The NIS hospital identification number, HOSP_NIS, is used for linkage.
  • The ATTRIB statement, which assigns a length and a label to a new data element (ASTHMA) specific to our example analysis. The next statement, Asthma =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of RSP009 for the principal diagnosis (NIS data element DXCCSR_Default_DX1=RSP009).

The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NIS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_NIS.
  • The STRATA statement, which includes NIS_STRATUM.
  • The WEIGHT statement, which includes the data element DISCWT.
  • The VAR statement, which includes the value, asthma, that we defined in the DATA step above.
  • The DOMAIN and FORMAT statements, which are specific to HOSP_REGION as we are interested in regional estimates.


Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)     
                                                                                                         
                                        The SURVEYMEANS Procedure                                        
                                                                                                         
                                   Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                 Std Error                     Std Error
HOSP_REGION     Variable     Label                    Mean         of Mean           Sum          of Sum
--------------------------------------------------------------------------------------------------------
1: Northeast    Asthma       Asthma default       0.006485        0.000358         41550     2419.380819
                             DXCCSR=RSP009
2: Midwest      Asthma       Asthma default       0.004217        0.000221         33065     1775.993606
                             DXCCSR=RSP009
3: South        Asthma       Asthma default       0.004415        0.000135         62150     2058.065599
                             DXCCSR=RSP009
4: West         Asthma       Asthma default       0.004590        0.000240         32565     1729.226390
                             DXCCSR=RSP009
--------------------------------------------------------------------------------------------------------
 

The output for this example SAS code provides the total number of weighted records in the 2019 NIS with a default CCSR category of RSP009, Asthma, by hospital region:

  • Northeast: 41,550
  • Midwest: 33,065
  • South: 62,150
  • West: 32,565

Return to Contents

Module 1: National Inpatient Sample (NIS), Validating National and Regional Estimates

There are three resources that can be used to validate national and regional estimates for the NIS.

  • The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
  • The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-9-CM and ICD-10-CM/PCS codes (individually and grouped by clinical category) in the HCUP nationwide databases. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
  • HCUPnet is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. HCUPnet can be used to validate select national estimates obtained from the NIS, KID, NEDS, or NRD and county- or State-level statistics for participating HCUP Partners.

HCUP Summary Statistics


Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for PAY1 Domains                                 

                                                                              Sum of
    PAY1                        Variable        Label                        Weights
    --------------------------------------------------------------------------------
     .: Missing                 KEY_NIS         NIS record number              42179 
    .A: Invalid                 KEY_NIS         NIS record number        1945.003493
     1: Medicare                KEY_NIS         NIS record number           14529614
     2: Medicaid                KEY_NIS         NIS record number            7927119
     3: Private insurance       KEY_NIS         NIS record number           10249101
     4: Self-pay                KEY_NIS         NIS record number            1525965
     5: No charge               KEY_NIS         NIS record number             109380
     6: Other                   KEY_NIS         NIS record number            1033720
    --------------------------------------------------------------------------------
 

The output for this example SAS code provides the following weighted record counts for PAY1 in the 2019 NIS:

  • Missing: 42,179
  • Invalid: 1,945
  • Medicare: 14,529,614
  • Medicaid: 7,927,119
  • Private insurance: 10,249,101
  • Self-pay: 1,525,965
  • No charge: 109,380
  • Other: 1,033,720

For validation, we are going to compare the output with the 2019 NIS Summary Statistics.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NIS Database Documentation.

The NIS Summary Statistics are available on this page, under the "Data Elements" section on the left-hand side.

The NIS Summary Statistics page includes all years of the NIS. We will scroll down to the section specific to data year 2019. Our data element of interest, PAY1, is in the NIS Core File, which means we will want to select the Summary Statistics for the NIS Core File, and, specifically, the file that provides weighted estimates (i.e., NIS 2019 Core Weighted). Once the file has downloaded, we will need to navigate to the frequency of the data element PAY1. We can do this easily by searching for this data element name within the downloaded PDF. We are now ready to compare the Summary Statistics with our output from SAS.

HCUP Weighted Summary Statistics Report: NIS 2019 Core File Weighted Frequency Distribution for PAY1
PAY1 Variable Name Use
.: Missing 42,179 0.12%
.A: Invalid 1,945 0.01%
1: Medicare 14,529,614 41.02%
2: Medicaid 7,927,119 22.38%
3: Private insurance 10,249,101 28.94%
4: Self-pay 1,525,965 4.31%
5: No charge 109,380 0.31%
6: Other 1,033,720 2.92%


Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for PAY1 Domains                                 

                                                                              Sum of
    PAY1                        Variable        Label                        Weights
    --------------------------------------------------------------------------------
     .: Missing                 KEY_NIS         NIS record number              42179 
    .A: Invalid                 KEY_NIS         NIS record number        1945.003493
     1: Medicare                KEY_NIS         NIS record number           14529614
     2: Medicaid                KEY_NIS         NIS record number            7927119
     3: Private insurance       KEY_NIS         NIS record number           10249101
     4: Self-pay                KEY_NIS         NIS record number            1525965
     5: No charge               KEY_NIS         NIS record number             109380
     6: Other                   KEY_NIS         NIS record number            1033720
    --------------------------------------------------------------------------------
 

A comparison of the PAY1 frequency from the 2019 NIS Weighted Core Summary Statistics and the output from the example SAS code in this tutorial demonstrates that our results match.

HCUP Diagnosis and Procedure Frequency Tables

In our second example analysis, which produced national estimates for records in the 2019 NIS with a default CCSR category of RSP009, Asthma, for the principal diagnosis, we obtained a count of 169,330.

For validation, we are going to compare the output with the NIS Diagnosis and Procedure Frequency Tables.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NIS Database Documentation.

The NIS Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.

Once the file has been downloaded, we will navigate to the tab T.1_By_DXCCSR_Category, which includes the unweighted and weighted number of records by individual CCSR for ICD-10-CM diagnosis category. We will then navigate to the row for CCSR category RSP009, Asthma, and scroll over to the columns that are specific to the 2019 NIS. Note that you can filter to RSP009 by using either Column A or Column B.

Table 1. Weighted and Unweighted Number of Records by Clinical Classifications
Software Refined (CCSR) for ICD-10-CM Diagnoses, v2021.2
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), National Inpatient Sample (NIS), 2016-2019

Note: Counts for all-listed diagnoses include all possible CCSR category assignments. Unduplicated means that if two or more diagnosis codes on the same discharge record mapped to the same CCSR category, the discharge record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size.
CCSR for ICD-10-CM Category, v2021.2 CCSR Description, v2021.2 2019 NIS:
Weighted N for
DX1 CCSR
Default
2019 NIS:
Weighted N for
All-Listed CCSR
(Unduplicated)
2019 NIS:
Unweighted N for
DX1 CCSR
Default
2019 NIS:
Unweighted N for
All-Listed CCSR
(Unduplicated)
RSP009 RSP009 Asthma **169,330 2,274,896 33,866 454,979



Produce National Estimate of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)         

                                       The SURVEYMEANS Procedure                        

                                             Data Summary                              

                              Number of Strata                 201                  
                              Number of Clusters              4568                  
                              Number of Observations       7083805
                              Sum of Weights              35419023


                                              Statistics                               

                                                        Std Error                     Std Error
          Variable      Label               Mean          of Mean          Sum           of Sum
          -------------------------------------------------------------------------------------
          Asthma        Asthma default      0.004781     0.000109     **169330      4029.071202
                        DXCCSR=RSP009
 

A comparison of the count obtained from the NIS Diagnosis and Procedure Frequency Tables and the output from the example SAS code in this tutorial (denoted by **) demonstrates that our results match.


HCUPnet


Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)     
                                                                                                         
                                        The SURVEYMEANS Procedure                                        
                                                                                                         
                                   Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                 Std Error                     Std Error
HOSP_REGION     Variable     Label                    Mean         of Mean           Sum          of Sum
--------------------------------------------------------------------------------------------------------
1: Northeast    Asthma       Asthma default       0.006485        0.000358         41550     2419.380819
                             DXCCSR=RSP009
2: Midwest      Asthma       Asthma default       0.004217        0.000221         33065     1775.993606
                             DXCCSR=RSP009
3: South        Asthma       Asthma default       0.004415        0.000135         62150     2058.065599
                             DXCCSR=RSP009
4: West         Asthma       Asthma default       0.004590        0.000240         32565     1729.226390
                             DXCCSR=RSP009
--------------------------------------------------------------------------------------------------------
 

Here is output from our final example analysis, which produced regional estimates for records in the 2019 NIS with a default CCSR category of RSP009, Asthma, for the principal diagnosis:

  • Northeast: 41,550
  • Midwest: 33,065
  • South: 62,150
  • West: 32,565

For validation, we are going to compare the output with HCUPnet.

As a first step, we will need to accept the terms of the Data Use Agreement. Now, we will navigate to the top menu and select the "Inpatient Setting" dashboard. Once selected, we will expand the option for "National Inpatient" and select "Diagnoses and Procedures."

The HCUPnet results will default to displaying trends in the total number of discharges with a default CCSR category of BLD001, Nutritional anemia, for the principal diagnosis. We need to modify the selections on the left-hand side of the screen to align with our analysis as follows:

  1. First, select the option for "Cross-Sectional" analysis.
  2. Next, retain the default data year of "2019" in the "Years" drop-down, the "Diagnoses—Clinical Classifications Software Refined or CCSR" in the "Classification Types" drop-down, and the "Principal" option in the "Principal or All-Listed" drop-down.
  3. Next, under the "Diagnoses/Procedures" drop-down unclick the (All) selection to change the default from running the query on all CCSR categories. Scroll down through the list to CCSR category RSP009, Asthma, or use the search bar and ensure the box is checked.
  4. Next, ensure only "Number of discharges" is selected in the "Outcome" drop-down.
  5. Next, select the "Hospital Census Region" option in the "Characteristic" drop-down.
  6. Next, retain the default option of "All" for the "Characteristic Levels" drop-down so that all four U.S. census regions are included in the results.
  7. Last, select the box for "Show 95% CI" to display the standard error of the estimates if you wish to view this information.

A table will appear next to the left-hand side menu where the selections were made. This table presents regional estimates for records in the 2019 NIS with a default CCSR category of RSP009, Asthma, for the principal diagnosis. If you wish to display a graph for this output, navigate to the upper right and make the necessary selections under the "Diagnoses/Procedures to Graph" and "Outcome to Graph" drop-down menus.

Diagnoses/Procedures Characteristic Levels Total number of discharges
Estimate Std. Error
RSP009: Asthma
Northeast 41,550 2,419
Midwest 33,065 1,776
South 62,150 2,058
West 32,565 1,729



Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)     
                                                                                                         
                                        The SURVEYMEANS Procedure                                        
                                                                                                         
                                   Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                 Std Error                     Std Error
HOSP_REGION     Variable     Label                    Mean         of Mean           Sum          of Sum
--------------------------------------------------------------------------------------------------------
1: Northeast    Asthma       Asthma default       0.006485        0.000358         41550     2419.380819
                             DXCCSR=RSP009
2: Midwest      Asthma       Asthma default       0.004217        0.000221         33065     1775.993606
                             DXCCSR=RSP009
3: South        Asthma       Asthma default       0.004415        0.000135         62150     2058.065599
                             DXCCSR=RSP009
4: West         Asthma       Asthma default       0.004590        0.000240         32565     1729.226390
                             DXCCSR=RSP009
--------------------------------------------------------------------------------------------------------
 

A comparison of our output from HCUPnet with the output from the example SAS code in this tutorial demonstrates that our results match.


You have completed Module 1, National Inpatient Sample (NIS).

For any questions about the NIS that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:

  • Email: hcup@ahrq.gov
  • Phone: 866-290-HCUP (4287) (toll free)
  • International users, please contact HCUP User Support by email.

The staff reviews messages daily and usually responds to inquiries within 3 business days.


Return to Contents


Module 2: Kids' Inpatient Database (KID)

The Kids' Inpatient Database, or KID, is the largest publicly available all-payer pediatric inpatient care database in the United States, containing data from 2 to 3 million hospital stays each year.

Information on the KID is organized by the four sections below. These include:

  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates

Additional information about the KID is available on the KID Database Documentation page on the HCUP User Support (HCUP-US) website.


Return to Contents


Module 2: Kids' Inpatient Database (KID), Overview of the KID

The KID is the largest publicly available all-payer pediatric inpatient care database in the United States, yielding national estimates of hospital inpatient stays for patients younger than 21 years. The KID can be used to identify, track, and analyze national trends in healthcare utilization, cost, quality, and outcomes for the pediatric population. The unique design of the KID enables national and regional studies of rare conditions (e.g., congenital anomalies) as well as uncommon treatments (e.g., cardiac surgery).

The KID includes a sample of pediatric discharges from the HCUP State Inpatient Databases (SID), which include all inpatient data from participating HCUP Partners that currently contribute to HCUP. Generally available every 3 years, the KID sampling frame has grown from including data from 22 HCUP Partners to 49 HCUP Partners.

Additional information on the sample design of the KID is available in the KID Introduction and the HCUP Sample Design tutorial.


Return to Contents


Module 2: Kids' Inpatient Database (KID), Weighting the KID

KID Data Element Discharge Weight

To produce nationally or regionally representative estimates, the KID data must be weighted. This can be done using the data element discharge weight, or DISCWT, which is assigned to each record in the KID with the value varying across records.

When the discharge weights are applied to the KID data, the result is an estimate of the number of discharges for the target universe, which includes all pediatric inpatient discharges from community hospitals in the United States, excluding rehabilitation hospitals beginning in 2000. Per the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals are non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.

Weights are developed after discharges sampled from the SID are stratified into counts using six hospital characteristics: (1) Indicator of freestanding children's hospital, (2) U.S. census region, (3) urban/rural location, (4) teaching status, (5) bed size, and (6) hospital ownership/control. Total discharge counts for the target universe are calculated using the American Hospital Association (AHA) Annual Survey birth estimates for births, and a combination of AHA and SID data to estimate other pediatric discharges.

Pediatric discharges included in the KID are a combination of newborn discharges and non-newborn pediatric discharges. For each stratum, weights are created for each group.

  • Newborn discharges by dividing the number of universe newborns in the stratum by the number of KID newborns in the stratum
  • Non-newborn pediatric discharges by dividing the number of universe non-newborn pediatric discharges in the stratum by the number of KID non-newborn pediatric discharges in the stratum

KID Discharge Weights Over Time

KID data are generally available every 3 years beginning with data year 1997. Users should be mindful of changes to the discharge weight variable over time. These changes are listed in the table below.


Data Year(s) Data Element Name Use
2003+ DISCWT All national estimates
2000 DISCWT National estimates except those including total charge (data element TOTCHG)
2000 DISCWTcharge National estimates of total charge (data element TOTCHG)
1997 DISCWT_U All national estimates

Return to Contents


Module 2: Kids' Inpatient Database (KID), SAS Code Examples

Example SAS Code for Producing National Estimates by Patient Location

This example SAS code produces national estimates of discharges by patient location in the 2019 KID, defined by the National Centers for Health Statistics (NCHS) urban-rural code (data element PL_NCHS).


Title "Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)";
Libname KID2019	"V:\KID\2019\KID\SASDATA" access=readonly;
Options PS=51 LS=146 ;

proc format;
    Value NCHSF 
         1 = " 1: Large Central Metro"
         2 = " 2: Large Fringe Metro"
         3 = " 3: Medium Metro"
         4 = " 4: Small Metro"
         5 = " 5: Micropolitan"
         6 = " 6: Noncore"
         . = " .: Missing"
        ;
run;

proc surveymeans data=kid2019.kid_2019_core missing sumwgt ;
     cluster HOSP_KID ;
     strata KID_STRATUM ;
     domain PL_NCHS ;
     format PL_NCHS nchsf. ;
     weight DISCWT ;
     var RECNUM ;
run;
 

The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on data element PL_NCHS, which has the following mappings:

  • Numeric value 1 for Large central metro
  • Numeric value 2 for Large fringe metro
  • Numeric value 3 for Medium metro
  • Numeric value 4 for Small metro
  • Numeric value 5 for Micropolitan
  • Numeric value 6 for Noncore
  • A decimal point means a numeric value is missing

This PROC FORMAT is specific to this example and should be modified if your analysis requires a different data element of interest. For example, if you are interested in obtaining national estimates for patient age or data element FEMALE, the proc format would include the mapping for that data element.

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the KID. This procedure includes the following statements:

  • The CLUSTER statement, which includes the KID hospital identifier or data element HOSP_KID.
  • The STRATA statement, which includes the KID stratum identifier or data element KID_STRATUM.
  • The DOMAIN and FORMAT statements are specific to this analysis, which is interested in national estimates by data element PL_NCHS.
  • The WEIGHT statement, which includes the KID discharge weight or data element DISCWT.
  • The VAR statement, which includes the KID record identifier or data element RECNUM.
    • Note that the KID record identifier data element name does not include KEY, which differs from the other nationwide databases.


Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for PL_NCHS domains                                 

                                                                              Sum of
    PL_NCHS                     Variable       Label                         Weights
    --------------------------------------------------------------------------------
     .: Missing                 RECNUM         KID record number               19590
     1: Large Central Metro     RECNUM         KID record number             1908413
     2: Large Fringe Metro      RECNUM         KID record number             1415496
     3: Medium Metro            RECNUM         KID record number             1229632
     4: Small Metro             RECNUM         KID record number              509407
     5: Micropolitan            RECNUM         KID record number              492487
     6: Noncore                 RECNUM         KID record number              327514
    --------------------------------------------------------------------------------
 

The output for this example SAS code provides the following weighted record counts for PL_NCHS in the 2019 KID:

  • Missing: 19,590
  • Large central metro: 1,908,413
  • Large fringe metro: 1,415,496
  • Medium metro: 1,229,632
  • Small metro: 509,407
  • Micropolitan: 492,487
  • Noncore: 327,514

Example SAS Code for Producing National Estimates for Appendectomies

This example SAS code identifies the number of weighted records in the 2019 KID with a principal procedure of appendectomy, which is based on the HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-PCS procedure category, GIS008 (Appendectomy).


Title "Produce National Estimate of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)";
Libname KID2019	"V:\KID\2019\KID\SASDATA" access=readonly;
Options PS=51 LS=146 ;

data Appendectomy;
    merge kid2019.kid_2019_core (keep=HOSP_KID RECNUM DISCWT KID_STRATUM)
          kid2019.kid_2019_dx_pr_grps (keep=HOSP_KID RECNUM PRCCSR_GIS008)
    ;
    by HOSP_KID RECNUM;
	/* 1 is principal only, 2 is both principal and secondary, 3 is secondary only, 0 is none */
    Attrib Appendectomy length=3 label='Appendectomy (PRCCSR=GIS008=1 or 2)';
    Appendectomy =(PRCCSR_GIS008 in (1:2));
run;

proc surveymeans data=Appendectomy sum std mean stderr;
     cluster HOSP_KID ;
     strata KID_STRATUM;
     var Appendectomy;
     weight DISCWT;
run;
 

The first section of this example SAS code includes the DATA step, which identifies records with a CCSR category for the principal procedure of GIS008, Appendectomy. This step includes the following statements:

  • The MERGE statement, which combines the KID Core File with the KID Diagnosis and Procedure Groups File. The KID Diagnosis and Procedure Groups File includes the data element specific to the CCSR category for appendectomy, which is PRCCSR_GIS008.
  • The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and PRCCSR_GIS008.
  • The ATTRIB statement, which assigns a length and a label to a new data element (Appendectomy) specific to our example analysis. The next statement, Appendectomy =, assigns a value to this new data element, which in our example, is defined based on the CCSR category of GIS008 for the principal procedure (KID data element PRCCSR_GIS008 where the value is equal to 1 or 2).
    • A value of 1 means that the CCSR category was triggered by only the principal procedure on the record, and a value of 2 indicates it was triggered by both the principal and a secondary procedure on the record.
    • It is important to note that the absence of the value 2 in this statement is a common mistake experienced by users of HCUP data. It is not uncommon for a procedure code (or diagnosis code) to be repeated more than once on a record. If you limit your analysis to just records where the respective CCSR category is equal to the value of 1, that is it was triggered by the principal procedure (or diagnosis) only, you will not obtain accurate results.

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the KID. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_KID.
  • The STRATA statement, which includes KID_STRATUM.
  • The WEIGHT statement, which includes data element DISCWT.
  • The VAR statement, which includes the value, Appendectomy, that we defined in the DATA step above.


Produce National Estimate of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)         

                                   The SURVEYMEANS Procedure                        

                                          Data Summary                              

                             Number of Strata                  95                  
                             Number of Clusters              3998                  
                             Number of Observations       3089283
                             Sum of Weights            5902538.38


                                           Statistics                               

                                                        Std Error                     Std Error
   Variable        Label                    Mean          of Mean          Sum           of Sum
   --------------------------------------------------------------------------------------------
   Appendectomy    Appendectomy             0.007910     0.000275        46687     42075.679551
                   (PRCCSR_GIS008=1 or 2)
 

The output for this example SAS code provides the total number of weighted records in the 2019 KID with a CCSR category of GIS008, Appendectomy, which is 46,687.

Example SAS Code for Producing Regional Estimates for Appendectomies

This example SAS code below produces regional estimates for records with a CCSR category for the principal procedure (PR1) of GIS008, Appendectomy.


Title "Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)";
Libname KID2019 "V:\KID\2019\KID\SASDATA" access=readonly;
Options PS=51 LS=146 ;
     
proc format;
     Value St_Regn 
         1 = "1: Northeast"
         2 = "2: Midwest"
         3 = "3: South"
         4 = "4: West"
     ;
run;

data appendectomy;
    merge kid2019.kid_2019_core (keep=HOSP_KID RECNUM DISCWT KID_STRATUM)
          kid2019.kid_2019_dx_pr_grps (keep=HOSP_KID RECNUM PRCCSR_GIS008)
    ;
    by HOSP_KID RECNUM;
    /* Look up region */
    if _n_=1 then do;
	   if 0 then set kid2019.kid_2019_hospital (keep=HOSP_REGION);
	   declare hash h (dataset: "kid2019.kid_2019_hospital");
	   h.defineKey('HOSP_KID');
	   h.defineData('HOSP_REGION');
	   h.defineDone();
    end;
    if h.find() ne 0 then abort; /* all disharges should have a matching hospital record */
	/* 1 is principal only, 2 is both principal and secondary, 3 is secondary only, 0 is none */
    Attrib Appendectomy length=3 label='Appendectomy (PRCCSR_GIS008=1 or 2)';
    Appendectomy = (PRCCSR_GIS008 in (1:2));
run;

proc surveymeans data=Appendectomy missing sum mean ;
    cluster HOSP_KID ;
    strata KID_STRATUM ;
	var Appendectomy;
	weight DISCWT ;
    domain HOSP_REGION ;
	format HOSP_REGION St_Regn. ;
run;

The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:

  • Numeric value 1 for Northeast
  • Numeric value 2 for Midwest
  • Numeric value 3 for South
  • Numeric value 4 for West

The second section of this example SAS code includes the DATA step. Like the second example above, this step is looking for records with a CCSR category for the principal procedure of GIS008, Appendectomy. This step includes the following statements:

  • The MERGE statement, which combines the KID Core File with the KID Diagnosis and Procedure Groups File keeping essential data elements from both files.
    • There is an additional step that is looking for the data element HOSP_REGION, which resides in the KID Hospital File and is needed to produce regional estimates.
  • The ATTRIB statement, which assigns a length and a label to a new data element (Appendectomy) specific to our example analysis. The next statement, Appendectomy =, assigns a value to this new data element, which in our example, is defined based on the CCSR category of GIS008 for the principal procedure (KID data element PRCCSR_GIS008 where the value is equal to 1 or 2).

The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the KID. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_KID.
  • The STRATA statement, which includes KID_STRATUM.
  • The WEIGHT statement, which includes data element DISCWT.
  • The VAR statement, which includes the value, Appendectomy, which we defined in the DATA step above.
  • The DOMAIN and FORMAT statements, which are specific to HOSP_REGION as we are interested in regional estimates.


    Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)     
                                                                                                         
                                          The SURVEYMEANS Procedure                                        
                                                                                                         
                                     Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                    Std Error                       Std Error
HOSP_REGION    Variable        Label                    Mean          of Mean        Sum               of Sum
-------------------------------------------------------------------------------------------------------------
1: Northeast   Appendectomy    Appendectomy             0.007744     0.000653      7430.721673     846.785971
                               (PRCCSR_GIS008=1 or 2)
2: Midwest     Appendectomy    Appendectomy             0.005521     0.000348      7025.478501     662.349843
                               (PRCCSR_GIS008=1 or 2)
3: South       Appendectomy    Appendectomy             0.006957     0.000384      16252          1277.861496
                               (PRCCSR_GIS008=1 or 2)
4: West        Appendectomy    Appendectomy             0.011976     0.000875      15980          1659.952615
                               (PRCCSR_GIS008=1 or 2)
-------------------------------------------------------------------------------------------------------------
 

The output for this example SAS code provides the total number of weighted records in the 2019 KID with a CCSR category for the principal procedure of GIS008, Appendectomy, by hospital region:

  • Northeast: 7,430
  • Midwest: 7,025
  • South: 16,252
  • West: 15,980

Return to Contents

Module 2: Kids' Inpatient Database (KID), Validating National and Regional Estimates

There are three resources that can be used to validate national and regional estimates for the KID.

  • The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
  • The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-9-CM and ICD-10-CM/PCS codes (individually and grouped by clinical category) in the HCUP nationwide databases. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
  • HCUPnet is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. HCUPnet can be used to validate select national estimates obtained from the NIS, KID, NEDS, or NRD and county- or State-level statistics for participating HCUP Partners.

HCUP Summary Statistics


Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for PL_NCHS domains                                 

                                                                              Sum of
    PL_NCHS                     Variable       Label                         Weights
    --------------------------------------------------------------------------------
     .: Missing                 RECNUM         KID record number               19590
     1: Large Central Metro     RECNUM         KID record number             1908413
     2: Large Fringe Metro      RECNUM         KID record number             1415496
     3: Medium Metro            RECNUM         KID record number             1229632
     4: Small Metro             RECNUM         KID record number              509407
     5: Micropolitan            RECNUM         KID record number              492487
     6: Noncore                 RECNUM         KID record number              327514
    --------------------------------------------------------------------------------
 

Here is the output from our first example analysis, which produced national estimates for records in the 2019 KID by patient location using the NCHS urban-rural code, or data element PL_NCHS. We have separate weighted counts for each PL_NCHS value as well as for the missing value.

For validation, we are going to compare the output with the 2019 KID Summary Statistics.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the KID Database Documentation.

The KID Summary Statistics are available on this page, under the "Data Elements" section on the left-hand side.

The KID Summary Statistics page includes all years of the KID. We will scroll down to the section specific to data year 2019. Our data element of interest, PL_NCHS, is in the KID Core File, which means we will want to select the Summary Statistics for the KID Core File and, specifically, the file that provides weighted estimates (i.e., KID 2019 Core Weighted). Once the file has been downloaded, we will navigate to the frequency table for the data element PL_NCHS. We can do this easily by searching for this data element name within the downloaded PDF.

HCUP Weighted Summary Statistics Report: KID 2019 Core File Weighted Frequency Distribution for PL_NCHS
PL_NCHS Frequency Percent of Total
.: Missing 19,590 0.33%
1: Large Central Metro 1,908,413 32.33%
2: Large Fringe Metro 1,415,496 23.98%
3: Medium Metro 1,229,632 20.83%
4: Small Metro 509,407 8.63%
5: Micropolitan 492,487 8.34%
6: Noncore 327,514 5.55%


Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)         

                                 The SURVEYMEANS Procedure                        

                               Statistics for PL_NCHS domains                                 

                                                                              Sum of
    PL_NCHS                     Variable       Label                         Weights
    --------------------------------------------------------------------------------
     .: Missing                 RECNUM         KID record number               19590
     1: Large Central Metro     RECNUM         KID record number             1908413
     2: Large Fringe Metro      RECNUM         KID record number             1415496
     3: Medium Metro            RECNUM         KID record number             1229632
     4: Small Metro             RECNUM         KID record number              509407
     5: Micropolitan            RECNUM         KID record number              492487
     6: Noncore                 RECNUM         KID record number              327514
    --------------------------------------------------------------------------------
 

A comparison of the PL_NCHS frequency from the 2019 KID Weighted Core Summary Statistics and the output from SAS demonstrates that our results match.

KID Diagnosis and Procedure Frequency Tables

In the output from our second example analysis, which produced national estimates for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure, we obtained a count of 46,687.

For validation, we are going to compare the output with the KID Diagnosis and Procedure Frequency Tables.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the KID Database Documentation.

The KID Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.

Once the file has been downloaded, we will navigate to the tab T.3_By_PRCCSR_Category, which includes the unweighted and weighted number of records by individual CCSR for ICD-10-PCS procedure categories. We will then navigate to the row for CCSR category GIS008, Appendectomy, and scroll over to the columns that are specific to the 2019 KID. Note that you can filter to GIS008 by using either Column A or Column B. We are now ready to compare the values with our output from SAS.

Table 3. Weighted and Unweighted Number of Records by Clinical Classifications Software Refined (CCSR) for ICD-10-PCS Procedures, v2021.1
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Kids' Inpatient Database (KID), 2016 and 2019

Note: Unduplicated means that if two or more procedure codes on the same discharge record mapped to the same CCSR category, the discharge record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size.
CCSR for ICD-10-PCS Category, v2021.1 CCSR Description, v2021.1 2019 KID:
Weighted N for
PR1 CCSR
2019 KID:
Weighted N for
All-Listed CCSR
(Unduplicated)
2019 KID:
Unweighted N for
PR1 CCSR
2019 KID:
Unweighted N for
All-Listed CCSR
(Unduplicated)
GIS008 GIS008 Appendectomy **46,687 51,239 34,537 37,927



Produce National Estimate of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)         

                                   The SURVEYMEANS Procedure                        

                                          Data Summary                              

                             Number of Strata                  95                  
                             Number of Clusters              3998                  
                             Number of Observations       3089283
                             Sum of Weights            5902538.38


                                           Statistics                               

                                                        Std Error                     Std Error
   Variable        Label                    Mean          of Mean          Sum           of Sum
   --------------------------------------------------------------------------------------------
   Appendectomy    Appendectomy             0.007910     0.000275      **46687     42075.679551
                   (PRCCSR_GIS008=1 or 2)
 

A comparison of the weighted count for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure with the output from SAS (denoted by **) demonstrates that our results match.


HCUPnet


    Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)     
                                                                                                         
                                          The SURVEYMEANS Procedure                                        
                                                                                                         
                                     Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                    Std Error                       Std Error
HOSP_REGION    Variable        Label                    Mean          of Mean              Sum         of Sum
-------------------------------------------------------------------------------------------------------------
1: Northeast   Appendectomy    Appendectomy             0.007744     0.000653      7430.721673     846.785971
                               (PRCCSR_GIS008=1 or 2)
2: Midwest     Appendectomy    Appendectomy             0.005521     0.000348      7025.478501     662.349843
                               (PRCCSR_GIS008=1 or 2)
3: South       Appendectomy    Appendectomy             0.006957     0.000384            16252    1277.861496
                               (PRCCSR_GIS008=1 or 2)
4: West        Appendectomy    Appendectomy             0.011976     0.000875            15980    1659.952615
                               (PRCCSR_GIS008=1 or 2)
-------------------------------------------------------------------------------------------------------------
 

Here is the output from our final example analysis, which produced regional estimates for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure. These weighted counts can be validated using HCUPnet. It is important to note that within HCUPnet, procedure-related statistics are limited to operating room (OR) procedures only. OR procedures are identified using the HCUP Procedure Classes Refined for ICD-10-PCS. For reference, all ICD-10-PCS procedures included in CCSR category GIS008 are classified as OR procedures by the Procedure Classes tool.

As a first step, we will need to accept the terms of the Data Use Agreement. Now, we will navigate to the top menu and select the "Inpatient Setting" dashboard. Once selected, we will expand the option for "Children Only" and select "Diagnoses and Procedures."

The output will default to displaying trends in the total number of discharges for all major diagnostic categories, or MDCs. We need to modify the selections on the left-hand side of the screen to align with our analysis.

  1. First, select the option for "Cross-Sectional" analysis.
  2. Next, retain the default data year of "2019" in the "Years" drop-down.
  3. Next, select "Procedures—Clinical Classifications Software Refined or CCSR, Restricted to Operating Room Only" in the "Classification Types" drop-down, and retain the "Principal" option in the "Principal or All-Listed" drop-down.
  4. Next, under the "Diagnoses/Procedures" drop-down unclick the (All) selection to change the default from running the query on all CCSR categories. Scroll down through the list to CCSR category GIS008, Appendectomy, or use the search bar and ensure the box is checked.
  5. Next, ensure only "Number of discharges" is selected in the "Outcome" drop-down.
  6. Next, select the "Hospital Census Region" option in the "Characteristic" drop-down.
  7. Next, retain the default option of "All" for the "Characteristic Levels" drop-down.
  8. Last, select the box for "Show 95% CI" to display the standard error of the estimates if you wish to view this information.

A table will appear next to the left-hand side menu where the selections were made. This table presents regional estimates for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure. If you wish to display a graph for this output, navigate to the upper right and make the necessary selections under the "Diagnoses/Procedures to Graph" and "Outcome to Graph" drop-downs.

Diagnoses/Procedures Characteristic Levels Total number of discharges
Estimate Std. Error
GIS008: Appendectomy
Midwest 7,025 662
Northeast 7,431 847
South 16,252 1,278
West 15,980 1,660



    Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)     
                                                                                                         
                                          The SURVEYMEANS Procedure                                        
                                                                                                         
                                     Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                    Std Error                       Std Error
HOSP_REGION    Variable        Label                    Mean          of Mean        Sum               of Sum
-------------------------------------------------------------------------------------------------------------
1: Northeast   Appendectomy    Appendectomy             0.007744     0.000653      7430.721673     846.785971
                               (PRCCSR_GIS008=1 or 2)
2: Midwest     Appendectomy    Appendectomy             0.005521     0.000348      7025.478501     662.349843
                               (PRCCSR_GIS008=1 or 2)
3: South       Appendectomy    Appendectomy             0.006957     0.000384      16252          1277.861496
                               (PRCCSR_GIS008=1 or 2)
4: West        Appendectomy    Appendectomy             0.011976     0.000875      15980          1659.952615
                               (PRCCSR_GIS008=1 or 2)
-------------------------------------------------------------------------------------------------------------
 

A comparison of our output from HCUPnet with the output from the example SAS code in this tutorial demonstrates that our results match.


You have completed Module 2, Kids' Inpatient Database (KID)!

For any questions about the KID that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:

  • Email: hcup@ahrq.gov
  • Phone: 866-290-HCUP (4287) (toll free)
  • International users, please contact HCUP User Support by email.

The staff reviews messages daily and usually responds to inquiries within 3 business days.


Return to Contents


Module 3: Nationwide Ambulatory Surgery Sample (NASS)

The Nationwide Ambulatory Surgery Sample, or NASS, is the largest all-payer ambulatory surgery database in the United States. It produces national estimates of major ambulatory surgery encounters in hospital-owned facilities.

Information on the NASS is organized by the four sections below. These include:

  • Overview
  • Weighting the Data
  • SAS Code Examples, and
  • Validating Estimates

Additional information about the NASS is available on the NASS Database Documentation page on the HCUP User Support, or HCUP-US, website.


Module 3: Nationwide Ambulatory Surgery Sample (NASS), Overview of the NASS

The NASS is the largest all-payer ambulatory surgery database in the United States, yielding national and regional estimates of major ambulatory surgery encounters performed in hospital-owned facilities.

Major ambulatory surgeries are defined as selected major therapeutic procedures that require the use of an operating room, penetrate or break the skin, and involve regional anesthesia, general anesthesia, or sedation to control pain (that is, surgeries flagged as "narrow" in the HCUP Surgery Flags Software).

The NASS is limited to encounters with at least one in-scope major ambulatory surgery on the record performed at hospital-owned facilities. Procedures intended primarily for diagnostic purposes are not considered in scope. Unweighted, the NASS contains about 9 million ambulatory surgery encounters each year and about 11.8 million ambulatory surgery procedures. Weighted, it estimates about 11.9 million ambulatory surgery encounters and 15.7 million ambulatory surgery procedures.

The NASS is sampled from the HCUP State Ambulatory Surgery and Services Databases (SASD), and is available beginning with data year 2016.

Additional information on the NASS sample design is available in the NASS Introduction.


Return to Contents

Module 3: Nationwide Ambulatory Surgery Sample (NASS), Weighting the NASS

NASS Data Element Encounter Weight

To produce nationally or regionally representative estimates, the NASS data must be weighted. This can be done using the data element encounter weight, or DISCWT, which is assigned to each record in the NASS.

When the encounter weights are applied to the NASS data, the result is an estimate of the number of major ambulatory surgery encounters for the target universe, which includes all major ambulatory surgery encounters performed in facilities owned by community hospitals in the United States, excluding rehabilitation hospitals and long-term acute care hospitals. Prior to data year 2019, specialty hospitals were also excluded. Per the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.

The NASS target universe is major ambulatory surgery encounters in hospital-owned facilities in the U.S. The NASS sample is comprised of 100 percent of the major ambulatory surgery encounters for facilities in the SASD that are also in the NASS target universe. Ambulatory surgery volume for the target universe is derived from encounters in the SASD for facilities in HCUP States and estimated for facilities in non-HCUP States using predictive modeling.

Weights are developed by first summarizing after target and sample ambulatory surgery volume by strata defined by four hospital characteristics: (1) ownership/control, (2) bed size, (3) location and teaching status, and (4) the four U.S. census regions.

NASS encounter weights are calculated by dividing the number of universe major ambulatory surgery encounters by the number of sampled SASD major ambulatory surgery encounters within each stratum.

Changes to the NASS Sampling Design Over Time

Changes have occurred to the NASS design since its initial release for data year 2016. These changes include:

  • Procedures considered in scope can change year to year.
  • Earlier years of the NASS (2016–2018) undercount certain emergent surgeries.
  • The hospital-owned facility universe was modified between data years 2018 and 2019 to include specialty hospitals and limit to hospitals included in the AHA Annual Survey that reported performing outpatient surgery.

Additional information on these changes is available in the NASS Introduction.

These changes may cause discontinuities in trend analyses of major ambulatory surgery encounters over time. Unlike the redesign of the NIS in 2012, the NASS design changes have not resulted in the development of special trend weight files. The NASS encounter weight (data element DISCWT) should be used to obtain national or regional estimates.



Return to Contents


Module 3: Nationwide Ambulatory Surgery Sample (NASS), SAS Code Examples

Example SAS Code for Producing National Estimates by Race and Ethnicity

This example SAS code produces national estimates of major ambulatory surgeries by patient race and ethnicity (data element RACE) in the 2019 NASS.


Title "Produce National Estimate of Encounters By Patient Race/Ethnicity from 2019 NASS File (Weighted)";
Libname NASS2019 "O:\NASS\2019\run1\data" access=readonly;
Options PS=51 LS=146 ;

proc format;
    Value FRACE 
         1 = " 1: White"
         2 = " 2: Black"
         3 = " 3: Hispanic"
         4 = " 4: Asian/Pacific Islander"
         5 = " 5: Native American"
         6 = " 6: Other"
         . = " .: Missing"
		.A = ".A: Invalid"
		.B = ".B: Unavailable from source"
        ;
run;

Title2 "Add NASS_STRATUM from Hospital file";
Proc Sort Data=NASS2019.NASS_2019_encounter (Keep=HOSP_NASS NASS_STRATUM) Out=Hospital ;
     By HOSP_NASS;
Run;

Data NASS;
     Merge Encounter (in=inE)
	       Hospital (in=inH)
	 ;
	 By HOSP_NASS;
	 if inE;
	 if not inH then abort;
Run;
Title2;

proc surveymeans data=NASS missing sumwgt ;
     cluster HOSP_NASS ;
     strata NASS_STRATUM ;
     domain RACE ;
     format RACE FRACE. ;
     weight DISCWT ;
     var KEY_NASS ;
run;
 

The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on data element RACE, which has the following mappings:

  • Numeric value 1 for White
  • Numeric value 2 for Black
  • Numeric value 3 for Hispanic
  • Numeric value 4 for Asian or Pacific Islander
  • Numeric value 5 for Native American
  • Numeric value 6 for Other
  • A decimal point means a numeric value is missing, and
  • A decimal followed by the uppercase letter, A, means the value is invalid.

This PROC FORMAT is specific to this example and should be modified if your analysis requires a different data element of interest. For example, if you are interested in obtaining national estimates for the primary expected payer or data element PAY1, the proc format would include the mapping for that data element.

The second section of this example SAS code involved two procedures which sort the NASS Encounter File and the NASS Hospital File by HOSP_NASS and KEY_NASS. KEEP statements in these procedures limit the Encounter and Hospital files to only those data elements necessary for linkage, weighting the data, and adding the stratum and RACE fields.

The third section in this code employs a SAS data step that creates a temporary file called NASS by using the MERGE statement to join the NASS Encounter File with the NASS Hospital File to obtain the field NASS_STRATUM.

The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NASS. This procedure includes the following statements:

  • The CLUSTER statement, which includes the NASS hospital identifier or data element HOSP_NASS.
  • The STRATA statement, which includes the NASS stratum identifier or data element NASS_STRATUM.
  • The DOMAIN and FORMAT statements are specific to this analysis, which is interested in national estimates by data element RACE.
  • The WEIGHT statement, which includes the NASS discharge weight or data element DISCWT.
  • The VAR statement, which includes the NASS record identifier or data element KEY_NASS.


Produce National Estimate of Encounters By Patient Race/Ethnicity from 2019 NASS File (Weighted)         

                                   The SURVEYMEANS Procedure                        

                                          Data Summary                              

                             Number of Strata                  61                  
                             Number of Clusters              2958                  
                             Number of Observations       8994101
                             Sum of Weights            11880487.3


                                           Statistics                               

                   Variable        Label                    Sum of Weights
                   -------------------------------------------------------
                   KEY_NASS        NASS record number             11880487
 

Produce National Estimate of Encounters By Patient Race/Ethnicity from 2019 NASS File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for RACE Domains                                 

                                                                              Sum of
    RACE                        Variable         Label                       Weights
    --------------------------------------------------------------------------------
     .: Missing                 KEY_NASS         NASS record number           343473
    .A: Invalid                 KEY_NASS         NASS record number       188.485640
     1: White                   KEY_NASS         NASS record number          8425840
     2: Black                   KEY_NASS         NASS record number          1101118
     3: Hispanic                KEY_NASS         NASS record number          1247939
     4: Asian/Pacific Islander  KEY_NASS         NASS record number           318776
     5: Native American         KEY_NASS         NASS record number            66099
     6: Other                   KEY_NASS         NASS record number           377055
    --------------------------------------------------------------------------------
 

The output for this example SAS code provides the weighted record counts for RACE in the 2019 NASS:

  • Missing: 343,473
  • Invalid: 188
  • White: 8,425,840
  • Black: 1,101,118
  • Hispanic: 1,247,939
  • Asian/Pacific Islander: 318,776
  • Native American: 66,099
  • Other: 377,055

Example SAS Code for Producing National Estimates for Arthroplasty of Knee

This example SAS code identifies the number of weighted records in the 2019 NASS with any-listed procedure of knee arthroplasty, which is based on the HCUP Clinical Classifications Software (CCS) For Services and Procedures category 152.


Title "Produce National Estimate of Encounters With Any-Listed Knee Arthroplasty Procedures from 2019 NASS File (Weighted)";
Libname NASS2019 "O:\NASS\2019\run1\data" access=readonly;
Options PS=51 LS=146 ;

Title2 "Add NASS_STRATUM from Hospital file";
Proc Sort Data=NASS2019.NASS_2019_encounter (keep=HOSP_NASS KEY_NASS DISCWT CPTCCS1-CPTCCS30) Out=Encounter ;
    By HOSP_NASS KEY_NASS;
Run;

Proc Sort Data=NASS2019.NASS_2019_hospital (keep=HOSP_NASS NASS_STRATUM) Out=Hospital ;
    By HOSP_NASS;
Run;

Title2 "Define Knee Arthroplasty"
Data NASS;
    Merge Encounter (in=inE)
          Hospital (in=inH)
    ;
    by HOSP_NASS;
	if inE;
	if not inH then abort;
    Attrib Knee_Arthroplasty length=3 label='Knee arthroplasty (CPTCCSn=152)';
	array CPTCCS{*} CPTCCS1-CPTCCS30;
	Knee_Arthroplasty=0;
	do i=1 to dim(CPTCCS) until (Knee_Arthroplasty=1);
	      if CPTCCS(i)=152 then Knee_Arthroplasty=1;
    end;
	drop i;
Run;

Title2;
proc surveymeans data=NASS missing sum;
     cluster HOSP_NASS ;
     strata NASS_STRATUM ;
     weight DISCWT ;
	 var Knee_Arthroplasty ;
run;
 

The first section of this example SAS code involves two procedures which sort the NASS Encounter File and the NASS Hospital File by HOSP_NASS and KEY_NASS. KEEP statements in these procedures limit the Encounter and Hospital files to only those data elements necessary for linkage, weighting the data, and adding the stratum and CPTCCS fields.

  • The second section in this example SAS code employs a SAS data step that creates a temporary file called NASS. This step includes the following statements:
    • The MERGE statement, which combines the NASS Encounter File with the NASS Hospital File to obtain the field NASS_STRATUM. If a hospital is not found in the NASS hospital file, the data create step is aborted. This can be seen in the line with the abort code.
    • The ATTRIB statement, which assigns a length and a label to a new data element (Knee_Arthroplasty) specific to our example analysis. The next statement, Knee_Arthroplasty =, assigns a value to this new data element, which in our example, is defined based on the CCS for Services and Procedures category of 152 (NASS data element CPTCCSn=152).

The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NASS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_NASS.
  • The STRATA statement, which includes NASS_STRATUM.
  • The WEIGHT statement, which includes data element DISCWT.
  • The VAR statement, which includes the value Knee_Arthroplasty, which we defined in the DATA step above.


Produce National Estimate of Encounters With Any-Level Knee Arthroplasty Procedures from 2019 NASS File (Weighted)         

                                   The SURVEYMEANS Procedure                        

                                          Data Summary                              

                             Number of Strata                  61                  
                             Number of Clusters              2958                  
                             Number of Observations       8994101
                             Sum of Weights            11880487.3


                                           Statistics                               

                                                                     Std Error
              Variable             Label                    Sum         of Sum
              ----------------------------------------------------------------
              Knee_Arthroplasty    Knee_Arthroplasty     301910          10285
                                   (CPTCCSn=152)
 

The output for this example SAS code provides the total number of weighted records in the 2019 NASS with any-listed procedure CCS for Services and Procedures category of 152 for Arthroplasty of knee, which is 301,910.

Example SAS Code for Producing Regional Estimates for Arthroplasty of Knee

This example SAS code produces regional estimates for knee arthroplasty in the 2019 NASS, which is based on the CCS for Services and Procedures category 152.


Title "Produce National Estimate of Encounters With Any-Listed Knee Arthroplasty Procedures from 2019 NASS File (Weighted)";
Libname NASS2019 "O:\NASS\2019\run1\data" access=readonly;
Options PS=51 LS=146 ;
     
proc format;
     Value St_Regn 
         1 = "1: Northeast"
         2 = "2: Midwest"
         3 = "3: South"
         4 = "4: West"
     ;
run;

Title2 "Add NASS_STRATUM and Region from Hospital file";
Proc Sort Data=NASS2019.NASS_2019_encounter (keep=HOSP_NASS KEY_NASS DISCWT CPTCCS1-CPTCCS30) Out=Encounter ;
    By HOSP_NASS KEY_NASS;
Run;

Proc Sort Data=NASS2019.NASS_2019_hospital (keep=HOSP_NASS NASS_STRATUM HOSP_REGION) Out=Hospital ;
    By HOSP_NASS;
Run;

Title2 "Define Knee Arthroplasty"
Data NASS;
    Merge Encounter (in=inE)
          Hospital (in=inH)
    ;
    by HOSP_NASS;
	if inE;
	if not inH then abort;
    Attrib Knee_Arthroplasty length=3 label='Knee arthroplasty (CPTCCSn=152)';
	array CPTCCS{*} CPTCCS1-CPTCCS30;
	Knee_Arthroplasty=0;
	do i=1 to dim(CPTCCS) until (Knee_Arthroplasty=1);
	      if CPTCCS(i)=152 then Knee_Arthroplasty=1;
    end;
	drop i;
Run;
Title2;

proc surveymeans data=NASS missing sum ;
    cluster HOSP_NASS ;
    strata NASS_STRATUM ;
	weight DISCWT ;
    domain HOSP_REGION ;
	format HOSP_REGION st_regn. ;
	var Knee_Arthroplasty ;
run;

The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:

  • Numeric value 1 for Northeast
  • Numeric value 2 for Midwest
  • Numeric value 3 for South
  • Numeric value 4 for West

The second section in the example SAS code involves two procedures which sort the NASS Encounter File and the NASS Hospital File by HOSP_NASS and KEY_NASS keeping essential data elements from each file.

The third section in this example SAS code employs a SAS data step that combines the NASS Encounter File with the NASS Hospital File to obtain the field NASS_STRATUM. If a hospital is not found in the NASS hospital file, the data create step is aborted. This can be seen in the line with the abort code. The ATTRIB statement assigns a length and a label to a new data element (Knee_Arthroplasty) specific to our example analysis. Th next statement, Knee_Athroplasty =, assigns a value to this new data element, which in our example, is defined based on any-listed CCS for Services and Procedures category of 152 (NASS data element CPTCCSn=152).

The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NASS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_NASS.
  • The STRATA statement, which includes NASS_STRATUM.
  • The WEIGHT statement, which includes data element DISCWT.
  • The VAR statement, which includes the value Knee_Arthroplasty, that we defined in the DATA step above.


Produce Regional Estimate of Encounters With Any-Listed Knee Arthroplasty Procedure from 2019 NASS File (Weighted)     
                                                                                                         
                                              The SURVEYMEANS Procedure                                        
                                                                                                         
                                         Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                                             Std Error
           HOSP_REGION    Variable               Label                         Sum              of Sum
           -------------------------------------------------------------------------------------------
           1: Northeast   Knee_Arthroplasty      Knee arthroplasty           35372         3981.738273
                                                 (CPTCCSn=152)
           2: Midwest     Knee_Arthroplasty      Knee arthroplasty           77946         4121.714968
                                                 (CPTCCSn=152)
           3: South       Knee_Arthroplasty      Knee arthroplasty          117928         6265.562914
                                                 (CPTCCSn=152)
           4: West        Knee_Arthroplasty      Knee arthroplasty           70664         5804.339212
                                                 (CPTCCSn=152)
           -------------------------------------------------------------------------------------------
 

The output for this example SAS code provides the total number of weighted records in the 2019 NASS with any-listed CCS for Services and Procedures category 152, Arthroplasty of knee, by hospital region:

  • Northeast: 35,372
  • Midwest: 77,946
  • South: 117,928
  • West: 70,664

Return to Contents

Module 3: Nationwide Ambulatory Surgery Sample (NASS), Validating National and Regional Estimates

There are two resources that can be used to validate national and regional estimates for the NASS.

  • The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
  • The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-10-CM codes (individually and grouped by clinical category) and CPT codes grouped by Clinical Classifications Software (CCS) for the NASS. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.

HCUP Summary Statistics

Here is the output from first example analysis, which produced national estimates by patient race and ethnicity, or data element RACE, from the 2019 NASS. We have separate weighted counts for each RACE value as well as for missing and invalid values.


Produce National Estimate of Discharges By Patient Race/Ethnicity from 2019 NASS File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for RACE Domains                                 

                                                                              Sum of
    RACE                        Variable         Label                       Weights
    --------------------------------------------------------------------------------
     .: Missing                 KEY_NASS         NASS record number           343473
    .A: Invalid                 KEY_NASS         NASS record number       188.485640
     1: White                   KEY_NASS         NASS record number          8425840
     2: Black                   KEY_NASS         NASS record number          1101118
     3: Hispanic                KEY_NASS         NASS record number          1247939
     4: Asian/Pacific Islander  KEY_NASS         NASS record number           318776
     5: Native American         KEY_NASS         NASS record number            66099
     6: Other                   KEY_NASS         NASS record number           377055
    --------------------------------------------------------------------------------
 

For validation, we are going to compare the output with the 2019 NASS Summary Statistics.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NASS Database Documentation.

The NASS Summary Statistics are available on this page, under the "Data Elements" section on the left-hand side.

The NASS Summary Statistics include all years of the NASS. We will scroll down to the section specific to data year 2019. Our data element of interest, RACE, is in the NASS Encounter File, which means we will want to select the Summary Statistics for the NASS Encounter File and, specifically, the file that provides weighted estimates (i.e., 2019 NASS Encounter File, Weighted). Once the file has downloaded, we will need to navigate to the frequency of the data element RACE. We can do this easily by searching for this data element name within the downloaded PDF.

NASS Summary Statistics 2019 Weighted Frequency Distribution for RACE
RACE Frequency Percent of Total
.: Missing 343,473 2.89%
.A: Invalid 188 0.00%
1: White 8,425,840 70.92%
2: Black 1,101,118 9.27%
3: Hispanic 1,247,939 10.50%
4: Asian/Pacific Islander 318,776 2.68%
5: Native American 66,099 0.56%
6: Other 377,055 3.17%


Produce National Estimate of Discharges By Patient Race/Ethnicity from 2019 NASS File (Weighted)         

                                The SURVEYMEANS Procedure                        

                               Statistics for RACE Domains                                 

                                                                              Sum of
    RACE                        Variable         Label                       Weights
    --------------------------------------------------------------------------------
     .: Missing                 KEY_NASS         NASS record number           343473
    .A: Invalid                 KEY_NASS         NASS record number       188.485640
     1: White                   KEY_NASS         NASS record number          8425840
     2: Black                   KEY_NASS         NASS record number          1101118
     3: Hispanic                KEY_NASS         NASS record number          1247939
     4: Asian/Pacific Islander  KEY_NASS         NASS record number           318776
     5: Native American         KEY_NASS         NASS record number            66099
     6: Other                   KEY_NASS         NASS record number           377055
    --------------------------------------------------------------------------------
 

A comparison of the RACE frequency from the 2019 NASS Weighted Encounter Summary Statistics and the output from SAS demonstrates that our results match.

Diagnosis and Procedure Frequency Tables

Here is the output from our second example analysis, which produced national estimates for records in the 2019 NASS with any-listed CCS for Services and Procedures category 152, Arthroplasty of knee.


Produce National Estimate of Encounters With Any-Level Knee Arthroplasty Procedures from 2019 NASS File (Weighted)         

                                   The SURVEYMEANS Procedure                        

                                          Data Summary                              

                             Number of Strata                  61                  
                             Number of Clusters              2958                  
                             Number of Observations       8994101
                             Sum of Weights            11880487.3


                                           Statistics                               

                                                                     Std Error
              Variable             Label                    Sum         of Sum
              ----------------------------------------------------------------
              Knee_Arthroplasty    Knee_Arthroplasty     301910          10285
                                   (CPTCCSn=152)
 

For validation, we are going to compare the output with the NASS Diagnosis and Procedure Frequency Tables.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NASS Database Documentation.

The NASS Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.

Once the file has been downloaded, we will navigate to the tab T.3_By_CPTCCS_Category, which includes the unweighted and weighted number of major ambulatory surgery encounters by individual CCS for Services and Procedures category. We will then navigate to the row for CCS for Services and Procedures category 152, Arthroplasty of knee, and scroll over to the columns that are specific to the 2019 NASS. Note that you can filter to CCS for Services and Procedures category 152 by using either Column A or Column B. We are now ready to compare the values with the output from SAS.

Table 3. Weighted and Unweighted Number of Records by Clinical Classifications Software for CPT Codes by Clinical Classifications Software (CCS) for Services and Procedures Category
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Nationwide Ambulatory Surgery Sample (NASS), 2016-2019

Note: Unduplicated means that if two or more procedures on the encounter record mapped to the same CCS category, the record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size. Blank cells indicate that the CCS category was not in scope for the data year.
CCS for Services and Procedure Category CCS for Services and Procedures Description 2019 NASS:
Weighted N for
CPT1 CCS
2019 NASS:
Weighted N for
All-Listed CCS
(Unduplicated)
2019 NASS:
Unweighted N for
CPT1 CCS
2019 NASS:
Unweighted N for
All-Listed CCS
(Unduplicated)
152 152: Arthroplasty 294,917 **301,910 220,773 225,866



Produce National Estimate of Encounters With Any-Level Knee Arthroplasty Procedures from 2019 NASS File (Weighted)         

                                   The SURVEYMEANS Procedure                        

                                          Data Summary                              

                             Number of Strata                  61                  
                             Number of Clusters              2958                  
                             Number of Observations       8994101
                             Sum of Weights            11880487.3


                                           Statistics                               

                                                                     Std Error
              Variable             Label                    Sum         of Sum
              ----------------------------------------------------------------
              Knee_Arthroplasty    Knee_Arthroplasty   **301910          10285
                                   (CPTCCSn=152)
 

A comparison of the weighted count for records in the 2019 NASS with any-listed CCS for Services and Procedures category 152, Arthroplasty of knee, along with the output from SAS (denoted by **) demonstrates our results match.



Module 3: Nationwide Ambulatory Surgery Sample (NASS)

You have completed Module 3, Nationwide Ambulatory Surgery Sample (NASS)!

For any questions about the NASS that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:

  • Email: hcup@ahrq.gov
  • Phone: 866-290-HCUP (4287) (toll free)
  • International users, please contact HCUP User Support by email.

The staff reviews messages daily and usually responds to inquiries within 3 business days.


Return to Contents

Module 4: Nationwide Emergency Department Sample (NEDS)

The Nationwide Emergency Department Sample (NEDS) can be used to produce national estimates of emergency department (ED) visits across the country. The NEDS includes both ED visits that result in admission to the hospital and those that do not.

Information on the NEDS is organized by the four sections below. These include:

  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates

Additional information about the NEDS is available on the NEDS Database Documentation page on the HCUP User Support (HCUP-US) website.


Return to Contents


Module 4: Nationwide Emergency Department Sample (NEDS), Overview of the NEDS

One of the most distinctive features of the NEDS is its large sample size, which allows for analysis across hospital types and the study of relatively uncommon disorders and procedures. Unweighted, the NEDS contains data from 33 million ED visits from nearly 1,000 hospital-owned EDs. Weighted, the NEDS represents 143 million ED visits.

The NEDS is sampled from the HCUP State Emergency Department Databases (SEDD) and State Inpatient Databases (SID). The SEDD capture information on ED visits that do not result in an admission (e.g., treat-and-release visits and transfers to another hospital). The SID contain information on patients initially seen in the ED and then admitted to the same hospital. The NEDS is available annually beginning with data year 2006.

Additional information on the sample design of the KID is available in the NEDS Introduction and the HCUP Sample Design tutorial.


Return to Contents


Module 4: Nationwide Emergency Department Sample (NEDS), NEDS Data Elements Discharge Weight and Hospital Weights

NEDS Data Element Discharge Weight

To produce nationally or regionally representative estimates, the NEDS data must be weighted.

When the discharge weight (DISCWT) is applied to NEDS discharge-level data, the result is an estimate of the number of ED visits for the target universe. When the hospital weight (HOSPWT) is applied to hospital-level data, the result is an estimate of the number of EDs in the target universe. The target universe, covers all ED visits in facilities owned by community hospitals in the United States, excluding rehabilitation hospitals. As defined by the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.

Weights are calculated after hospitals from the SEDD and SID have been stratified and sampled. Hospitals are stratified (grouped) based on five hospital characteristics: (1) ownership/control, (2) teaching status, (3) urban/rural location, (4) trauma center designation, and (5) location in the four U.S. Census regions. Within each stratum, 20% of hospitals in the target universe are sampled from the combined SEDD and SID. The number of hospitals in the target universe stratum is determined from the American Hospital Association (AHA) Annual Survey data for all States and hospitals, including those without data in the SEDD or SID.

Then the NEDS discharge weight is calculated by dividing the number of ED visits in the target universe by the number of ED visits in the sampled hospitals within each stratum.

NEDS Hospital Weight

To produce hospital-level estimates, such as the number of hospital-owned EDs in the United States located in a metropolitan area, you need to apply a hospital weight (HOSPWT) to the data. HOSPWT is assigned to each hospital within the NEDS Hospital File, with the value varying across records.

HOSPWT is also calculated according to the NEDS strata of ownership/control, teaching status, urban/rural location, trauma center designation, and the four U.S. census regions. The number of hospital-owned EDs in the target universe is determined from the AHA data for all States and hospitals, including those without data in the SEDD or SID.

NEDS hospital weights are calculated by dividing the number of hospital-owned EDs in the target universe by the number of sampled hospital-owned EDs within each stratum.


Return to Contents


Module 4: Nationwide Emergency Department Sample (NEDS), SAS Code Examples

Example SAS Code for Producing National Estimates by Type of ED Visit

This example SAS code produces national estimates by the type of ED visit or source of the ED record (data element HCUPFILE) in the 2019 NEDS.


Title "Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)";
Libname NEDS2019  "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;

proc surveymeans data=neds2019.neds_2019_core missing sumwgt ;
     cluster HOSP_ED ;
     strata NEDS_STRATUM ;
     domain HCUPFILE ;
     format PL_NCHS nchsf. ;
     weight DISCWT ;
     var KEY_ED ;
run;
 

This example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:

  • The CLUSTER statement, which includes the NEDS hospital identifier or data element HOSP_ED.
  • The STRATA statement, which includes the NEDS stratum identifier or data element NEDS_STRATUM.
  • The DOMAIN statement is specific to this analysis, which produces national estimates by data element HCUPFILE.
  • The WEIGHT statement, which includes the NEDS encounter weight or data element DISCWT.
  • The VAR statement, which includes the NEDS record identifier or data element KEY_ED.


Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)         

                     The SURVEYMEANS Procedure                        

                           Data Summary                              

                Number of Strata                141                  
                Number of Clusters              989                  
                Number of Observations     33147251
                Sum of Weights            143432284


                            Statistics                               

   Variable      Label                              Sum of Weights
   ---------------------------------------------------------------
   KEY_ED        HCUP NEDS record identifier             143432284
 

  Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)         

                         The SURVEYMEANS Procedure                        

                      Statistics for HCUPFILE domains                                 

                                                                   Sum of
HCUPFILE     Variable       Label                                 Weights
-------------------------------------------------------------------------
SEDD        KEY_ED         HCUP NEDS record identifier          123058750
SID         KEY_ED         HCUP NEDS record identifier           20373534
-------------------------------------------------------------------------
 

The output for this example SAS code provides the total number of weighted records in the 2019 NEDS, which is 143,432,284, as well as the total number of weighted records for HCUPFILE:

  • SEDD: 123,058,750
  • SID: 20,373,534

Example SAS Code for Producing National Estimates for Urinary Tract Infection

This example SAS code identifies the number of weighted records in the 2019 NEDS with a principal or first-listed diagnosis of urinary tract infection (UTI), which is based on the default HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-CM diagnosis category, GEN004 (Urinary tract infection).



Title "Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)";
Libname NEDS2019  "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;

data neds;
    merge neds2019.neds_2019_core (keep=HOSP_ED NEDS_STRATUM KEY_ED DISCWT)
          neds2019.neds_2019_dx_pr_grps (keep=HOSP_ED KEY_ED DXCCSR_Default_DX1)
    ;
    by HOSP_ED KEY_ED;
    Attrib UTI length=3 label='Urinary Tract Infection (Default CCSR=GEB004)';
    UTI=(DXCCSR_Default_DX1='GEN004')
run;

proc surveymeans data=neds sum mean nomcar;
     cluster HOSP_ED ;
     strata NEDS_STRATUM;
     weight DISCWT;
     var UTI;
run;
 

The first section of this example SAS code includes the DATA step, which identifies records with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis. This step includes the following statements:

  • The MERGE which combines the NEDS Core File with the NEDS Diagnosis and Procedure Groups File. The NEDS Diagnosis and Procedure Groups file includes the default CCSR category for the principal or first-listed diagnosis or data element DXCCSR_Default_DX1.
  • The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and DXCCSR_Default_DX1.
  • The ATTRIB statement, which assigns a length and a label to a new data element (UTI) specific to our example analysis. The next statement, UTI =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of GEN004 for the principal or first-listed diagnosis (NEDS data element DXCCSR_Default_DX1=GEN004).

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_ED.
  • The STRATA statement, which includes NEDS_STRATUM.
  • The WEIGHT statement, which includes data element DISCWT.
  • The VAR statement, which includes the value, UTI, that we defined in the DATA step above.


Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)         

                                                   The SURVEYMEANS Procedure                        

                                                          Data Summary                              

                                           Number of Strata                  141                  
                                           Number of Clusters                989                  
                                           Number of Observations       33147251
                                           Sum of Weights              143432284
							 
							 
                                                      Variance Estimation                              

                                           Method                  Taylor Series                  
                                           Missing Values                 NOMCAR 


                                                          Statistics                               

                                                                      Std Error                       Std Error
        Variable        Label                               Mean        of Mean            Sum           of Sum
        -------------------------------------------------------------------------------------------------------
        UTI             Urinary Tract Infection         0.025498       0.000305        3657277            85457
                        (Default CCSR=GEN004)
 

The output for this example SAS code provides the total number of weighted records in the 2019 NEDS with a default CCSR category for the principal or first-listed diagnosis of GEN004, Urinary tract infection, which is 3,657,277.

Example SAS Code for Producing Regional Estimates for Urinary Tract Infection

The example SAS code below produces regional estimates for records in the 2019 NEDS with a principal or first-listed diagnosis of UTI (default CCSR category GEN004).


Title "Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)";
Libname NEDS2019 "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
     
proc format;
     Value St_Regn 
         1 = "1: Northeast"
         2 = "2: Midwest"
         3 = "3: South"
         4 = "4: West"
     ;
run;

data neds;
    merge neds2019.neds_2019_core (keep=HOSP_ED NEDS_STRATUM KEY_ED DISCWT)
          neds2019.neds_2019_dx_pr_grps (keep=HOSP_ED KEY_ED DXCCSE_Default_DX1)
    ;
    by HOSP_ED KEY_ED;
	Attrib UTI length=3 label='Urinary Tract Infection (Default CCSR=GEN004)';
	UTI=(DXCCSR_Default_DX1='GEN004');
	/* look up region */
    if _n_=1 then do;
	   if 0 then set neds2019.neds_2019_hospital (keep=HOSP_REGION); %* initiates the variable;
	   declare hash h (dataset: "neds2019.neds_2019_hospital");
	   h.defineKey('HOSP_ED');
	   h.defineData('HOSP_REGION');
	   h.defineDone();
    end;
    if h.find() ne 0 then abort; %* all disharges should have a matching hospital record;
	format HOSP_REGION st_regn.;
run;

proc surveymeans data=neds sum mean nomcar ;
    cluster HOSP_ED ;
    strata NEDS_STRATUM ;
	domain HOSP_REGION ;
	weight DISCWT ;
	var UTI ;
run;

The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:

  • Numeric value 1 for Northeast
  • Numeric value 2 for Midwest
  • Numeric value 3 for South
  • Numeric value 4 for West

The second section includes the DATA step, which includes the following statements:

  • The MERGE statement, which links the NEDS Core File with the NEDS Diagnosis and Procedure Groups File keeping essential data elements from each file.
    • For this specific example, there is an additional step that is looking for the data element HOSP_REGION, which resides in the NEDS Hospital File.
  • The ATTRIB statement, which assigns a length and a label to a new data element (UTI) specific to our example analysis. The next statement, UTI =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of GEN004 for the principal or first-listed diagnosis (NEDS data element DXCCSR_Default_DX1=GEN004).

The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_ED.
  • The STRATA statement, which includes NEDS_STRATUM.
  • The WEIGHT statement, which includes data element DISCWT.
  • The VAR statement, which includes the value, UTI, which we defined in the DATA step above.
  • The DOMAIN statement, which is specific to HOSP_REGION as we are interested in regional estimates.


Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)     
                                                                                                         
                                                  The SURVEYMEANS Procedure                                        
                                                                                                         
                                             Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                          Std Error                       Std Error
HOSP_REGION    Variable     Label                             Mean          of Mean           Sum            of Sum
-------------------------------------------------------------------------------------------------------------------
1: Northeast   UTI          Urinary Tract Infection       0.021364         0.000509        554359             34012
                            (Default CCSR=GEN004)
2: Midwest     UTI          Urinary Tract Infection       0.024167         0.000529        776972             39084
                            (Default CCSR=GEN004)
3: South       UTI          Urinary Tract Infection       0.027824         0.000614       1623836             58585
                            (Default CCSR=GEN004)
4: West        UTI          Urinary Tract Infection       0.026030         0.000496        702110             34442
                            (Default CCSR=GEN004)
-------------------------------------------------------------------------------------------------------------------
 

The output for this example SAS code provides the total number of weighted records in the 2019 NEDS with a default CCSR category of GEN004, UTI, by hospital region:

  • Northeast: 554,359
  • Midwest: 776,972
  • South: 1,623,836
  • West: 702,110


Title "Produce National Estimate of Hospitals with HOSP_TRAUMA>0 from 2019 NEDS File (Weighted)";
Libname NEDS2019  "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;

data neds;
    set neds2019.neds_2019_hospital;
    Attrib Trauma_Hosp length=3 label='Trauma Hospital';
    Trauma_Hosp=(HOSP_TRAUMA>0);
run;

proc surveymeans data=neds sum mean nomcar;
     cluster HOSP_ED ;
     strata NEDS_STRATUM;
     weight HOSPWT;
     var Trauma_Hosp;
run;
 

The first section of this example SAS code includes the DATA step. Included in this step is the ATTRIB statement, which assigns a length and a label to a new data element (Trauma_Hosp) specific to our example analysis. The next statement, Trauma_Hosp =, assigns a value to this new data element, which in our example, is defined based on any trauma center designation; therefore, we are interested in a value greater than 0 for the NEDS data element HOSP_TRAUMA.

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:

  • The CLUSTER statement, which includes HOSP_ED.
  • The STRATA statement, which includes NEDS_STRATUM.
  • The WEIGHT statement, which includes the NEDS hospital weight or data element HOSPWT given that we are interested in national estimates of hospital-owned EDs.
  • The VAR statement, which includes the value, Trauma_Hosp, that we defined in the DATA step above.


   Produce National Estimate of Hospitals with HOSP_TRAUMA>0 from 2019 NEDS File (Weighted)         

                                    The SURVEYMEANS Procedure                        

                                           Data Summary                              

                              Number of Strata                  141                  
                              Number of Clusters                989                  
                              Number of Observations            989
                              Sum of Weights                   4549


                                       Variance Estimation                              

                              Method                  Taylor Series                  
                              Missing Values                 NOMCAR 


                                            Statistics                               

                                                  Std Error                           Std Error
Variable        Label                   Mean        of Mean              Sum             of Sum
-----------------------------------------------------------------------------------------------
Trauma_Hosp     Trauma Hospital     0.239613              0      1090.000000       3.335209E-15
 

The output for this example SAS code includes the number of hospital-owned EDs in the United States designated as a trauma center in the 2019 NEDS, which is 1,090.


Return to Contents

Module 4: Nationwide Emergency Department Sample (NEDS), Validating National and Regional Estimates

There are three resources that can be used to validate national and regional estimates for the NEDS.

  • The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
  • The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-9-CM and ICD-10-CM/PCS codes (individually and grouped by clinical category) in the HCUP nationwide databases. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
  • HCUPnet is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. HCUPnet can be used to validate select national estimates obtained from the NIS, KID, NEDS, or NRD and county- or State-level statistics for participating HCUP Partners.

HCUP Summary Statistics

Here is output from our first example analysis, which produced national estimates by source of the ED record, or data element HCUPFILE, from the 2019 NEDS. We have separate weighted counts for each of the two HCUPFILE values.


  Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)         

                         The SURVEYMEANS Procedure                        

                      Statistics for HCUPFILE domains                                 

                                                                   Sum of
HCUPFILE     Variable       Label                                 Weights
-------------------------------------------------------------------------
SEDD        KEY_ED         HCUP NEDS record identifier          123058750
SID         KEY_ED         HCUP NEDS record identifier           20373534
-------------------------------------------------------------------------
 

For validation, we are going to compare the output with the 2019 NEDS Summary Statistics.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NEDS Database Documentation.

The NEDS Summary Statistics page all years of the NEDS. We will scroll down to the section specific to data year 2019. Our data element of interest, HCUPFILE, is in the NEDS Core File, which means we will want to select the Summary Statistics for the NEDS Core File and, specifically, the file that provides weighted estimates (i.e., 2019 NEDS Core File, weighted). Once the file has downloaded, we will need to navigate to the frequency of the data element, HCUPFILE. We can do this easily by searching for this data element name within the downloaded PDF.

HCUP Weighted Summary Statistics Report: NEDS 2019 Core File Weighted Frequency Distribution for HCUPFILE
HCUPFILE Frequency Percent
SEDD 123,058,750 85.80%
SID 20,373,534 14.20%


  Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)         

                         The SURVEYMEANS Procedure                        

                      Statistics for HCUPFILE domains                                 

                                                                   Sum of
HCUPFILE     Variable       Label                                 Weights
-------------------------------------------------------------------------
SEDD        KEY_ED         HCUP NEDS record identifier          123058750
SID         KEY_ED         HCUP NEDS record identifier           20373534
-------------------------------------------------------------------------
 

A comparison of the HCUPFILE frequency from the 2019 NEDS Weighted Core Summary Statistics and the output from SAS demonstrates that our results match.

Diagnosis and Procedure Frequency Tables

Here is output from our second example analysis, which produced national estimates for records in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis.


Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)         

                                                   The SURVEYMEANS Procedure                        

                                                          Data Summary                              

                                           Number of Strata                  141                  
                                           Number of Clusters                989                  
                                           Number of Observations       33147251
                                           Sum of Weights              143432284
							 

                                                      Variance Estimation                              

                                           Method                  Taylor Series                  
                                           Missing Values                 NOMCAR 


                                                          Statistics                               

                                                                      Std Error                       Std Error
        Variable        Label                               Mean        of Mean            Sum           of Sum
        -------------------------------------------------------------------------------------------------------
        UTI             Urinary Tract Infection         0.025498       0.000305        3657277            85457
                        (Default CCSR=GEN004)
 

For validation, we are going to compare the output with the NEDS Diagnosis and Procedure Frequency Tables.

From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NEDS Database Documentation.

The NEDS Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.

Once the file has been downloaded, we will navigate to the tab T.1_By_DXCCSR_Category, which includes the unweighted and weighted number of records by individual CCSR for ICD-10-CM diagnosis categories for all ED visits. Note that if you wish to obtain counts separately for treat-and-release ED visits or ED visits that result in admission to the same hospital, you will need to use the two subsequent tabs that end in "TandR" or "EDadmit." We will then navigate to the row for CCSR category GEN004, Urinary tract infection, and scroll over to the columns that are specific to the 2019 NEDS. Note that you can filter to GEN004 using either Column A or Column B.

Table 1. Weighted and Unweighted Number of Records (All Emergency Department Visits) by Clinical Classifications Software Refined (CCSR) for ICD-10-CM Diagnoses, v2021.2
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Nationwide Emergency Department Sample (NEDS), 2016-2019

Note: Counts for all-listed diagnoses include all possible CCSR category assignments. Unduplicated means that if two or more diagnosis codes on the same discharge record mapped to the same CCSR category, the discharge record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size.
CCSR for ICD-10-CM Category, v2021.2 CCSR Description, v2021.2 2019 NEDS-All:
Weighted N for
DX1 CCSR Default
2019 NEDS-All:
Weighted N for
All-Listed CCSR
(Unduplicated)
2019 NEDS-All:
Unweighted N for
DX1 CCSR Default
2019 NEDS-All:
Unweighted N for
All-Listed CCSR
(Unduplicated)
GEN004 GEN004 Urinary tract infections **3,657,277 7,963,917 851,106 1,856,213



Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)         

                                                   The SURVEYMEANS Procedure                        

                                                          Data Summary                              

                                           Number of Strata                  141                  
                                           Number of Clusters                989                  
                                           Number of Observations       33147251
                                           Sum of Weights              143432284
							 

                                                      Variance Estimation                              

                                           Method                  Taylor Series                  
                                           Missing Values                 NOMCAR 


                                                          Statistics                               

                                                                      Std Error                       Std Error
        Variable        Label                               Mean        of Mean            Sum           of Sum
        -------------------------------------------------------------------------------------------------------
        UTI             Urinary Tract Infection         0.025498       0.000305      **3657277            85457
                        (Default CCSR=GEN004)
 

A comparison of the weighted count for records in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis with the output from SAS (denoted by **) demonstrates that our results match.


HCUPnet

Here is output from our third example analysis, which produced regional estimates for records in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis.


Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)     
                                                                                                         
                                                  The SURVEYMEANS Procedure                                        
                                                                                                         
                                             Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                          Std Error                       Std Error
HOSP_REGION    Variable     Label                             Mean          of Mean           Sum            of Sum
-------------------------------------------------------------------------------------------------------------------
1: Northeast   UTI          Urinary Tract Infection       0.021364         0.000509        554359             34012
                            (Default CCSR=GEN004)
2: Midwest     UTI          Urinary Tract Infection       0.024167         0.000529        776972             39084
                            (Default CCSR=GEN004)
3: South       UTI          Urinary Tract Infection       0.027824         0.000614       1623836             58585
                            (Default CCSR=GEN004)
4: West        UTI          Urinary Tract Infection       0.026030         0.000496        702110             34442
                            (Default CCSR=GEN004)
-------------------------------------------------------------------------------------------------------------------
 

For validation, we are going to compare the output with HCUPnet.

As a first step, we will need to accept the terms of the Data Use Agreement. Now, we will navigate to the top menu and select "Emergency Department Setting." Once selected, we will subsequently expand the option for "National Emergency Department" and select "Diagnoses."

The output will default to displaying trends in the total number of ED visits with a default CCSR category of BLD001, Nutritional anemia, for the principal or first-listed diagnosis. We need to modify the selections on the left-hand side of the screen to align with our analysis.

  1. First, select the option for "Cross-Sectional" analysis.
  2. Next, retain the default data year of "2019" in the "Years" drop-down.
  3. Next, select "Diagnoses—Clinical Classifications Software Refined (CCSR)" in the "Classification Types" drop-down, and retain the "Principal/First-listed" option in the "Principal/First-listed or All-Listed" drop-down.
  4. Next, under the "Diagnoses/Procedures" drop-down unclick the (All) selection to change the default from running the query on all CCSR categories. Scroll down through the list to CCSR category GEN004, Urinary tract infection, or use the search bar and ensure the box is checked.
  5. Next, retain the default value "ED Visits Resulting in Hospital Admission" in the "Type of ED Visit" drop-down. Note that there is no option for "All ED Visits." As a result, we will need to query the two ED visit types separately and then combine for all ED visits. Because each discharge only has one principal/first-listed diagnosis, we are able to sum the number of discharges.
  6. Next, ensure only "Number of Ed visits" is selected in the "Outcomes" drop-down.
  7. Next, select the "Hospital Census Region" option in the "Characteristic" drop-down.
  8. Next, retain the default option of "All" for the "Characteristic" drop-down.
  9. Last, select the box for "Show 95% CI" to display the standard error of the estimates if you wish to view this information.

A table will appear next to the left-hand side menu where the selections were made. This table presents regional estimates for ED visits that result in hospital admission in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal diagnosis. If you wish to display a graph for this output, navigate to the upper right and make the necessary selections under the "Diagnoses/Procedures to Graph" and "Outcome to Graph" drop-downs.

Now, we need to go back and select the "Treat-and-release ED visits" option from the "Type of ED Visit" drop-down. The results table will automatically update to reflect this type of ED visit.

  ED Visits Resulting in Admission Treat-and-Release ED Visits   Sum of ED Visits Resulting in Admission + Treat-and-Release ED Visits
Diagnoses Characteristic Levels Number of ED visits Number of ED visits   Number of ED visits
GEN004: Urinary Tract Infections
Midwest 97,389 + 679,582 = 776,971
Northeast 92,261 + 462,098 = 554,359
South 205,833 + 1,418,003 = 1,623,836
West 67,451 + 634,659 = 702,110



Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)     
                                                                                                         
                                                  The SURVEYMEANS Procedure                                        
                                                                                                         
                                             Statistics for HOSP_REGION Domains                               
                                                                                                         
                                                                          Std Error                       Std Error
HOSP_REGION    Variable     Label                             Mean          of Mean           Sum            of Sum
-------------------------------------------------------------------------------------------------------------------
1: Midwest     UTI          Urinary Tract Infection       0.024167         0.000529        776972             39084
                            (Default CCSR=GEN004)
2: Northeast   UTI          Urinary Tract Infection       0.021364         0.000509        554359             34012
                            (Default CCSR=GEN004)
3: South       UTI          Urinary Tract Infection       0.027824         0.000614       1623836             58585
                            (Default CCSR=GEN004)
4: West        UTI          Urinary Tract Infection       0.026030         0.000496        702110             34442
                            (Default CCSR=GEN004)
-------------------------------------------------------------------------------------------------------------------
 

A comparison of our output from HCUPnet with the output from SAS demonstrates that our results match except for the number of ED visits in the Midwest, which is lower by one in HCUPnet. This is a result of how estimates are rounded in HCUPnet. In some cases, the sum of estimates in HCUPnet for the two ED visit types may differ slightly from estimates obtained directly from the NEDS.

Module 4: Nationwide Emergency Department Sample (NEDS)

You have completed Module 4, Nationwide Emergency Department Sample (NEDS)!

For any questions about the NEDS that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:

  • Email: hcup@ahrq.gov
  • Phone: 866-290-HCUP (4287) (toll free)
  • International users, please contact HCUP User Support by email.

The staff reviews messages daily and usually responds to inquiries within 3 business days.


Return to Contents


Module 5: Nationwide Readmissions Database (NRD)

The Nationwide Readmissions Database, or NRD, is a unique and powerful database designed to support various types of analyses of national readmissions for all patients regardless of the expected payer for the hospital stay.

Information on the NRD is organized by the four sections below. These include:

  • Overview
  • Weighting the Data
  • SAS Code Examples
  • Validating Estimates

Additional information about the NRD is available on the NRD Database Documentation page on the HCUP User Support (HCUP-US) website.


Return to Contents


Module 5: Nationwide Readmissions Database (NRD), Overview of the NRD

The Nationwide Readmissions Database (NRD) is a unique and powerful database designed to support various types of analyses of national readmissions for all patients regardless of the expected payer for the hospital stay. The NRD includes discharges for patients with and without repeat hospital visits in a year and those who have died in the hospital. Repeat stays may or may not be related. The criteria to determine the relationship between hospital admissions is left to the analyst using the NRD. Unweighted, the NRD contains data from about 18 million discharges each year. Weighted, it estimates roughly 35 million discharges in the United States.

The NRD is drawn from HCUP State Inpatient Databases (SID) containing verified patient linkage numbers that can be used to track a person across hospitals within a State, while adhering to strict privacy guidelines. The NRD is available annually beginning with data year 2010.

Additional information on the sample design of the NRD is available in the NRD Introduction and the HCUP Sample Design tutorial.


Return to Contents


Module 5: Nationwide Readmissions Database (NRD), Weighting the Nationwide Readmissions Database (NRD)

To produce nationally representative estimates, the NRD data must be weighted. This can be done using the data element discharge weight (DISCWT), which is available on each record in the NRD, with the value varying across records.

When the discharge weights are applied to the NRD data, the result is an estimate of the number of discharges for the target universe, which includes discharges from all community hospitals in the United States, excluding rehabilitation and long-term acute care hospitals. Per the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.

The NRD is not designed to support regional estimates, because information on U.S. census region is not available.

Weights are developed after discharges sampled from the SID are stratified into counts using five hospital characteristics and two patient characteristics: (1) ownership/control, (2) bed size, (3) teaching status, (4) urban/rural location, (5) the four U.S. census regions, (6) patient age, in groups, and (7) patient sex. Total discharge counts for the target universe are estimated using total discharges from hospitals in the SID and the American Hospital Association (AHA) Survey estimates of discharges (admissions plus births) for hospitals not included in the NRD.

NRD discharge weights are calculated by dividing the number of universe discharges by the number of sampled discharges within each NRD stratum.


Return to Contents


Module 5: Nationwide Readmissions Database (NRD), SAS Code Example

As described earlier in this module, the NRD is designed to support readmission analyses. It is not recommended for use in obtaining total national estimates of discharges in the United States because pairs of transfer records are collapsed into a single record in the NRD. In that case, the National Inpatient Sample (NIS) should be used.

These are two critical components of a readmission analysis:

  1. Index event: Initial inpatient stay that indicates the starting point for analyzing repeat hospital stays and that is typically defined by specific inclusion and exclusion criteria.
  2. Readmission: A subsequent inpatient stay within a specified time period; the readmission may be for a specific cause or any cause.

Additional information on defining the index event and readmission is available in the NRD Introduction and the NRD tutorial.

Example SAS Code for Producing National Estimates for Index Events with Principal Diagnosis of Septicemia

This example SAS code determines the weighted number of index events in the 2019 NRD with a principal diagnosis of septicemia, which is based on the default HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-CM diagnosis category, INF002 (Septicemia).

For this example, an index event is defined as follows:

  • The patient was discharged between January and November 2019.
  • The patient was discharged alive.
  • The length of stay was nonmissing.
  • The discharge was for a patient aged 1 year or older.
  • The patient may be a nonresident of the State.
  • And, the patient is allowed to have multiple index events, regardless of how far apart.

This index event definition is consistent with what is used on HCUPnet, which is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. However, it should be noted that users should define the index event (as well as the readmission) based on their own analytic purpose.



Title1 "Produce National Estimate of Index Discharges with Principal or First-Listed Diagnosis of Septicemia";
Title2 "(Default CCSR=INF002) from 2019 NRD File (Weighted)";
Libname NRD2019  "O:\NRD\2019\CD\CDNRD" access=readonly;
Options PS=51 LS=146 ;

data nrd;
    merge nrd2019.nrd_2019_core (keep=HOSP_NRD KEY_NRD DISCWT NRD_STRATUM NRD_visitlink NRD_daystoevent AGE LOS DMONTH DIED )
          nrd2019.nrd_2019_dx_pr_grps (keep=HOSP_NRD KEY_NRD DXCCSR_Default_DX1)
    ;
    by HOSP_NRD KEY_NRD;
    Attrib IndexEvent length=3 label='Index event with DX1 of Septicemia (Default CCSR=INF002)';
    if DIED=0 /* not died */
	and DMONTH in (1:11) /* Discharged Jan-Nov to allow 30 day follow up */
	and not missing(NRD_daystoevent) /* non-missing admission date */
	and not missing(LOS) /* non-missing LOS to calculate discharge date */
	and age>=1 /* match HCUPnet */
	and DXCCSR_Default_DX1='INF002' /* DX1 of interest */ then indexEvent=1;
	else IndexEvent=0;
run;

proc surveymeans data=NRD sum mean nomcar;
     cluster HOSP_NRD ;
     strata NRD_STRATUM;
     weight DISCWT;
     var IndexEvent;
run;
 

The first section of this example SAS code includes the DATA step. This step includes the following statements:

  • The MERGE statement, which links the NRD Core File with the NRD Diagnosis and Procedure Groups File keeping essential data elements from each file. The NRD Diagnosis and Procedure Groups File includes the default CCSR category for the principal diagnosis or data element DXCCSR_Default_DX1.
  • The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and DXCCSR_Default_DX1.
  • The ATTRIB statement, which assigns a length and a label to a new data element (IndexEvent). This new data element is defined based on a combination of clinical criteria (default CCSR of INF002, Septicemia, for the principal diagnosis) as well as non-clinical criteria (e.g., patient did not die in the hospital or data element DIED = 0, patient was discharged between January to November 2019 or data element DMONTH has a value within the range of 1 to 11).

The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NRD. This procedure includes the following statements:

  • The CLUSTER statement, which includes the NRD hospital identifier or data element HOSP_NRD.
  • The STRATA statement, which includes the NRD stratum identifier or data element NRD_STRATUM.
  • The WEIGHT statement, which includes the NRD discharge weight or data element DISCWT.
  • The VAR statement, which includes the data element, IndexWeight, that we defined in the DATA step above.


               Produce National Estimate of Index Discharges with Principal or First-Listed Diagnosis 
                         of Septicemia (Default CCSR=INF002) from 2019 NRD File (Weighted)         

                                            The SURVEYMEANS Procedure                        

                                                   Data Summary                              

                                        Number of Strata                  93                  
                                        Number of Clusters              2507                  
                                        Number of Observations      18132856
                                        Sum of Weights              35399480
							 
							 
                                                Variance Estimation                              

                                        Method                  Taylor Series                  
                                        Missing Values                 NOMCAR 


                                                     Statistics                               

                                                                      Std Error                     Std Error
Variable        Label                                      Mean        of Mean           Sum           of Sum
-------------------------------------------------------------------------------------------------------------
IndexEvent      Index event with DX1 of Septicemia     0.052613       0.000736       1862468            32880
                (Default CCSR=INF002)
 

The output for this example SAS code provides the weighted number of index events in the 2019 NRD with a principal diagnosis of septicemia.


Return to Contents


Module 5: Nationwide Readmissions Database (NRD), Validating National Estimates

Unlike the other four modules within this tutorial, the NRD cannot be validated using the HCUP Summary Statistics, HCUP Diagnosis and Procedure Frequency Tables, or HCUPnet as the definitions of an index event and readmission will vary depending on the analytic purpose.

Additional information for working with the NRD is available in the NRD Introduction and the NRD Tutorial.


Return to Contents


Module 5: Nationwide Readmissions Database (NRD)

You have completed Module 5, Nationwide Readmissions Database (NRD)!

For any questions about the NRD that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:

  • Email: hcup@ahrq.gov
  • Phone: 866-290-HCUP (4287) (toll free)
  • International users, please contact HCUP User Support by email.

The staff reviews messages daily and usually responds to inquiries within 3 business days.


Return to Contents



Internet Citation: HCUP National Estimates Tutorial - Accessible Version. Healthcare Cost and Utilization Project (HCUP). February 2023. Agency for Healthcare Research and Quality, Rockville, MD. www.hcup-us.ahrq.gov/tech_assist/nationalestimates/508_course/508course_2023.jsp.
Are you having problems viewing or printing pages on this website?
If you have comments, suggestions, and/or questions, please contact hcup@ahrq.gov.
Privacy Notice, Viewers & Players
Last modified 2/10/23