Download:
pdf |
pdf2023 National Survey on Drug Use and
Health (NSDUH)
Sample Design Plan
Substance Abuse and Mental Health Services Administration
Center for Behavioral Health Statistics and Quality
Rockville, Maryland
September 2022
2023 National Survey on Drug Use and Health (NSDUH)
Sample Design Plan
Acknowledgments
This report was prepared for the Substance Abuse and Mental Health Services Administration
(SAMHSA), U.S. Department of Health and Human Services (HHS), under Contract No.
75S20322C00001 with RTI International. Marlon Daniel served as the government project
officer and as the contracting officer representative.
Recommended Citation
Center for Behavioral Health Statistics and Quality. (2022). 2023 National Survey on Drug Use
and Health (NSDUH) sample design plan (unpublished internal documentation). Substance
Abuse and Mental Health Services Administration.
Originating Office
Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health
Services Administration, 5600 Fishers Lane, Room 15E01B, Rockville, MD 20857. For
questions about this report, please email CBHSQrequest@samhsa.hhs.gov.
Nondiscrimination Notice/Aviso de no discriminación
SAMHSA complies with applicable federal civil rights laws and does not discriminate on the
basis of race, color, national origin, age, disability, or sex. SAMHSA cumple con las leyes
federales de derechos civiles aplicables y no discrimina por motivos de raza, color, nacionalidad,
edad, discapacidad o sexo.
U.S. Department of Health and Human Services
Substance Abuse and Mental Health Services Administration
Center for Behavioral Health Statistics and Quality
Populations Survey Branch
September 2022
ii
Table of Contents
Chapter
Page
1.
Introduction ......................................................................................................................... 1
1.1
Purpose.................................................................................................................... 1
1.2
Target Population .................................................................................................... 1
1.3
Design Overview .................................................................................................... 1
2.
2014-2023 Coordinated Sample ......................................................................................... 3
2.1
Overview ................................................................................................................. 3
2.2
Stratification............................................................................................................ 5
2.3
First-, Second-, and Third-Stage Sample Selections .............................................. 6
2.4
Fourth and Subsequent Stages of Sample Selection ............................................... 8
3.
2023 National Survey on Drug Use and Health Hybrid Address-Based Sampling
Frame and Field Enumeration Frame ................................................................................. 9
3.1
Overview ................................................................................................................. 9
3.2
2023 NSDUH Hybrid ABS Frame ....................................................................... 10
3.2.1 New Sample Hybrid ABS Approach ........................................................ 10
3.2.2 Overlap Sample Hybrid ABS Approach ................................................... 12
4.
2023 National Survey on Drug Use and Health Dwelling Unit and Person
Samples ............................................................................................................................. 15
4.1
Overview and Requirements ................................................................................. 15
4.2
Sample Allocation................................................................................................. 15
4.3
Fourth-Stage Sample (Dwelling Unit) Selection .................................................. 18
4.4
Fifth-Stage Sample (Person) Selection ................................................................. 20
4.5
Pair Sampling Parameter....................................................................................... 20
4.6
Expected Precision and Projected Domain Sample Sizes..................................... 22
4.7
Assignment of Respondents to 2010 Census Blocks ............................................ 23
5.
General Sample Allocation Procedures for the Mental Illness Calibration Study............ 25
5.1
Respondent Universe and Sampling Methods for the Mental Illness
Calibration Study .................................................................................................. 25
5.2
Person Sample Allocation Procedures for the Mental Illness Calibration
Study ..................................................................................................................... 25
References ..................................................................................................................................... 29
List of Contributors ....................................................................................................................... 31
Appendix
A
2023 National Survey on Drug Use and Health Variance Modeling............................... 33
iii
iv
List of Tables
Table
Page
2.1
Annual National Sample of Area Segments ....................................................................... 7
2.2
Sample Rotation within a State Sampling Region; 2014 through 2023 ............................. 8
3.1
Summary of 2023 NSDUH Hybrid Address-Based Sampling (ABS) Approach,
New and Overlap Samples ................................................................................................ 10
4.1
Summary of the 2023 NSDUH Design ............................................................................. 16
4.2
2023 NSDUH Sample Sizes and Projected Respondents; by State and Age Group ........ 16
4.3
2023 NSDUH Expected Completed Screening Interviews and Interview
Respondents; by Mode...................................................................................................... 20
4.4
Expected Pair Selection Counts and Response Rates for λ = 0.25 ................................... 22
4.5
Relative Standard Errors and Completed Interviews for Key Outcome Measures;
by Demographic Domain .................................................................................................. 22
5.1
MICS Age-Related Factors ............................................................................................... 26
5.2
Projected Yields of Predicted Positive Cases ................................................................... 27
5.3
2023 MICS Sample Allocation, by K6 Group .................................................................. 27
5.4
2023 MICS Sample Allocation, by WHODAS Score ...................................................... 28
A.1
Variance Component Notation ......................................................................................... 33
A.2
Variance Components of Residuals as a Percentage of Total Variance for Selected
Measures, Based on 2021 NSDUH................................................................................... 34
A.3
Unequal Weighting Effects for the National Estimates, Based on 2021 NSDUH ........... 34
A.4
Average Cluster Sizes and Coefficients of Variation, Based on 2019 NSDUH............... 35
A.5
Prevalence Estimates and Standard Errors for Key Outcome Measures;
by Demographic Domain .................................................................................................. 36
A.6
2021 Sample Fraction of an Age Group Originating in Selected Domains ...................... 37
v
vi
1. Introduction
1.1
Purpose
The goal of this report is to describe sampling plans for the 2023 National Survey on
Drug Use and Health (NSDUH). The report is organized into five chapters and includes a list of
cited references, a list of contributors, and an appendix. The remainder of Chapter 1 describes the
target population and provides a general overview of the 2023 NSDUH sample design. Chapter 2
discusses the 2014 through 2022 coordinated design and the motivation for extending it to 2023.
Chapters 3 and 4 focus on the features that will be employed in 2023, including a hybrid addressbased sampling (ABS) 1 and field enumeration sampling frame (Chapter 3), the selection of
dwelling units (DUs) (Chapter 4), and the selection of people (Chapter 4). Then, Chapter 5
describes the sample design for the Mental Illness Calibration Study (MICS), which is planned
for the 2023 and 2024 NSDUHs. Finally, Appendix A presents the parametric variance model
and sample design parameters used to project relative standard errors for 25 key outcome
measures and domains of interest.
1.2
Target Population
The respondent universe for the 2023 NSDUH (conducted by RTI International 2) is the
civilian, noninstitutionalized population aged 12 years or older within the 50 states and the
District of Columbia. Consistent with NSDUH designs since 1991, the 2023 NSDUH universe
includes residents of noninstitutional group quarters (e.g., shelters, rooming houses, dormitories)
and civilians residing on military bases. Persons excluded from the universe include those with
no fixed household address (e.g., persons experiencing homelessness not in shelters) and
residents of institutional group quarters, such as jails and hospitals.
1.3
Design Overview
A coordinated sample design was developed for the 2014 through 2017 NSDUHs. A
large reserve sample of area clusters was selected at the time the 2014 through 2017 NSDUH
sample was selected. This reserve sample is being used to field the 2018 through 2022 NSDUHs.
Because 2020 Census data are not available, the reserve sample will also be used for the 2023
NSDUH. The 50 percent overlap in sampled clusters will be continued so that half of the
sampled clusters are carried over from the 2022 NSDUH and the other half are new.
For the 2023 through 2027 NSDUHs, the Substance Abuse and Mental Health Services
Administration (SAMHSA) aims to improve the precision of estimates for minorities or hard-toreach populations. Toward this goal, the selection of smaller areas within secondary sampling
units (SSUs) will be eliminated in the new portion of the 2023 sample. Using larger geographic
ABS refers to the sampling of residential addresses from a list based on the U.S. Postal Service’s
Computerized Delivery Sequence (CDS) file.
2
RTI International is a trade name of Research Triangle Institute. RTI and the RTI logo are U.S. registered
trademarks of Research Triangle Institute.
1
1
areas is expected to reduce intracluster correlation and increase precision. Some larger SSUs may
need to be subsampled to make field enumeration feasible, however.
Within sampled areas, NSDUH has traditionally used field enumeration to construct DU
frames. A specially trained field lister visits each sampled area to obtain a complete and accurate
list of all potentially eligible DUs within the sampled area’s boundaries. Like the 2022 NSDUH,
a hybrid ABS approach will be implemented for the 2023 NSDUH. In this two-tiered hybrid
approach, areas with medium or low expected ABS coverage will be field enumerated, whereas
areas with high expected ABS coverage will use the ABS frame (see Chapter 3).
Over a period of several years, RTI International developed and tested an electronic
listing (eListing) application to transition NSDUH’s paper-based field enumeration process to an
electronic format. Beginning with the 2023 NSDUH, the eListing application will be used to
enumerate DUs in areas with medium or low expected ABS coverage and to locate sampled DUs
during data collection. Use of the eListing application at both stages (field enumeration and data
collection) is expected to result in process efficiencies and improve data quality. For sampled
areas carried over from the 2022 NSDUH, paper DU listings will be converted to eListings so
that only electronic maps are used to support data collection in 2023.
Similar to NSDUHs dating back to 1999, 3 the 2023 survey will provide state estimates
based on requested sample sizes per state. Like the 2021 and 2022 NSDUHs, selected DUs will
be mailed an invitation to participate in the survey online. Then, field interviewers will visit all
pending DUs to complete the screener and/or interview any selected individuals using computerassisted interviewing methods. A $30 incentive will be provided to web and in-person survey
participants, as has been done since the 2002 NSDUH. The total sample size of 67,507
completed interviews will be distributed within five age groups as follows: 25 percent for youths
aged 12 to 17, 25 percent for young adults aged 18 to 25, 15 percent for adults aged 26 to 34, 20
percent for adults aged 35 to 49, and 15 percent for adults aged 50 or older. This sample size will
allow SAMHSA to continue to report on estimates for some demographic subgroups at the
national level with adequate precision without the need to oversample these subgroups, as was
required prior to the 1999 survey.
The MICS is planned for the 2023 and 2024 NSDUHs. The goal of MICS is to recalibrate
estimate(s) of mental illness using criteria in the Diagnostic and Statistical Manual of Mental
Disorders, 5th edition (American Psychiatric Association, 2013). At the end of the NSDUH
interview, respondents selected for MICS will be asked to participate in a follow-up clinical
interview that will be conducted via Zoom or phone. The MICS sample design is similar to that
of the 2008-2012 Mental Health Surveillance Studies and is described in Chapter 5.
3
Prior to 2002, the survey was called the National Household Survey on Drug Abuse.
2
2. 2014-2023 Coordinated Sample
2.1
Overview
A coordinated sample design was developed for the 2014 through 2017 National Surveys
on Drug Use and Health (NSDUHs) and was extended to the 2018 through 2022 NSDUHs. In
summary, each state is stratified into state sampling regions (SSRs), then census tracts are
selected within SSRs (Stage 1), census block groups (CBGs) are selected within census tracts
(Stage 2), and census blocks are selected within CBGs (Stage 3). The 2010 decennial census and
2006-2010 American Community Survey (ACS) data were used to select the coordinated sample
originally. Because the 2020 decennial census data were not available, the Substance Abuse and
Mental Health Services Administration (SAMHSA) made the decision to extend the coordinated
sample design to the 2023 NSDUH.
In preparation for the 2014 NSDUH sample redesign, several optimization models and
other related analyses were conducted (RTI International, 2012). SAMHSA used the results from
these analyses to inform the 2014 through 2023 design. The multiyear design allows for a costefficient sample allocation to the largest states, while maintaining adequate sample sizes in
smaller states to support small area estimation (SAE) at the state and substate levels.
First-, second-, and third-stage sampling units were formed and selected ahead of time
and in sufficient numbers to support the 2014 through 2017 NSDUH main studies and several
large field tests. Additional “reserve” sampling units were selected to carry the sample through
the next decennial census. This reserve sample is being used to support the 2018 through 2023
NSDUHs. In addition to being efficient, the process of selecting sampled areas ahead of time
minimizes respondent burden by preventing most physical dwelling units (DUs) from being
selected more than once during the years of the coordinated sample. 4
The coordinated design for the 2014 through 2023 NSDUHs facilitates 50 percent
overlap in sampled areas between each 2 successive years from 2014 through 2023. This
designed sample overlap may slightly increase the precision of estimates of year-to-year trends if
a reused sampled area is somewhat homogeneous, creating a small positive correlation in
successive years. The 50 percent overlap of sampled areas significantly reduces costs because
DU frames need to be constructed for only one half of the sampled areas each year after 2014.
As with the design for most area household surveys, NSDUH’s 2014 through 2023
design offers the advantage of increasing interviewing efficiency by clustering the sample of
DUs within a sample of geographies. Also, because a complete frame of DUs does not exist in
all areas, the clustered area design enables the construction of a DU frame for a representative
sample of geographies so that all DUs have a chance of selection. The main concern of area
surveys is the potential variance-increasing effects of clustering and unequal weighting, but these
potential problems are directly addressed by (1) selecting a rather large sample of clusters at the
For the 2023 NSDUH, using secondary sampling units (SSUs) as the smallest geographic unit allows for
some SSUs to partially overlap with smaller area segments used in previous surveys. As a result, some DUs that
were selected in previous years may be selected for the 2023 NSDUH. Duplicate sample dwelling units (SDUs)
within the 2023 sample will be removed.
4
3
early stages of selection and (2) selecting these clusters with probability proportional to a
composite size measure defined as the population weighted by the state sampling rate in each age
group. This type of selection boosts precision by achieving an approximately self-weighting
sample within each state and age group (Center for Behavioral Health Statistics and Quality,
2022). Furthermore, the composite size measure approach tends to equalize final cluster sizes,
thus equalizing interviewer workloads within states.
SSUs and smaller area segments were designed to contain enough DUs to support two
annual survey samples (because of the 50 percent overlap in sampled areas) and one field test
sample. The minimum size requirement varies by state and the urban/rural status of the sampling
unit, as discussed in Section 2.3.
The 2014 through 2023 design ensures a sufficient sample to produce national estimates
directly. Depending on the desired precision, either direct estimates or estimates using SAE are
produced for state and substate areas, often by combining several years of data. For example,
typically 2-year combined estimates are produced by age group within each state, and 3-year
combined estimates are produced by substate area and age group using SAE. Other examples
include direct estimates for some core-based statistical areas (CBSAs 5) by age group and gender
using multiple years of pooled data.
The 2014 through 2023 surveys are designed to yield the following:
•
•
•
•
•
•
4,560 completed interviews in California;
3,300 completed interviews each in Florida, New York, and Texas;
2,400 completed interviews each in Illinois, Michigan, Ohio, and Pennsylvania;
1,500 completed interviews each in Georgia, New Jersey, North Carolina, and
Virginia;
967 completed interviews in Hawaii; and
960 completed interviews in each of the remaining 37 states and the District of
Columbia.
An additional requirement of the 2014 through 2023 sample design is that the sample
yield a minimum of 200 completed interviews in Kauai County, Hawaii, over a 3-year period.
This allows for Kauai County to be included as a separate entity in the production of substate
estimates that are produced biennially and typically based on 3 years of data. To achieve this
goal while maintaining precision at the state level, Kauai County is treated separately from the
remainder of Hawaii for sample allocation and sample size management purposes. The annual
sample in Hawaii consists of 67 completed interviews in Kauai and 900 completed interviews in
the remainder of the state, for a total of 967 completed interviews each year.
5
CBSAs include metropolitan and micropolitan statistical areas as defined by the Office of Management
and Budget (2009). Metropolitan statistical areas contain at least one urbanized area with 50,000 or more people and
may include adjacent territory with a high degree of social and economic integration with the core as measured by
commuting. Micropolitan statistical areas have an urban core with at least 10,000 but fewer than 50,000 people, plus
adjacent territory that is socioeconomically tied to the core by commuting. Both metropolitan and micropolitan
statistical areas are defined in terms of whole counties (or equivalent entities).
4
In expectation, data from roughly a random one fourth of the final sample of respondents
are collected in each calendar quarter. 6 This design feature helps control the influence of
seasonal variation on annual drug use and mental health prevalence estimates and other
important NSDUH outcome measures of interest.
2.2
Stratification
The first level of stratification is the “state,” where the District of Columbia is treated as a
state. The next level of strata is defined by geographically partitioning each state into roughly
equal-sized (according to composite size measure) SSRs; in the NSDUH design, each SSR is
expected to yield roughly the same number of interviews within each state during each data
collection period. To form SSRs, counties or, when necessary, portions of counties 7 (census
tracts) were combined using a geographic information systems (GIS) application developed by
RTI until the specified number of SSRs was formed and the SSRs, as a group, spanned the entire
land area of the state. The formation of SSRs also took interviewer accessibility into account
(e.g., by considering mountain ranges, rivers, and other potential “difficult to travel” areas).
The partitioning of the United States resulted in the formation of 750 SSRs. Within each
of these SSRs, a sample of primary sampling units (PSUs; one or more census tracts) was
selected. Then, within sampled PSUs, SSUs (one or more CBGs) were selected. Finally, because
CBGs generally far exceed the minimum DU requirement (defined in Section 2.3), one smaller
geographic area was selected within each sampled SSU. In general, third-stage sampling units
consisted of adjacent census blocks. In summary, the first-stage stratification for the 2014
through 2023 studies involves states and SSRs within states, with the first-stage sampling units
being census tracts, the second-stage sampling units being CBGs, and the third-stage sampling
units being one or more census blocks. As discussed previously, the third stage of selection was
eliminated for the new portion of the 2023 NSDUH sample to increase the precision of estimates.
In the 2022 NSDUH, SSUs were used for areas with high expected ABS coverage, and smaller
area segments (third-stage sampling units) were used for areas requiring field enumeration. Thus,
for the 2023 NSDUH, “segment” refers to either an SSU or a third-stage sampling unit.
For the coordinated sample, additional implicit stratification was achieved by sorting the
first-stage sampling units (census tracts) by a CBSA/socioeconomic status (SES) indicator 8 and
by the percentage of the population who are non-Hispanic White prior to selection. The firststage sampling units were systematically selected from this well-ordered sample frame.
A slight modification to equal allocation to calendar quarters is discussed in Section 4.3.
In Louisiana, parishes or portions of parishes were combined to form SSRs. In Alaska, whole or portions
of boroughs, city and boroughs, municipalities, and census areas were combined.
8
Four categories of the indicator are defined: (1) CBSA/low SES, (2) CBSA/high SES, (3) non-CBSA/low
SES, and (4) non-CBSA/high SES. To define SES, tract-level median rents and property values from the 2006-2010
ACS were given a rank (1…5) based on state and CBSA quintiles. The rent and value ranks then were averaged,
weighted by the percent of renter- and owner-occupied DUs, respectively. If the resulting score fell in the lowest
25th percentile by state and CBSA, the area was considered “low SES”; otherwise, it was considered “high SES.”
6
7
5
2.3
First-, Second-, and Third-Stage Sample Selections
The design of the first stage of selection began with the construction of an area sample
frame that contained one record for each census tract in the United States. If necessary, census
tracts were aggregated until each PSU had reached the minimum size requirement. In California,
Florida, Georgia, Illinois, Michigan, New Jersey, New York, North Carolina, Ohio,
Pennsylvania, Texas, and Virginia, this minimum size requirement was 250 DUs in urban areas
and 200 DUs in rural areas. In the remaining states and the District of Columbia, the minimum
requirement was 150 DUs in urban areas and 100 DUs in rural areas. After PSUs were formed,
a sample was selected within each SSR with probabilities proportional to a composite size
measure and with minimum replacement.
For the second stage of selection, adjacent CBGs were aggregated within selected PSUs
as necessary to form SSUs to meet the minimum DU requirements (150 or 250 DUs in urban
areas and 100 or 200 DUs in rural areas according to state). Then one SSU was selected per
sampled PSU with probability proportional to a composite size measure.
The systematic selection of census tracts at the first stage of selection and CBGs at the
second stage has the potential to reduce sampling variance by controlling the distribution of
selected areas and reducing the chance of selecting neighboring and possibly similar areas within
tracts and block groups. In addition, the merging of NSDUH data to external data sources for
future analytical purposes is simplified when sampled areas are contained within tract and block
group boundaries to the extent possible.
For the third stage of sampling, each selected CBG was partitioned into compact clusters 9
of DUs by aggregating adjacent census blocks. These geographic clusters of blocks or smaller
area segments were formed so that they contain the same minimum number of DUs as the PSU
(i.e., census tracts) and SSU (i.e., CBGs) to which they belong. That is, smaller area segments
contain at least 150 or 250 DUs in urban areas and 100 or 200 DUs in rural areas according to
state. Smaller area segments were constructed using 2010 decennial census data supplemented
with 2013 DU population estimates obtained from Claritas. 10 One smaller area segment was
selected within each sampled SSU with probability proportional to size.
SSUs and smaller area segments were formed to contain sufficient numbers of DUs to
support one field test and two annual NSDUH samples (because of the 50 percent overlap in
sampled areas between two successive survey years). Because each sampled area has more than
twice as many DUs as needed in any given year, each sampled area can be used for two survey
cycles. Therefore, half of the sampled areas used in any given year’s main sample are used again
in the following year as a means of improving the precision of measures of annual change. 11 This
SSU or smaller area segment size also allows for any special supplemental sample or field test
Although the entire cluster is compact, the final sample of DUs represents a noncompact cluster.
Noncompact clusters differ from compact clusters in that only random units within the cluster are included in the
sample. Although compact cluster designs are less costly and more stable, a noncompact cluster design was used
because it provides for greater heterogeneity of dwellings within the sample. Also, social interaction (contagion)
among neighboring dwellings is sometimes introduced with compact clusters (Kish, 1965).
10
Claritas is a market research firm headquartered in Cincinnati, Ohio (see https://www.claritas.com/ ).
11
Each segment is fielded during the same calendar quarter in each of the 2 years it is used.
9
6
that SAMHSA may wish to conduct within the same sampled areas. Table 2.1 shows the
allocation of the annual area segment sample by state. As noted previously, for the 2023
NSDUH, “segment” refers either to an SSU or a smaller area segment.
Table 2.1 Annual National Sample of Area Segments
IL, MI,
OH, and
PA
GA, NC,
NJ, and
VA
Remaining
38 States
and DC
Total
750
6,000
Design Parameter
Total Sample
CA
FL, NY,
and TX
SSRs
Segments
Total per State
36
288
90
720
96
768
60
480
468
3,744
SSRs
Segments
Total per SSR
36
288
30
240
24
192
15
120
12
96
2
8
2
8
2
8
2
8
2
8
Segments per Quarter
Segments over Four Quarters
CA = California; DC = District of Columbia; FL = Florida; GA = Georgia; IL = Illinois; MI = Michigan; NC = North Carolina;
NJ = New Jersey; NY = New York; OH = Ohio; PA = Pennsylvania; SSR = state sampling region; TX = Texas; VA = Virginia.
Within each SSR, a total of 48 segments were selected. With the 50 percent segment
overlap from one analysis year to the next, the 2014 through 2017 coordinated sample required a
sample of 20 segments per SSR. An additional 28 segments per SSR were selected to extend the
design to the next decennial census if desired and to support any additional studies embedded
within NSDUH. These 28 segments are referred to as the “reserve sample,” and 24 of the reserve
segments are being used to field the 2018 through 2023 NSDUHs. Thus, a total of 44 segments
per SSR are required for the 2014 through 2023 studies.
After selecting the 48 segments per SSR, 12 equal probability subsamples of 4 segments
were selected, and each subsample was assigned a sequential panel number. Thus, the SSRs
remain the same, and the 48 segments are allocated to 12 panels: 11 for the 2014 through 2023
NSDUHs plus 1 additional panel. Within panels, segments were assigned to quarters in the order
they were selected. Two panels or eight segments per SSR are used for each NSDUH year. The
panels used in the 2023 NSDUH are designated as panels 10 and 11. Panel 10 segments were
used for the 2022 survey and will be used for the second time in 2023, constituting the
50 percent overlap in survey samples. DUs not selected for the 2022 survey will be eligible for
selection in the panel 10 segments in 2023. The panel 11 segments will be used for the 2023
survey only.
Table 2.2 displays how a sample of 44 segments selected in an SSR is used to provide
8 segments per year for 10 years. Note that panels 1 and 11 are used only once, but all other
samples are used in 2 successive years.
7
Table 2.2 Sample Rotation within a State Sampling Region; 2014 through 2023
Panel
Number
1
2
3
4
5
6
7
8
9
10
11
2.4
Panel
Segments
4
4
4
4
4
4
4
4
4
4
4
2014
X
X
2015
X
X
2016
X
X
2017
X
X
2018
X
X
2019
X
X
2020
X
X
2021
X
X
2022
X
X
2023
X
X
Fourth and Subsequent Stages of Sample Selection
The selection of DUs (fourth-stage units) and persons (fifth-stage units) are specific to
each annual survey or supplement. Sections 4.3 and 4.4 discuss these stages of selection for the
2023 NSDUH.
8
3. 2023 National Survey on Drug Use and Health Hybrid
Address-Based Sampling Frame and Field Enumeration
Frame
3.1
Overview
Address-based sampling (ABS) refers to the sampling of residential addresses from a list
purchased from a licensed vendor. The vendor lists are based on the U.S. Postal Service’s
Computerized Delivery Sequence (CDS) file. Relative to field enumeration, ABS has the
potential to greatly reduce costs, improve timeliness, and improve frame accuracy in areas with
controlled access.
ABS has some limitations. Some addresses may geocode into the wrong area segment
and therefore be incorrectly included or excluded in a segment. For example, an address that
geocodes across the street from its physical location when the street is the segment’s boundary is
excluded from the address frame for the segment in which it belongs. It is incorrectly included
on the address frame for the adjacent segment. In-person surveys also require addresses that can
be located on the ground. 12 Thus, rural areas with high concentrations of PO Box™ addresses
and noncity-style addresses 13 are undercovered 14 (Dohrmann et al., 2006, 2007). Finally, group
quarters and American Indian or Alaska Native areas also are known to be undercovered
(Dohrmann et al., 2006; Dohrmann & Sigman, 2013; McMichael, 2015).
To maximize coverage and minimize costs, many studies use a hybrid ABS approach that
uses both ABS and field enumeration, depending on the expected coverage of the ABS frame for
particular areas. 15 In a two-tiered hybrid approach, a net coverage estimate is used to assign
sampled areas to two coverage tiers. In general, the net coverage estimate is defined as the frame
count (i.e., count of ABS addresses that geocode into the sampled area) divided by an external
benchmark dwelling unit (DU) estimate (e.g., census count). Areas whose net coverage estimate
is below a predetermined threshold are field enumerated, whereas areas whose net coverage
estimate exceeds the threshold use ABS.
An ABS research report was prepared to summarize ABS research to date and identify
any continued areas of concern for the use of ABS frames on the National Survey on Drug Use
and Health (NSDUH) (Substance Abuse and Mental Health Services Administration
[SAMHSA], 2019). The findings were then used to inform research questions for an ABS field
test. Based on the ABS research report and preliminary results from the ABS field test,
SAMHSA approved the implementation of a hybrid ABS frame for the 2022 NSDUH. The 2022
Although data will initially be collected via the web, all pending DUs will be transferred to in-person
data collection after a specified period of time. Thus, locatable addresses are required for 2023 NSDUH multimode
data collection.
13
City-style addresses are those with a street number and name, city name, state abbreviation or name, and
ZIP code). Noncity-style addresses include PO Boxes, rural route boxes, and highway contract boxes.
14
Coverage is the extent to which the frame includes the eligible survey population. At the dwelling unit
(DU) stage, a frame with complete coverage includes all eligible DUs in the sample segment.
15
Other studies that have employed a hybrid ABS frame include the National Health Interview Survey, the
Residential Energy Consumption Survey, and the U.S. Food & Drug Administration Tobacco Consumer Studies
Panel.
12
9
approach allowed SAMHSA to deploy a hybrid ABS frame on a limited basis, without
significant risk, while realizing some of the cost and timeliness benefits. A less conservative,
larger scale deployment of ABS is planned for the 2023 NSDUH.
3.2
2023 NSDUH Hybrid ABS Frame
Half of the 2023 NSDUH secondary sampling units (SSUs) or segments will be retained
from the 2022 NSDUH, but the frame construction process for the new portion of the 2023
NSDUH sample differs from that of the overlap sample. Table 3.1 summarizes the hybrid ABS
approach for both portions of the sample. Then, Sections 3.2.1 and 3.2.2 discuss the methods
used to prepare the DU frames for new and overlap SSUs or segments, respectively, in greater
detail.
Table 3.1 Summary of 2023 NSDUH Hybrid Address-Based Sampling (ABS) Approach,
New and Overlap Samples
Hybrid ABS Frame
Component
Net Coverage
Estimate
Coverage Threshold
Group Quarters
Drop Points1
1
New Sample Approach
Computerized Delivery Sequence (CDS) +
NoStat throwback count divided by
occupied housing unit count from the 2019
5-year American Community Survey
(ACS)
If the secondary sampling unit (SSU) net
coverage estimate falls below 90 percent, it
will be assigned to electronic listing
(eListing). Otherwise, it will use the ABS
frame. SSUs will serve as both eListing
and ABS segments.
SSUs in which 1 percent or more of the
dwelling units are group quarter units will
be assigned to eListing.
If 25 percent or more of the addresses in an
SSU are drop points, the SSU will be
assigned to eListing.
A drop point is a mail receptacle serving multiple housing units.
Overlap Sample Approach
CDS + NoStat throwback count divided by
total housing unit count from the 2019
5-year ACS
If the SSU net coverage estimate fell
below 95 percent, a smaller segment was
selected for field enumeration. Otherwise,
the SSU was the segment and used the
ABS frame.
SSUs with any adult (18+) group quarter
population according to the 2019 5-year
ACS were assigned to field enumeration.
1. If an SSU had at least one drop point
with three or more units, it was assigned to
field enumeration.
2. If 25 percent or more of the addresses in
an SSU were drop points, the SSU was
assigned to field enumeration.
3.2.1 New Sample Hybrid ABS Approach
As mentioned previously, in the new portion of the 2023 NSDUH sample, SSUs will
serve as geographic clusters both in areas with sufficient mailing address coverage and in areas
assigned to eListing. Using SSUs as both ABS and eListing segments is expected to reduce
intracluster correlation and increase the precision of estimates.
First, net coverage estimates will be computed for all SSUs that were selected at the
second stage of selection and assigned to the 2023 NSDUH (panel 11 in Table 2.2). The net
10
coverage estimate used to stratify SSUs will be computed using the CDS + NoStat throwback 16
count in the numerator because these are the addresses that constitute the ABS frames for
NSDUH. Occupied housing unit counts from the 2019 5-year American Community Survey
(ACS) will serve as the denominator.
The choice of coverage threshold used to stratify SSUs into those that use the ABS frame
and those that are field enumerated involves a trade-off between cost and coverage. Higher
thresholds have less cost savings but better accuracy; lower thresholds result in higher cost
savings at the expense of accuracy. Based on findings from the ABS field test, a coverage
threshold of 90 percent is recommended for NSDUH. Thus, new SSUs with less than 90 percent
expected ABS coverage will be assigned to eListing (Table 3.1).
In addition to having high expected ABS coverage, new SSUs will be required to meet
separate criteria for group quarters and drop points (defined in the next paragraph) in order to use
the ABS frame (Table 3.1). Because group quarters are expected to be undercovered on the ABS
frame and because making contact with group quarter administrators at the listing stage improves
the likelihood of gaining access for screening and interviewing, it is preferable that SSUs
containing group quarters be assigned to field enumeration in a hybrid frame. For the new
portion of the sample, any SSU in which 1 percent or more of the DUs are group quarter units 17
will be assigned to eListing.
A drop point is a mail receptacle that is shared by multiple housing units (drop units).
While some drop points are large (e.g., gated communities and high-rise apartment buildings),
the majority are two-unit drop points (e.g., a duplex with one mailing address) (Amaya, 2017).
The ABS frame indicates the number of units at a drop point but does not include unit identifiers
(e.g., apartment numbers). Drop points present additional challenges for sample implementation
because of their one-to-many relationship to DUs. For this reason, if 25 percent or more of the
addresses in a new SSU are drop points, the SSU will be assigned to eListing.
Based on the hybrid ABS approach described above, a total of 1,931 (64.37 percent) of
the 3,000 new SSUs in the 2023 sample will use the ABS frame and 1,069 (35.63 percent) will
be electronically listed beginning in August 2022. Similar to prior years, specially trained field
listers will list all DUs and potential DUs within each eListing SSU. However, for the first time,
listers will use the eListing application to record DU information. The electronic listings will be
based primarily on observation and may include vacant DUs and units that appear to be DUs but
may actually be used for nonresidential purposes. The objective of the listing is to attain as
complete a listing of potentially eligible residential addresses as possible. Any false positives for
residences—such as vacant DUs or nonresidential units—that are selected into the sample will
be eliminated during the household screening process after the sample is selected.
The NoStat file, another U.S. Postal Service file from CDS file vendors, contains addresses that do not
receive mail delivery (e.g., new construction). Active NoStat addresses include rural throwbacks and internal drops.
Rural throwbacks are locatable addresses for residents on rural postal routes who specify that their mail be delivered
to the post office rather than their home address. Internal drops are locatable addresses for units (i.e., include unit
type and identifying number) with identical street addresses on the CDS (Shook-Sa et al., 2013).
17
Group quarter units are estimated by dividing the 2019 ACS group quarter population by the average
number of people per group quarter unit from historical NSDUHs.
16
11
Some large SSUs will need to be subsampled to make eListing feasible. When possible,
SSUs with more than 750 expected DUs or more than 75 square miles and at least 200 expected
DUs will be subsampled using standard subsegmenting procedures (see Appendix D in the
sample design report [Center for Behavioral Health Statistics and Quality {CBHSQ}, 2022]).
The sample weights will be adjusted to reflect this subsampling.
3.2.2 Overlap Sample Hybrid ABS Approach
Compared with geocoding at the census block level, geocoding accuracy improves
significantly at the census block group (CBG) level in both rural and urban areas (Shook-Sa
et al., 2010). Thus, in the 2022 hybrid ABS approach, SSUs (one or more CBGs) served as
geographic clusters in areas with sufficient mailing address coverage. Net coverage estimates
were computed for all SSUs that were selected for the 2022 NSDUH (panel 10 in Table 2.2).
Those SSUs that met the ABS coverage criteria then served as sample segments. 18 SSUs that did
not meet the ABS coverage criteria required field enumeration. For cost-efficiency, the smaller
geographic area selected at the third stage of selection was field enumerated and considered the
sample segment. These ABS and field enumerated segments will be used for the second time in
2023 and will retain their 2022 ABS or field enumeration designation.
For the 2022 NSDUH overlap sample, the SSU-level net coverage estimates were
computed by dividing the CDS + NoStat throwback count by the total housing unit count from
the 2019 5-year ACS. Using total housing units instead of occupied housing units in the
denominator provided a conservative estimate of ABS coverage. A relatively high coverage
threshold of 95 percent was used to stratify SSUs based on the net coverage estimate. As noted
previously, a higher threshold allowed SAMHSA to roll out ABS on a limited basis for the 2022
NSDUH.
SSUs were also required to meet separate criteria for group quarters and drop points. For
the 2022 NSDUH overlap sample, any SSU with a nonzero adult (18 or older) group quarters
population according to the 2019 5-year ACS was assigned to field enumeration. In addition, two
drop point criteria were applied. First, if an SSU had at least one drop point with three or more
units, it was assigned to field enumeration. This step eliminated the need to subsample units if
more than five drop units were found at a selected drop point (and added as missed DUs).
Second, if 25 percent or more of the addresses in an SSU were drop points (even if all of them
were two-unit drop points), the SSU was assigned to field enumeration. This step eliminated the
need to subsample missed DUs if more than 10 drop units were added as missed DUs within the
segment.
Based on the 2022 NSDUH hybrid ABS approach, a total of 973 (32.43 percent) of the
3,000 overlap segments in the 2023 NSDUH sample will use ABS and 2,027 (67.57 percent)
were field enumerated on paper in the second half of 2021. As noted previously, the paper DU
listings will be converted to eListings prior to 2023 NSDUH data collection.
Because smaller geographic areas are not sampled within ABS segments, the third-stage probability of
selection is set to 1 for these segments.
18
12
For ABS segments retained from the 2022 NSDUH, the most recent version of the CDS
will be used; however, addresses selected for the 2022 NSDUH will be ineligible for selection in
the 2023 NSDUH sample.
Some ABS SSUs contain a large number of DUs. Prior to 2023, all NSDUH systems
were built around a 10-digit DU identification number (DUID) with the last three digits being the
DU line number. Rather than change the length of the DUID for the 2022 NSDUH, SSUs
assigned to ABS and with more than 999 DUs were subsampled using standard subsegmenting
procedures (CBHSQ, 2022). In summary, each large ABS SSU was divided into smaller areas
containing an approximately equal number of DUs and no more than 999 DUs. Then, one area
was randomly selected. The sample weights were adjusted accordingly. In 2023, NSDUH
systems will be updated to accommodate an 11-digit DU identification number so new ABS
SSUs will not need to be subsampled unless they contain more than 9,999 DUs.
13
14
4. 2023 National Survey on Drug Use and Health Dwelling
Unit and Person Samples
4.1
Overview and Requirements
The requirement of 67,507 completed interviews for the 2023 National Survey on Drug
Use and Health (NSDUH) was derived from the following objectives:
•
•
minimum sample sizes of 4,560 completed interviews in California; 3,300 completed
interviews each in Florida, New York, and Texas; 2,400 completed interviews each in
Illinois, Michigan, Ohio, and Pennsylvania; 1,500 completed interviews each in
Georgia, New Jersey, North Carolina, and Virginia; 967 completed interviews in
Hawaii; and 960 completed interviews in each of the remaining 37 states and the
District of Columbia; and
allocation to age groups as follows: 25 percent for youths aged 12 to 17, 25 percent
for young adults aged 18 to 25, 15 percent for adults aged 26 to 34, 20 percent for
adults aged 35 to 49, and 15 percent for adults aged 50 or older.
The 1999 sample was the first to reflect the objective of the Substance Abuse and Mental
Health Services Administration (SAMHSA) to develop reliable and representative state-level
estimates using small area estimation (SAE) and direct estimation procedures. This objective
continues to apply for the 2023 sample. To achieve this objective, the targeted sample size by
state was set to be at least 960 completed interviews. In 13 states, the target was set at greater
than 960 completed interviews (as shown in Table 4.1). The larger overall sample makes it
possible to get adequate precision for national prevalence estimates for Hispanic and
non-Hispanic Black or African American populations without any targeted oversampling of high
concentration areas of these populations or any oversampling through screening for these
populations (as was done for the 1985 through 1998 surveys).
4.2
Sample Allocation
Similar to previous NSDUHs, at the final stage of selection, five age groups will be
sampled at different rates. These five age groups will be defined as follows: 12 to 17, 18 to 25,
26 to 34, 35 to 49, and 50 or older. For the 2023 NSDUH, each state’s sample will be allocated
to age groups, as described in Section 4.1. Table 4.2 displays the age group allocation by state
for the 2023 NSDUH sample.
15
Table 4.1 Summary of the 2023 NSDUH Design
Statistic
Total Sample
SSRs
Segments
Selected DUs
Expected Eligible DUs
Expected Completed
Screening Interviews
Expected Selected Persons
Expected Completed
Interviews
Total per State
SSRs
Segments
Expected Selected DUs
Expected Completed
Interviews
Expected Interviews per
Segment
Total per SSR and Segment,
by Quarter
Segments per SSR
Expected Interviews per
SSR
Expected Interviews per
Segment
CA
FL, NY,
and TX
IL, MI,
OH, and
PA
GA, NC,
NJ, and
VA
Remaining
37 States
and DC
36
288
64,759
55,170
19,750
90
720
140,595
119,778
42,877
96
768
136,334
116,148
41,578
60
480
85,209
72,593
25,986
12
96
13,733
11,700
4,188
456
3,648
518,070
441,363
157,996
750
6,000
958,699
816,752
292,375
13,348
4,560
28,979
9,900
28,100
9,600
17,563
6,000
2,831
967
106,782
36,480
197,602
67,507
36
288
64,759
30
240
46,865
24
192
34,084
15
120
21,302
12
96
13,733
12
96
13,633
N/A
N/A
N/A
4,560
3,300
2,400
1,500
967
960
N/A
15.83
13.75
12.50
12.50
10.07
10.00
N/A
2
2
2
2
2
2
N/A
31.67
27.50
25.00
25.00
20.15
20.00
N/A
15.83
13.75
12.50
12.50
10.07
10.00
N/A
HI
Total
CA = California; DC = District of Columbia; DU = dwelling unit; FL = Florida; GA = Georgia; HI = Hawaii; IL = Illinois; MI =
Michigan; N/A = not applicable; NC = North Carolina; NJ = New Jersey; NY = New York; OH = Ohio; PA = Pennsylvania;
SSR = state sampling region; TX = Texas; VA = Virginia.
Note: This table was prepared using 2019-2021 National Survey on Drug Use and Health data.
Table 4.2 2023 NSDUH Sample Sizes and Projected Respondents; by State and Age
Group
State
Total Population
Northeast
Connecticut
Maine
Massachusetts
New Hampshire
New Jersey
New York
Pennsylvania
Rhode Island
Vermont
State
Sampling Total
Regions Segments
750
6,000
12
12
12
12
15
30
24
12
12
96
96
96
96
120
240
192
96
96
Total
Selected
Dwelling
Units
958,699
Total
Selected
Persons
197,602
13,633
13,633
13,633
13,633
21,302
46,865
34,084
13,633
13,633
2,810
2,810
2,810
2,810
4,391
9,660
7,025
2,810
2,810
Age Groups for Total Respondents
12-17 18-25
26-34 35-49
50+
16,877 16,877 10,126 13,501 10,126
240
240
240
240
375
825
600
240
240
240
240
240
240
375
825
600
240
240
144
144
144
144
225
495
360
144
144
192
192
192
192
300
660
480
192
192
144
144
144
144
225
495
360
144
144
Total
67,507
960
960
960
960
1,500
3,300
2,400
960
960
(continued)
16
Table 4.2 2023 NSDUH Sample Sizes and Projected Respondents; by State and Age
Group (continued)
State
Midwest
State
Sampling Total
Regions Segments
Total
Selected
Dwelling
Units
Total
Selected
Persons
Age Groups for Total Respondents
12-17
18-25
26-34
35-49
50+
Total
Illinois
Indiana
Iowa
Kansas
Michigan
Minnesota
Missouri
Nebraska
North Dakota
Ohio
South Dakota
Wisconsin
24
12
12
12
24
12
12
12
12
24
12
12
192
96
96
96
192
96
96
96
96
192
96
96
34,084
13,633
13,633
13,633
34,084
13,633
13,633
13,633
13,633
34,084
13,633
13,633
7,025
2,810
2,810
2,810
7,025
2,810
2,810
2,810
2,810
7,025
2,810
2,810
600
240
240
240
600
240
240
240
240
600
240
240
600
240
240
240
600
240
240
240
240
600
240
240
360
144
144
144
360
144
144
144
144
360
144
144
480
192
192
192
480
192
192
192
192
480
192
192
360
144
144
144
360
144
144
144
144
360
144
144
2,400
960
960
960
2,400
960
960
960
960
2,400
960
960
Alabama
Arkansas
Delaware
District of
Columbia
Florida
Georgia
Kentucky
Louisiana
Maryland
Mississippi
North Carolina
Oklahoma
South Carolina
Tennessee
Texas
Virginia
West Virginia
12
12
12
96
96
96
13,633
13,633
13,633
2,810
2,810
2,810
240
240
240
240
240
240
144
144
144
192
192
192
144
144
144
960
960
960
12
30
15
12
12
12
12
15
12
12
12
30
15
12
96
240
120
96
96
96
96
120
96
96
96
240
120
96
13,633
46,865
21,302
13,633
13,633
13,633
13,633
21,302
13,633
13,633
13,633
46,865
21,302
13,633
2,810
9,660
4,391
2,810
2,810
2,810
2,810
4,391
2,810
2,810
2,810
9,660
4,391
2,810
240
825
375
240
240
240
240
375
240
240
240
825
375
240
240
825
375
240
240
240
240
375
240
240
240
825
375
240
144
495
225
144
144
144
144
225
144
144
144
495
225
144
192
660
300
192
192
192
192
300
192
192
192
660
300
192
144
495
225
144
144
144
144
225
144
144
144
495
225
144
960
3,300
1,500
960
960
960
960
1,500
960
960
960
3,300
1,500
960
South
(continued)
17
Table 4.2 2023 NSDUH Sample Sizes and Projected Respondents; by State and Age
Group (continued)
State
Sampling
Total
Regions Segments
State
West
Alaska
Arizona
California
Colorado
Hawaii
Idaho
Montana
Nevada
New Mexico
Oregon
Utah
Washington
Wyoming
4.3
12
12
36
12
12
12
12
12
12
12
12
12
12
96
96
288
96
96
96
96
96
96
96
96
96
96
Total
Selected
Dwelling
Units
Total
Selected
Persons
13,633
13,633
64,759
13,633
13,733
13,633
13,633
13,633
13,633
13,633
13,633
13,633
13,633
2,810
2,810
13,348
2,810
2,831
2,810
2,810
2,810
2,810
2,810
2,810
2,810
2,810
Age Groups for Total Respondents
12-17 18-25
240
240
1,140
240
242
240
240
240
240
240
240
240
240
26-34
35-49
144
144
684
144
145
144
144
144
144
144
144
144
144
192
192
912
192
193
192
192
192
192
192
192
192
192
240
240
1,140
240
242
240
240
240
240
240
240
240
240
50+
144
144
684
144
145
144
144
144
144
144
144
144
144
Total
960
960
4,560
960
967
960
960
960
960
960
960
960
960
Fourth-Stage Sample (Dwelling Unit) Selection
The sampling frame for the fourth stage of sample selection will be made up of dwelling
units (DUs) within selected segments (or secondary sampling units [SSUs]). A DU is either a
housing unit for a single household or a listing unit (e.g., a dormitory room or a shelter bed)
within one of the eligible noninstitutional group quarters that are part of the defined target
population. Before any sample selection within selected segments can proceed, DU frames will
be constructed using the hybrid address-based sampling (ABS) approaches described in
Section 3.2.
To estimate the number of DUs that will need to be selected at the fourth stage of
selection, a simulation was run using 2021 NSDUH data. The response rates used in the
simulation were computed using Quarter 2 through 4 2019 and Quarter 1 2020 data and were
adjusted to Quarter 4 2021 multimode experience. After accounting for eligibility, nonresponse,
and the fifth-stage (person) sample selection procedures, it was estimated that roughly 958,699
DUs will need to be selected to obtain a sample of 67,507 responding persons distributed by state
and age group as shown in Tables 4.1 and 4.2. While this estimate is based on current NSDUH
multimode experience, it will be refined closer to the time the quarterly samples are selected.
For the DU stage of selection, the sample segments (or SSUs) initially will be separated
into their respective states. DUs then will be systematically selected from the segment (or SSU)
address frames. The sample of DUs selected from each segment (or SSU) will be based on
sampling rates that will be predefined for the 50 states and the District of Columbia and will be
inversely proportional to the composite-size-measure-based selection probability of the census
18
tract, census block group, and segment. 19, 20 In addition, historical state-level NSDUH data will
be used to accurately predict interviews per household (yield), eligibility rates, and response
rates in the allocation of the DU samples.
Some constraints will be put on the DU sample sizes. For example, if at least five unused
DUs remain in a segment, a minimum of five sample dwelling units (SDUs) per segment will be
required for cost-efficiency. 21 Similarly, to ensure adequate samples for supplemental studies,
the DU sample size will be limited to half of the actual listing unit count. 22
The total sample will be allocated to calendar Quarters 1 through 4 in proportions of 24.0,
26.0, 25.5, and 24.5 percent, respectively (or 96, 104, 102, and 98 percent of the quarterly
sample, respectively). This disproportional allocation to calendar quarters allows for smaller
sample sizes during the “short” quarters (1 and 4) when either field interviewer (FI) training
(Quarter 1) or holidays (Quarter 4) force the interviewing schedule to be shortened. Although the
disproportionate allocation does produce some additional unequal weighting, the effect is small
and is compensated for by the improved field performance.
In addition, the quarterly samples of DUs will be partitioned into a series of releases. This
partitioning will allow for greater control of sample sizes for each individual state within the
quarter. As with previous NSDUHs, input will be solicited from the field staff concerning the
ongoing quarter; this input plus the realized sample sizes from previous quarters (if available)
will be reviewed to determine whether an additional release of sample is needed to meet each
state’s yearly goal. In addition to the desired quarterly sample, a supplemental sample (typically
20 percent of the quarterly sample) will be selected that can be used to overcome any shortfall in
a state’s respondent sample size.
During the screening phase of the data collection period, FIs will ask each screening
respondent whether any other living quarters are within the structure or on the property of the
SDU, 23 such as a basement apartment or an apartment above a garage. If the respondent indicates
that a DU is on the premises of the sampled DU and was missed during frame construction, then
the new or missed dwelling(s) will be selected. In addition, FIs will be instructed to call their
supervisors if they notice large differences in the segment listing and what they encounter in the
field during either the screening or interviewing phase of data collection. Then, a special “bust”
procedure will be implemented to minimize bias associated with large numbers of missed DUs
19
The third stage of selection is eliminated in overlap ABS segments and all new segments. For these
segments, the third-stage probability of selection will be set to 1.
20
As a consequence, all DUs within a specific stratum will be selected with approximately the same
probability and, therefore, approximately equalized DU sampling weights.
21
If fewer than five unused DUs remain in the segment, fewer than five SDUs will be selected.
22
Although the 2023 segments are not planned to be used again, these DU sample size limits will be
retained in case, for example, the decennial census data do not become available before the 2024 NSDUH sample is
selected.
23
To avoid respondent confusion, the missed DU question is skipped for web respondents and residents of
larger multiunit structures (three or more units). FIs report to their supervisors large differences in the number of
units observed and listed. The missed DU question is also skipped for residents of group quarters and residents of
housing units associated with group quarters (e.g., a “house mother” apartment in a sorority house). The number of
units in a group quarters structure is generally confirmed with a gatekeeper when gaining access to the group
quarters.
19
by selecting a sample of them. A bust is defined as 150 or more missed DUs in a segment or 50
or more missed DUs following any one DU.
4.4
Fifth-Stage Sample (Person) Selection
After DUs are selected within each segment, each selected DU will be mailed an
invitation to participate in the survey online. Web screening respondents will enter roster
information for all persons residing in the DU. If the DU does not respond initially via web, an
FI will visit the DU to obtain a roster of all persons residing in the DU. This roster information
will be used to select 0, 1, or 2 persons for the survey. Sampling rates will be preset by age group
and state. Roster information will be entered directly into the web or electronic screening
instrument, which will automatically implement this fifth stage of selection based on the state
and age group sampling parameters.
One benefit of using an electronic screening instrument in NSDUH is the ability to
automate a complicated person-level selection algorithm at the fifth stage of the NSDUH design.
The selection algorithm allows any pair of survey-eligible people within a DU to have some
known chance of being selected. This feature of the design is of interest to NSDUH researchers
because, for example, it allows analysts to examine, using appropriate design-based weights,
how the drug use propensity of one individual in a family relates to the drug use propensity of
another family member residing in the same DU (e.g., the relationship of drug use between a
parent and child).
As shown in Table 4.1, at the fifth stage of selection, roughly 197,602 people will be
selected from within 292,375 screened and eligible DUs. Assuming a 36 percent screening
completion rate and a 34 percent interview completion rate, these sample sizes are sufficient to
obtain the desired 67,507 person respondents. Based on Quarter 4 2021 experience,
approximately 28,353 (42 percent) of the 67,507 interviews are expected to be completed via
web. Table 4.3 displays the expected number of completed screeners and interviews by mode.
Table 4.3 2023 NSDUH Expected Completed Screening Interviews and Interview
Respondents; by Mode
Mode
Total Sample
Web
In-person
DU = dwelling unit.
4.5
Selected DUs
958,699
86,283
872,416
Expected
Eligible DUs
816,752
86,274
730,478
Expected
Completed
Screening
Interviews
292,375
86,274
206,101
Expected
Selected
Persons
197,602
58,309
139,293
Expected
Completed
Interviews
67,507
28,353
39,154
Pair Sampling Parameter
The pair sampling algorithm in NSDUH is based on the Chromy and Penne (2002)
adaptation of the Brewer (1963, 1975) method for selecting samples of size 2. Chromy and
Penne (2002) adapted the method to select samples of 0, 1, or 2 persons within a selected DU
20
containing at least one eligible person. They also introduced a pair sampling parameter λ that
governs the number of pairs selected.
The target selection probability for person i in DU h is defined as Phi. 24 Then, to ensure
that all pairs have a positive probability of selection, all person probabilities must be strictly less
than 1, and, arbitrarily, the maximum Phi is set to 0.99. In Brewer’s (unadapted) method of
sampling pairs, the sum of first-order inclusion probabilities is always equal to n = 2. However,
because the design calls for a selection of 0, 1, or 2 persons per DU, it is unlikely that the sum of
person probabilities within a DU, Sh = Σi Phi , equals 2. The following adaptations are then
applied to the sampling algorithm:
•
•
If Sh > 2, a multiplicative scaling factor, Fh = 2/Sh, is applied to all the target selection
probabilities so that they are scaled down to sum to exactly 2.
If Sh < 2, the problem is remedied by creating three dummy persons and distributing
the remaining size measure (2 – Sh) to them equally (i.e., the inclusion of dummy
persons in the selection could result in the selection of 0 or 1 actual persons).
Operationally, this initially requires the application of the following multiplicative
scaling factor to the person probabilities:
𝐹𝐹ℎ = min �
0.99
2
,
�.
𝑆𝑆ℎ max(𝑃𝑃ℎ𝑖𝑖 )
However, a further modification is applied to this scaling factor that allows some flexibility in
the actual number of pairs selected. This modification is governed by the pair sampling
parameter λ. Define
𝑇𝑇(λ) = 𝑆𝑆ℎ + λ(2 − 𝑆𝑆ℎ ); 0 ≤ λ ≤ 1.
Then the modified multiplicative scaling factor is expressed as
𝐹𝐹ℎ∗ = min �
𝑇𝑇(λ)
0.99
,
�.
𝑆𝑆ℎ max(𝑃𝑃ℎ𝑖𝑖 )
Simulation analyses resulted in the selection of λ = 0.50 for the 2002 to 2013 NSDUH
sample designs. However, changes to the sample design in 2014 with respect to age group and
state necessitated further simulation analyses to identify the value of λ best suited for the 2014
through 2023 design. Simulation analyses based on the 2012 screening data, modified to reflect
the required 2014 through 2023 age group sample proportions (but not modified to reflect the
new state proportions), were conducted, and λ = 0.25 was selected. Using λ = 0.25, the 2014
through 2023 design was expected to produce a similar number and distribution of selected pairs
as the previous design (Center for Behavioral Health Statistics and Quality, 2016). Table 4.4
displays the expected pair selection counts (scaled to sum to 67,507) and corresponding pair
response rates for λ = 0.25 updated using 2021 screening data.
Phi is equivalent to the state (h) and segment (j) probability of selection in age group a, or Shja, in the
2021 NSDUH sample design report (Center for Behavioral Health Statistics and Quality, 2022).
24
21
Table 4.4 Expected Pair Selection Counts and Response Rates for λ = 0.25
Age Group Pair
12+, 12+
12-17, 12-17
12-17, 18-25
12-17, 26+
18-25, 18-25
18-25, 26+
26+, 26+
Selected Pairs
43,629
5,052
4,360
12,151
6,619
7,208
8,239
Pair Response Rate, %
27.8
34.9
25.8
27.8
28.0
25.9
26.2
Source: SAMHSA, Center for Behavioral Health Statistics and Quality, National Survey on Drug Use and Health, 2021.
4.6
Expected Precision and Projected Domain Sample Sizes
The multistage, stratified 2023 survey design is intended to achieve acceptable precision
for various person subpopulations of interest. The allocation of persons per state and age group
(12 to 17, 18 to 25, 26 to 34, 35 to 49, and 50 or older) is also taken as a requirement to support
direct estimation in some large sample states and SAE in the remaining states. Using the state
and age group distribution presented in Table 4.2 and 2021 NSDUH data, estimates and relative
standard errors for 25 key outcome measures and domains of interest were modeled and are
presented in Table 4.5. The parametric variance model represents the variance of key estimates
as a function of sample design parameters, including unequal weighting effects, clustering
effects, and the impact of respondent sample size differences across segments (or SSUs) and
DUs (see Appendix A).
Table 4.5 Relative Standard Errors and Completed Interviews for Key Outcome
Measures; by Demographic Domain
Data File
Measure
Variable Name
ALCMON
Past Month Alcohol Use
ALCMON
Past Month Alcohol Use
Domain
12+
12-20
ALCMON
50+
Past Month Alcohol Use
ALCMON
API,
12+
Past Month Alcohol Use
ALCMON
AIAN, 12+
Past Month Alcohol Use
ALCMON
Pregnant, 12-44
Past Month Alcohol Use
BNGDRKMON Past Month Binge Alcohol Use
18-25
BNGDRKMON Past Month Binge Alcohol Use
12+
MRJMON
12+
Past Month Marijuana Use
MRJMON
12-17
Past Month Marijuana Use
MRJMON
18-25
Past Month Marijuana Use
MRJMON
50+
Past Month Marijuana Use
MRJMON
API, 12+
Past Month Marijuana Use
MRJMON
MRJMON
Past Month Marijuana Use
CIGMON
Past Month Cigarette Use
CIGMON
Past Month Marijuana Use
Past Month Cigarette Use
AIAN, 12+
Pregnant, 12-44
12-17
12+
22
Expected
Completed
Interviews
(2023)
2021
Prevalence
0.4755
2021 RSE
Projected
RSE
(2023)
0.0086
0.0108
67,507
0.1514
0.0306
0.0425
23,334
0.4713
0.0156
0.0179
10,126
0.3191
0.0501
0.0597
4,025
0.3756
0.0978
0.1277
849
0.0975
0.2151
0.2708
632
0.2915
0.0238
0.0219
16,877
0.2145
0.0146
0.0176
67,507
0.1299
0.0209
0.0215
67,507
0.0576
0.0617
0.0530
16,877
0.2413
0.0264
0.0244
16,877
0.0756
0.0481
0.0576
10,126
0.0613
0.2702
0.0714
0.1193
0.1370
0.1166
0.1522
4,025
849
0.2732
0.2904
632
0.0151
0.1280
0.1111
16,877
0.1559
0.0202
0.0249
67,507
(continued)
Table 4.5 Relative Standard Errors and Completed Interviews for Key Outcome
Measures; by Demographic Domain
Data File
Variable Name
Measure
PNRNMMON
Past Month Pain Reliever
Misuse
PNRNMMON
Past Month Pain Reliever
Misuse
ABODALC
Past Year Alcohol Disorder
UDPYILL
UDPYILAL
TXYRSPILAL
SMIPY
Past Year Illicit Drug Disorder
Past Year Substance Use
Disorder
Past Year Specialty Substance
Use Treatment
Domain
2021
Prevalence
2021 RSE
18-25
0.0064
0.1439
0.1661
16,877
12+
12+
12+
0.0086
0.0523
0.0816
0.0935
67,507
0.0335
0.0349
67,507
0.0349
0.0369
0.0360
67,507
50+
0.0434
0.0688
0.0819
10,126
12+
18+
18+
0.0106
0.0555
0.0732
0.0780
67,507
0.0299
0.0313
50,630
0.0829
Past Year MDE
0.0243
0.0257
50,630
AIAN = American Indian or Alaska Native (NEWRACE2 = 3); API = Asian or Pacific Islander (NEWRACE2 = 4 or 5); MDE
= major depressive episode; Pregnant, 12-44 = (PREG2=1); RSE = relative standard error; SMI = serious mental illness.
Note: Projected RSEs were determined using 2014 through 2023 state and age sample allocations in a variance component model.
All model components, except average cluster sizes and coefficients of variation, were updated using 2021 NSDUH data
(see Appendix A).
Source: SAMHSA, Center for Behavioral Health Statistics and Quality, National Survey on Drug Use and Health, 2021.
IRAMDEYR
4.7
Past Year SMI
Expected
Completed
Interviews
(2023)
Projected
RSE
(2023)
Assignment of Respondents to 2010 Census Blocks
To allow external data (e.g., census and American Community Survey data) to be
appended to the NSDUH analytic file, a census block will be assigned to each of the
approximately 67,507 respondents. In ABS segments, the census block will be based on the
geocoded location of the address. In field enumerated segments, the census block will be
assigned based on the SDU’s location on the electronic map. With eListing, manual census block
assignments will no longer be required.
23
24
5. General Sample Allocation Procedures for the Mental
Illness Calibration Study
The overarching goal of the Mental Illness Calibration Study (MICS) within the National
Survey of Drug Use and Health (NSDUH) is to fit a prediction model for serious mental illness
(SMI) among adults aged 18 or older that can be used to create updated model-based estimates of
SMI and other mental illness categories at the national and domain levels (e.g., by age group and
race/ethnicity). This sample of respondents will be contacted following their main NSDUH
interview for a clinical follow-up interview. This clinical interview will be conducted using
criteria in the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (American
Psychiatric Association, 2013) for diagnoses. These data will provide the dependent variables in
the prediction models that will be used to compute updated mental illness estimates for the
NSDUH data products and data files.
5.1
Respondent Universe and Sampling Methods for the Mental
Illness Calibration Study
The 2023 and 2024 MICS samples are designed to yield 2,000 clinical interviews per
year. Each year, the probability sample will be distributed across four calendar quarters, resulting
in approximately 500 MICS follow-up clinical interviews per quarter. Similar to the Mental
Health Surveillance Study (MHSS) fielded during the 2008 through 2012 NSDUHs, the
probability sample will be embedded in the main study sample; therefore, the initial interview for
the validation cases will be included in the target study sample of approximately 50,630 main
study adult interviews. The target population for the MICS will exclude persons whose main
study interview was conducted in Spanish.
The selection algorithm developed and used in the 2012 MHSS to mitigate the problem
of extreme weights will be used for the MICS. A subsample of eligible respondents aged 18 or
older will be selected for clinical follow-up with probabilities based on their Kessler-6 (K6)
nonspecific psychological distress scale score (Kessler et al., 2003) and World Health
Organization Disability Assessment Scale (WHODAS) score (Novak et al., 2010; Rehm et al.,
1999), and an age group adjustment factor will be applied. A probability sampling algorithm will
be programmed in the computer-assisted interviewing instrument so that selected respondents
can be recruited for the subsequent clinical psychiatric interview conducted on Zoom.
5.2
Person Sample Allocation Procedures for the Mental Illness
Calibration Study
Table 5.1 shows some of the factors used to compute sampling rates. Based on 2021
population estimates and the 2023 planned sample, the average weighting 25 for persons aged 50
or older is 5.9 times as large as the average weighting for persons aged 18 to 25. (Smaller
differences occur for intermediate age groups.) To compensate for this initial disparity in weights
and to focus on persons aged 18 or older as a whole, sampling rates were set for persons aged 18
The average weight is equal to the estimated population totals per age group from the 2021 NSDUH
divided by the expected target sample for each age group planned for years 2023 and 2024.
25
25
to 25, then adjusted for the other three age groups by applying the equalization factor, F, shown
in Table 5.1. 26 Persons completing the Spanish-language questionnaire were not eligible to be
selected for the clinical follow-up. An eligibility factor of 97.31 percent (estimated from the
2021 NSDUH for adults 18 or older) was used for all age groups. The response rate used in the
calculations was set to 40 percent. This rate is the product of the percentage agreeing to the
follow-up survey (assumed to be 80 percent) and the proportion of those who actually complete
the follow-up (assumed to be 50 percent). 27
Table 5.1 MICS Age-Related Factors
Age
18 to 25
26 to 34
35 to 49
50 or Older
2021 Population
33,458,433
40,161,932
62,151,458
118,052,839
Planned Sample
16,877
10,126
13,501
10,126
Average Weight
1,982.5
3,966.2
4,603.5
11,658.4
Weight Equalization
Factor1
1.0000
2.0006
2.0006
2.0006
The weight equalization factor that was computed for Age 26 to 34 was also used for Age 35 to 49 and
50 or Older to reduce the unequal weighting effect.
1
The general sample allocation strategy is to find an allocation that provides the most
precise estimate of SMI. A total of 225 strata were defined based on the combination of 25
possible K6 scores (0 to 24) and 9 possible WHODAS scores (0 to 8). First, the predicted
probability of SMI in each stratum is calculated as the average value of the predicted
probabilities of SMI (variable SMIPP in the NSDUH dataset) among all the adult respondents
within this stratum in the 2021 NSDUH. Then, the proportionality factors, rh ,age , for setting
sampling rates by stratum (denoted h) and age group are computed as:
𝑟𝑟ℎ,𝑎𝑎𝑎𝑎𝑎𝑎 ∝
�𝑃𝑃ℎ (1 − 𝑃𝑃ℎ )
∗ 𝐹𝐹𝑎𝑎𝑎𝑎𝑎𝑎
𝐸𝐸 ∗ 𝑅𝑅𝑅𝑅
where Ph refers to the predicted probability of SMI in stratum h, and 𝐹𝐹𝑎𝑎𝑎𝑎𝑎𝑎 , 𝐸𝐸, and 𝑅𝑅𝑅𝑅 refer to the
age-specific weight equalization factors, the overall eligibility factor, and the overall response
rate factor, respectively. These proportionality factors will then be multiplied by the projected
sample counts and scaled to achieve an overall respondent sample of 2,000 persons aged 18 or
older to obtain the stratum and age-specific sampling rates. As an example, the predicted
probability of SMI for a person with a K6 score of 10 and a WHODAS score of 6 is 0.1123. For
the 18 to 25 age group, the proportionality factor then would be
𝑟𝑟ℎ,18−25 =
�0.1123(1 − 0.1123)
∗ 1.000 = 0.8112
0.9731 ∗ 0.4
26
Use of the derived weight equalization factors in Table 5.1 would have greatly increased the sampling
rate for persons aged 50 or older. An adjusted set of factors that partially reduced the unequal weighting effects
across age groups was specified instead. The adjusted equalization factors for the 35 to 49 and 50 or older age
groups were set equal to the factor for the 26 to 34 age group.
27
Response rate factors were projected using the 2012 Mental Health Surveillance Study and the 2020
Clinical Validation Study as well as consideration of other factors that might impact the response rate.
26
An adjustment factor of 0.1444 was applied to each proportionality factor in order to achieve an
overall sample of 2,000 persons in each year. Thus, the sampling rate for this stratum and age
group was 0.8112 * 0.1444 = 0.1171.
Projected yields of positive cases based on the predicted probability of SMI broken out
by age group are provided in Table 5.2. In addition, Tables 5.3 and 5.4 provide the 2022 MICS
sample allocation by K6 group and WHODAS score, respectively.
Table 5.2
Projected Yields of Predicted Positive Cases
Age Group
18 or Older
Total Selected Sample
5,138
Expected Completed Interviews
2,000
Expected Respondents with AMI
1,095
Expected Respondents with SMI
389
AMI = any mental illness; SMI = serious mental illness.
18 to 25
1,290
502
349
144
26 to 34
1,355
527
323
115
35 to 49
1,571
612
314
103
50 or
Older
922
359
109
27
The probability sample of 2,000 clinical follow-up interviews will be distributed across
four calendar quarters in each year with approximately 500 follow-up interviews per quarter.
Throughout the 2023 and 2024 surveys, the MICS sample will be monitored, and the sampling
parameters will be modified on an as-needed basis.
The analysis weight for the MICS sample will be separate from the main study analysis
weight. To compute the MICS analysis weight, the main study analysis weight will be multiplied
by the inverse of the probability of selecting the person for clinical follow-up (the inverse of the
sampling rate used to select the person) and nonresponse and poststratification adjustments. In
addition to the analysis weight, MICS-specific variance estimation strata and replicates will be
created to appropriately account for the design when computing variances of estimates.
Table 5.3 2023 MICS Sample Allocation, by K6 Group
Assumed SMI
Expected
Percent of
Expected SMI
Overall
Rate
Completed
K6 Group
Population1
(Percent)1,2
Interviews
Count
Sampling Rate
52.58
0.94
525
0
0.06018
0 to 3
11.19
1.31
151
0
0.06788
4 to 5
8.52
2.05
138
2
0.07726
6 to 7
5.95
3.15
122
3
0.09426
8 to 9
4.74
5.18
138
6
0.12043
10 to 11
8.31
11.20
354
59
0.16814
12 to 15
8.71
33.69
572
319
0.23591
16 or Higher
100.00
5.11
2,000
389
Total
K6 = Kessler-6, a 6-item psychological distress scale; SMI = serious mental illness.
1
Source: 2021 National Survey on Drug Use and Health.
2
To compute assumed SMI rates, SMI estimates by K6 and WHODAS score were averaged (weighted) across K6
scores. These rates are not the actual SMI rates that were used in the sample allocation.
27
Table 5.4 2023 MICS Sample Allocation, by WHODAS Score
Expected
Percent of
Assumed SMI
Expected SMI
Overall
Completed
Population1
Rate (Percent)1,2
Interviews
Count
Sampling Rate
70.15
1.01
777
1
0.06229
7.17
2.14
137
3
0.08699
4.86
4.54
141
7
0.12023
3.69
6.70
133
12
0.14334
3.33
11.36
141
22
0.17428
2.63
16.43
149
37
0.20859
2.59
23.74
155
61
0.23789
2.17
34.78
142
79
0.25612
3.42
46.91
224
167
0.27277
100.00
5.11
2,000
389
Total
K6 = Kessler-6, a 6-item psychological distress scale; SMI = serious mental illness; WHODAS = World Health
Organization Disability Assessment Scale.
1
Source: 2021 National Survey on Drug Use and Health.
2
To compute assumed SMI rates, SMI estimates by K6 and WHODAS score were averaged (weighted) across K6
scores. These rates are not the actual SMI rates that were used in the sample allocation.
WHODAS
Score
0
1
2
3
4
5
6
7
8
28
References
Amaya, A. E. (2017). RTI International’s Address-Based Sampling Atlas: Drop points. RTI
Press. RTI Press Occasional Paper No. OP-0047-1712
https://doi.org/10.3768/rtipress.2017.op.0047.1712h
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders
(5th ed.). https://doi.org/10.1176/appi.books.9780890425596
Brewer, K. R. W. (1963). A model of systematic sampling with unequal probabilities. Australian
Journal of Statistics, 5(1), 5-13. https://doi.org/10.1111/j.1467-842x.1963.tb00132.x
Brewer, K. R. W. (1975). A simple procedure for sampling πpswor. Australian Journal of
Statistics, 17(3), 166-172. https://doi.org/10.1111/j.1467-842x.1975.tb00954.x
Center for Behavioral Health Statistics and Quality. (2016). 2015 National Survey on Drug Use
and Health methodological resource book: Section 2, sample design report. Rockville, MD:
Substance Abuse and Mental Health Services Administration.
Center for Behavioral Health Statistics and Quality. (2022). 2021 National Survey on Drug Use
and Health (NSDUH) methodological resource book, Section 2, Sample design report.
Rockville, MD: Substance Abuse and Mental Health Services Administration.
Chromy, J. R., & Penne, M. A. (2002). Pair sampling in household surveys. In Proceedings of
the 2002 Joint Statistical Meetings, American Statistical Association, Survey Research Methods
Section, New York, NY (pp. 552-554). Alexandria, VA: American Statistical Association.
Retrieved from http://www.asasrms.org/Proceedings/index.html
Dohrmann, S., Han, D., & Mohadjer, L. (2006). Residential address lists vs. traditional listing:
Enumerating households and group quarters. In Proceedings of the 2006 Joint Statistical
Meetings, American Statistical Association, Survey Research Methods Section, Seattle, WA
(pp. 2959-2964). Alexandria, VA: American Statistical Association.
Dohrmann, S., Han, D., & Mohadjer, L. (2007). Improving coverage of residential address lists
in multistage area samples. In Proceedings of the 2007 Joint Statistical Meetings, American
Statistical Association, Section on Survey Research Methods, Salt Lake City, UT (pp. 32193126). Alexandria, VA: American Statistical Association.
Dohrmann, S., & Sigman, R. (2013). Using an area linkage method to improve the coverage of
ABS frames for in-person household surveys. In Proceedings of Federal Committee on
Statistical Methodology (FCSM) Research Conference. Washington, DC: Federal Committee on
Statistical Methodology.
Kessler, R. C., Barker, P. R., Colpe, L. J., Epstein, J. F., Gfroerer, J. C., Hiripi, E., Howes, M. J.,
Normand, S. L., Manderscheid, R. W., Walters, E. E., & Zaslavsky, A. M. (2003). Screening for
serious mental illness in the general population. Archives of General Psychiatry, 60, 184-189.
Kish, L. (1965). Survey sampling. New York, NY: John Wiley & Sons.
29
McMichael, J. P. (2015, August). ABS coverage evaluation: Recommendations for evaluating
the household coverage of address-based sampling frames. In Proceedings of the 2015 Joint
Statistical Meetings, American Statistical Association, Section on Survey Research Methods,
Seattle, WA (pp. 2279-2280). Alexandria, VA: American Statistical Association.
Novak, S. P., Colpe, L. J., Barker, P. R., & Gfroerer, J. C. (2010). Development of a brief mental
health impairment scale using a nationally representative sample in the USA. International
Journal of Methods in Psychiatric Research, 19(S1), 49-60. https://doi.org/10.1002/mpr.313
Office of Management and Budget. (2009, December 1). OMB Bulletin No. 10-02: Update of
statistical area definitions and guidance on their uses. Washington, DC: The White House.
https://obamawhitehouse.archives.gov/sites/default/files/omb/assets/bulletins/b10-02.pdf
Rehm, J., Üstün, T. B., Saxena, S., Nelson, C. B., Chatterji, S., Ivis, F., & Adlaf, E. (1999). On
the development and psychometric testing of the WHO screening instrument to assess
disablement in the general population. International Journal of Methods in Psychiatric Research,
8, 110-123. https://doi.org/10.1002/mpr.61
RTI International. (2012). National Survey on Drug Use and Health: Sample redesign issues and
methodological studies (RTI/0209009.486.001 and 0211838.108.006.005, prepared for the
Substance Abuse and Mental Health Services Administration under Contract Nos. 283-200400022 and HHSS283200800004C). Research Triangle Park, NC: Author.
SAS Institute Inc. (2017). SAS/STAT software: Release 14.1. Cary, NC: Author.
Shook-Sa, B. E., McMichael, J. P., Ridenhour, J. L., & Iannacchione, V. G. (2010). The
implications of geocoding error on address-based sampling. In Proceedings of the 2010 Joint
Statistical Meetings, American Statistical Association, Section on Survey Research Methods
(pp. 3303-3312). Alexandria, VA: American Statistical Association. Retrieved from
http://www.asasrms.org/Proceedings/index.html
Shook-Sa, B. E., Currivan, D. B., McMichael, J. P., & Iannacchione, V. G. (2013). Extending the
coverage of address-based sampling frames. Public Opinion Quarterly, 75, 994-1005.
https://doi.org/10.1093/poq/nft041
Substance Abuse and Mental Health Services Administration. (2019). Address-based sampling
research report. Rockville, MD: Center for Behavioral Health Statistics and Quality, Substance
Abuse and Mental Health Services Administration.
https://www.samhsa.gov/data/sites/default/files/cbhsq-reports/NSDUHABSReport2019.pdf
30
List of Contributors
This methodological report was prepared by the Substance Abuse and Mental Health
Services Administration (SAMHSA), Center for Behavioral Health Statistics and Quality, and by
RTI International. Work by RTI was performed under Contract No. 75S20322C00001. Marlon
Daniel served as the government project officer and as the contracting officer representative, and
David Hunter served as the RTI project director.
Magas.
Contributors to this report at SAMHSA included Rong Cai, Jingsheng Yan, and Iva
Significant contributors to this report at RTI included Katherine B. Morton, Rachel M.
Harter, Phillip S. Kott, Dan Liao, and Lauren K. Warren. Other contributors at RTI included
Charlotte Looby, Peilan C. Martin, Joe McMichael, Erin Murphy, and Matt Williams.
31
32
Appendix A:
2023 National Survey on Drug Use and Health
Variance Modeling
Parametric variance models allow a sampling statistician to represent the variance of key
estimates as a function of sample design parameters, as mentioned in Section 4.6. Except where
noted, the 2021 National Survey on Drug Use and Health (NSDUH) data were used to estimate
some of the key population parameters needed for these models. The required parameters include
the following:
•
•
•
•
•
A.1
variance components,
unequal weighting effects (UWEs),
averages and coefficients of variation for respondents per segment,
averages and coefficients of variation for respondents per dwelling unit (DU), and
prevalence estimates by domain.
Variance Components
Treating the variability (variance) of an estimate across individuals, after adjusting for
age, as being (on average) composed of random components due to stratum (state sampling
region or SSR), segment or secondary sampling unit (SSU) (within SSR), 28 DU (within segment
or SSU), and the person (within the DU), the variance components measure the contribution to
the variance from each of those components. The notation for variance components associated
with SSR, segment or SSU, DU, and person is shown in Table A.1.
Table A.1 Variance Component Notation
Level
Symbol
2
State Sampling Region (SSR)
σ SSR
2
Segment or SSU
σ segment
σ2DU
Dwelling Unit (DU)
2
Person
σ person
2
2
2
2
2
σ = σ SSR + σ segment + σ DU + σ person
Total
Variance components were estimated using the method of moments in SAS PROC
NESTED with equal weights (SAS Institute Inc., 2017) and were computed for 11 substance use
and treatment variables of interest, controlling for age group. Table A.2 shows variance
components for the 11 selected measures.
Because one segment or SSU is selected per sampled PSU, the PSU and segment/SSU components of
variance are the same.
28
33
Table A.2 Variance Components of Residuals as a Percentage of Total Variance for
Selected Measures, Based on 2021 NSDUH
Variance Component as a Percent of Total Variance
2
2
Variable
Mean
σ SSR
2
σ segment
2
σ DU
σ person
Past Month Alcohol Use, 12+
0.4755
2.7251
3.6335
28.0795
65.5619
Past Month Binge Alcohol Use, 12+
0.2145
0.9129
1.7385
30.8592
66.4894
Past Month Marijuana Use, 12+
0.1299
2.3194
1.1551
29.6786
66.8469
Past Month Cigarette Use, 12+
0.1559
2.0814
3.7511
32.8277
61.3398
Past Month Pain Reliever Misuse, 12+
0.0086
0.1027
0.0300
15.0013
84.8660
Past Year Alcohol Disorder, 12+
0.0523
0.3935
0.0379
29.1925
70.3760
Past Year Illicit Drug Disorder, 12+
Past Year Substance Use Disorder,
50+
Past Year Specialty Substance Use
Treatment, 12+
0.0349
0.6746
0.0000
15.2491
84.0763
0.0434
0.0000
5.7366
22.8460
71.4174
0.0106
0.3074
0.0016
24.7496
74.9414
Past Year SMI, 18+
0.0555
0.4503
0.0000
4.7326
94.8171
Past Year MDE, 18+
0.0829
0.4086
0.0000
4.3883
95.2031
MDE = major depressive episode; SMI = serious mental illness.
Source: SAMHSA, Center for Behavioral Health Statistics and Quality, National Survey on Drug Use and Health, 2021.
A.2
Unequal Weighting Effects
The UWE for a domain, d, is
nd
UWE d =
∑ wi2
i∈d
( ∑ wi) 2
,
i∈d
where nd is the domain sample size and wi is the weight for respondent i.
Table A.3 displays the UWEs for the 2021 national estimates by five age groups.
Table A.3 Unequal Weighting Effects for the National Estimates, Based on 2021 NSDUH
Age Group
12+
12-17
18-25
26-34
35-49
50+
Unequal Weighting Effect
3.91031
2.70185
2.99204
2.93169
3.07589
2.76581
34
A.3
Cluster Sizes and Cluster Size Variation
Parametric variance models also require the average cluster size and the coefficient of
variation of the cluster size at the segment and DU levels. For age group domain δ, mδ,seg is the
segment-level average cluster size, CVmδ ,seg is the segment-level coefficient of variation of the
cluster size,
mδ,DU
is the DU-level average cluster size, and CVmδ ,DU is the DU-level coefficient of
variation of the cluster size. Because additional segments were added to the 2021 NSDUH
sample, the 2019 NSDUH average cluster sizes and coefficients of variation are closer to
expected for the 2023 NSDUH and are presented in Table A.4.
Table A.4 Average Cluster Sizes and Coefficients of Variation, Based on 2019 NSDUH
Age Group
Domain (δ )
Respondents/DU1
Population Size
Respondents/Segment
(∑ i∈δ wi )
Nδ
mδ,DU
CVmδ ,DU
mδ,seg
CVmδ ,seg
275,221,249
67,625
0.4569
1.5160
11.2708
0.5364
12-17
24,905,039
16,858
0.1139
3.2007
2.8097
0.8456
12-20
38,073,463
23,158
0.1564
2.7620
3.8597
0.8028
18-25
33,732,492
16,516
0.1116
3.2579
2.7527
1.0754
26-34
40,322,989
10,207
0.0690
3.9711
1.7012
0.9957
50+
115,300,847
10,509
0.0710
3.7744
1.7515
0.9553
18+
250,316,210
50,767
0.3430
1.6768
8.4612
0.5746
16,977,699
3,432
0.0232
7.7135
0.5720
2.7743
1,563,640
804
0.0054
15.6505
0.1340
5.8021
2,069,681
759
0.0051
13.9477
0.1265
2.9137
12+
API, 12+
AIAN, 12+
Pregnant,
12+
AIAN = American Indian or Alaska Native; API = Asian or Pacific Islander; DU = dwelling unit.
1 The total number of DUs in the sample is actually more than the number of respondents. This does not matter because the key
2
statistic has the form mδ , DU (1 + CVδ , DU ) (see Equation A1 in Section A.5), which is insensitive to the number of DUs in the
sample.
A.4
Estimates, by Demographic Domain
The last parameter required for the variance models is prevalence estimates by domain.
Table A.5 displays these estimates and their standard errors.
35
Table A.5 Prevalence Estimates and Standard Errors for Key Outcome Measures;
by Demographic Domain
Data File
Variable Name
ALCMON
ALCMON
ALCMON
ALCMON
ALCMON
ALCMON
BNGDRKMON
BNGDRKMON
MRJMON
MRJMON
MRJMON
MRJMON
MRJMON
MRJMON
MRJMON
CIGMON
CIGMON
PNRNMMON
PNRNMMON
ABODALC
UDPYILL
UDPYILAL
TXYRSPILAL
SMIPY
IRAMDEYR
Measure
Past Month Alcohol Use
Past Month Alcohol Use
Past Month Alcohol Use
Past Month Alcohol Use
Past Month Alcohol Use
Past Month Alcohol Use
Past Month Binge Alcohol Use
Past Month Binge Alcohol Use
Past Month Marijuana Use
Past Month Marijuana Use
Past Month Marijuana Use
Past Month Marijuana Use
Past Month Marijuana Use
Past Month Marijuana Use
Past Month Marijuana Use
Past Month Cigarette Use
Past Month Cigarette Use
Past Month Pain Reliever Misuse
Past Month Pain Reliever Misuse
Past Year Alcohol Disorder
Past Year Illicit Drug Disorder
Past Year Substance Use Disorder
Past Year Specialty Substance Use Treatment
Past Year SMI
Past Year MDE
Domain
12+
12-20
50+
API, 12+
AIAN, 12+
Pregnant, 12-44
18-25
12+
12+
12-17
18-25
50+
API, 12+
AIAN, 12+
Pregnant, 12-44
12-17
12+
18-25
12+
12+
12+
50+
12+
18+
18+
2021
Standard
Prevalence
Error
0.4755
0.0041
0.1514
0.0046
0.4713
0.0073
0.3191
0.0160
0.3756
0.0367
0.0975
0.0210
0.2915
0.0069
0.2145
0.0031
0.1299
0.0027
0.0576
0.0036
0.2413
0.0064
0.0756
0.0036
0.0613
0.0073
0.2702
0.0370
0.0714
0.0195
0.0151
0.0019
0.1559
0.0032
0.0064
0.0009
0.0086
0.0007
0.0523
0.0018
0.0349
0.0013
0.0434
0.0030
0.0106
0.0008
0.0555
0.0017
0.0829
0.0020
AIAN = American Indian or Alaska Native (NEWRACE2 = 3); API = Asian or Pacific Islander (NEWRACE2 = 4 or 5); MDE =
major depressive episode; Pregnant, 12-44 = (PREG2=1); SMI = serious mental illness.
Source: SAMHSA, Center for Behavioral Health Statistics and Quality, National Survey on Drug Use and Health, 2021.
A.5
Variance Models
Let δ be a domain of interest (e.g., 12 to 20, 12 or older Asian or Pacific Islander [API],
18 or older). The parametric variance model for a U.S.-level estimate under the 2014 through
2023 design, with differing age group targets, is as follows:
(
)
Var=
pˆn 67,507
=
∑
relevant a
Wa2δ
(A1)
pa (1 − pa )
2
2
2
mδ , seg (1 + CVm2δ , seg ) + σ DU
mδ , DU (1 + CVδ2, DU ) + σ person
UWEUS
,
σ seg
a
f aCaδ 67,507
{
}
36
where
a
= an age group (e.g., relevant a means that the 26 or older age group is excluded
when the domain is 18- to 25-year-olds);
pa = the estimated (domain) proportion of interest within age group a ;
Waδ = the (estimated) population share of age group
∑
i∈a ∩δ
wi
∑
i∈δ
a
within domain
δ , that is,
wi ;
f a = the proposed sampling fraction for age group a ; and
caδ = the fraction of the sample in age group
a
that is also in domain
δ , that is,
∑ i∈a ∩δ 1 ∑ i∈a 1 .
The value of caδ is 1 except when the domain is API, American Indian or Alaska Native
(AIAN), pregnant women, or 12- to 20-year-olds. For the 12 to 20 domain, .375 is used as the
fraction of 18- to 20-year-olds within the 18 to 25 age group. Table A.6 displays the other
needed sample fractions from the 2021 NSDUH sample.
Table A.6 2021 Sample Fraction of an Age Group Originating in Selected Domains
Domain
Age Group
Pregnant Women
AIAN
API
12-17
0.0014
0.0101
0.0483
18-25
0.0173
0.0103
0.0630
26-34
0.0335
0.0098
0.0615
35-49
0.0085
0.0084
0.0694
50+
0.0000
0.0066
0.0436
AIAN = American Indian or Alaska Native; API = Asian or Pacific Islander.
37
SAMHSA’s mission is to reduce the impact of substance abuse and mental illness on America’s communities.
1-877-SAMHSA-7 (1-877-726-4727) | 1-800-487-4889 (TDD) | www.samhsa.gov
File Type | application/pdf |
File Title | 2021 National Survey on Drug Use and Health (NSDUH) Sample Design Plan: NSDUH Methodological Report (unpublished internal docume |
Subject | National Survey on Drug Use and Health (NSDUH), 2021 sample design plan |
Author | RTI International for SAMHSA Center for Behavioral Health Statis |
File Modified | 2022-10-06 |
File Created | 2022-09-23 |