Chief, Laboratory of Epidemiology and Biometry
Division of Intramural Clinical and Biological Research
National Institute on Alcohol Abuse and Alcoholism
5635 Fishers Lane
Rockville, MD 20852
Phone: (301) 443-7370
Fax: (301) 443-1400
Email: bgrant@willco.niaaa.nih.gov
Table of Contents
B. COLLECTIONS OF INFORMATION EMPLOYING STATISTICAL METHODS 1
B.1 Respondent Universe and Sampling Methods 1
B.2 Procedures for the Collection of Information 13
B.3 Methods to Maximize
Response Rates and Deal with
Nonresponse
31
B.4 Test of Procedures oR Methods to be Undertaken 33
B.5 Individuals Consulted on Statistical Aspects and Individuals Collecting and/or Analyzing Data 34
(In Order of Appearance in Supporting Statement B)
Attachment 8. Overview of NESARC-III Interview Flow Including
Consent Documents and Procedures, and DNA
Collection Procedures
and Associated Script and
Screenshots 13, 17,
20, 21, 22, 32, 33
Attachment 21. Screening and Recontact Scripts 16, 17
Attachment 22. Thank You Letter and Example Refusal Letter 17, 32
Attachment 23. Sample Size Requirements for Genetic Analyses 21, 29
B. Collections of Information Employing Statistical Methods
The target population of NESARC-III is the non-institutionalized, civilian population 18 years of age or older in the U.S. (the 50 states and the District of Columbia) including persons residing in noninstitutionalized group quarters such as college dormitories, group homes, group quarters, and dormitories for workers. Note, however that college students will be sampled at their permanent residence rather than at their dormitory as described later in this document.
Estimates of the NESARC-III respondent universe are shown in Table 1, which presents the number of persons in specific age, sex, and race-ethnicity domains derived from population projections for 2011 from the U.S. Census Bureau Population Division. Population estimates are defined by sex, five age groups (18-24, 25-29, 30-39, 40-49, 50+) and for each of five race-ethnicity groups. The mutually exclusive and exhaustive race-ethnicity groups presented in Table 1 are consistent with the most recent revision of the OMB Statistical Policy Directive No. 15, Race and Ethnic Standards for Federal Statistics and Administrative Reporting. They are: Hispanic or Latino (H/L), Not H/L Black alone, Not H/L Asian alone, Not H/L White alone, and Not H/L All Other Races. These groupings form the basis of the planned NESARC-III minority oversampling sampling strategy described in more detail later. The sum of the various classifications adds to the total projected 2011 non-institutionalized population of 237,610,430 Americans aged 18 years and older.
Table 2 presents the expected number of survey respondents in the final NESARC-III sample with analogous break-downs to those given in Table 1. Under the proposed sample design, the overall expected number of completed AUDADIS interviews is 46,500, including approximately 7,220 Hispanics, 7,845 Blacks, 2,210 Asians, 28,235 non-Hispanic Whites and 990 from All Other Races. Note that within the race-ethnicity groups given in this table, the expected sample sizes for the corresponding age and sex domains are estimated to be proportional to the Respondent Universe figures from Table 1.
Table 1. NESARC-III Respondent Universe*
Age Categories |
||||||
Sex/
|
18+ Years |
18-24 Years |
25-29 Years |
30-39 Years |
40-49 Years |
50+ Years |
Total |
237,610,430 |
30,952,071 |
21,464,994 |
40,872,684 |
43,380,464 |
100,940,217 |
Male |
115,630,207 |
15,796,572 |
10,902,878 |
20,607,571 |
21,583,308 |
46,739,878 |
Female |
121,980,223 |
15,155,499 |
10,562,116 |
20,265,113 |
21,797,156 |
54,200,339 |
|
|
|
|
|
|
|
Hispanic or Latino |
|
|
|
|
|
|
Total |
33,670,124 |
6,033,402 |
3,893,999 |
8,045,377 |
6,658,641 |
9,038,705 |
Male |
17,110,210 |
3,117,164 |
2,010,952 |
4,296,734 |
3,458,424 |
4,226,936 |
Female |
16,559,914 |
2,916,238 |
1,883,047 |
3,748,643 |
3,200,217 |
4,811,769 |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
|
|
Black |
|
|
|
|
|
|
Total |
28,053,265 |
4,563,937 |
2,975,755 |
5,308,327 |
5,303,983 |
9,901,263 |
Male |
13,077,289 |
2,300,332 |
1,495,933 |
2,554,030 |
2,483,857 |
4,243,137 |
Female |
14,975,976 |
2,263,605 |
1,479,822 |
2,754,297 |
2,820,126 |
5,658,126 |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
|
|
Asians |
|
|
|
|
|
|
Total |
11,227,067 |
1,306,007 |
1,033,191 |
2,602,108 |
2,323,144 |
3,962,617 |
Male |
5,228,832 |
646,834 |
486,388 |
1,223,491 |
1,109,506 |
1,762,613 |
Female |
5,998,235 |
659,173 |
546,803 |
1,378,617 |
1,213,638 |
2,200,004 |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
|
|
White |
|
|
|
|
|
|
Total |
159,920,230 |
18,114,763 |
12,975,487 |
23,995,293 |
28,295,940 |
76,538,747 |
Male |
77,912,581 |
9,261,250 |
6,616,222 |
12,079,845 |
14,141,717 |
35,813,547 |
Female |
82,007,649 |
8,853,513 |
6,359,265 |
11,915,448 |
14,154,223 |
40,725,200 |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
|
|
All Other Races |
|
|
|
|
|
|
Total |
4,739,744 |
933,962 |
586,562 |
921,579 |
798,756 |
1,498,885 |
Male |
2,301,295 |
470,992 |
293,383 |
453,471 |
389,804 |
693,645 |
Female |
2,438,449 |
462,970 |
293,179 |
468,108 |
408,952 |
805,240 |
*Based on 2011 population projections from the US Census Bureau Population Division (http://www.census.gov/population/www/projections/downloadablefiles.html)
Age Categories |
|
|||||||||||
Sex/Race-Ethnicity |
18+ Years |
18-24 Years |
25-29 Years |
30-39 Years |
40-49 Years |
50+ Years |
|
|||||
Total |
46,500 |
6,220 |
4,284 |
8,151 |
8,531 |
19,314 |
|
|||||
Male |
22,592 |
3,173 |
2,175 |
4,104 |
4,233 |
8,908 |
|
|||||
Female |
23,908 |
3,048 |
2,109 |
4,047 |
4,298 |
10,406 |
|
|||||
|
|
|
|
|
|
|
|
|||||
Hispanic or Latino |
|
|
|
|
|
|
|
|||||
Total |
7,220 |
1,294 |
835 |
1,725 |
1,428 |
1,938 |
|
|||||
Male |
3,669 |
668 |
431 |
921 |
742 |
906 |
|
|||||
Female |
3,551 |
625 |
404 |
804 |
686 |
1,032 |
|
|||||
|
|
|
|
|
|
|
|
|||||
Not Hispanic or Latino |
|
|
|
|
|
|
|
|||||
Black |
|
|
|
|
|
|
|
|||||
Total |
7,845 |
1,276 |
832 |
1,484 |
1,483 |
2,769 |
|
|||||
Male |
3,657 |
643 |
418 |
714 |
695 |
1,187 |
|
|||||
Female |
4,188 |
633 |
414 |
770 |
789 |
1,582 |
|
|||||
|
|
|
|
|
|
|
|
|||||
Not Hispanic or Latino |
|
|
|
|
|
|||||||
Asians |
|
|
|
|
|
|
|
|||||
Total |
2,210 |
257 |
203 |
512 |
457 |
780 |
|
|||||
Male |
1,029 |
127 |
96 |
241 |
218 |
347 |
|
|||||
Female |
1,181 |
130 |
108 |
271 |
239 |
433 |
|
|||||
|
|
|
|
|
|
|
|
|||||
Not Hispanic or Latino |
|
|
|
|
|
|||||||
White |
|
|
|
|
|
|
|
|||||
Total |
28,235 |
3,198 |
2,291 |
4,237 |
4,996 |
13,513 |
|
|||||
Male |
13,756 |
1,635 |
1,168 |
2,133 |
2,497 |
6,323 |
|
|||||
Female |
14,479 |
1,563 |
1,123 |
2,104 |
2,499 |
7,190 |
|
|||||
|
|
|
|
|
|
|
|
|||||
Not Hispanic or Latino |
|
|
|
|
|
|||||||
All Other Races |
|
|
|
|
|
|
|
|||||
Total |
990 |
195 |
123 |
192 |
167 |
313 |
|
|||||
Male |
481 |
98 |
61 |
95 |
81 |
145 |
|
|||||
Female |
509 |
97 |
61 |
98 |
85 |
168 |
|
The sample for NESARC-III will be selected using a four-stage, stratified probability sample design involving the selection of: (1) primary sampling units (PSUs) consisting of counties or groups of contiguous counties; (2) second-stage sampling units (referred to as segments); (3) dwelling units (DUs) or household-equivalents in group quarters; and (4) eligible persons within households occupying dwelling units. The frames to be used at each stage are described below.
For the initial stage of sampling, a PSU frame will be created using the Census 2010 county-level data files. In general, the PSUs will be formed as a single county or a group of contiguous counties, depending on the population size and the end-to-end distance within a PSU. The objective of the PSU formation process will be to minimize travel distance within a PSU (e.g., to ensure that the maximum distance is no more than 100 miles), subject to a specified minimum PSU population size of 15,000. Although some large metropolitan statistical areas (MSAs) will occasionally be split to form efficient areas for data collection, in general, metropolitan PSUs will be formed within MSA boundaries where feasible. Under these rules, we estimate that approximately 1,800 PSUs will be formed from the over 3,100 counties and county-equivalents in the United States. Since the Census long form is no longer administered under the decennial censuses but instead equivalent data are collected under the American Community Survey (ACS), income and other county-level characteristics that formerly were available from the long form will be obtained from the ACS for PSU stratification. The five-year ACS files (covering the years 2005-09), which will be available for all counties by the end of 2010, will be used for this purpose. Note that while the ACS data will be used for PSU stratification purposes, the 2010 Census population counts will be used to construct the PSU sampling measure of size. From the PSU frame, a stratified probability proportional to size (PPS) sample of 150 PSUs will be selected for NESARC-III.
The second-stage sampling units (referred to as segments) will be based on Census-defined blocks. The frame of segments will be created within the sampled PSUs using the 2010 Census Redistricting Data (P.L. 94-171) Summary File block data (which will be available by the end of March 2011 for all states). Within each PSU, the block-level records from the P.L. 94-171 summary file will be sorted by tract, block group, and block number before creating the segments. Blocks with no population in 2010 will be included in the segment formation process to ensure that areas containing dwelling units (DUs) constructed after the 2010 Census are given an appropriate chance of selection. A single block will be used as a segment if the number of dwelling units in the block exceeds 60. Neighboring blocks will be combined within a tract to reach either the required minimum of 60 dwelling units per segment or the end of the tract (segments will not cross tract boundaries). As discussed later in Section B.1d, the operational definition of the segment will differ depending on the frame used to select dwelling units at the subsequent stage of selection.
At the third stage of selection, a sample of addresses will be selected within sampled segments. The sample of addresses will be selected from a combination of two sources: (1) address-based sampling (ABS) frames derived from U.S. Postal Service (USPS) address lists; and (2) lists compiled by field data collectors. The USPS address lists, which are called Computerized Delivery Service Files (CDSFs), are derived from mailing addresses maintained and updated by the USPS. These files are available from commercial vendors. Recent studies suggest that the coverage of these lists is generally high for urban and large suburban areas, and sometimes reasonably high for parts of rural areas as well. Thus, the CDSF lists will serve as the main sampling frames in these cases. To handle any address noncoverage of the CDSF lists in these areas, the missed structure procedure will be used to develop in-person address listings as described below. There are, however, known under-coverage problems with CDSF lists in some rural areas or parts of rural areas (Montaquila, Hsu, Brick, English and O’Muircheartaigh, 2009; Dohrmann, Han, and Mohadjer, 2007; Iannchionne, Staab, Redden, 2003; O’Muircheartaigh, Eckman, Weiss, 2002). Even with the ongoing improvements in the CDSF lists in the rural areas due to the conversion to urban-style street addresses to facilitate 911 emergency response services as well as enhancements to CDSFs made by commercial vendors, there are likely to remain parts of some rural areas where the CDSF addresses are subject to substantial noncoverage of usable addresses because of the use of PO box addresses. The procedure described above for areas of high coverage will also be applied in cases where the coverage is moderate, but with a greater use of the missed structure procedure (e.g., it could be applied to a relatively larger fraction of segments than in areas where coverage is high). However, in areas where the coverage of the CDSF appears to be particularly poor, it may be necessary to revert to standard area sampling listing procedures. A coverage quality evaluation will be done prior to making a determination of the extent of use of the missed structure procedure or whether to use listings compiled by field data collectors in specific areas. For example, such an evaluation might identify segments where the number of usable addresses in the CDSF frame falls far short of the corresponding 2010 Census household count. In such segments, a decision might be made to use listings compiled by field data collectors to develop the address frames.
To take account of possible new construction and to improve coverage in general, two separate quality control procedures will be used: the “missed structure procedure” and the “hidden DU procedure.” The missed structure procedure will be implemented for a subset of the sampled segments to check for structures that are not part of any other structure within the segment boundary. Missed structures are those that should have been included on the original listing but were not, or were constructed after the lists were compiled. When a segment is selected for this procedure, the entire segment will be canvassed by the data collector and any newly identified structures deemed as missed or new construction will be added to the address sampling frame. Depending on the rate at which the procedure is applied, either all or a sample of the missed structures identified within a sampled segment will be selected for the study.
The second procedure, the “hidden DU procedure,” will be implemented in sampled dwelling units (DUs) at the end of the screener interview. Note that a “dwelling unit” is defined as “a group of rooms or a single room occupied as separate living quarters (or if vacant, intended for occupancy as separate living quarters); that is, the occupants do not live with any other person in the structure and there is direct access from the outside or through a common hall or area.” The term “household” includes all persons who occupy a dwelling unit. In this document, the terms “occupied dwelling unit” and “household” are used interchangeably for simplicity’s sake. The hidden DU procedure will attempt to identify DUs that are attached to the main DU where the screener interview is taking place, but that have no separate mailing address in the CDSF address list or that were not apparent to the canvasser during the initial listing for the area sample. These hidden DUs can either be part of a unit of a multi-unit building (apartment house), or can be additional or hidden DUs in a single family home (attic or basement apartment or other separate living quarters). Once identified, the hidden DU(s) will be entered into the data collector’s CAPI application, and screening and interviewing will take place within the newly identified unit or units. Note that if there are a large number of hidden DUs associated with any one sampled structure, subsampling of the newly identified hidden DUs will take place.
At the fourth stage of selection, the sampling frames will consist of lists of eligible persons 18 years of age and older that are obtained for households within each sampled dwelling address completing the screener.
As described earlier, the sample will be selected using a four-stage, stratified probability design which will result in a nationally representative sample of 46,500 respondents 18 years and older. It will provide for oversampling of Blacks, Hispanics, and Asians by oversampling geographic areas with high concentrations of these minority populations, and by giving minorities within sampled households greater probabilities of selection than nonminorities, for some household compositions. The sample will include both persons living in households and in group quarters.
At the first stage, a stratified sample of primary sampling units (PSUs) consisting of MSAs, parts of MSAs, or individual counties or groups of contiguous counties will be selected using probability proportional to size (PPS) sampling. The measure of size (MOS) will be defined to be a weighted sum of PSU-wide population counts by four major race-ethnic categories (Hispanic, Black, Asian, and all other groups), where the weights used to construct the MOS are proportional to the expected overall sampling rates to be applied to the given race-ethnic group. The data in the 2010 Census Redistricting Data (PL 94-171) Summary File will be used to obtain county-level population counts for the construction of a MOS for sampling PSUs. Such a composite MOS is designed to ensure approximately equal PSU sample sizes while maintaining self-weighted samples for the various race-ethnic groups (e.g., see Folsom, R., Potter, F., Williams, S., 1987, Notes on a composite size measure for self-weighting samples in multiple domains, Proceedings of the Section on Survey Research Methods, American Statistical Association).The PSUs with the largest MOS will be selected with certainty (with probability equal to one) using a certainty cutoff determined from probability proportional to size (PPS) sampling. We estimate that 50 PSUs will be selected with certainty. Each of the certainty PSUs will in effect be a single stratum. The remaining PSUs will be grouped together to form non-certainty strata. The strata for the non-certainty PSUs will be defined by census division, MSA status, percent minority population, and other characteristics to the extent feasible (e.g., income levels from the ACS). The main objective of the stratification process will be to control the variances of the estimates. The stratum population sizes will be kept as equal as possible so that roughly equal workloads are maintained in each stratum. In addition to region, metropolitan status, and minority status, PSU-level variables to be considered for stratification purposes will include: per capita income, average household size, percentage of the population aged 18 to 29, percentage of the population living below 150 percent of the poverty guidelines, and possibly others. A total of 150 PSUs will be selected from the PSU frame, including the 50 largest PSUs, which will be selected with certainty, and an additional 100 non-certainty PSUs. All non-certainty PSUs will be assigned to one of 50 strata. Two PSUs will then be selected from each stratum with probabilities proportional to the MOS.
Within the selected PSUs, a sample of second-stage sampling units or segments consisting of blocks or groups of blocks will be drawn. To account for varying concentrations of the minority population and to allow for the oversampling of such groups, the segments will be stratified according to their prevalence with respect to the overall minority populations of interest (defined to include Blacks, Hispanics, and Asians). The segments where the prevalence is highest (e.g., segments in the two highest quintiles) will be deemed as “high density minority segments”. A sampling measure of size will be constructed for each segment (within sampled PSUs) that depends on the distribution of Blacks, Hispanics and Asians in the segment. Like the PSU-level MOS defined previously, the segment-level sampling measure of size will be defined as the weighted sum of the segment-level population counts by race-ethnicity, where the weights are proportional to the sampling rates to be used to select dwelling units for the various race-ethnic groups. Segments in strata with high concentrations of minorities will then be oversampled (given higher selection probabilities) relative to segments in the low minority strata. Ordering the frame of segments within each PSU by the proportion of the minority groups in the segment (as reflected in the 2010 Census files) will provide an implicit stratification by minority concentration for the second-stage sample. A systematic PPS sample of segments will then be selected from the sorted frame. Within each sampled non-certainty PSU exactly 48 segments will be selected, whereas 48 segments will be selected, on average, within certainty PSUs since these are essentially strata. This will result in a total of 7,200 segments for the overall national sample.
Finally, it should be noted that the operational definition of the segment will depend on whether the associated addresses are obtained from traditional area listing or from the CDSF frames. In the former case, the segment is defined by the actual geographic boundaries of the segment used by the 2010 census. In the latter case, the segment is defined by the collection of addresses that are assigned to the segment by available geocoding software. Although some addresses may be “non-geocodable” (e.g., due to inconsistent spelling of street names or errors in street numbers), it is often possible to assign such addresses to Census blocks using ZIP code information. Nevertheless, even where addresses are geocodable, since the process of geocoding down to the block/segment level can be imperfect, some addresses associated with a given segment may actually lie outside the segment boundaries, and conversely, some addresses outside of the segment boundaries may be incorrectly associated with the segment. Whether or not the addresses associated with the segment are physically located within Census-defined boundaries, for NESARC-III, they will collectively define the segment for sampling purposes. With this definition of the segment for the address-based samples, every geocodable address in the CDSF sampling frame has a known probability of selection for the survey sample.
At the third stage of sampling, current lists of addresses will be developed for the selected segments using address lists compiled from the CDSF lists or from traditional household listing procedures. The CDSF address lists will be used in what are primarily urban areas, and traditional listing will be employed in rural areas to the extent necessary. From these lists, samples of addresses will be drawn for screening using systematic sampling within the segments. As a result of the PPS selection of PSUs and segments, roughly equal numbers of addresses will be selected from each segment, thus maintaining a constant workload across segments. Once selected, a screener interview will be administered within the household associated with the sampled address to determine the race-ethnicity, sex and age of each household member, active duty status of military personnel, among other things. Under the proposed design that includes oversampling Hispanics, Blacks and Asians, we estimate that about 2,200 of the sampled White households in the high minority strata will be screened out of the study (i.e., de-selected from oversampled segments) in order to maintain the prescribed overall sample size of 46,500. Note that the de-selection of White households will be done in such a way as to equalize the weights of the retained White households to the extent possible. The reduction in sample size for Whites will not appreciably affect the analytic potential of the survey data since the standard errors based on the White subgroup only will be small regardless, given the large sample size for this group.
The fourth and final stage of selection will use the roster of age-eligible household members (aged 18 and older) for each dwelling unit associated with a selected address (or household-equivalent in the case of group quarters) to select up to two persons within each household. In households with three or fewer age-eligible members, one person will be selected at random. In households with four or more eligible persons, two persons will be sampled. Under the proposed sample design that includes oversampling of minority groups, an estimated 4.9 percent of the sampled households will include two sampled persons. Furthermore, in households with mixed racial composition (roughly 4.4 percent of US households according to 2007 American Community Survey data), minority members of the household will be given higher probabilities of selection relative to the nonminority members in some instances. The specific within household probabilities of selection depend on the particular household size and composition.
The types of noninstitutionalized group quarters that are of interest to NESARC-III include college dormitories, agricultural, vocational training and other dormitories, half-way houses, hostels, YMCAs, shelters, campgrounds and carnivals.. The noninstitutionalized group quarters population represents only about 1.5 percent of the total noninstitutionalized population 18 years of age or older. Thus, in a sample of 46,500 respondents, roughly 700 will be residing in group quarters housing. Students in college quarters comprise the largest component of the noninstitutional group quarters population, representing about 60 percent of the group. We propose to sample college students living in dormitories and fraternity and sorority houses through their permanent residence. If a student who lives in a dormitory for much of the year is identified as the sample person and is at home (their permanent residence) when the screening occurs, an attempt to administer the interview will be made before the student returns to the dormitory. Otherwise a time will be found during the field period when the student will be at home and an interview will be scheduled for that time. If this is not possible, the student will be contacted and interviewed at the dormitory if the dormitory is within easy reach of any sampled PSU (it need not be the PSU of the family residence). Identifying students in dormitories via their family residence is a simpler process than constructing a separate dormitory sampling frame from which to select students. It avoids the costs and complications of contacting and gaining permission from college and university officials, of obtaining and sampling from lists of dormitories, and of listing and sampling within selected dormitories.
Where traditional area sampling procedures are used, enumerators will be instructed to list all units in a segment, and will be given special instructions to include the appropriate group quarters and to obtain a rough count of the number of residents of the group quarters for sampling purposes.
Where address lists based on the CDSF are used, reliance on the lists alone could potentially give rise to a coverage problem. Although the CDSF address lists theoretically cover group quarters, many types of group quarters—such as some shelters and half-way houses—are classified as business establishments in the CDSF lists and hence would not be included on the residential file used as the sampling frame. However, the missed structure procedure described earlier will serve to identify missed group quarters (including group quarters that are classified as business establishments in the CDSF) in the same way as it will identify any new or missed housing units. Thus the use of the missed structure procedure will mitigate undercoverage of certain types of group quarters in the CDSF lists.
As noted earlier, once a dwelling unit is selected, a household screener will be administered in the field to determine the race-ethnicity and age of each household member. It is assumed that the response rate for this procedure will be 90 percent. In terms of screening, it is expected that the eligibility rate for households will be close to 100 percent since there are only a negligible number of households in the United States that do not contain any persons who are 18 years or older. Within high density minority segments where there is oversampling of Blacks, Hispanics and Asians, roughly 4.3 percent of the sampled White households (about 2,200 households) will be de-selected or “screened out” in order to maintain an overall sample size of 46,500. Depending on whether the household has fewer than four or four-or-more eligible persons, either one or two individuals within the sampled households will be selected and administered the AUDADIS interview. Although Wave 2 NESARC achieved a person level response rate of 93 percent, NESARC-III is expecting a somewhat more conservative response rate of 90 percent at the person level based on the overall decline in response rates observed in recent years. Thus, the overall response rate expected for NESARC-III is 81 percent (i.e., the product of the expected screener response rate and the expected person level response rate). Table 3 summarizes the overall sampling rate and expected response rate assumptions for NESARC-III.
Sampling unit |
Assumed
|
Expected
|
|||
1 |
Primary sampling unit (PSU) |
––– |
150 |
|
|
2 |
Area segments/ CDSF segments |
48 per PSU |
7,200 |
|
|
3 |
Dwelling units (DUs) |
9.0 per segment |
65,000 |
|
|
4 |
Occupied dwelling units |
88.0% |
57,200 |
|
|
5 |
Households completing screener enumeration |
90.0% |
51,480 |
|
|
6 |
Households screened out due to race-ethnicity |
4.3% |
2,214 |
|
|
7 |
Eligible households with persons 18+ |
100.0%* |
49,266 |
|
|
|
Households with <4 adults |
95.1% |
46,852 |
|
|
|
Households with 4+ adults |
4.9% |
2,414 |
|
|
8 |
Number of sample persons in eligible households |
|
51,680** |
|
|
|
In households with <4 adults |
1 per HH |
46,852 |
|
|
|
In households with 4+ adults |
2 per HH |
4,828 |
|
|
9 |
Persons completing AUDADIS |
|
46,512 |
|
|
|
In households with <4 adults |
90.0% |
42,167 |
|
|
|
In households with 4+ adults |
90.0% |
4,345 |
|
* A very small number of screened households may contain only persons under 18.
** Assumes 1 sample person in households with 3 or fewer persons, and 2 sample persons in households with 4 or more persons .
The NESARC-III involves three main components: an automated CAPI screening instrument, an automated CAPI AUDADIS-V instrument, and collection of a saliva sample. In addition to the main data collection effort, two separate concurrent studies are also proposed: a Reliability Study and a Validity Study. The information presented sequentially below applies to the various components, while more detailed information on the Reliability and Validity methodological components is discussed in sections B.2e and B.2f. The flow diagram of the entire data collection activity, including all verbatim statements to the respondent, and consent documents and procedures appears in Attachment 8.
The primary objective of the data collector working on the NESARC-III study is to obtain complete and accurate information from sampled persons in each eligible household in their assignment. This requires that the data collector have a thorough understanding of the survey’s protocol, as well as an understanding of the techniques required to gain the respondent’s cooperation and maintain rapport through the interaction. All data collectors working on NESARC-III will receive extensive in-person training on the exact procedures to be followed in the administration of the data collection instruments themselves, as well as techniques to gain cooperation, such as understanding the importance of the study, answering respondent questions, and addressing respondent concerns.
The training provided to data collectors will be in two forms: home study and in-person. The 8-hour home study program will be designed to introduce trainees to the NESARC-III study, with a focus on the respondent contact materials. The home study will also provide data collectors with practice in gaining cooperation and establishing rapport. In-person training techniques are designed to maximize trainee involvement, maintain the interest of the trainees, and produce well trained data collectors who have the necessary skills for gaining respondent cooperation, correctly answering questions about the study, and adeptly completing all components of the interview. Training materials will be developed by experienced NESARC-III team members. In the 4-day in-person session, data collectors will be trained on techniques for obtaining consent, conducting the CAPI screener and AUDADIS-V interviews, collecting saliva samples, and issuing respondent incentives, in addition to administrative procedures such as data transmission and reporting to the supervisor.
In addition to the in-person training, data collectors will be provided with a data collector manual, providing detailed reference materials on locating sampled addresses, determining household membership, the interviewing process, questionnaire content and saliva collection procedures.
During the data collection period, numerous quality control procedures will be utilized to ensure that data collectors are following the specified procedures and protocols and that the data collected are of the utmost quality. Data collectors who successfully completed training, but show any area of potential weakness will be observed in-person at least one time by a supervisor or home office staff members. Observing data collectors conducting their job in the field is a very effective way of monitoring their skills to conduct the interview, as well as their adherence to survey procedures. It also provides the observer with an appreciation of the data collectors’ tasks and provides the opportunity to experience first-hand the administration of the NESARC-III instruments and saliva collection procedures. Observations will be concentrated in the early weeks of data collection so that problems are detected as early as possible, to provide corrective feedback to the data collectors.
Brief quality control interviews will be conducted to verify that an interview was administered or attempted as reported by the data collector. Quality control procedures will be implemented to verify at least ten percent of each data collector’s finalized work to ensure that the interview was conducted according to study procedures. This includes cases finalized as complete, as well as those with non-complete dispositions, such as vacant or refusal. Quality control will begin early in the data collection period to allow for any identified problems to be addressed immediately. Quality control interviews will be conducted by separate trained data collection staff over the telephone, whenever possible. However, if unable to complete a quality control interview via telephone (e.g., the dwelling unit is vacant), the interview will be assigned to an experienced, specially trained data collector who will conduct the quality control procedure in person.
Additionally, throughout the field period, supervisors will remain in close contact with the data collectors. Scheduled weekly telephone conferences will be held in which all non-finalized cases in the data collector’s assignment will be reviewed to determine the best approach for working the case and the need for additional resources.
Management staff at all levels will have access to a supervisor management system, including automated management and production reports that will be used to monitor the data collection effort and ensure that the data collection and quality control goals are being attained. Data collectors will be required to transmit data on a daily basis. Data will be transmitted to a secure server at Westat’s Rockville offices, which will then be used to update the automated management reports. These data are also used to produce weekly reports that might provide evidence of suspicious data collector behavior, such as overall interview administration length, individual instrument administration time, amount of time between interviews, interviews conducted very early in the morning or late in the evening, and number of interviews conducted per day.
The random selection of one (or two for less than 5% of the sample) respondents per eligible household (as described in Section B.1) is conducted through the use of an automated screening instrument (see Attachment 21). The screener uses a full household enumeration process to collect the following information for each reported household member: first name, gender, age or age range (if exact age is not known or refused), active military service status and race-ethnicity. If a household member is selected to participate, as will usually be the case, the relationship of all household members to each sample person (SP) is collected. The respondent who answers the screener will receive $10 in cash (the screener incentive). In addition to household enumeration information, household and SP telephone numbers are collected to allow the recontact of the household for quality control purposes, or to set appointments for the AUDADIS-V if the sample person is not available at the time of the screening. Finally, if the mailing address differs from the street address, the household mailing address is collected. Mailing address allows written follow-up with nonresponse cases, as discussed in Section B3 below.
The proposed sampling algorithm for selecting one respondent (or two) per household has been programmed within the Blaise screener software. To check that the screener is working properly, it has been tested using 100,000 households from 2007 American Community Survey (ACS) Micro-data files. Although NESARC-III will include a much smaller sample of households (around 46,500), a larger ACS sample for the test was used to smooth out the variability in selected person sample sizes due to random selection within households. In these 100,000 households, there were 188,631 total eligible household members aged 18+. From these, 104,069 eligible household members were sampled according to the algorithm, selecting one or two persons depending on the number of eligible members in the household, and giving higher probabilities of selection to minority members within mixed households. Tables by sex, age, and minority status were produced and checked to see if the weighted counts of sampled persons (weighted by the inverse of the within-household probability of selection) matched the corresponding frame counts. As expected, there was very close alignment. Additionally, in some tests involving 2,000 households from the ACS, the possibility of not getting demographic data for some household members in the screener was simulated by randomly blanking out cases and coding them as refusals, to see if the algorithm would properly take account of these cases. Overall, the algorithm appears to be working as expected.
Immediately following the administration of the screener, if the selected sample person is available and has an adequate amount of time to complete the interview, the data collector obtains informed consent (See Attachment 8) and will then attempt to complete the automated CAPI AUDADIS-V instrument. If a sample person is not available or there is insufficient time to complete the interview, the data collector will attempt to schedule an appointment for a return visit or, at a minimum, determine the best time for a return visit.
After obtaining consent, the data collector provides the first half of the incentive ($45) to the respondent (discussed in more detail in Section A.9). The data collector then launches the automated CAPI AUDADIS-V and administers the appropriate items to the respondent. As required throughout the interview, the data collector will use the flashcard booklet to aid the sample person in providing a response. Following the completion of the AUDADIS-V, the second half of the incentive ($45) is provided to the sample person. The data collector will then read the consent document for possible reinterview in the Reliability or Validity Study (See Attachment 8). If the sample person agrees to participate in the reinterview, information such as best time for recontact and sample person contact information are requested. (See Attachment 21 for recontact script.)
Respondents will receive a thank you letter at the completion of the NESARC-III. (Attachment 22). A refusal letter will also be sent to respondents who are difficult to contact (Attachment 22).
Any SP who consents to provide a saliva sample will be requested to provide the sample during the home visit. Saliva specimens will be collected following the consent for reinterview using Oragene Model OG-500.005. Data collectors will provide sample persons with a saliva sample collection kit and an instruction sheet, and will review the instructions with the sample person and provide guidance for the self-collection. A brief instructional video on how to use the kit will be loaded on the interviewer’s laptop and shown to each SP. (see Attachment 8 for DNA collection procedures and associated script and screen shots). Data collectors will be trained to address questions the sample person may have concerning the collection procedure. Bar coding will be used to identify and track collected specimens to ensure privacy of the SP. Data collectors will package and ship collected specimens to Westat and will document all activities in the Field Management System (FMS). Staff will receive thorough training in collecting, packaging, and shipping samples. Detailed procedures will be provided in the data collector manual.
Immediately after completion of the saliva sample collection, the data collector will document the collection in the automated FMS per established study procedures. Briefly, this includes inputting the barcode on the collection container to assign the specimen to the sample person, recording the date and time of the collection, and preparing the specimen for transport to Westat.
After the visit, the data collector will ship the specimen to Westat on the day it was collected or as soon as possible, per established study procedures and in accordance with 49 CFR 173.134 and IATA regulations 3.6.2.2.3.6 (a-c).
Saliva specimens will be tracked from the point of collection and shipment in the FMS, through to the points of receipt, storage, requisition and processing using Westat’s in-house specimen tracking system (Biological and Environmental Sample Tracking System, or BEST). Specimen collection and shipping data will be uploaded locally to the home office management system on a daily basis and will be electronically integrated into BEST. No subject identifying information will be located in BEST.
Westat will receipt each shipment and individual specimen in BEST daily. Any shipping and receipt problems observed will be noted. Barcoded specimens will be stored at Westat until they are batched for shipment to the laboratory for DNA extraction. Once specimens are shipped to the repository, the shipment and specimens can either be receipted directly into BEST (which is web-accessible) or can be receipted into the laboratory’s own system and a data file uploaded to the NESARC-III secure web portal so that Westat can update BEST with the receipt information. The repository that will perform DNA extraction from the saliva samples will be supported by a separate NIAAA contract.
Westat will review QC reports generated from BEST on a routine basis to determine how many specimens have been received, how many are expected, collection problems (such as low volumes of saliva), and shipping/receipt problems (such as specimens arriving damaged). Westat will use the BEST and FMS systems to determine whether collection and shipping problems can be attributed to specific data collectors, kit production dates, or other systematic factor(s).
The repository will provide Westat with DNA extraction data on a flow basis. This will include parameters associated with DNA extraction such as extraction date and method, DNA mass (and mass unit), concentration (and concentration unit), volume (and volume unit), solvent (e.g. TE, water), starting volume of the specimen from which DNA was extracted, A260/280 and other measures of DNA quality that the DNA extraction laboratory will generate (such as result from Identifiler assays (Applied Biosystems)), daughter vials of DNA created (including sample IDs, original saliva sample ID, DNA mass, DNA concentration, DNA volume), and storage container ID (location in the storage box). Westat will work with the laboratory and NIAAA to verify and, if necessary, correct the data.
Using the unique barcode IDs that have been recorded in the FMS and used by the repository and in BEST, Westat will merge all of the relevant laboratory data with the other NESARC-III study data and incorporate it within the analytic file deliveries.
Once the DNA is extracted, Westat will perform quality control checks to ensure that the data associated with the resulting DNA are of high-quality. Examples include checks for duplicate or incorrect specimen IDs, checks for missing or incomplete data, and checks to ensure DNA yields are acceptable.
In the Reliability Study or first methodological component of the NESARC-III, 1000 NESARC-III respondents will be re-interviewed with one-half of the AUDADIS-V interview. The purpose of the reliability component is to collect additional information on the replicability or repeatability of the major NESARC-III outcome variables.
The basic design features of the Reliability Study include:
A systematic sample of 1000 respondents participating in the NESARC-III proper will be selected to participate in the retest interview only if consent for this reinterview has been obtained and recorded by the interviewer after the payment of the second AUDADIS-V incentive during the NESARC-III survey proper. This consent documentation remains a permanent part of the interview record and is not associated with any personally identifying information. (See Attachment 8).
Each re-interview will consist of one-half of the AUDADIS-V administered in the NESARC-III proper.
All interviews will be conducted with a CAPI instrument.
Prior to administering the retest interview the interviewer will read an introductory statement to the respondent explaining the purpose of the interview: (See Attachment 8.)
Interviewers administering the Reliability Study will differ from those administering the AUDADIS-V proper and accordingly will be blind to the respondent’s previous responses..
Reliability Study interviews are to be completed in person, with incentive payments of $100.00 provided at the end of the interview. Interviewers will certify that the incentives were received by responding to a computerized prompt.
All re-interviews are to be conducted within 6 weeks of the original interview.
Data from the Reliability Study will be processed, stripped of all personally identifiable information, and delivered to NIAAA staff, who will prepare recode information for the contractor to incorporate into a file that links major outcome variables from the NESARC-III proper and reliability administrations of the instrument.
Kappa statistics will be used to assess the concordance of interview and re-interview responses to major NESARC-III outcome variables.
The reliability methodological component will follow the same interviewing protocols and monitoring procedures as the NESARC-III proper and be handled with the same security measures and confidentiality strictures of data collected in the main survey. This methodological component will be included also as part of the pilot test.
Informed consent will be obtained for the Reliability Study after the second AUDADIS-V incentive payment is provided to respondents in the NESARC-III survey proper. The consent documents and procedures appear in Attachment 8.
The second methodological component of the NESARC-III entails a Validation Study in which concordance between AUDADIS-V major outcome measures and those derived from a clinician-administered interview, the PRISM, are compared. Similar to the AUDADIS-V, the PRISM has undergone a continuous program of psychometric assessment, with this validation study representing its most current implementation. The PRISM, developed by Dr. Deborah Hasin, New York State Psychiatric Institute (NYSPI), is a semi-structured instrument for assessing the same alcohol use disorders and their associated disabilities as measured in the AUDADIS-V. All training, hiring, instrument development and computerization, data cleaning and algorithm construction will be implemented by Dr. Hasin and her staff through a subcontract with Westat. Westat will be responsible for general task coordination and oversight, sample management, monitoring sample production, administering the $100.00 incentive, data handling and delivery, reporting and Telephone Research Center (TRC) infrastructure.
The basic design features of the Validation methodological Study are:
A sample of 700 respondents to the initial AUDADIS-V in the NESARC-III survey proper will be systematically selected to be re-interviewed with the CATI version of the PRISM only if consent has been obtained and recorded by the data collector after payment of the second AUDADIS-V incentive during the NESARC-III survey proper. This consent documentation remains a permanent part of the interview record and is not associated with any personally identifying information. (Attachment 8).
Respondents completing the PRISM will receive a $100.00 incentive provided at the end of the interview.
PRISM interviews will be conducted over the telephone using Westat’s TRC vast technical capability to virtually host the Validation Study (see Attachment 23 for a description of Westat’s IT security procedures and best practices).
All PRISM interviews will be conducted within 6 weeks of the initial AUDADIS-V interview.
PRISM interviewers will be blind to the respondent’s previous responses to the AUDADIS-V.
PRISM interviews will be recorded for quality assurance purposes once the respondent has consented to the recording. The procedures used to obtain consent from the respondent by the telephone interviewer appear in Attachment 8.
Immediately after determining the respondent’s decision to record the interview, the interviewer will read the same statement read to the respondent in the Reliability Study that explains the purpose of the study (see Attachment 8).
Data from the Validation Study which are stored at Westat will be stripped of all identifiers and placed in a secure Westat network location and made available for data cleaning and recoding.
Corresponding data from the PRISM and AUDADIS-V will be combined by Westat to assess the concordance between major AUDADIS-V outcome variables, using Kappa or intra-class correlation coefficients. The greater the concordance, the greater the validity/utility of the AUDADIS-V survey outcome measures.
Planning and coordination of the validation component will be fully integrated with planning and conducting the NESARC-III, including its incorporation into the pilot testing. The validation study will also adhere to all confidentiality and security policies and practices of the NESARC-III proper outlined in this OMB submission.
Informed consent will be obtained for the Validity Study after the AUDADIS-V interview and the full incentive has been provided to the SP. The consent document and procedures appear in Attachment 8.
The cross-sectional estimates from NESARC-III will be functions of weighted responses for each sampled person, where the weights will consist of eight components: a DU base weight, six multiplicative adjustment factors (an adjustment factor for missed structures, an adjustment factor for hidden DUs, a White household deselection adjustment factor, a household screener non-response adjustment factor, a within household selected person weight, and a selected person non-response adjustment factor) and a final post-stratification ratio adjustment. The details are as follows:
The DU base weight
The dwelling unit base weight is the inverse of the overall probability of selection of a sample housing unit or housing unit equivalent for group quarters. The probability of selection is the product of the conditional probabilities of selection at each stage of sample selection, where the stages are as follows: (a) in “non-certainty strata”, the selection of two non-certainty primary sampling units (PSUs) per stratum using PPS without replacement sampling, and in “certainty strata” the selection of all certainty PSUs with probability equal to one; (b) the selection of 48 segments per PSU using systematic PPS without replacement sampling; and (c) the selection of approximately 9-10 addresses associated with dwelling units per segment using systematic sampling.
For (b), the segment selection probabilities will reflect the fact that segments in strata with high concentrations of minorities will be oversampled at up to two times the rate of segments in the low density minority strata.
The adjustment factor for missed structures
As mentioned earlier, a missed structure procedure is implemented in a subset of sampled segments to account for missed DUs that should have been included in the original listing but were not in the case of the area sample, or are found not to be part of the CDSF list upon initial canvassing in the case of the CDSF sample. In most cases where missed structures are found, they will simply be added to the sample and no adjustment factor to the weights is required (an implied adjustment factor of one is used). However, in rare cases where there is a large number of missed structures found within a segment, subsampling will occur and an adjustment factor equal to the number of missed structures identified divided by the number subsampled will be applied to those selected for the subsampling. This adjustment is the first of the multiplicative factors that will be applied to the DU base weight for all completed screeners.
The adjustment factor for hidden DUs
As mentioned earlier, a hidden DU procedure is implemented in both the area sample and the CDSF sample for sampled DUs within sampled segments, in order to detect potential multiple households within a sampled DU. In most cases, a sampled DU will consist of one household. But in cases where multiple households are found, in general, NESARC-III will conduct the AUDADIS interview in all households associated with the sampled DU. In this case, no adjustment factor to the weights is required (an implied adjustment factor of one is used). However, in some rare cases where there is a large number of households associated with the sampled DU, a subsample of these households will be interviewed and an adjustment factor equal to the total number of households associated with the sampled DU divided by the number of households subsampled for interviewing will be applied to the subsampled households. This adjustment is the second of the multiplicative factors that will be applied to the DU base weight for all completed screeners.
The White household deselection adjustment factor
Because the survey design oversamples Hispanics, Blacks and Asians by sampling high density minority strata at a rate of up to twice the rate of low density minority strata, about 2,200 of the sampled White households in the high minority strata will be de-selected from oversampled segments in order to maintain the prescribed overall sample size of 46,500. Specifically, in the high density segments, up to half the DUs will be flagged for potential deselection, and if during the screening process, the households associated with flagged DUs are found to consist entirely of white members, the entire household will be deselected from the survey. In order to equalize the weights of the retained white households in the segment to the extent possible, an adjustment factor equal to the number of households sampled in the segment divided by the number of households flagged for deselection in the segment will be applied to the non-flagged and retained white households. This adjustment is the third of the multiplicative factors that will be applied to the DU base weight for completed screeners, and affects only the white households in the high density (oversampled) minority strata.
The household screener nonresponse adjustment factor
The household screener nonresponse adjustment will be calculated to account for inability to obtain a completed household roster resulting from a refusal, failure to identify a knowledgeable screener respondent, or inability to locate the housing unit. Adjustment cells will be defined by grouping segments together by MSA status, region, minority status, and other characteristics. For each cell, the ratio of the weighted number of eligible sample households to the weighted number of completed screeners will be computed, where the weight used will be the DU base weight multiplied by the first three multiplicative adjustment factors. This factor will be used to inflate the weights of the screener respondents in the cell to account for the screener nonrespondents. This adjustment is the fourth of the multiplicative factors that will be applied to the DU base weight for all completed screeners.
The within-household selected person weight
One eligible person per household will be selected at random from amongst all those in households with three or fewer eligible members and two eligible persons will be selected at random within households with four or more eligible members. In the former case, the sum of the probabilities of selection of all eligible members within such households will add to one, and in the latter case, it will add to two. In households with mixed racial composition, for some household size and composition combinations, minority members of the household will be given higher probabilities of selection relative to the nonminority members, resulting in unequal probabilities of selection within these households. For example, in households of size 3 where there is one minority and two nonminorities, the former will be selected with probability 1/2 and each of the latter with probability 1/4. In other household size and composition combinations, persons will be selected with equal probabilities. For example, in households of size 6 or more (which are relatively few in number), all household members will be selected with equal probabilities regardless of household composition. To do otherwise and give minorities within the household a greater probability of selection would unduly inflate the weights of the nonminorities in the household. A conditional (within-household) person weight corresponding to each selected person will be calculated as the inverse of the probability of selecting the particular person within the household. This is the fifth of the multiplicative factors that will be applied to the DU base weight.
The selected person non-response adjustment factor
The selected person non-response adjustment factor accounts for those persons who were selected for the study, but for whom no AUDADIS interview was obtained. This type of non-interview can result from a refusal by the sample person, the inability to contact the sample person, etc. The factor will be computed using adjustment cells defined by relevant segment-level and screener data, where the weights of interviewed persons in the cells are inflated to account for the non-interviewed persons. This is the sixth of the multiplicative factors that will be applied to the DU base weight.
Post-stratification ratio adjustment
This adjustment is designed to ensure that weighted sample counts agree with independent estimates of the number of persons in the civilian, non-institutional population of the United States for selected cross-classifications of region, age, sex, race-ethnicity, and relevant person-level characteristics available from the AUDADIS interview. These cross-classifications are also referred to as “dimensions.” The adjustment will be calculated using an iterative procedure referred to as ranking-ratio or simply “raking”. The independent estimates to be used in the raking adjustment will be derived from 2011 ACS population estimates. The procedure will start with the overall nonresponse-adjusted person weight (defined to be the dwelling unit base weight multiplied by the six adjustment factors above), and will then iteratively adjust or “rake” these weights until the resulting weighted counts agree with the independent control totals for each of the specified raking dimensions. The weights resulting from the final iteration of this process are the final post-stratified person-level weights. While the raking procedure mainly helps to reduce the component of bias resulting from sampling frame under-coverage, it also frequently reduces sampling variance.
Most of the major objectives of the NESARC-III relate to the estimation of the prevalence of various alcohol consumption levels, alcohol use disorders, alcohol-related physical and mental consequences and disabilities, and alcohol treatment utilization, among others.
In order to achieve these objectives, the NESARC-III was designed to produce reliable estimates for these characteristics for various population subgroups. Characteristics of most interest are dichotomous, having “yes” or “no” outcomes. The percentage of “yes’ responses is denoted by p and represents the prevalence rate of a particular characteristic (e.g., binge drinking). Based on previous research and the accumulation of clinical experience, the majority of characteristics measured in the NESARC-III are expected to have magnitudes of prevalence exceeding 7 percent, while the expected magnitude of a few characteristics will lie between 5 and 8 percent.
A measure of the precision associated with these prevalence rates is the relative standard error (RSE) which is defined as the standard error divided by the prevalence estimate, expressed as a percentage. More specifically, , where the standard error is given by the square root of the variance of the estimate, taking into account the complex sample design for NESARC-III. The impact of the various complex features of the sample design on the variance is reflected through inflation factors called design effects (DEFFs). The extent to which these design effects exceed one indicates the extent to which the variance of an estimate based on the complex sample design is greater than the corresponding variance based on a simple random sample (SRS) design. There are four key features to the NESARC-III sampling design that contribute to the overall design effect.
The first feature is the clustering at both the PSU and segment levels. In general, for a fixed sample size, the greater the number of units to be sampled per cluster, and the more homogeneous the sampling units are with respect to a characteristic of interest within clusters, the greater the DEFF and hence the inflation in the variance (resulting in decreased precision). The level of homogeneity within a cluster is reflected through two types of intraclass correlations: for PSUs and for segments. Note that and will vary in value for different characteristics of interest. The expected standard errors for prevalence estimates for NESARC-III have been calculated taking into account the contributions due to clustering at both the PSU and segment levels under the assumptions that the intraclass correlations ( , ) are (.005, .05). These values were based on estimates taken from various sources in the survey literature. The calculations reflect the fact that “certainty PSUs” are in fact strata not PSUs, so that there is no contribution to the variance from clustering at the PSU level for these PSUs. With 150 PSUs selected, it is estimated that there will be approximately 50 certainties representing 40 percent of the U.S. population.
A second feature of the NESARC-III design that contributes to the sampling variability is the planned oversampling of geographic areas with high concentrations of minority populations described earlier. Under the proposed design, households in high minority areas will be sampled at up to two times the rate of households in low minority areas. The unequal weighting DEFFs resulting from this feature of the sample design are expected to range from 1.04 to 1.07, depending on the minority groups of interest (Hispanic, Black or Asian). For nonminorities, the unequal weighting DEFF will be close to 1.0 because of the de-selection of nonminority households in the oversampled high minority strata. For analyses that combine all the races, the unequal weighting DEFF is expected to be approximately 1.09.
The third feature of the NESARC-III design that contributes to the overall sampling variability is the restriction that no more than two persons be sampled from a participating household. This requirement contributes to the variability of weights because persons in multi-person households will be sampled at lower rates than persons in single-person households. This variability in the weights is mitigated to some degree by the decision of NESARC-III to select two persons in households with four or more eligible persons. The unequal weighting DEFFs due to this feature of the sample design are expected to range from 1.09 to 1.16, depending on the race-ethnic group. For analyses that combine all the races, the unequal weighting DEFF is expected to be approximately 1.17.
The fourth and final feature of the NESARC-III design that contributes to the overall variance is the clustering within households where 2 persons are selected (in households with four or more eligible members). Since many analyses are likely to be stratified by race-ethnicity, age and sex, NESARC-III will select two persons in larger households in such a manner as to ensure that they are as diverse as possible with respect to these characteristics, in order to minimize the potential clustering effect. Regardless, in the calculations of the estimates of precision that follow, the contributions due to clustering assume the worst case scenario where both selected persons in large households are included in a particular subgroup analysis. As with PSUs and segments, the level of homogeneity within a household is reflected through an intraclass correlation, denoted by . Although will vary in value for different characteristics of interest, we assumed a relatively large value of in our calculations. The clustering DEFFs due to this feature of the sample design are expected to range from 1.01 to 1.06, depending on the race-ethnicity group. For analyses that combine all the races, the unequal weighting DEFF is expected to be approximately 1.02.
Note that for analyses of subgroups of race-ethnicity, say by age or sex, the above DEFFs will diminish since there generally will be fewer members of the subgroups contributing to the clustering effect.
Estimates of RSEs for NESARC-III were calculated (taking into account the DEFFs resulting from the four sample design features described above) for the sex, age and race-ethnic categories on which the NESARC-III universe and sample compositions in Tables 2 and 3 were based. However, the sample sizes in some of the age-sex breakdowns for the “Asians” category from Table 3 are expected to be too small to yield sufficient precision for estimates of important survey variables (yielding RSEs of greater than 30 percent). For this reason, summary results presented in this subsection are presented for three collapsed age categories only (18-29, 30-49 and 50+). Note, also, that the “All Other Races” category is not a category of particular analytical interest, and therefore is excluded in the tables that follow.
The estimated RSEs for each cell of a one-way, two-way and three-way classification defined by the sex, age and race-ethnic groups described earlier are presented in Table 4A for a prevalence of 5 percent and in Table 4B for a prevalence of 10 percent. RSEs of greater than 30 percent are suppressed as noted above. Summary results from Tables 4A and 4B are given in the bullet points. The various cross-classifications are indicated by the varied shading in the tables.
Total Sample (indicated with double outline in Tables 4A and 4B)
From Table 4A (for a 5 percent or more characteristic), for an analysis involving the total sample, an RSE of 3.3 percent is expected.
From Table 4B (for a 10 percent or more characteristic), for an analysis involving the total sample, an RSE of 2.3 percent is expected.
One-way Classifications (indicated with dark shading in Tables 4A and 4B)
From Table 4A (for a 5 percent or more characteristic) and from Table 4B (for a 10 percent or more characteristic), an RSE of less than 5 percent is expected for a one-way analysis by sex (for male or female).
From Table 4A (for a 5 percent or more characteristic) and from Table 4B (for a 10 percent or more characteristic), an RSE of less than 5 percent is generally expected for a one-way analysis by age (for any of the three age categories).
From Table 4A (for a 5 percent or more characteristic) and from Table 4B (for a 10 percent or more characteristic), an RSE of less than 10 percent is generally expected for a one-way analysis by race-ethnicity (for any of the four race-ethnicity categories).
Two-way Cross Classifications (indicated with light shading in Tables 4A and 4B)
From Table 4A (for a 5 percent or more characteristic) and from Table 4B (for a 10 percent or more characteristic), an RSE of less than 10 percent is expected for a two-way analysis of age by sex.
From Table 4A (for a 5 percent or more characteristic) and from Table 4B (for a 10 percent or more characteristic), an RSE of less than 10 percent is generally expected for a two-way analysis of race-ethnicity by age, for Whites, Hispanics and Blacks (each crossed by age).
From Table 4A (for a 5 percent or more characteristic) and from Table 4B (for a 10 percent or more characteristic), an RSE of less than 10 percent is generally expected for a two-way analysis of race-ethnicity by sex, for Whites, Hispanics and Blacks (each crossed by sex).
Three-way Cross Classifications (indicated without shading in Tables 4A and 4B)
From Table 4A (for a 5 percent or more characteristic) , an RSE of less than 15 percent is generally expected for a three-way analysis of race-ethnicity by age by sex for Whites, Hispanics and Blacks (crossed by age and sex).
From Table 4B (for a 10 percent or more characteristic), an RSE of less than 10 percent is generally expected for a three-way analysis of race-ethnicity by age by sex for Whites, Hispanics and Blacks (crossed by age and sex).
Attachment 23 describes the sample size requirements for determining genetic main effects and gene-environment and gene-gene interactions given expected minor allele frequencies, prevalences of outcome measures and environmental and genetic risk factors and empirically demonstrated effect size.
Age Categories |
|||||
Sex/Race-Ethnicity |
18+ Years |
18-29Years |
30-49Years |
50+Years |
|
Total |
3.3% |
5.3% |
4.5% |
4.3% |
|
Male |
4.1% |
7.1% |
5.8% |
5.7% |
|
Female |
4.0% |
7.2% |
5.8% |
5.3% |
|
|
|
|
|
|
|
Hispanic or Latino |
|
|
|
|
|
Total |
6.1% |
10.7% |
8.8% |
11.2% |
|
Male |
8.2% |
14.7% |
12.0% |
16.2% |
|
Female |
8.4% |
15.2% |
12.7% |
15.2% |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
||
Black |
|
|
|
|
|
Total |
5.8% |
10.7% |
9.1% |
9.4% |
|
Male |
8.2% |
15.0% |
13.0% |
14.2% |
|
Female |
7.7% |
15.1% |
12.4% |
12.3% |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
||
Asians |
|
|
|
|
|
Total |
10.4% |
22.5% |
15.6% |
17.3% |
|
Male |
15.1% |
* |
22.5% |
25.9% |
|
Female |
14.1% |
* |
21.4% |
23.2% |
|
|
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
||
White |
|
|
|
|
|
Total |
3.6% |
6.7% |
5.4% |
4.6% |
|
Male |
4.6% |
9.2% |
7.2% |
6.3% |
|
Female |
4.5% |
9.4% |
7.3% |
5.9% |
|
|
|
|
|
|
* Values
greater than 30% suppressed
** Double outline represents the total, dark shading represents all one-way classifications, light shading represents all two-way cross classifications, and no shading represents all three-way cross classifications
Age Categories |
||||
Sex/Race-Ethnicity |
18+ Years |
18-29Years |
30-49Years |
50+Years |
Total |
2.3% |
3.7% |
3.1% |
2.9% |
Male |
2.8% |
4.9% |
4.0% |
3.9% |
Female |
2.7% |
5.0% |
4.0% |
3.7% |
|
|
|
|
|
Hispanic or Latino |
|
|
|
|
Total |
4.2% |
7.3% |
6.1% |
7.7% |
Male |
5.7% |
10.1% |
8.3% |
11.1% |
Female |
5.8% |
10.5% |
8.7% |
10.4% |
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
Black |
|
|
|
|
Total |
4.0% |
7.4% |
6.3% |
6.5% |
Male |
5.7% |
10.3% |
9.0% |
9.7% |
Female |
5.3% |
10.4% |
8.5% |
8.5% |
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
Asians |
|
|
|
|
Total |
7.2% |
15.5% |
10.7% |
11.9% |
Male |
10.4% |
22.2% |
15.5% |
17.8% |
Female |
9.7% |
21.5% |
14.7% |
16.0% |
|
|
|
|
|
Not Hispanic or Latino |
|
|
|
|
White |
|
|
|
|
Total |
2.5% |
4.6% |
3.7% |
3.2% |
Male |
3.2% |
6.3% |
5.0% |
4.3% |
Female |
3.1% |
6.4% |
5.0% |
4.1% |
|
|
|
|
|
** Double outline represents the total, dark shading represents all one-way classifications, light shading represents all two-way cross classifications, and no shading represents all three-way cross classifications
NESARC-III will not be immune to the declining response rate trends experienced in recent years across most surveys. Methods to maximize response rates will be planned and implemented both prior to and during the data collection effort.
Westat will recruit a team of experienced data collectors and field supervisors sufficient in size to work all cases thoroughly. These field staff will be strategically located within or in close proximity to PSUs, which will expedite visits to the sample dwelling units and will also ensure that they are familiar with the communities within which the cases are located. Data collectors will also be thoroughly trained in gaining respondent cooperation through refusal aversion and conversion. Field management will ensure that data collection efforts are thoroughly planned down to the data collector level; for example, production goals will be developed which will set a pace for individual data collectors, field supervisor teams, and the nation as a whole.
Several tools and approaches to address nonresponse and maximize response rates will be utilized, in addition to the respondent incentive described in section A.9. Extensive respondent materials will be developed to encourage participation. These will include advance letters to inform selected households of the study prior to in-person contact (Attachment 8). The advance letter will contain assurances of privacy and describe the voluntary nature of the survey and principal purposes and uses of the survey data. A respondent telephone call line to answer respondents’ questions and to reassure them of the credibility of the study, will be established. Tailored letters will be developed for use with reluctant respondents/sample persons and with selected units located in limited-access situations (doorperson buildings, gated communities, etc), which may be sent via FedEx or priority mail to reinforce the perceived importance of participation. (See Attachment 22 for an example refusal letter.)
A web-based Field Management System (FMS) will allow field supervisors to closely monitor each data collector’s work, which facilitates the development of strategies to address nonresponse. These strategies will include reassigning difficult or reluctant cases among local data collectors and the use of specially trained, traveling data collectors who are highly skilled in refusal conversion.
Lastly, the data collection efforts will implement a phased approach that anticipates refusal conversion efforts. In this approach, new sample segments will be released to the data collectors approximately every quarter. Cases from earlier quarters will not have to be closed out prior to releasing new sample, which will allow additional time to complete challenging cases. Further, the most difficult-to-work segments will be released in the first or second quarters of data collection, thereby giving the data collection staff additional time to work the cases. Front-loading the sample release in this manner allows data collectors the opportunity to implement the full contact strategy, including nonresponse conversion as needed.
To adjust for those non-interviews that cannot be converted, adjustments will be performed for the NESARC-III data using the accuracy procedures described in Section B.2. The specific procedure selected will ensure the accuracy of the resulting estimators and the suitability of the compensated data set for addressing the major objectives of the survey.
The overall response rate for the NESARC-III is expected to be 81 percent. (See Section B.1f for evidence to support this expected response rate. The response rate will be calculated as the number of survey participants divided by the number of eligible sample persons. Ineligible persons include persons under the age of 18 years, respondents whose mental and/or physical impairment preclude participation in the survey, military personnel on active duty and persons who are unable to conduct their interviews in English, Spanish, Mandarin, Vietnamese, Korean or Cantonese.
NIAAA proposes a small pilot study to be conducted approximately two months following OMB approval. The proposed pilot test will be conducted with approximately 50-100 paid ($100) volunteers (18 years and older) and will serve only as a test of the data collection procedures and operations. The pilot study will be conducted approximately 11 months in advance of the main data collection effort, as part of the main study preparations, designed to fine-tune the data collection procedures. Data collectors will obtain consent from pilot study respondents. (See Attachment 8.)
The objectives of this pilot study will be: (1) to test the DNA collection in a household setting (saliva will be collected but not analyzed); (2) to test data collector training procedures and materials; (3) to test data processing and the interface between the DNA repository and Westat; (4) to test interaction and communication flow between NYSPI and Westat’s Telephone Research Center (TRC); (5) to test incentive procedure; and (6) to test systems and security architectures.
The pilot study will include administering the AUDADIS-V and PRISM reinterviews (associated with Reliability and Validity studies) to no more than 5 of the pilot study respondents to test the related CAPI/CATI programs. No personally identifying information will be collected from the pilot study respondents.
All data collected in the pilot study (including the saliva sample) will not be analyzed and will be destroyed within one week after the respondent completes their participation in the pilot study. Additionally, no changes to the AUDADIS-V or PRISM instruments will be made as a result of the pilot study. OMB will be notified of any changes to data collection procedures made as a result of the pilot study.
Sampling Design
Graham Kalton, Ph.D.
Senior Sampling Statistician
Adam Chu, Ph.D.
Senior Sampling Statistician
Westat
1600 Research Blvd.
Rockville, MD 20850
Survey Operations
David Maklin, Ph.D.
Survey Director
Martha Berlin
Director of Survey Operations
Michelle Amsbary
Task Leader for Survey Design and Operations
Westat
1600 Research Blvd.
Rockville, MD 20850
IT System
Ron Hirschhorn
IT Systems Director
Westat
1600 Research Blvd.
Rockville, MD 20850
Statistical Analysis
Bridget F. Grant, Ph.D., Ph.D.
NIAAA Project Officer
Chief, Laboratory of Epidemiology and Biometry
Division of Intramural Clinical and Biological Research
National Institute on Alcohol Abuse and Alcoholism
National Institutes of Health
5635 Fishers Lane, MSC 9304
Bethesda, MD 20892-9304
File Type | application/msword |
File Title | National Epidemiologic Survey on Alcohol and Related Conditions - III |
Author | Debra Reames |
Last Modified By | curriem |
File Modified | 2010-12-20 |
File Created | 2010-12-20 |