NSCAW III Phase I and II Site Selection and Sampling
This document provides the overview of the Phase I and II of NSCAW III (OMB# 0970-0202) that guided site selection of child welfare agencies and sampling of the target population.
For the sake of comparability across cohorts, the National Survey for Child and Adolescent Well-being (NSCAW III) sample design will mirror the original design used in NSCAW I and replicated in NSCAW II.1 However, unlike NSCAW II which reused the NSCAW I primary sampling units (PSUs), NSCAW III will select a new sample of PSUs using a procedure that maximizes the overlap of the PSU sample. The sample design chosen for NSCAW III is based on the lessons learned from NSCAW I and NSCAW II and incorporates enhancements to improve the sampling precision.
Key features of the NSCAW III sampling design are as follows:
Rather than carry over all former PSUs from prior cohorts of NSCAW, a new sample of 83 PSUs will be selected in order to update the probability proportional to size (PPS) selection probabilities for the current distribution of the child welfare population.
A “maximal PSU sampling coordination” approach will be used that maximizes the probability of sampling PSUs (or agencies) in the NSCAW II sample.
Agency first contact (AFC) states—i.e., states having legal statutes requiring the agencies to contact families and obtain written permission to allow their information to be released—will be removed from the sample once they have been identified through the recruitment process.
The plan for sampling PSUs was outlined in detail in Phase I of the study approved by OMB in November 2016 (OMB # 0970-0202). Phase I activities began in November, are ongoing, and include the recruitment of the child welfare agencies and the collection of sample frame files that will be used to select the sample of children involved with the child welfare system (CWS).
For the baseline and 18-month follow-up data collection, the target population for NSCAW III includes all children ages 0-17½ who come into contact with the CWS during the 12-month sampling period. Specifically, the target population includes children who were (1) were investigated or assessed for child abuse or neglect and (2) entered state legal custody through other pathways (e.g., juvenile justice). These children, who are placed into legal guardianship, may comprise as much as 20 percent of children in out-of-home placement. This target population for Phase II of NSCAW III is shown in Exhibit 1.1.
Exhibit 1.1. Phase II of NSCAW III Target Population |
|
According to 2014 data from the National Child Abuse and Neglect Data System (NCANDS), an estimated 3.6 million referrals of abuse or neglect, concerning approximately 6.6 million children, were received by child protective services (CPS). Almost 61 percent of those referrals were accepted for investigation or assessment.
NSCAW III will construct a sampling frame consisting of all counties in the U.S. except for (1) very small ones, namely those who are expected to produce fewer than 55 completed NSCAW III interviews (2) counties whose state law (AFC states) prohibits the release of information. The first exclusion was similarly used for previous NSCAW samples and for cost efficiency. Only about 1-2 percent of the child welfare population resides in these small counties so their exclusion has a negligible effect on population coverage and estimation bias. The second exclusion (AFC states) is necessary because child welfare agencies in these states are prevented by state law from participating in the NSCAW.
The NSCAW I target population represented approximately 94.6 percent of the U.S. population of children investigated or assessed for child abuse or neglect during the sampling period. In NSCAW II, it was approximately 88 percent as a result of the additional AFC states that were dropped from the study. In NSCAW III, population coverage for this same group is likely to be approximately at the NSCAW II levels or perhaps slightly lower if more states have passed AFC legislation necessitating their removal from the target population. Neither NSCAW I or II included children who enter and are served by the CWS without a maltreatment investigation. These children will be included in the NSCAW III target population (see Figure 1.1) so the coverage of this more broadly defined population could be greater than NSCAW I.
NSCAW III proposes to use a stratified, cluster sample design, similar to prior cohorts of NSCAW.
Phase I: Sampling of Child Welfare Agencies (Previously Approved; 0970-0202, Nov 2016)
The first stage of sampling involved the selection of primary sampling units (PSU), which for this study are U.S. counties. The following text includes the plan for sampling PSUs as submitted in the Phase I OMB submission approved in November 2016 (OMB # 0970-0202).
The frame PSUs will be ordered by Census region, by state within Census region and then by urban/rural status to ensure that regions and states in both urban and rural areas will be sampled in proportion to their child welfare populations. In a few very large counties such as Los Angeles County, CW agencies will be sampled proportionately. A frame of all children in the target population will be developed for each sample agency using lists obtained from the agencies during each month of the sampling period. Then children will be selected disproportionately in each PSU according to their sampling domain to achieve the desired sample size in each domain.
Biemer (2007) determined that a sample size of 55 to 60 completed cases per PSU, per year, is ideal for the general NSCAW design in terms of cost versus error optimization. Thus, for an overall sample size of 4,565 cases, 83 PSUs/cooperating child welfare agencies is optimal.
Using a maximum sampling coordination approach, a sample of 83 PSUs/cooperating child welfare agencies via PPS sampling using composite size measures that incorporate the population sizes of the selected domains in each PSU. Data from the most recent NCANDS file will supply these population counts. The composite size measure method (Folsom, Potter and Williams, 1987) provides a means to control domain sample sizes that maximizes the efficiency of the design by minimizing weight variation for units within sampling domains. PSUs will be defined essentially as they were in NSCAW II (i.e., geographic areas that encompass the population served by a single child welfare agency). In most cases, these areas correspond to counties or contiguous areas of two or more counties. In larger metropolitan areas with branch offices, the county will be subdivided into areas served by a single agency/office.
As in NSCAW I, the selection of primary sampling units (PSUs) will involve the following four steps:
1. Partition the target population into PSUs (i.e., counties)
2. Compute a size measure for each PSU
3. Stratify the PSU sampling frame
4. Select the sample of PSUs
The activities for carrying out each of these four steps are outlined below:
Step 1: Partitioning the Target Population into PSUs. The administrative structure of the child welfare system varies considerably across the states and even within states. Therefore, a single definition of a PSU is not feasible since it depended on the administrative structure of the state system, as well as the jurisdictions of child welfare agencies within the state. For most areas of the country, the best definition of a PSU is the county since it corresponds to a clearly defined political entity and geographic area of manageable size. In other areas, the definition of a PSU is not as straightforward, as in a single child welfare agency that had jurisdiction over several counties. In such instances, the PSU will be defined as a part of or the entire area over which the child welfare agency had jurisdiction. Extremely large counties or MSAs have child welfare agencies with many branch offices, each with its own data system. Such PSUs will be divided into smaller units, such as areas delineated by branch office jurisdictions, to create manageable PSUs. For the purpose of the first-stage sampling discussion, counties are referred to as PSUs, for simplicity’s sake.
Phase II: Sampling of Children (Previously Approved; 0970-0202, July 2017)The second stage of sampling involved selecting children from each PSU. The following text includes the plan for sampling children as submitted in the Phase II OMB submission approved in July 2017 (OMB # 0970-0202).
Step 2: Compute a Size Measure for Each PSU. The second-stage sampling units will be stratified into nine domains of interest to control the second-stage sample allocation so that domains of interest have sufficient sample sizes. The second-stage NSCAW III domains and the allocation of achieved sample sizes are shown in Exhibit 1.2. Note that there are a total of nine sampling domains defined by columns 1 and 2 of the table. Column 3 is the expected sample size under proportionate sampling. Because some of these sample sizes are inadequate for research purposes, they will be increased to the levels shown in Column 4 (Target sample size). Also shown in this table is the unequal weighting effect (UWE), clustering effect (based upon the expected intercluster correlation), the effective sample size (which is the target sample size divided by the UWE and the clustering effect) and the minimum detectable effect size (MDES) which is the smallest effect size that can be detected with 80 percent power and a type I error of 5 percent.
Exhibit 1.2. Second-Stage Sampling Domains
Age group |
Services |
Proportionate |
Target |
UWE |
Clustering |
Effective |
MDES |
Infant (under 1 year) |
Services in home |
109 |
533 |
1.000 |
1.358 |
392 |
0.200 |
Services out of home |
73 |
533 |
1.000 |
1.358 |
392 |
0.200 |
|
No services |
182 |
533 |
1.000 |
1.358 |
392 |
0.200 |
|
Ages 1 to 11 |
Services in home |
914 |
397 |
1.000 |
1.250 |
318 |
0.222 |
Services out of home |
291 |
189 |
1.000 |
1.084 |
174 |
0.300 |
|
No services |
1,799 |
782 |
1.000 |
1.556 |
503 |
0.177 |
|
Ages 12 to 17 |
Services in home |
366 |
533 |
1.000 |
1.358 |
392 |
0.200 |
Services out of home |
123 |
533 |
1.000 |
1.358 |
392 |
0.200 |
|
No services |
708 |
533 |
1.000 |
1.358 |
392 |
0.200 |
|
Total |
|
4,565 |
4,565 |
1.754 |
4.564 |
570 |
0.166 |
The composite size measure method, described in Folsom, Potter, and Williams (1987), will provide a method for controlling domain sample sizes while maximizing the efficiency of the design. The composite size measure reflects the size of the sample that would fall into the PSU if a national random sample of children were selected with the desired sampling rates for all domains but without PSU clustering.
After the composite size measures are computed, each of the approximately 3,140 counties on the initial PSU frame will be checked to determine whether it was large enough to support the planned completion of at least 55 valid interviews per PSU during the twelve-month data collection period. In NSCAW I, approximately 700 counties with an expected number of 55 or fewer eligible children were deleted from the frame; this accounted for approximately 1 percent of the target population. We expect a similar final coverage rate for NSCAW III.
Step 3: Stratifying the First-Stage Frame. The PSU frame will be implicitly stratified by nine census regions and urbanicity within region prior to sampling. The urbanicity of a PSU will be defined by whether the county was part of an MSA (extremely large county). Stratifying PSUs by region and urbanicity allows for controlled allocation of sample PSUs in these implicit strata.
Step 4: Selecting the PSUs. Given the first-stage stratification and the size measure , the selection frequency of the kth PSU in the hth first-stage stratum is calculated as
(2)
where is the number of PSUs selected from the hth first-stage stratum and, is the total size measure of all PSUs in the hth first-stage stratum.
PSUs will be selected using an algorithm that maximizes the expected number of PSUs that will overlap NSCAW I and NSCAW III while assuring unconditional selection probability is as in (2). Given the sample of NSCAW I PSUs in stratum h, denoted , the algorithm produces a set of conditional probabilities while preserving the unconditional probabilities specified in (2). PSUs will then be sampled from each stratum using the conditional probabilities produced by the algorithm in Ernst (1995).
After selecting the PSUs for the study, the process of recruiting the child welfare agencies associated with the PSUs will begin. As these agencies are recruited, we will work with them individually to refine our projections of the expected sizes of the domains of analysis for sampling. The nine domains for the study are shown in Exhibit 1.2. As shown in this exhibit, the number of children that will be selected in each domain will be sufficient to achieve a minimum detectable effect size (MDES) of 0.2 in all nine strata. When calculating the necessary sample sizes, we assumed an intercluster correlation of 0.066 based upon an analysis of the NSCAW II key characteristics.
As previously noted, a sample size of about 55 completed interviews per PSU is ideal for NSCAW for cost and error optimization. We will use the data available from both NSCAW I and II to establish initial sample allocations for each domain within PSU. Then we will adjust those sample allocations throughout the data collection process by following steps:
1. Each month, the contractor (RTI International) will receive files from each child welfare agency containing all children with completed investigations/assessments as well as children entering legal custody through alternative pathways such as the juvenile justice system.
2. These files will be processed and any duplications will be removed.
3. The contractor will compute the number of cases to select in each domain, in each PSU, in any given month using an algorithm we developed in NSCAW II. The algorithm determines the number of cases to select so that target sample sizes are achieved by the end of data collection, then the algorithm optimizes the allocation of sample across PSUs so that UWEs are minimized while equalizing interviewer assignments.
4. The sample for each domain in each PSU is selected, reviewed for accuracy, and transmitted to the field. These steps are depicted in Exhibit 1.3.
In prior NSCAW studies, some child welfare agencies can be slow to enter cases and their outcomes into their agency data systems. In fact, there could be a lag of up to three months before an investigated case is finally entered into the system. In these agencies, a sampling process that only obtained cases completed in month t, say, would missing cases that were not entered into the system until month t+1, say. For that reason, we will obtain four files from the agency for each month of sampling: the month t file as of month t as well as the month t file after it has been updated in months t+1, t+2, and t+3. This will ensure there will be no loss of coverage as a result of delayed data entry.
Exhibit 1.3. Flow Diagram of the Sampling Process
The sampling of agencies (or PSUs) during the agency recruitment phase will result in the selection of 83 participating agencies. From the 83 participating agencies, approximately 4,565 CWS-involved children, their caregivers, and their caseworkers will be interviewed. It is not particularly meaningful to specify the statistical power of NSCAW child-level analysis overall because it will include so many different research questions, variables and subpopulations. However, in Exhibit 1.2 we show the MDES for estimates within the primary domains of analysis. The target sample size for NSCAW III is 4,565 completed cases where a completed case is defined as a completed interview for the key respondent (defined as the caregiver if the child is under the age of 11, and as the child if the child is 11 or older). Based on experience with the NSCAW II analysis, this sample was adequate for many types of analysis that were conducted, both for cross-sectional and longitudinal analysis.
To determine the number of cases to draw from each PSU, the initial sample needs was estimated as 8,695 sampled children to reach a completed sample size of 4,565. The rationale is presented in Exhibit 1.4.
Exhibit 1.4. Calculations for Child-Level Sample for Phase II of NSCAW III
Steps |
Number |
1. initial sample |
8,695 |
2. assume 25% of the initial cases will be ineligible due to factors including the following:
|
.75 x 8,695 = 6,521 |
3. assume 60% of the cases will cooperate with the initial data collection efforts |
.60 x 6,521 = 3,913 |
4. subject 50% of the remaining nonresponders (n = 3,143) to intensive data collection efforts |
.50 x 2,609= 1,304 |
5. assume 50% of the nonresponders ultimately participate |
.50 x 1,304 = 652 |
Total number of completed cases |
3,913 + 652 = 4,565 |
With the number of completed cases, the average number of completes per PSU, and the oversampled domains, the extensive data available from NSCAW I and II will be used to update response rates by domain and by PSU in order to establish initial sample allocations for each domain.
An important requirement of the NSCAW III design is to maximize both agency and child-level response rates. Obtaining a response rate of 80 percent for the key respondent at baseline is a high priority. The contractor will use a number of response rate enhancement features that will maximize response rates without appreciably increasing data collection costs. Central among these features are the following:
Incorporating both contact and response propensity models in the field work to identify sample members who are either difficult to contact, have a low probability of cooperating when contacted or both,
Using two-phase sampling to select a 50 percent sample of nonrespondents to pursue during the nonresponse followup phase of data collection, and
Using matched substitution for select groups of nonrespondents in order to further reduce nonresponse bias and boost response rates.
Biemer, P. P. (2007). “NSCAW II Design Methodology and Recommendations,” internal RTI International design report, May 4, 2007.
Ernst, L.R. (1995). Maximizing and minimizing overlap of ultimate sampling units. In JSM Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association. 706-711.
Folsom, R.E., Potter, F.J., and Williams, S.R. (1987). Notes on a composite size measure for self-weighting samples in multiple domains. In Proceedings of the American Statistical Association, Section on Survey Research Methods, 792-796.
U.S. Office of Management and Budget (1999). National Survey of Child and Adolescent Well-Being (NSCAW). OMB Information Collection Request. OMB Control No: 0970-0202, ICR Reference No: 199906-0970-002, Conclusion Date: 8/18/1999.
U.S. Office of Management and Budget (2008). National Survey of Child and Adolescent Well-Being. OMB Information Collection Request. OMB Control No: 0970-0202, ICR Reference No: 200803-0970-002, Conclusion Date: 3/4/2008.
1 The NSCAW II OMB package contains additional detail about the previous sample designs and can be found here: http://www.reginfo.gov/public/do/PRAViewDocument?ref_nbr=200803-0970-002
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
Author | Keith Smith |
File Modified | 0000-00-00 |
File Created | 2021-09-08 |