MSHS OMB Part B_Draft 12-21-16_passback_Abt revisions-clean

The Office of Planning, Research and Evaluation (OPRE), Administration for Children and Families (ACF), U.S. Department of Health and Human Services, is proposing data collection for the Migrant and Seasonal Head Start (MSHS) Study. The MSHS Study is a nationally representative study that focuses on describing the characteristics and experiences of the children and families who enroll in MSHS and the practices and services of the MSHS programs that serve them.

The MSHS Study comprises two study components that are based on distinct samples. First, the Program¹ and Center Component will: (1) accurately and completely describe program operations and services across the universe of all MSHS programs and selected centers, documenting up-to-date information and compiling an overall picture that includes a level of detail not previously provided to MSHS, and (2) systematically identify variations in program operations, both within and across programs and centers. This component will include the universe of all MSHS programs as well as a nationally representative sample of centers. For this component, the study team will mail two separate surveys to program directors and center directors.

Second, the Classroom, Family and Child Component will yield national estimates of the circumstances and experiences of MSHS families and children (demographics, agricultural livelihoods, food and housing security, social and economic challenges and resources, mental and physical well-being, cultural and linguistic processes); the characteristics of and practices in MSHS classrooms; and families’ service needs and preferences. In addition, relational analyses will estimate the associations among MSHS program and family contexts and children’s language skills and other abilities.

This component will include on-site data collection with a nationally representative sample of MSHS programs, centers, families and children. We will recruit centers, teachers and assistant teachers, and families/children into the sample. Data collection will include: parent interviews including Parent Child Reports; direct child assessments, teacher surveys including Teacher Child Reports, assistant teacher surveys, and classroom observations.

B.1. Respondent Universe and Sampling Methods

The target population for the MSHS Study is all MSHS programs in the 48 contiguous states in the U.S., their centers and classrooms, and the children and families they serve. The study design involves the selection of two distinct samples, described below and summarized in Exhibit B.1. The sample for the Program and Center Component will include the universe of all MSHS programs as well as a nationally representative, stratified random sample of centers from those programs. The Classroom, Family and Child Component will use a multi-stage cluster design to select a nationally representative sample in four stages: 1) MSHS programs, with programs defined as grantees or delegate agencies providing direct services to children; 2) centers within programs; 3) classrooms within centers; and 4) children (and their families) within classes.

Exhibit B.1. Flow chart of sample selection procedures

Shape1 Shape2 Shape3

Information about the universe of programs for creating the sampling frame for each component will be gathered from the most current data available from the Head Start Enterprise System (HSES), the Head Start Program Information Report (PIR), the Head Start program directory Web site, and administrators from the MSHS Branch of the ACF Office of Head Start (OHS). Based on data from HSES obtained in January 2016, there are 29 grantees and 24 delegate agencies in operation in the 48 contiguous states, for a total of 53 MSHS programs. The programs operate a total of 420 centers. Programs that are under transitional management and those that are (or will soon be) defunded² will not be included in either sampling frame because they likely will not be fully operational/stable and also there may not be a project director who can respond to the survey or support the recruitment of centers. As noted above, all programs in the sampling frame will be selected with certainty for the Program and Center component. For the Classroom, Family and Child component, a nationally-representative sample of programs that provide direct services to children will be randomly selected from the sampling frame of all programs that provide direct services.

For each component, the sampling frames used to sample centers will also be constructed from the most current data available from HSES, PIR, the Head Start program directory website, and administrators from the MSHS Branch of OHS. The center-level sampling frames for each component will include all centers within the programs that are included in each program-level sampling frame, which will differ slightly since the two programs that do not provide direct services will be included in the Program and Center component, but not in the program-level sampling frame for the Classroom, Family and Child component.

For the Classroom, Family and Child component only, the classroom sampling frame will be constructed after the center sample is drawn and a list of classrooms is obtained, including enrollment and predominant age served, from center directors and on-site coordinators (OSCs). The OSC is a designated center staff member who will work with the study team to recruit teachers, assistant teachers, and families; help schedule site visits; and help with obtaining informed consent. Similarly, the child sampling frame will be constructed after the classroom sample is drawn, at which point a roster of children and their ages for each sampled classroom will be obtained. Slots for children in family child care homes (rather than centers) and for Early Head Start-child care partnerships will be excluded from the child sampling frame. All centers, classrooms, and children in study-eligible programs will be included in the respective sampling frames, with one exception. Because sampling within centers will occur at a season of peak enrollment, children may appear on the roster for more than one center, as families migrate throughout the year. To prevent children from being sampled more than once, children who are on a roster and have appeared on a roster from another center that has already undergone child sampling will be withdrawn from the selection pool.

B.1.1 Program and Center Component

The universe of 53 programs will be included in the Program and Center Component (see Exhibit B.2). The advantage to including all MSHS programs is that estimates of characteristics will have no sampling error. If programs were sampled, there would be a smaller sample of programs resulting in estimates with poor precision. We expect a 100 percent response rate for program directors, based on the 100 percent response rate in the 2009 FACES study, a nationally representative study of Head Start.

Exhibit B.2. Sample Frame for Program and Center Component

	Universe	Initial Sample		Response Rate Anticipated	Achieved Sample Anticipated	Instrument
	Universe	Main	Reserve	Response Rate Anticipated	Achieved Sample Anticipated	Instrument
Programs	53	53	n/a	100%	53	Program Director Survey
Centers	420	250	30	70-80%	200	Center Director Survey

Using stratified, systematic random sampling, a nationally representative sample of 280 centers will be selected, with 30 held in reserve and to be released if the response rate drops below 80 percent. Together the main sample of 250 centers and the reserve sample of 30 centers will allow for a 70-80 percent response rate to obtain an actual sample size of 200 centers (see Exhibit B.2). A list of centers and their locations will be obtained from the programs. All centers operated by the 53 programs comprise the universe of centers.

Geography. Due to the seasonal nature of farm work, the geographic location of MSHS centers reflects variations in peak operational periods for centers and major migratory streams for farmworkers. When there is variation within an overall population, it can be advantageous to stratify before randomly sampling to reduce the variance estimates and ensure a more representative sample. Center addresses will be used to stratify centers by the six geographic regions identified in the National Agricultural Workers Survey (NAWS^³) to ensure that the sample is representative of the geographic variation in center locations. The NAWS regions were collapsed from the 12 agricultural regions defined by the U.S. Department of Agriculture based on crop patterns and are defined as:

East: North Carolina, Virginia, Kentucky, Tennessee, West Virginia, Connecticut, Maine, Massachusetts, New Hampshire, New York, Rhode Island, Vermont, Delaware, Maryland, New Jersey, Pennsylvania
Southeast: Arkansas, Louisiana, Mississippi, Alabama, Georgia, South Carolina, Florida
Midwest: Illinois, Indiana, Ohio, Iowa, Missouri, Kansas, Nebraska, North Dakota, South Dakota, Michigan, Minnesota, Wisconsin
Southwest: Arizona, New Mexico, Oklahoma, Texas
Northwest: Idaho, Montana, Wyoming, Colorado, Nevada, Utah, Oregon, Washington
California

Centers will be allocated to each region with probability proportional to MSHS enrollment.

Program Characteristics. Within region strata, centers will be sorted by program and center enrollment, and systematic random sampling will be conducted with equal probabilities within stratum.⁴ Systematic random sampling involves selecting units at a fixed interval (e.g., selecting every third center) throughout the stratum after a random start. Systematic random sampling controls the distribution of the sample by spreading the selections throughout the stratum at equal intervals. This approach provides implicit stratification based on the variables used to sort the data (i.e., program and center enrollment, in this case).

Finally, we anticipate a response rate of 70-80 percent for the center director survey. There was a 100 percent response rate in FACES 2009, although it is expected that this study may have lower response rates due to the unique operational periods for centers.

B.1.2 Classroom, Family, and Child Component

As seen in Exhibit B.1 and Exhibit B.3, the Classroom, Child and Family Component will employ a multi-stage cluster design to collect information on a national probability sample of MSHS children, their families, and classrooms. Specifically, this design will involve four stages, selecting a nationally representative sample of programs (stage 1), centers from sampled programs (stage 2), classrooms from sampled centers (stage 3), and children and their families from sampled classrooms (stage 4). Recall, the universe of MSHS programs for the Classroom, Child and Family component will exclude the two programs that do not directly provide services to children, resulting in a sampling frame of 51 programs (i.e., 27 grantees and 24 delegate agencies).

Exhibit B.3. Sample Frame for Classroom, Family and Child Component

Stage	Universe	Initial Sample		Response Rate Anticipated	Achieved Sample Anticipated	Instrument
Stage	Universe	Main	Reserve	Response Rate Anticipated	Achieved Sample Anticipated	Instrument
1. Programs	51	24	2	92%	24	n/a
2. Centers	420	66	10	70-80%	53	Classroom Sampling Form
3. Classrooms	All classrooms in sampled centers (TBD after center sampling)	159 classrooms (3 classrooms per center)	n/a	100% for roster forms, surveys and observations	159	Child Roster Form Teacher Survey Teacher Child Report Assistant Teacher Survey Classroom Observations
4. Child and Family	All unduplicated children on class rosters from sampled classrooms (TBD after classroom sampling)	1,272 infants, toddlers, and preschoolers^a	n/a	80%	1,018	Parent Interview Parent Child Report
		212 infants/young toddlers	n/a	80%	170	n/a^b
		424 older toddlers	n/a	80%	339	Older toddler child assessments
		636 preschoolers	n/a	80%	509	Preschool child assessments

^a The breakdown of the total child sample is indicated in the grey shaded cells.

^b No direct assessments – only Teacher and Parent Child Reports – will be conducted with infants/young toddlers.

In an effort to obtain a sample of 24 programs, at Stage 1, a stratified, systematic random sample of 26 programs will be selected from the universe of 51 MSHS programs. Two programs will be held in reserve to replace programs from the main sample that are closed or otherwise unable to participate in the study (e.g., due to termination, re-competition, being newly funded). Prior to sampling, very small programs may be collapsed to create program groups, to prevent a shortfall of sampled children and excessive variability in the final child weights.^⁵ Programs (or program groups) will be stratified by the six NAWS geographic regions. The 26 programs will be allocated proportionally to region based on the MSHS enrollment in each region. Prior to sampling, programs within each region will be sorted by enrollment to create implicit strata, to help ensure representation of programs by size. Within strata, a systematic random sample will be selected with probability proportional to enrollment. Large programs whose measure of size exceeds the stratum sampling interval will be selected with certainty. Two of the non-certainty programs will be randomly selected with equal probabilities to be reserve programs.

With the goal of obtaining 53 centers, at Stage 2, a stratified, systematic random sample of 76 centers will be selected, with a main sample of 66 centers, which assumes an 80 percent response rate, and a reserve sample of 10 centers, which will be included if the participation rate looks as if it will be lower than desired and additional centers are needed to reach the desired final sample size of 53 centers. Within each of the randomly sampled programs, a center frame will be constructed based on data from the administrative databases mentioned above. Very small centers within a program will be grouped by geographic proximity to form center groups with a minimum enrollment size, which will help ensure that there are enough children to sample from within each center or center group and prevent a shortfall in the overall sample size of children.

Within each program, centers (and center groups) will be stratified by the six NAWS regions and sorted within program by length of operation (i.e., number of months open and which months open) and zip code to create implicit strata. At least one center will be sampled from each of the 24 sampled programs, and the remaining 52 of the 76 centers will be allocated proportionally to strata based upon the number of centers within each stratum. Thus larger programs will be allocated more centers. From each program-by-region stratum, centers will be randomly selected with probability proportional to total center enrollment using systematic sampling. The probabilities of selection will be derived using a Keyfitz procedure to maximize the overlap with the 200 centers sampled from the center universe for the mailed survey within Program and Center Component.⁶ The initial sample of 76 centers will be split into a main sample of 66 centers and reserve sample of 10 centers by randomly selecting with equal probabilities the number of centers allocated for the main sample in each program. The selected centers constitute the main sample; the remaining centers are the reserve sample.

At Stage 3, classrooms will be randomly sampled from each selected center at the time of peak enrollment in the center. In each selected center, all the classrooms serving infants through preschoolers will be listed. Classrooms with fewer than eight children will be combined into a classroom group with at least eight children of the same age group if possible. Based on recent data from the HSES, the median class size is 8 children (range 1 to 21) for classes serving infants/toddlers (aged 0 – 35 months), and 18 children (range 1 to 25) for classes serving preschool children (aged 36 months and older). For the universe of center-based MSHS centers, 65 percent of classes serve infants/toddlers and 35 percent serve preschool age children.

In centers with three or fewer classrooms, all classrooms will be selected with certainty. Because some centers have fewer than three classrooms, fewer than three classrooms will be selected into the sample from those centers. The balance of the three classrooms allocated to that center (which were not selected because there were fewer than three classrooms in the center) will be re-allocated to a larger center (i.e., one with more classrooms) within the same program.

In centers with more than three classrooms, classrooms within each center will be stratified by the predominant age group they serve. The age groups are defined as infants or young toddlers (0-23 months), older toddlers (24-35 months), and preschool (age 36 months and older). One classroom (or classroom group) will be sampled with probability proportional to size in each stratum. For example, in centers with classrooms serving two age groups, one classroom will be randomly sampled from the age group with fewer classrooms and two classrooms from the age group with more classrooms. For a sample of 53 centers, this approach will yield a total sample of 159 classrooms, with approximately 53 classrooms serving each age group.

We anticipate being able to achieve very close to 100 percent response rate for the teacher and assistant teacher surveys, as well as the Teacher Child Reports, using the following procedures: 1) enlisting the support of center directors to encourage selected teachers to participate in the data collection; 2) sending the instruments in advance to allow adequate time for completion prior to or during the data collection week; 3) having both the OSC and the site supervisor (the site supervisor is the field person who manages the team of field data collectors) encourage and remind teachers and assistant teachers to complete the survey; 4) having the respondents aim to return their surveys to the site supervisor by the end of the on-site data collection visit; 5) as necessary, leaving behind pre-paid mailers for the return of any surveys not completed by that deadline; and 6) offering incentives to participants to acknowledge their efforts (please see Part A.9 for more information about incentives). Further, as our data collection partner (Westat) did for the Head Start Impact Study (0970-0229), we expect to be able to observe 100 percent of the sampled classrooms.

At Stage 4, a nationally representative sample of MSHS children will be selected from the randomly sampled centers and classrooms. Our goal is to sample a total of 1,000 children – approximately 500 preschoolers and 500 infants/toddlers (i.e., 170 infants/young toddlers and 339 older toddlers). With an assumed response rate of 80 percent for parent surveys, Parent Child Reports, and child assessments, 1,272 children (approximately 636 preschoolers, 424 older toddlers, and 212 infants/young toddlers) will be initially sampled to achieve the targeted sample size for completed child assessments and parent surveys. We expect a response rate of 80 percent for the parent survey based on experience from the 2014 FACES study.

A child sampling frame will be constructed based on rosters of children currently enrolled in each of the 159 sampled classrooms. Rosters will be obtained from each sampled center during the peak operational period. Sampling within centers according to their season of operation can result in children appearing on the roster for more than one center, because families migrate throughout the year. To prevent children from being sampled more than once, the rosters of children will be de-duplicated across centers using an in-house SAS program that does probabilistic matching. Each child’s name, gender, date of birth, and parent name(s) will be used to match children across classroom lists. The roster from each center about to undergo child sampling will be matched to the rosters from centers that have already undergone child sampling, and children who appear on a previous roster will be withdrawn from the selection pool. Should the de-duplication process fail, the child-level base weight will be adjusted by the additional chance(s) of selection from the other center list(s). This step will reduce the likelihood of sampling children more than once in different centers over the course of the data collection period, which would result in a loss of sample size.

Within each sampled classroom, children will be sorted by age (or birthdate) and randomly sampled with equal probability within each classroom using systematic sampling. From the age-sorted list of children in the classroom, children will be selected at a fixed interval throughout the class list, after a random start. This approach provides implicit stratification based on child age. From classrooms predominantly serving preschoolers, 12 children will be randomly selected. From classrooms predominantly serving older toddlers, 8 children will be randomly selected, and from classrooms predominantly serving infants or young toddlers, 4 children will be randomly selected.

We anticipate that, across all sampled centers, we will achieve the total targeted sample size for each age group; however, if necessary, the allocated number to be sampled from each age group may be modified to help achieve targets. Because all children in a given classroom may not be within the predominant age group, there may be variations in the number sampled in each age group within a given center. For example, a classroom that predominantly serves preschoolers may also serve toddlers. If there are 12 preschoolers and 6 toddlers in the classroom, the 12 randomly sampled children will likely include approximately only 8 preschoolers along with 4 toddlers. For this reason, we will continuously monitor the ages of children sampled and the progress toward sample targets for each age group. If necessary to achieve planned targets within each age group, we may revise the allocated number of children to be sampled from each age group in later-sampled centers. In other words, if there appears to be a shortfall within one age group, we will attempt to make up the shortfall in subsequent sampling of children. This process will help achieve our total sample targets within each age group.

The strategy for sampling children helps ensure that the child sample is representative of the children attending the center at that point in time. By sampling programs, centers, and classrooms with probability proportional to size and allocating programs and centers proportionally to strata, then selecting a fixed sample size of children within classrooms, the variability in the child weights is minimized. This assumes the measure of size used to sample programs and centers is up-to-date and accurately reflects the number of children in each program and center.

B.2. Procedures for Collecting Information

B.2.1 Sampling and Estimation Procedures

Statistical methodology for stratification and sample selection

The sampling methodology is described under item B.1 above. For the Program and Center Component, the universe of programs will be sampled. When sampling centers, explicit strata will be formed using NAWS regions. Sample allocation will be proportional to MSHS enrollment in each stratum. The sampling frame will be implicitly stratified (sorted) by program and center enrollment. Within strata, centers will be randomly selected with equal probability.

For the Classroom, Family and Child Component, when sampling programs, explicit strata will be formed using NAWS region. Sample allocation will be proportional to MSHS enrollment. The sample frame will be implicitly stratified (sorted) by enrollment. Within strata, programs will be randomly selected with probability proportional to program enrollment.

When sampling centers, centers will be stratified by program and by NAWS region within program. Sample allocation will be proportional to the number of centers in the program. The sample frame will be implicitly stratified by length of operation (i.e., number of months open and which months open) and zip code. Within strata, centers will be randomly selected with probability proportional to center enrollment.

When sampling classrooms, explicit strata will be formed based on the predominant age group served (infant/younger toddler, older toddler, or preschooler). Three classrooms will be selected from each center with three or more classrooms. For centers with fewer than three classrooms, the remaining allocated classrooms – up to three – will be re-allocated to another center from the same program, with larger centers allocated the excess. In centers with three or fewer classrooms, all classrooms will be selected with certainty. In centers with more than three classrooms, classrooms will be sampled from each age-group strata with probability proportional to size.

When sampling children (and their families), explicit strata will be formed using classrooms. After removing children who appear on rosters in previously-sampled centers, the sample frame will be implicitly stratified within classroom strata by children’s age in months. Within classroom, children will be randomly sampled with equal probability – 12 from preschool classrooms, 8 from older toddler classrooms, and 4 from infant/young toddler classrooms. As described above, the allocated number to be sampled from each age group may be modified in later-sampled centers, if necessary, to help achieve sample targets within each age group.

Estimation procedure

Analysis weights will be constructed to account for variations in the probabilities of selection and variations in the eligibility and response rates among those selected. For each stage of sampling (program, center, class, and child/family) and within each explicit sampling stratum, the probability of selection will be calculated. The inverse probability of selection within stratum at each stage is the sampling or base weight. The sampling weight takes into account the probability proportional to size or equal probability sampling approach, the presence of any certainty selections, and the actual number of cases released. The eligibility status of each sampled unit is treated as known at each stage. Then, at each stage, the sampling weight will be multiplied by the inverse of the weighted response rate within the sampling stratum to obtain the analysis weight. Respondents’ analysis weights account for both the respondents and non-respondents, whereby weighted estimates represent the national MSHS population in the 48 contiguous states.

Analysis weights will be constructed at each stage of sampling, with program-level weights adjusting for the probability of program selection and response; with center-level weights adjusting for the probability of center selection and response; with the class-level weight adjusting for the probability of class selection and response; and with child-level weights adjusting for the probability of child selection and child response (including the probability that various child-level instruments were obtained, such as child assessments, parent reports, teacher reports, and parent surveys). The formulas below represent the various weighting steps for cumulative weights through the stages of selection.

Given the complex sampling design, analyses of weighted data require statistical analysis software, such as SAS or Stata survey procedures, that can estimate standard errors accurately. Both software packages are designed to produce appropriate design-based parameter estimates, standard errors, confidence intervals, and design effects. The use of weights and an appropriate variance estimation method such as linearization or a replication method will account for the multi-level data structure (children in classrooms in centers in programs) and stratification in the design, and will produce accurate standard errors. Standard errors that reflect the design are necessary to indicate the precision of the national estimates and for statistical tests.

Degree of accuracy needed for the purpose described in the justification

The analysis will focus on estimating descriptive statistics with confidence intervals for the full sample at all levels – program, center, classroom, and child/family, as well as for some selected subgroups (e.g., region, migrant/seasonal farmworker status, and child age groups). The analysis will not involve comparisons of subgroups using statistical tests for significant differences, which would require larger sample sizes. Descriptive statistics will be estimates of either population means for continuous measures or population percentages for binary or categorical measures. Precision is therefore addressed mainly in terms of 95 percent confidence half-widths measured in standard deviation units for continuous variables or measured in percentage points for categorical variables, with the latter based on an estimated percentage of P=50 percent. ⁷

The expected sample sizes are provided in Exhibit B.4 for programs and centers, in Exhibit B.5 for classrooms, and in Exhibit B.6 for children/families. The expected sample sizes are based on the sampling of programs, centers, classrooms and children as described in Section B.2.A. The exhibits also show 95 percent confidence interval half-widths for estimates based on each sample.

In Exhibit B.4, the confidence interval for programs is zero. There will be no error for estimates of MSHS programs because all programs in the universe will be surveyed. For centers, precision estimates assume a design effect of 1.0 and a confidence interval of ±0.14 standard deviations for estimates of population means and a confidence interval of ±6.9 percentage points for estimates of percentages, assuming a prevalence of 50 percent.

Exhibit B.4. Expected Sample Sizes and Precision for MSHS Program and Center Estimates

Sample	Expected Number Sampled	Expected Number Responding	95% Confidence Interval Half Width
Sample	Expected Number Sampled	Expected Number Responding	Continuous measures	Binary measures (Percentage Points ^a)
Programs	53	53	0	0
Centers ^b	250	200	±0.14	±6.9

^a For an estimated percentage of 50%.

^b For an assumed design effect of 1.0.

For classrooms, the design effect is not known; therefore Exhibit B.5 shows the 95 percent confidence interval half-widths for a range of design effects to illustrate the potential range in the precision of estimates about classrooms. The design effect incorporates the effects of clustering within programs and centers, and the variation in the final classroom weights. A design effect of 1.0 means the effective sample size is the same as the nominal sample size; a design effect of 1.5 means the effective sample size is reduced to one-third the nominal sample size; and a design effect of 2.0 means that the effective sample size is reduced to half the nominal sample size. Depending on the design effect, the confidence interval for estimates of the full sample of classrooms ranges from ±0.15 to ±0.22 standard deviation units for continuous measures and from ±7.8 to ±11.0 percentage points for estimates of percentages, assuming a prevalence of 50 percent. For subgroups of classrooms by predominant age group served, estimates will not be very precise for population means or for proportions of 0.50, but precision will be more precise for proportions of .10 or .90.

Exhibit B.5. Expected Sample Sizes and Precision for MSHS Classroom Estimates, for Design Effects of 1.0, 1.5, and 2.0

Classroom Sample	Expected Sample Size	95% Confidence Interval Half Width			95% Confidence Interval Half Width
		Design Effect			Design Effect
		1.0	1.5	2.0	1.0	1.5	2.0
		Binary measures (percentage points)
		Proportion of 0.5			Proportion of 0.1 / 0.9
Full Sample
All Classrooms	159	±7.8	±9.8	±11.0	±4.7	±5.7	±6.6
Subgroups, by Predominant Age Group Served
Infant	53	±13.5	±16.5	±19.0	±8.3	±10.1	±11.7
Toddler	53	±13.5	±16.5	±19.0	±8.3	±10.1	±11.7
Preschool	53	±13.5	±16.5	±19.0	±8.3	±10.1	±11.7
		Continuous measures (standard deviation units)
Full Sample
All Classrooms	159	±0.15	±0.19	±0.22
Subgroups, by Predominant Age Group Served
Infant	53	±0.28	±0.33	±0.39
Toddler	53	±0.28	±0.33	±0.39
Preschool	53	±0.28	±0.33	±0.39

For child/family measures (Exhibit B.6), we assume a design effect of 2.0^⁸ for child/family measures. The design effect incorporates the effects of clustering within programs, centers and classrooms and the variation in the final child weights. A design effect of 2.0 means the effective sample size is reduced to half the nominal sample size. The confidence interval for estimates of the full sample of children/families is ±0.09 standard deviations for continuous measures (e.g., scores on child assessments) and ±4.3 percentage points for estimates of percentages (e.g., percent below the federal poverty level), assuming a prevalence of 50 percent. For estimates of subgroups of children by age group, confidence intervals range from ±.12 to .15 standard deviations and ±5.9 to ±7.5 percentage points, depending on the age group. Estimates for migrant farmworkers and for subgroups of children/families by region will be less precise, with confidence intervals ranging from ±9.2 for children/families in the Northwest to ±10.6 for migrant farmworker families and their children to ±10.9 for children/families in the Southeast.

Exhibit B.6. Expected Child/Family Sample Sizes and Precision for a Design Effect of 2.0

			95% Confidence Interval Half Width
Child/Family Sample	Expected Number Sampled	Expected Number Responding	Continuous measures (Standard deviation units)	Binary measures (Percentage Points ^a)
Full Sample
All children/families	1,272	1,018	±0.09	±4.3
Subgroups
Child Age
Preschoolers	636	509	±0.12	±5.9
Infants/All Toddlers ^b	636	509	±0.12	±5.9
Older Toddlers only ^c	424	339	±0.15	±7.5
Farmworker Status
Migrant ^d	216	173	±0.21	±10.6
Seasonal	1,056	845	±0.10	±4.8
Region
California	269	215	±0.19	±9.4
Northeast	176	140	±0.23	±11.7
Midwest	209	167	±0.21	±10.7
Northwest	282	226	±0.18	±9.2
Southeast	201	161	±0.22	±10.9
Southwest	136	109	±0.27	±13.3

^a For an estimated proportion of 50%.

^bSample for estimates based on the full sample of infants and toddlers (e.g., based on Teacher-Child Reports or Parent-Child Reports), including toddlers of any age. In other words, the sample includes both young toddlers (13-23 months old) and older toddlers (24-35 months old).

^cSample for estimates based on direct child assessments of toddlers. This sample is restricted to toddlers (24-35 months old) because only these older toddlers receive direct assessments. Younger toddlers (13-23 months old), like infants, do not receive direct assessments.

^d Migrant farmworkers constitute 17% of farmworkers (2011-2012 NAWS).

Unusual problems requiring specialized sampling procedures

We do not anticipate any unusual problems that require specialized sampling procedures.

Any use of periodic (less frequent than annual) data collection cycles to reduce burden

We will only collect data from each respondent at one time.

B.2.2 Data Collection Procedures

We propose to collect data for two study components that are based on distinct samples. First, the Program and Center Component will include mail surveys for the universe of all MSHS programs as well as a nationally representative sample of centers. Second, the Classroom, Family and Child Component will be based on a nationally representative sample of MSHS programs, centers, families and children. Exhibit A.1 (in Part A) shows the instrument components, sample size, and type of administration. Outlined below are the procedures for each of the data collection instruments, separately for each of the two study components.

Component 1: Program and Center Component
1. Program Director Survey

For the Program and Center Component, the MSHS study team plans to collect data between February 2017 and April 2018. We will mail self-administered paper-and-pencil surveys (Appendix 3) accompanied by letters from the ACF Project Officer and the Abt Study Director to the universe of MSHS program directors (29 grantees and 24 delegate agencies reported in HSES data, obtained in January 2016; see Appendix 2). The data from the survey will be used to answer research questions regarding MSHS program characteristics. The survey is estimated to require approximately 40 minutes to complete. The study team will send the program director surveys as early as possible during the data collection period to allow for time to follow-up as necessary.^⁹ The letter from the Abt Study Director will ask program directors to return the completed survey in an enclosed pre-paid envelope by a specific date, usually about two weeks after receipt of the survey. The study team will send email reminders and also conduct phone follow-up as necessary (Appendix 5). Prior to data collection, the MSHS study team will have already conducted outreach to program directors through newsletters, conference presentations and one-on-one encounters, all of which will emphasize the importance of the study and of program directors’ participation. In addition, the study will be publicized and supported by the MSHS Branch within the Office of Head Start and by the National MSHS Association. Based on the level of outreach and encouragement from trusted sources and the targeted efforts of the study team, we expect to be successful in obtaining completed surveys from the universe of program directors.

Center Director Survey

We will mail self-administered paper-and-pencil surveys to a sample of MSHS center directors (Appendix 4) along with the letter from the ACF Project Officer and a cover letter from the Abt Study Director (Appendix 2). In addition, we will send survey packets to any of the 53 centers selected for participation in the Classroom, Family and Child component that are not selected as part of the original sample of 250 centers.^¹⁰The data from the survey will be used to answer research questions regarding MSHS center characteristics. The survey is estimated to require approximately 40 minutes to complete. The study team will send the center director surveys as early as possible during the data collection period, but on a rolling basis to provide for variable periods of operation for the centers. The Abt Study Director cover letter will ask the center directors to return the completed survey in an enclosed pre-paid envelope by a specific date, usually about two weeks after receipt of the survey.

Component 2: Classroom, Family and Child Component

For the Classroom, Family and Child Component, the MSHS study team proposes to collect data from several sources: MSHS children, their parents, their teachers and assistant teachers, as well as classroom observations.

The data collection for this component will occur on-site at the 53 centers (or center groups) between April 2017 and April 2018. We will visit each center for approximately one week. Prior to on-site data collection, we will identify an on-site coordinator (OSC) for each center – a designated center staff member who will work with the study team to recruit teachers, assistant teachers, and families; help schedule site visits; and help with obtaining informed consent. A member of the study team, in conjunction with the OSC, will schedule the on-site data collection week during a peak attendance period (i.e., a week when the highest number of children are expected to be in attendance). Seven field data collection teams will be available for on-site data collection, although no more than five or six teams will be scheduled during a given week to allow remaining team members to fill in when necessary.

Below, procedures are outlined for each of the data collection instruments for the Classroom, Family and Child Component.

MSHS Classroom Sampling Form

Approximately three weeks prior to the scheduled on-site data collection, for each selected center, the study team will send the MSHS Classroom Sampling Form (Appendix 16) to the OSC. The OSC will be asked to enter each teacher’s full name, the classroom type (e.g., AM, PM, full day, or other) and specify the number of MSHS-funded infant, toddler and preschool children by age group in each classroom. Centers may provide this information in various alternative formats including reports from their record systems, hard copy lists, or photocopies of records. Upon receipt of the completed MSHS Classroom Sampling Form or alternative format, the study team will use a sampling program to select the sample of classrooms (expected to be three classrooms per selected center) according to the Sampling and Analysis Plan, and immediately afterwards, notify the OSC of the selected classrooms. The OSC will then begin recruiting the teachers and assistant teachers in the selected classrooms and completing the MSHS Child Roster forms.

MSHS Child Roster Form

The study team will send the OSC the MSHS Child Roster Forms (Appendix 17) and request that, for each of the MSHS- funded children in the selected classrooms, the OSC list the child’s full name, date of birth, gender, parent/primary caregiver full name, and full name(s) of the child’s MSHS-funded siblings(s) enrolled in each of the sampled classrooms. Again, centers may provide this information in various alternative formats including reports from their record systems, hard copy lists, or photocopies of records. Upon receipt of the completed MSHS Child Roster Form for each classroom, the study team will use a sampling program to select a sample of children (expected to be an average of eight children per classroom) and their families. Next, the study team will send the OSC a listing of the sampled children and their roster information. When the team provides the OSC with the list of selected families and children, it will ask the OSC to verify that the information listed for each child is correct. The study team will take extra care when matching children and their respective parents and when checking for duplicates in the sampling process because some individuals from Spanish-speaking countries have two last names (i.e., paternal then maternal last name). Because traditional English language data systems allow only one slot for one last name, a child may be listed by only one last name in the center records. Thus, the study’s data system will have slots for multiple last names to allow for full name verification, which will be critical to accurate identification of MSHS children and families and avoidance of duplication.

As part of the initial orientation for the OSCs, study staff will apprise the OSC of the two to three week turnaround necessary for obtaining the names, selecting the sample, and notifying the OSC of the selected children and their families. We also will need the OSC to begin the process of obtaining informed consent immediately upon receipt of the sample. Westat will send each OSC a schedule with the actual dates for each of these activities for the OSC’s center.

To prevent children from being sampled more than once, particularly when families migrate from one center to another during the year, the rosters of children will be unduplicated across centers using a two-pronged approach that includes use of an in-house SAS program that does probabilistic matching and also having research assistants conduct manual reviews of the rosters. That is, the roster from each center about to undergo child sampling will be matched to the rosters from centers that have already undergone child sampling, and children who appear on the earlier roster will be withdrawn from the selection pool. To facilitate the comparison and identification of duplicates, we will provide research assistants with sorted lists of children already sampled. Each child’s name, gender, date of birth, and parent name(s) will be used to match children across classroom lists. This step will reduce the likelihood of sampling children more than once in different centers over the course of the data collection period, which would result in a loss of sample size. Should a duplicated case be missed, the child-level base weight will be adjusted by the additional chance(s) of selection from the other center list(s). The OSC will be asked to notify Westat immediately should a child move prior to data collection. In that case, the child will be considered a non-respondent for that center, but we will continue to check for the child as a previously sampled child when processing subsequent rosters. Should the child be found, the child would be deemed ineligible at the subsequent center.

Obtaining Parent Consent, Parent Interview and Parent Child Report

Before the on-site visit, the site supervisor will discuss with the OSC the best approaches for securing parent participation and informed consent. Site supervisors will have been trained in these approaches as part of their training and will be tasked with preparing the OSC for their role in securing informed consent from the parents.

We have prepared advance letters at an appropriate reading level from the Abt Study Director to invite parents and their children to participate (Appendix 25), a colorful MSHS Study flyer (Appendix 22), a tailored set of Frequently Asked Questions for parents (Appendix 26), and clearly written informed consent letters with details of the study activities for parents and their children (Appendix 27). All of these materials will be available in both Spanish and English. We recognize, however, that for the MSHS parents, a good deal of outreach will need to be face-to-face and communicated orally by trusted people. Therefore, the MSHS study team will work with the OSC to tailor an approach that will work best for each center and their families.

During the scheduled on-site visit, the field data collectors will conduct the face-to-face interviews with parents who have agreed to participate (Appendix 29). As part of the one hour interviews, the MSHS study team will also ask parents to provide ratings for their study child (Appendix 30). The field team will provide afternoon, evening and weekend interview sessions at the centers or in a nearby public facility such as a community center, library, or local restaurant or at the parent’s home, as appropriate. The field team will include bilingual interviewers, fluent in both Spanish and English, who will conduct the parent interviews in the parent’s preferred language. Nevertheless, some MSHS parents speak languages other than English or Spanish (e.g., an indigenous language of Mexico or Haitian Creole). If the field interviewers do not speak the parent’s preferred language, whenever possible, and with the parent’s agreement, we will enlist the aid of an interpreter from the MSHS program. Alternate possibilities for interpreters that have proven successful on other studies – such as the National Head Start Impact Study and the Third Grade Follow-up to the Head Start Impact Study – will include enlisting the aid of another family member, friend or neighbor who, with the parent’s agreement, can provide interpreter services.

Child Assessment

The field data collection team will assess older toddlers (24-25 months) and preschoolers (36 months and older) individually for about 27 minutes and 42 minutes, respectively, during the scheduled on-site visit (see Appendix 34). The site supervisor will work with the OSC to determine the most comfortable and most appropriate quiet space for conducting the assessments. This will surely vary from center to center. Additionally, assessors will complete assessor child ratings that will require about 7 minutes for preschoolers and toddlers. We expect that many children will be dual language learners so field assessors will determine the language(s) and assess the children in English and/or Spanish. The MSHS study team will train and certify the field assessors in both the toddler and preschool assessments to be able to work with both age groups. In addition, information will be collected for infants from parent ratings (part of the Parent Interview) and teacher ratings (Teacher Child Reports). The field assessors will conduct the assessments at the MSHS center.

For all age groups, we will ask the teachers and assistant teachers to introduce the field staff to the children at the start of the data collection week and prior to conducting the assessments. In addition, whenever possible, we will conduct the classroom observation prior to the child assessments to facilitate children becoming acquainted with the field staff before being assessed. Assessors also will be trained to establish rapport by talking to individual children to set them at ease before beginning the child assessment. If the child refuses to participate, the study team will not insist on continuing. Field staff also will tell the children that they will receive a gift when they’re all done with the activities. Particularly for the youngest children, we will invite familiar staff to be present when the children are assessed. The study team will also engage in a short warm up activity with children (e.g., play with puppets) to help put them at ease.

Teacher and Assistant Teacher Survey

Two weeks prior to the scheduled on-site data collection week for a center, we will mail packets to the OSC to distribute to the teacher and assistant teacher in each of the three sampled classrooms at the center. Each packet will contain a paper-and-pencil survey (see Appendix 18 for teachers and Appendix 19 for assistant teachers) along with a cover letter from the Abt Study Director (Appendix 20), a letter from the ACF Project Officer (Appendix 21), a colorful MSHS Study flyer (Appendix 22), and a set of Frequently Asked Questions tailored to teachers and assistant teachers (Appendix 23). We will ask the OSC to distribute the packets upon receipt to allow the teachers and assistant teachers as much time as possible to complete the instruments and return them to the site supervisor by the end of the on-site visit. This procedure was employed successfully for the Head Start Impact Study. It is expected that the surveys will require no more than 40 minutes to complete for teachers, and 20 minutes for assistant teachers. The paper-and-pencil format will afford respondents the flexibility to complete the surveys in one session or several sessions at time(s) convenient for them. When we encounter teachers and/or assistant teachers with lower literacy rates, either self-identified or identified by the OSC as needing assistance, the site supervisor or a field interviewer will complete the survey as an in-person interview with the respondent during the on-site visit. Should it be necessary, pre-paid mailers will be left behind for the return of any surveys not completed by that deadline. For teachers and assistant teachers who do not submit their instruments on time and who need more time to complete their surveys after the on-site data collection week, we will have bilingual staff available to complete the survey as a phone interview.

Teacher Child Reports

Included in the survey packet for teachers will be a Teacher Child Report form for each sampled child in the teacher’s classroom (Appendix 24).¹¹ These report forms will be in the teachers preferred language (English or Spanish; according to information collected in the Classroom Sampling Form). The cover letter from the Abt Study Director (described above) for the teacher survey will ask the teacher to provide child ratings for each sampled child in their classroom by completing a Teacher Child Report form. Each teacher is expected to have an average of eight sampled children and each Teacher Child Report Form will require approximately 10 minutes to complete. For teachers with lower literacy level, either self-identified or identified by the OSC as needing assistance, the site supervisor or a field interviewer will complete the Teacher Child Reports as in-person interviews during the on-site visit. The MSHS study team will ask teachers to return their completed Teacher Child Report forms with their completed survey to the site supervisor by the end of the on-site visit. Should it be necessary, pre-paid mailers will be left behind for the return of any Teacher Child Reports not completed by that deadline. For those teachers who still do not submit their Teacher Child Reports using the pre-paid mailers, bilingual staff will be available to complete the Teacher Child Reports as phone interviews.

Classroom Observations

We will conduct a direct observation of teacher-child interactions that support children’s learning and development in all sampled classrooms using multiple observation instruments (Appendix 31). The site supervisor will work closely with the OSC to schedule the classroom observations. Observations are expected to be conducted during the morning hours for approximately two hours per classroom. These observations do not impose any burden.

B.3. Methods to Maximize Response Rates and Data Reliability

To encourage participation in the study, the team has prepared detailed recruitment materials including an informative newsletter, colorful flyers, posters, concise FAQs, and clearly written advance letters from ACF leadership, the ACF Project Officer and the Abt Study Director. Cover and consent letters are written to engage respondents, provide a description of the study, and underscore the importance of the study to help MSHS to better serve the needs of children and families in the future. We will send advance letters and recruitment materials to MSHS staff and parents regarding upcoming surveys and interviews to allow for ample time for preparation. We anticipates that many MSHS program staff and families will be motivated to participate if they feel invested in the study and if they feel their contribution is an important factor in the success of the study. These are all important elements that we will communicate to MSHS staff and families. Field staff are trained to be flexible, persistent and respectful. If there is any reluctance on the part of programs or centers, site supervisors and potentially senior data collection staff, all of whom are trained in refusal conversion strategies for gaining and maintaining cooperation, will utilize these skills to help secure participation.

While recognizing the many challenges presented by the study, we are confident that high response rates can be obtained with the center staff and the participating children and families by using and building on effective strategies and procedures used successfully on previous studies, including the National Head Start Impact Study. These strategies include selecting and developing data collection instruments at appropriate reading levels for ease of completion; offering incentives; utilizing multi-mode data collection approaches, such as completing paper-and-pencil surveys as interviews for those with low literacy levels; and sending multiple and varied types of reminders. We recognize the importance of obtaining high response rates to reduce the possibility of nonresponse bias and to enable making study findings more generalizable to the MSHS population.

B.4. Test of Procedures or Methods

Many of the scales and items in the proposed parent survey, child assessment, and Teacher Child Reports were successfully administered previously in FACES and Baby FACES. We plan to pilot the updated instruments that are adapted for the MSHS context with fewer than 10 respondents to assess the appropriateness of items and the length of administration. The instruments will also be reviewed by experts with MSHS population (For more information about experts, see Section A.8.2. Consultation with Experts Outside of the Study.)

B.5. Individuals Consulted on Statistical Methods

The study team is led by Wendy DeCourcey, the ACF Project Officer; Dr. Linda Caswell, Project Director; Dr. Erin Bumgarner, Deputy Project Director; Drs. Michael López and Sandra Barrueco, Co-Principal Investigators; and Camilla Heid, data collection lead. The team also consulted with Pam Broene, a senior statistician at Westat; Dr. Anne Wolf, an Associate Scientist at Abt Associates; and Mr. Cris Price, a senior statistician at Abt Associates, on statistical and analytic issues.

1 We use the term “program” as shorthand to represent both “grantees” and “delegate agencies.”

2 We will work with OHS to update the list of programs before finalizing the sampling frames. Programs that are known by OHS to have lost their funding or otherwise closed will be removed from the sampling frame and programs associated with new grants awarded since then will be added to the sampling frame.

3 The NAWS is an employment-based, random sample survey of migrant and seasonal crop workers conducted by the U.S. Department of Labor. Its primary purpose is to monitor the terms and conditions of agricultural employment and assess the conditions of farm workers. Data are collected throughout the year, over three cycles, to reflect the seasonality of agricultural production and employment (http://www.doleta.gov/agworker/naws.cfm).

4 Note that each center within a stratum has an equal probability of selection; however, the number of centers is to be allocated proportionally to the MSHS enrollment in that stratum, with more centers selected from strata with higher enrollment and fewer centers selected from strata with lower enrollment.

5 Although the median funded enrollment for MSHS programs is 282 (2013–2014 PIR), 10% of MSHS programs have a funded enrollment of 56 or fewer.

6 To ensure that the study team has data on all centers that are in the Classroom, Family and Child Component, any center directors within the Classroom, Family and Child Component sample who were not sampled for the Center Director mail survey in the Program and Center Component will be asked to complete the mail survey. For burden estimates (Part A) we assume the maximum possible number of respondents for the Center Director mailed survey (200 + 53 = 253).

7 The variance of a proportion is largest at P=50%, and decreases as P approaches 0 or 100% for a given sample size.

8 The Design Report estimated an average design effect of 1.78 from Head Start FACES 2003, which had a similar sample design to the MSHS, using three outcome variables: 1) Test de Vocabulario en Imagenes Peabody (TVIP), 2) Woodcock-Munoz Letter Word Identification (WMLW), and 3) Woodcock-Munoz Dictation (WMDICT) for Dual Language Learner (DLL) children who did not pass the English language screener. For these three outcomes plus Woodcock-Munoz Applied Problems, the Head Start Impact Study for spring 2003 estimated an average design effect of 1.67 for Spanish speaking children who attended Head Start centers in Puerto Rico.

9 MSHS Program staff at the Office of Head Start and our expert consultants have assured us that MSHS programs are open year-round; however, in the case that a program is not, our team can be flexible and administer the survey later in the data collection period if necessary.

10 Sampling will be done to ensure that there is as much overlap as possible between the 250 centers included in the Program and Center Component and those included in the Classroom, Family and Child Component; however, it is assumed that some of the 53 centers will need to be added to the group of centers receiving the survey. The reason that it is necessary to gather center-level data on each of the 53 centers in the Classroom, Family and Child Component is so that the study team can address research questions about associations between center characteristics and family characteristics.

11 Only teachers, not assistant teachers, will be asked to complete Teacher Child Reports.

File Type	application/vnd.openxmlformats-officedocument.wordprocessingml.document
File Title	Abt Single-Sided Body Template
Author	Erin Bumgarner
File Modified	0000-00-00
File Created	2021-01-22

MSHS OMB Part B_Draft 12-21-16_passback_Abt revisions-clean