The Abt statisticians assigned to NEXT are Dr. K.P. Srinath and Dr. Martin Frankel. Dr. Srinath will oversee the sampling and weighting processes for NEXT, including the development and implementation of imputation procedures. As part of the NEXT team, Dr. Frankel will provide expert technical support to Dr. Srinath in the areas of sampling and weighting. Dr. Srinath and Dr. Frankel were responsible for the sampling and weighting process for the HBSC 2006 study.
Dr. K.P. Srinath currently oversees sampling and estimation procedures for a wide variety of projects, providing guidance regarding construction of sampling frames, stratification, sample size determination, sample allocation and sample selection, and developing detailed weighting specifications for programming staff. Dr. Srinath has contributed to a number of important methodological studies and analyses. Dr. Srinath holds a Ph.D. in biostatistics from the University of California, Los Angeles and is an elected member of the International Statistical Institute.
Dr. Martin R. Frankel, a senior statistical scientist at Abt, has 30 years of experience applying statistical sampling and analysis to social and business issues. He is nationally recognized for his expertise in the design, execution, and analysis of major national sample surveys for a number of government agencies and commercial enterprises. He is also well known for his designs of longitudinal surveys in the field of education and his knowledge of NCES Statistical Standards. Dr. Frankel served on an invited Standards Review Panel for NCES, in which capacity he was asked to provide advice to NCES that will help it make the Standards more effective. Dr. Frankel is the coauthor of two important books—Inference from Survey Samples, and Total Survey Error—and has done pioneering work in the construction of multistage samples for ED. He is one of a small number of statisticians whose work essentially sets standards in the survey industry. Dr. Frankel has a Ph.D. in mathematical sociology from the University of Michigan.
Nonresponse Bias Analysis in NEXT
Bias in a survey estimate because of nonresponse consists of two components. The first is the nonresponse rate and the second is the difference between respondents and nonrespondents in the population parameter that is being estimated. For example, if we are estimating a population percentage by selecting a simple random sample and computing the sample percentage and there is nonresponse, the bias in the sample percentage due to nonresponse is given by
where is the sample percentage based on respondents, is the response rate, is the population percentage among the respondents and is the population percentage among the nonrespondents. Therefore, it is important to examine both the response rate and the differences between the responding and nonresponding groups in the analysis of bias in the estimates due to nonresponse. We describe below the steps that we intend to follow for nonresponse bias analysis due to nonresponse by some schools in the sample in NEXT. These steps are in accordance with the statistical standards set up by the National Center for Education Statistics (NCES) for nonresponse bias analysis (http://nces.ed.gov/StatProg/2002/std4_4.asp ).
1. Examination of Response Rates
We will examine both the overall response rate and the response rates for various subgroups as per the guideline 4-4-2A under NCES Statistical Standards. High response rates for the entire sample but also for subgroups might indicate that there is no need for further analysis of bias due to nonresponse (Bose, 2001). Large differences in the response rates for subgroups serve as indicators that potential bias may exist (Brick & Bose, 2001). We plan to examine school response rates by: (1) census division;
(2) rural and urban; (3) enrollment (large schools vs. small schools); (4) proportion of minority students; (5) poverty index for schools; and (6) school type - public, Catholic and private schools. It is possible to look at the rates by subgroups as this information is available for both respondent and nonrespondent schools within the sampling frame. As an example, if the response rates for schools with high-income students (low poverty index) and schools with low-income (high poverty index) are very different, then any difference in characteristics of interest (like percent of students who are obese or who have low physical activity) between these schools would result in a bias in the estimates.
For each of these variables, we plan to examine selected characteristics, such as obesity, low physical activity, and tobacco use, based on the respondents in each group. If group differences are found for both the selected characteristic and response rates, there is reason to believe that there is bias in the estimates. We will also investigate the sampling frame characteristics of these schools in each subgroup. We will make appropriate weighting adjustments to reduce this bias.
2. Comparison of Sample and Frame Estimates
Per the NCES guideline 4-4-2C, we will use the sampling weight based on the probability of selection of responding schools without any nonresponse adjustment and the data from the responding schools to compute population estimates of some characteristics available (not used for stratification at the time of selection of schools) on the sampling frame. These estimates will be compared with the population values. For example, the total number of weighted students by grade based on the respondents can be compared to the number on the sampling frame. If there are large differences taking into account the sampling error, then this may indicate bias because of nonresponse. We will also get estimates of students in responding schools by race/ethnicity, and compare this to the total computed from the population of schools on the frame to determine whether there is any bias in the estimates.
3. Comparison of estimates based on respondents to estimates from external sources
Per the NCES guideline 4-4-2C, we will compare estimates of the prevalence of selected health behaviors from the 2009 Health Behavior in School-Age Children Survey, Youth Risk Behavior Survey (YRBS), and Monitoring the Future Survey to determine whether there is large difference in the survey estimates. A large difference which cannot be attributed to sampling error may indicate a bias in the estimates. This approach is limited as differences may not be solely due to nonresponse.
4. Comparisons of Respondents by Successive Levels of Recruitment Effort
As per the guideline 4-4-2D by NCES, we plan to compare schools that agree to participate in the survey after the first contact with those that agree after several attempts or those that refuse first and then later agree. Estimates of student level characteristics will be computed based on each successive wave of participating schools (i.e., adding respondents in the order of level of effort used to recruit the school) and the sampling weights based on probabilities of selection. If the estimates based on the initial sample and successively larger samples have a trend of either increasing or decreasing, this may be an indication of bias because of nonresponse.
For example, if the percentage of students who are obese increases significantly as the number of responding schools increase, this might indicate that we are underestimating the percent of students who are obese.
5. Nonresponse Propensity Model
As suggested in NCES guideline 4-4-2B, we will examine the possibility of constructing a propensity score model to estimate the probability of a school in the sample responding to the survey both for respondents and nonrespondents. This is called a propensity score. The estimated propensity scores come from a logistic regression model. The survey statisticians at Abt Associates have experience working with propensity score models for dealing with problems of noncoverage and nonrespoonse (Srinath et al, 2009). The model will be based on variables which are available both for nonresponding and responding schools. Census division, rural/urban, enrollment, Catholic/private/public, proportion minority, poverty index are some of the variables that will be considered. Schools will be grouped using the estimated propensity scores. Within each group we will compare the frame characteristics of responding and nonresponding schools. This may help to determine the survey characteristics of students in schools that do not respond. For example if nonresponding schools with low propensity scores happened to be rural and low income schools, then the characteristics of the responding schools will provide information on the bias because of these nonresponding schools. This grouping in addition to assessing the bias will also provide a method of forming weighting classes for adjusting the weights to reduce the bias due to nonresponse.
References:
NCES Statistical Standard 4-4. Nonresponse Bias Analysis. http://nces.ed.gov/StatProg/2002/std4_4.asp
Analysis of Potential Nonresponse Bias Analysis, J.M. Brick and J. Bose
Proceedings of the Annual Meeting of the American Statistical Association, 2001.
Nonresponse Bias Analysis at the National Center for Education Statistics. J. Bose
Proceedings of the Statistics Canada Symposium 2001.
Compensating for Noncoverage of Nontelephone Households in Random-Digit-Dialing Surveys: A Comparison of Adjustments Based on Propensity Scores and Interruption in Telephone Service. K.P. Srinath, Martin R. Frankel, David C. Hoaglin, and Michael P. Battaglia. Journal of Official Statistics, Vol.25, No.1, 2009, pp 77-98.
File Type | application/msword |
File Title | Nonresponse Bias Analysis in NEXT |
Author | SrinathK |
Last Modified By | Ronald J. Iannotti |
File Modified | 2009-12-31 |
File Created | 2009-12-31 |