u. s. dEPARTMENT OF eDUCATION
institute of education sciences
National Center for Education Evaluation and regional assistance
to: OMB
from: Lauren Angelo
subject: Update on the Conversion Magnet School Evaluation
date: 2/1/08
As requested in the Terms of Clearance for the Conversion Magnet Schools Evaluation (OMB# 1850-0832), we are reporting back to OMB on the results of the feasibility phase.
BACKGROUND
Magnet schools and programs were originally designed to help address racial equity issues in American public education. More recently, they have become an important component of public school choice as well as a possible mechanism to improve the achievement of all students, particularly students who are disadvantaged. Magnet schools are typically based in neighborhoods with high concentrations of socio-economically disadvantaged and/or minority students. These schools largely serve students who reside in their attendance zone (resident students). When the schools either shift entirely from a traditional school to a magnet school (a conversion) or begin operating a magnet program, a thematic or specialized approach to instruction is adopted with the purpose of attracting other, usually more advantaged, students to attend the magnet program/school from other neighborhoods and districts. The overall goal is to create a more diverse mix of students in the school, with the expectation that this diversity, combined with a better academic program, will improve academic achievement and reduce minority group isolation particularly for resident students.
In an effort to provide more rigorous information about its program, ED’s Office of Innovation and Improvement (OII) requested that the Institute of Education Sciences (IES) consider evaluating the Department’s Magnet School Assistance Program (MSAP) or one of its program models. The two offices agreed to focus on elementary schools converting to whole-school magnets, because it is the most common approach used among districts funded by the MSAP and is consistent with NCLB efforts to turn around low performing schools.1 OII’s greatest policy interest was in the effects on resident students, because they are the largest group of students served by magnet programs and tend to be more disadvantaged. Non-resident students who must actively apply for admission comprise a much smaller group of magnet students and tend to be more advantaged and higher achieving.
The best available method to assess the relationship between magnet school conversion, minority group isolation, and student achievement for the resident students is a comparative interrupted time series analysis (ITS). The strongest ITS design would compare resident student outcomes with those in matched, non-magnet comparison schools in the same district for years prior to and post conversion. Lottery-based analysis could be conducted for the advantaged students who apply to enroll in the magnet.
In September 2006, NCEE awarded a contract to AIR/BPA (with Julian Betts as Principal Investigator) to determine if a sufficient number of 2004 and 2007 MSAP grantees had the data necessary to conduct an ITS evaluation (feasibility phase) and to conduct the evaluation if NCEE called for it (the option).
RESULTS OF THE FEASIBILITY PHASE
The contractor team screened both sets of grantees and found approximately 25 grantee schools (in 13 districts, 8 states) that were converting to magnet programs and that had the consistent achievement data (2-3 years before and a similar number after the conversion) for an ITS. The contractor determined that there were approximately 50 comparison schools in the same districts that also had sufficient data. This total number of schools that meet the study’s criteria2 means the study will be able to detect achievement impacts as small as 0.17 over the three years, a target we think is reasonable for schools that receive approximately $600,000 per year for each of the three years of their MSAP grant
As a result, we plan to execute the option to conduct the evaluation and anticipate a quick start. The evaluation design approved by OMB calls for a principal survey to be conducted in the last year of the MSAP grant. Five of the eligible magnet schools and about 9 comparison schools are from the 2004 round of MSAP grants and surveys will also be administered to those principals in spring 2008.
IES Responded to Two OMB Questions Concerning the memo: Response to Terms of Clearance for Conversion Magnet Schools Evaluation (1850-0832)
OMB QUESTION 1 - NCEE provided three criteria (response of 6/19/07 to OMB questions) that would help it determine whether the proposed study was feasible. It appears that at most one of these was met. Please provide a more detailed discussion of the feasibility results, specifically in light of those criteria.
IES Response: As shared in the 6/19/07 response to OMB questions, “…the determination of whether or not to implement the evaluation [was] based on the availability of data to support the interrupted times series (ITS).” The response specified that the necessary data for this analysis included 50 magnet schools and 100 non-magnet comparison schools and that (1) each magnet school must be accompanied by one or more non-magnet comparison schools from the same district with similar demographic and achievement profiles, (2) the magnet and comparison schools must have existed and administered the same standardized tests to their students for at least 3 years prior to and 3 years after the magnet conversion date, and (3) the districts must be able and willing to provide longitudinal individual student records data. These criteria were established based on prior power calculations that demonstrated this overall sample (50 magnets, 100 comparison schools) would be sufficient to detect an MDE of .19 for a sub-sample of approximately 20%.
However, subsequent to the submission of that response, we and our contractor refined the power calculations for the ITS and tailored these calculations to focus on (1) estimation of effects on the large group of resident students (ED’s greatest policy interest), rather than smaller subsamples, and (2) the particular schools that are eligible and willing to participate in the study. The original calculations had been overly conservative in the assumptions (about the R-squared, intra-cohort correlation, etc.) because there were based on a limited set of published data that were not particularly aligned with our study parameters. The new power calculations draw on a wider set of information, including published data for the specific sample of magnet schools recruited; these new calculations indicate that we would need substantially fewer schools, 15-16 magnet schools and 32-34 comparison schools (depending on reading or math outcome), to achieve an MDE of 0.20 for the resident student sample, even if not all of the schools have a full 3 years of baseline (pre-grant) achievement data (see appendix).
According to the criteria established earlier:
1) We have identified 23 conversion magnet schools and 48 comparison schools in 13 districts, an average of 2.2 comparison schools per magnet school. That full set of schools will be used for an analysis of math achievement gains, while 21 have the data to conduct the analysis of reading achievement gains.
2) Among the identified schools/districts, we have an average of 2.6 years of baseline data and expect to collect the full three years of post-grant data.
3) The 13 districts in the identified sample have agreed to provide the longitudinal data. We have another 2 districts we believe are eligible for the study, and are pursuing their cooperation; if they are included in the study, the MDE will be reduced.
Overall, with our current sample of schools that meet criteria and are willing to participate, we will be able to detect an MDE of .167 for resident students over the three-year period of the MSAP grant (see power analysis results – Appendix B Table 2). Although or primary analysis will focus on the resident students as a whole, we will still be able to detect effects for subgroups of 30% and likely less. This would allow us the opportunity to conduct analyses for specific grade levels and some minority groups.
OMB QUESTION 2 - In addition, please clarify whether the number of identified schools represents those for which participation (via the districts) has been secured, or merely the universe from which NCEE must secure agreement to participate.
IES Response: All 23 + 48 schools in the 13 districts that we previously identified have been screened, determined to have the necessary data, and are willing to participate. These schools/districts have received MSAP grants through the Office of Innovation and Improvement (OII), and OII has encouraged grantee cooperation (e.g., EDGAR requires grantees to participate in a program evaluation if one is conducted). As noted above, there are two other districts that appear eligible but for whom we are seeking their agreement to participate.
APPENDIX
REVISED POWER CALCULATIONS
To estimate the number of magnet schools needed to yield an MDES of 0.2 or less for the resident student population and various sub-samples, we assumed that the desired sample would resemble, in number of students tested, average number of years of baseline data, and average number of comparison schools, the average characteristics of the sample of magnet schools in our list of eligible magnet schools. For this sample, we calculated the average number of students tested in each of math and English Language Arts in the most recent year available, for the magnet schools and the provisional sample of comparison schools. We assumed that (1) 80% of the students at each school would be resident, (2) on average, there were 2.5 years of baseline test-score data available before the year of magnet conversion, and (3) on average there were two comparison schools for each magnet. (The actual sample means were slightly larger, at 2.6 and 2.2 respectively, but we wanted to be somewhat conservative in our estimates.)
One particularly important parameter in the power calculation is ρ, defined as the proportion of total test score variance that is between cohorts within schools. Unfortunately, there is very little published data to help guide a choice ρ. For this parameter, for math we used an estimate of 0.02, which is the median estimate obtained by Bloom (1999) in his study of grade 2 and grade 6 math test scores in Rochester, New York. (He obtained the same estimate for both grades.) For reading, we took a simple average of Bloom’s median estimates for grades 2 and 6 in Rochester, plus six other estimates for grade 2 from six other districts around the country, kindly provided by Michael Garet of AIR (with permission of ED). The average of these was 0.022, which is considerably above the Rochester results, of roughly 0.0025. We emphasize that we have used all the estimates of ρ of which we are aware. (We checked with Howard Bloom, for example, and he confirmed that the Rochester estimates in his 1999 paper are the only estimates of which he is aware.)
Another important parameter is the variance across magnet schools in the true effects of converting a school into a magnet, which is referred to as 2 in Appendix A of the design document for this study (Bloom, Doolittle, Garet, Christenson and Eaton, 2004). The design document, lacking any information on the value of , “guesstimated” a value of 0.01, which is what we have used in our main power calculations. The authors chose this figure on the presumption that a reasonable 95% confidence interval for the true effects of magnets might be -0.05 to 0.35, (centered on a mean effect of 0.15, which as cited elsewhere in their report is the effect size of a full year of school on math achievement and the effects estimated in the Tennessee class-size reduction experiment). The 95% confidence interval suggests  has a standard deviation of 0.1, and a variance of 0.12 =.01. This estimate of variance in the true effects is fairly large, in the sense that sometimes a school that becomes a magnet performs slightly worse, and in some cases substantially better (+0.35 effect size). This is a conservative estimate in terms of our power analysis because the number of schools needed to obtain a given MDES rises with 2.
Our estimates of the number of schools needed to reach a MDES of 0.2 or lower are probably conservative (that is, on the high side). First, recall from above that we assume 2.5 years of baseline data on average and 2 comparison schools per magnet, both of which are below the means of 2.6 and 2.2 respectively. We also assume that only 80% of students tested will be relevant. This is likely to be true for the magnet schools, when we study only resident students. It is less clear to us that we will want to exclude nonresident students from comparison schools, or even that many of the comparison schools will have any nonresident students to speak of. A less conservative but still reasonable estimate is that 90% of tested students could be included in our analysis. Finally, the design document assumed for  a value of 0.01, which reflects an interest in estimating the average impact for the population of magnet schools from which the schools in the sample were drawn. If we instead set =0, we are focusing instead on estimating the average effect for the particular schools in our sample, which may be more appropriate.
Sample Required for an MDES of 0.2 for Resident Students
To calculate the number of magnet schools required to provide a Minimum Detectable Effect Size (MDES) of 0.2 or less, we drew on data from the sample of magnet schools that met study eligibility requirements. In particular, we based several key assumptions on characteristics of the sample, including the number of students tested in each of math and English Language Arts in the most recent year available; the number of years of baseline achievement data available; and the number of comparison schools available. Based on these calculations, Table 1 shows that, at a minimum, we need 15-16 magnet schools in order to detect an effect size of 0.2 for the overall resident student population (pooled across all grades tested).
Table 1 Number of Magnet Schools Needed to Yield a MDES of 0.2 or Less, Based on Characteristics of Magnet Schools Already in Our Sample
| Subgroup Size as % of Full Student Sample | English Language Arts Minimum Number of Magnet Schools Needed | Mathematics Minimum Number of Magnet Schools Needed | 
| 20 | 25 | 24 | 
| 30 | 21 | 20 | 
| 40 | 19 | 18 | 
| 50 | 18 | 17 | 
| Full Resident Sample | 16 | 15 | 
Notes: Calculations assume subgroups are equally distributed across magnet and comparison schools. MDES based on 80% power and alpha of 0.05. Our calculations are based on the characteristics of all magnet schools and comparison schools in our sample.
Power Calculations Based on Screened and Willing Sample
What is the MDES for the sample of magnet schools that we in fact have recruited? We have identified 23 eligible magnets (compared to the 16 needed for an MDES of 0.20). All of these are able to provide consistent data for analysis of math achievement, while 21 can provide data for analysis of reading achievement.
Table 2 MDES Based on Characteristics of Magnet Schools Eligible for Inclusion in Our Sample
| Subgroup Size as % of Full Resident Student Sample | English Language Arts MDES | Mathematics MDES | 
| 20 | 0.211 | 0.209 | 
| 30 | 0.194 | 0.193 | 
| 40 | 0.185 | 0.184 | 
| 50 | 0.179 | 0.178 | 
| Full Resident Sample | 0.167 | 0.167 | 
Summary: Feasibility of a Comparative Interrupted Time Series Study of Resident Students
The power analysis suggests that we should be able to detect effect sizes as small as 0.167 when we test for an overall effect on resident students. We obtain a MDES smaller than 0.20 when we have sub-samples of 30% or even less. This finding opens up the strong possibility that we can obtain fairly precise estimates of the effects of magnetization for students in individual grades, rather than pooled across grades. Alternatively, we could obtain estimates for demographic subgroups when we pool across grades. We will almost certainly be able to test for an effect on non-white students and white students separately. Depending on the demographics in our final sub-sample, we may be able to break down the non-white category at least into its larger subgroups.
1 The 2007 MSAP grant competition had a competitive priority to encourage schools “in need of improvement” to apply.
2 To be eligible for the study, schools need to have two or three years of assessment data both pre-and post-conversion using the same test and have stable attendance zones. They also need to have at least one, preferably two, comparison schools from the same district as the conversion magnet school..
| File Type | application/msword | 
| File Title | Elegant Memo | 
| Author | jonathan.jacobson | 
| Last Modified By | #Administrator | 
| File Modified | 2010-03-23 | 
| File Created | 2010-03-23 |