Download:
pdf |
pdfANNEX I
FORMULAS FOR ESTIMATING MEANS AND VARIANCES
Formulas for estimating means and variances
Exact formulas for variances using our sampling strategy are not available, since we are using a
systematic random sampling procedure (with a random start) in the first stage.1 As noted in the
text, the estimated variance based on a simple random sample provides a conservative estimate
for the variance with systematic sampling, so we use the formulas associated with (stratified)
simple random sampling in the first stage. The formulas below were derived for our case of a
two stage simple random sample without replacement, with stratification in both stages (i.e.,
stratification of regions in the first stage and stratification of respondent groups in the second
stage), and are consistent with the approach and formulas found in Särndal, et al. (2003), chapter
4.2
For a population of K potential respondents in our sample universe, a consistent and
asymptotically unbiased estimate of the population mean of response variable y is given by:
1)
,
where is the estimate of the population total of y and
is the estimate of K. These estimates
are determined by the following formulas:
2)
3)
,
where h is the first stage stratum number, H is the total number of first stage strata (= 6 for the
full population), Nh is the total number of towns in stratum h, nh is the number of sample towns
in stratum h, Mig is the total number of potential respondents of type g in town i, and
1
2
is the
See Särndal, et al. (2003), section 3.4.4 for an explanation of this point.
Our exact case is not shown in Särndal, et al. (2003), so the formulas were derived using the same approach.
sample mean of y for respondent group g in town i (
, where mig is the number
of sample respondents of type g in town i and yigk is the value of y for respondent k of type g in
town i).
A consistent and asymptotically unbiased estimator of the variance of
4)
where
is
,
is given by
5)
and where
,
,
, and
.
Equations 1) to 5) can also be used to estimate the mean and variance of y for
subpopulations of the respondent universe. For example, to estimate the mean and variance for a
subset of the first stage strata (h), the sums in these equations will be over the selected strata,
rather than all six strata. Similarly, to estimate the mean and variance for a subset of the
respondent groups (g), the sums will be over the selected groups rather than all G groups. We
use this fact to estimate the means and variances for subpopulations discussed in the text,
including communities with vs. without a hospital and communities in different regions.
File Type | application/pdf |
Author | %USERNAME% |
File Modified | 2012-05-23 |
File Created | 2012-05-23 |