Download:
pdf |
pdfAttachment F
MEMORANDUM FOR:
Lisa Clement
Survey Director for CPS & Time Use, Associate Directorate for
Demographic Programs
From:
James Treat
Chief, Demographic Statistical Methods Division
Subject:
Source and Accuracy Statement for the June 2014 CPS Microdata
File on Fertility and Birth Expectation
Attached is the statement on the source of the data and accuracy of the estimates for the June
2014 CPS Microdata File on Fertility and Birth Expectation.
If you have any questions or need additional information, please contact Stephen Clark of the
Demographic Statistical Methods Division via email at dsmd.source.and.accuracy@census.gov.
Attachment
cc:
G. Weyland
T. Marshall
K. Woods
R. Kreider
W. Savino
J. Farber
(ADDP)
(SEHSD)
(ACSD)
(DSMD)
Y. Cheng
T. Kennel
J. Scott
D. Hornick
S. Clark
R. Hoop
Source of the Data and Accuracy of the Estimates for the June 2014 CPS
Microdata File on Fertility and Birth Expectation
Table of Contents
SOURCE OF THE DATA.............................................................................................................1
Basic CPS.............................................................................................................................1
June 2014 Supplement .........................................................................................................2
Estimation Procedure ...........................................................................................................2
ACCURACY OF THE ESTIMATES ..........................................................................................3
Sampling Error .....................................................................................................................3
Nonsampling Error...............................................................................................................3
Nonresponse .........................................................................................................................3
Sufficient Partial Interview ..................................................................................................4
Coverage ..............................................................................................................................4
Comparability of Data..........................................................................................................5
A Nonsampling Error Warning ............................................................................................6
Standard Errors and Their Use .............................................................................................6
Estimating Standard Errors ..................................................................................................7
Generalized Variance Parameters ........................................................................................7
Standard Errors of Estimated Numbers ...............................................................................8
Standard Errors of Estimated Percentages ...........................................................................9
Standard Errors of Estimated Differences .........................................................................10
Standard Errors of Ratios ...................................................................................................11
Standard Errors of Fertility Ratios .....................................................................................12
Standard Errors of Quarterly or Yearly Averages .............................................................13
Accuracy of State Estimates ..............................................................................................13
Standard Errors of State Estimates ....................................................................................13
Standard Errors of Regional Estimates ..............................................................................14
Standard Errors of Groups of States ..................................................................................14
Technical Assistance ..........................................................................................................15
REFERENCES .............................................................................................................................20
Tables
Table 1. CPS Coverage Ratios: June 2014 .....................................................................................5
Table 2. Estimation Groups of Interest and Generalized Variance Parameters .............................8
Table 3. Parameters for Computation of Standard Errors for Labor Force Characteristics: June
2014....................................................................................................................................16
Table 4. Parameters for Computation of Standard Errors for Fertility and Birth Expectation
Characteristics: June 2014 .................................................................................................17
Table 5. Parameters for Computation of Standard Errors for Fertility Ratios: June 2014 ...........18
Table 6. Factors and Populations for State Standard Errors and Parameters: June 2014 .............19
Table 7. Factors and Populations for Regional Standard Errors and Parameters: June 2014 .......19
Source of the Data and Accuracy of the Estimates for the June 2014 CPS
Microdata File on Fertility and Birth Expectation
SOURCE OF THE DATA
The data in this microdata file are from the June 2014 Current Population Survey (CPS). The
U.S. Census Bureau conducts the CPS every month, although this file has only June data. The
June survey uses two sets of questions, the basic CPS and a set of supplemental questions. The
CPS, sponsored jointly by the Census Bureau and the U.S. Bureau of Labor Statistics, is the
country’s primary source of labor force statistics for the entire population. The Census Bureau
and the U.S. Bureau of Labor Statistics also jointly sponsor the supplemental questions for June.
Basic CPS. The monthly CPS collects primarily labor force data about the civilian
noninstitutionalized population living in the United States. The institutionalized population,
which is excluded from the population universe, is composed primarily of the population in
correctional institutions and nursing homes (98 percent of the 4.0 million institutionalized people
in Census 2010). Interviewers ask questions concerning labor force participation about each
member 15 years old and over in sample households. Typically, the week containing the
nineteenth of the month is the interview week. The week containing the twelfth is the reference
week (i.e., the week about which the labor force questions are asked).
The CPS uses a multistage probability sample based on the results of the decennial census, with
coverage in all 50 states and the District of Columbia. The sample is continually updated to
account for new residential construction. When files from the most recent decennial census
become available, the Census Bureau gradually introduces a new sample design for the CPS.
Every ten years the CPS first stage sample is redesigned1 reflecting changes based on the most
recent decennial census. In the first stage of the sampling process, primary sampling units
(PSUs) 2 were selected for sample. In the 2000 design, the United States was divided into 2,025
PSUs. These were then grouped into 824 strata and one PSU was selected for sample from each
stratum. In the 2010 sample design, the United States was divided into 1,987 PSUs. These PSUs
were then grouped into 852 strata. Within each stratum, a single PSU was chosen for the sample,
with its probability of selection proportional to its population as of the most recent decennial
census. In the case of strata consisting of only one PSU, the PSU was chosen with certainty.
In April 2014, the Census Bureau began phasing out the 2000 sample and replacing it with the
2010 sample, creating a mixed sampling frame. Two simultaneous changes occur during this
phase-in period. First, within the PSUs selected for both the 2000 and 2010 designs, sample
households from the 2010 design gradually replace sample households selected for the 2000
design. Second, new PSUs selected for only the 2010 design gradually replace outgoing PSUs
selected for only the 2000 design. By July 2015, the new 2010 sample design will be completely
implemented and the sample will come entirely from the 2010 redesigned sample.
1
2
For detailed information on the 2010 sample redesign, please see reference [1].
The PSUs correspond to substate areas (i.e., counties or groups of counties) that are geographically contiguous.
2
Approximately 73,000 housing units were selected for sample from the sampling frame in June.
Based on eligibility criteria, 11 percent of these housing units were sent directly to computerassisted telephone interviewing (CATI). The remaining units were assigned to interviewers for
computer-assisted personal interviewing (CAPI).3 Of all housing units in sample, about 60,000
were determined to be eligible for interview. Interviewers obtained interviews at about 54,000 of
these units. Noninterviews occur when the occupants are not found at home after repeated calls
or are unavailable for some other reason.
June 2014 Supplement. In June 2014, in addition to the basic CPS questions, interviewers
asked supplementary questions on fertility to women 15 to 50 years of age.
Estimation Procedure. This survey’s estimation procedure adjusts weighted sample results to
agree with independently derived population estimates of the civilian noninstitutionalized
population of the United States and each state (including the District of Columbia). These
population estimates, used as controls for the CPS, are prepared monthly to agree with the most
current set of population estimates that are released as part of the Census Bureau’s population
estimates and projections program.
The population controls for the nation are distributed by demographic characteristics in two
ways:
•
•
Age, sex, and race (White alone, Black alone, and all other groups combined).
Age, sex, and Hispanic origin.
The population controls for the states are distributed by race (Black alone and all other race
groups combined), age (0-15, 16-44, and 45 and over), and sex.
The independent estimates by age, sex, race, and Hispanic origin, and for states by selected age
groups and broad race categories, are developed using the basic demographic accounting formula
whereby the population from the 2010 Census data is updated using data on the components of
population change (births, deaths, and net international migration) with net internal migration as
an additional component in the state population estimates.
The net international migration component in the population estimates includes a combination of
the following:
•
•
•
•
•
3
Legal migration to the United States.
Emigration of foreign-born and native people from the United States.
Net movement between the United States and Puerto Rico.
Estimates of temporary migration.
Estimates of net residual foreign-born population, which include unauthorized
migration.
For further information on CATI and CAPI and the eligibility criteria, please see reference [2].
3
Because the latest available information on these components lags the survey date, it is necessary
to make short-term projections of these components to develop the estimate for the survey date.
ACCURACY OF THE ESTIMATES
A sample survey estimate has two types of error: sampling and nonsampling. The accuracy of an
estimate depends on both types of error. The nature of the sampling error is known given the
survey design; the full extent of the nonsampling error is unknown.
Sampling Error. Since the CPS estimates come from a sample, they may differ from figures
from an enumeration of the entire population using the same questionnaires, instructions, and
enumerators. For a given estimator, the difference between an estimate based on a sample and
the estimate that would result if the sample were to include the entire population is known as
sampling error. Standard errors, as calculated by methods described in “Standard Errors and
Their Use,” are primarily measures of the magnitude of sampling error. However, they may
include some nonsampling error.
Nonsampling Error. For a given estimator, the difference between the estimate that would
result if the sample were to include the entire population and the true population value being
estimated is known as nonsampling error. There are several sources of nonsampling error that
may occur during the development or execution of the survey. It can occur because of
circumstances created by the interviewer, the respondent, the survey instrument, or the way the
data are collected and processed. For example, errors could occur because:
•
•
•
•
•
The interviewer records the wrong answer, the respondent provides incorrect
information, the respondent estimates the requested information, or an unclear
survey question is misunderstood by the respondent (measurement error).
Some individuals that should have been included in the survey frame were missed
(coverage error).
Responses are not collected from all those in the sample or the respondent is
unwilling to provide information (nonresponse error).
Values are estimated imprecisely for missing data (imputation error).
Forms may be lost, data may be incorrectly keyed, coded, or recoded, etc.
(processing error).
To minimize these errors, the Census Bureau applies quality control procedures during all stages
of the production process, including the design of the surveys, the wording of questions, the
review of the work of interviewers and coders, and the statistical review of reports.
Two types of nonsampling error that can be examined to a limited extent are nonresponse and
undercoverage.
Nonresponse. The effect of nonresponse cannot be measured directly, but one indication of its
potential effect is the nonresponse rate. For the June 2014 basic CPS, the household-level
nonresponse rate was 11.12 percent. The person-level nonresponse rate for the fertility
supplement was an additional 6.0 percent.
4
Since the basic CPS nonresponse rate is a household-level rate and the fertility supplement
nonresponse rate is a person-level rate, we cannot combine these rates to derive an overall
nonresponse rate. Nonresponding households may have fewer persons than interviewed ones, so
combining these rates may lead to an overestimate of the true overall nonresponse rate for
persons for the fertility supplement.
Sufficient Partial Interview. A sufficient partial interview is an incomplete interview in which
the household or person answered enough of the questionnaire for the supplement sponsor to
consider the interview complete. The remaining supplement questions may have been edited or
imputed to fill in missing values. Insufficient partial interviews are considered to be
nonrespondents. Refer to the supplement overview attachment in the technical documentation for
the specific questions deemed critical by the sponsor as necessary to be answered in order to be
considered a sufficient partial interview.
Coverage. The concept of coverage in the survey sampling process is the extent to which the
total population that could be selected for sample “covers” the survey’s target population.
Missed housing units and missed people within sample households create undercoverage in the
CPS. Overall CPS undercoverage for June 2014 is estimated to be about 13 percent. CPS
coverage varies with age, sex, and race. Generally, coverage is larger for females than for males
and larger for non-Blacks than for Blacks. This differential coverage is a general problem for
most household-based surveys.
The CPS weighting procedure partially corrects for bias from undercoverage, but biases may still
be present when people who are missed by the survey differ from those interviewed in ways
other than age, race, sex, Hispanic origin, and state of residence. How this weighting procedure
affects other variables in the survey is not precisely known. All of these considerations affect
comparisons across different surveys or data sources.
A common measure of survey coverage is the coverage ratio, calculated as the estimated
population before poststratification divided by the independent population control. Table 1
shows June 2014 CPS coverage ratios by age and sex for certain race and Hispanic groups. The
CPS coverage ratios can exhibit some variability from month to month.
5
Table 1. CPS Coverage Ratios: June 2014
Total
White only
Black only
Residual race
Hispanic
All
Age
Male Female Male Female Male Female Male Female Male Female
group people
0-15
0.86
0.86
0.87
0.90
0.91
0.75
0.74
0.76
0.81
0.84
0.87
16-19 0.87
0.87
0.86
0.89
0.88
0.81
0.82
0.81
0.77
0.85
0.85
20-24 0.73
0.72
0.74
0.74
0.78
0.65
0.65
0.63
0.63
0.74
0.78
25-34 0.82
0.79
0.84
0.83
0.88
0.64
0.73
0.71
0.74
0.74
0.86
35-44 0.86
0.84
0.89
0.88
0.91
0.70
0.83
0.72
0.76
0.80
0.88
45-54 0.89
0.88
0.91
0.89
0.92
0.84
0.85
0.88
0.86
0.76
0.86
55-64 0.90
0.88
0.92
0.91
0.94
0.77
0.84
0.77
0.79
0.87
0.89
65+
0.93
0.93
0.92
0.93
0.92
0.92
0.96
0.95
0.88
0.90
0.90
15+
0.87
0.85
0.88
0.88
0.90
0.76
0.81
0.77
0.78
0.80
0.87
0+
0.87
0.85
0.88
0.88
0.90
0.75
0.80
0.77
0.79
0.81
0.86
Notes: (1) The Residual race group includes cases indicating a single race other than White or Black,
and cases indicating two or more races.
(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for
race and ethnicity, please see the “Generalized Variance Parameters” section.
Comparability of Data. Data obtained from the CPS and other sources are not entirely
comparable. This results from differences in interviewer training and experience and in differing
survey processes. This is an example of nonsampling variability not reflected in the standard
errors. Therefore, caution should be used when comparing results from different sources.
Data users should be careful when comparing the data from this microdata file, which reflects
Census 2010-based controls, with microdata files from January 2003 through December 2011,
which reflect 2000 census-based controls. Ideally, the same population controls should be used
when comparing any estimates. In reality, the use of the same population controls is not
practical when comparing trend data over a period of 10 to 20 years. Thus, when it is necessary
to combine or compare data based on different controls or different designs, data users should be
aware that changes in weighting controls or weighting procedures can create small differences
between estimates. See the discussion following for information on comparing estimates derived
from different controls or different sample designs.
Microdata files from previous years reflect the latest available census-based controls. Although
the most recent change in population controls had relatively little impact on summary measures
such as averages, medians, and percentage distributions, it did have a significant impact on
levels. For example, use of Census 2010-based controls results in about a 0.2 percent increase
from the 2000 census-based controls in the civilian noninstitutionalized population and in the
number of families and households. Thus, estimates of levels for data collected in 2012 and later
years will differ from those for earlier years by more than what could be attributed to actual
changes in the population. These differences could be disproportionately greater for certain
population subgroups than for the total population.
6
Users should also exercise caution because of changes caused by the phase-in of the Census
2000 files (see “Basic CPS”).4 During this time period, CPS data were collected from sample
designs based on different censuses. Three features of the new CPS design have the potential of
affecting published estimates: (1) the temporary disruption of the rotation pattern from August
2004 through June 2005 for a comparatively small portion of the sample, (2) the change in
sample areas, and (3) the introduction of the new Core-Based Statistical Areas (formerly called
metropolitan areas). Most of the known effect on estimates during and after the sample redesign
will be the result of changing from 1990 to 2000 geographic definitions. Research has shown
that the national-level estimates of the metropolitan and nonmetropolitan populations should not
change appreciably because of the new sample design. However, users should still exercise
caution when comparing metropolitan and nonmetropolitan estimates across years with a design
change, especially at the state level.
Caution should also be used when comparing Hispanic estimates over time. No independent
population control totals for people of Hispanic origin were used before 1985.
A Nonsampling Error Warning. Since the full extent of the nonsampling error is unknown,
one should be particularly careful when interpreting results based on small differences between
estimates. The Census Bureau recommends that data users incorporate information about
nonsampling errors into their analyses, as nonsampling error could impact the conclusions drawn
from the results. Caution should also be used when interpreting results based on a relatively
small number of cases. Summary measures (such as medians and percentage distributions)
probably do not reveal useful information when computed on a subpopulation smaller than
75,000.
For additional information on nonsampling error including the possible impact on CPS
data when known, refer to references [2] and [3].
Standard Errors and Their Use. The sample estimate and its standard error enable one to
construct a confidence interval. A confidence interval is a range about a given estimate that has
a specified probability of containing the average result of all possible samples. For example, if
all possible samples were surveyed under essentially the same general conditions and using the
same sample design, and if an estimate and its standard error were calculated from each sample,
then approximately 90 percent of the intervals from 1.645 standard errors below the estimate to
1.645 standard errors above the estimate would include the average result of all possible samples.
A particular confidence interval may or may not contain the average estimate derived from all
possible samples, but one can say with specified confidence that the interval includes the average
estimate calculated from all possible samples.
Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing
between population parameters using sample estimates. The most common type of hypothesis is
that the population parameters are different. An example of this would be comparing the
percentage of men who were part-time workers to the percentage of women who were part-time
workers.
4
The phase-in process using the 2010 Census files began in April 2014.
7
Tests may be performed at various levels of significance. A significance level is the probability
of concluding that the characteristics are different when, in fact, they are the same. For example,
to conclude that two characteristics are different at the 0.10 level of significance, the absolute
value of the estimated difference between characteristics must be greater than or equal to 1.645
times the standard error of the difference.
The Census Bureau uses 90-percent confidence intervals and 0.10 levels of significance to
determine statistical validity. Consult standard statistical textbooks for alternative criteria.
Estimating Standard Errors. The Census Bureau uses replication methods to estimate the
standard errors of CPS estimates. These methods primarily measure the magnitude of sampling
error. However, they do measure some effects of nonsampling error as well. They do not
measure systematic biases in the data associated with nonsampling error. Bias is the average
over all possible samples of the differences between the sample estimates and the true value.
Generalized Variance Parameters. While it is possible to compute and present an estimate of
the standard error based on the survey data for each estimate in a report, there are a number of
reasons why this is not done. A presentation of the individual standard errors would be of
limited use, since one could not possibly predict all of the combinations of results that may be of
interest to data users. Additionally, data users have access to CPS microdata files, and it is
impossible to compute in advance the standard error for every estimate one might obtain from
those data sets. Moreover, variance estimates are based on sample data and have variances of
their own. Therefore, some methods of stabilizing these estimates of variance, for example, by
generalizing or averaging over time, may be used to improve their reliability.
Experience has shown that certain groups of estimates have similar relationships between their
variances and expected values. Modeling or generalizing may provide more stable variance
estimates by taking advantage of these similarities. The generalized variance function is a
simple model that expresses the variance as a function of the expected value of the survey
estimate. The parameters of the generalized variance function are estimated using direct
replicate variances. These generalized variance parameters provide a relatively easy method to
obtain approximate standard errors for numerous characteristics. In this source and accuracy
statement, Table 3 provides the generalized variance parameters for labor force estimates, and
Tables 4 and 5 provides generalized variance parameters for characteristics from the June 2014
supplement. Tables 6 and 7 provide factors and population controls to derive U.S. state and
regional parameters.
The basic CPS questionnaire records the race and ethnicity of each respondent. With respect to
race, a respondent can be White, Black, Asian, American Indian and Alaskan Native (AIAN),
Native Hawaiian and Other Pacific Islander (NHOPI), or combinations of two or more of the
preceding. A respondent’s ethnicity can be Hispanic or non-Hispanic, regardless of race.
The generalized variance parameters to use in computing standard errors are dependent upon the
race/ethnicity group of interest. The following table summarizes the relationship between the
8
race/ethnicity group of interest and the generalized variance parameters to use in standard error
calculations.
Table 2. Estimation Groups of Interest and Generalized Variance Parameters
Race/ethnicity group of interest
Generalized variance parameters to
use in standard error calculations
Total population
Total or White
White alone, White AOIC, or White non-Hispanic population
Total or White
Black alone, Black AOIC, or Black non-Hispanic population
Black
Asian alone, Asian AOIC, or Asian non-Hispanic population
AIAN alone, AIAN AOIC, or AIAN non-Hispanic population
Asian, AIAN, NHOPI
NHOPI alone, NHOPI AOIC, or NHOPI non-Hispanic
population
Populations from other race groups
Asian, AIAN, NHOPI
Hispanic population
Hispanic
Two or more races – employment/unemployment and
educational attainment characteristics
Black
Two or more races – all other characteristics
Asian, AIAN, NHOPI
Notes: (1) AIAN is American Indian and Alaska Native and NHOPI is Native Hawaiian and Other
Pacific Islander.
(2) AOIC is an abbreviation for alone or in combination. The AOIC population for a race group
of interest includes people reporting only the race group of interest (alone) and people
reporting multiple race categories including the race group of interest (in combination).
(3) Hispanics may be any race.
(4) Two or more races refers to the group of cases self-classified as having two or more races.
Standard Errors of Estimated Numbers. The approximate standard error, sx, of an estimated
number from this microdata file can be obtained by using the formula:
s x ax 2 bx
(1)
Here x is the size of the estimate and a and b are the parameters in Table 3 or 4 associated with
the particular type of characteristic. When calculating standard errors from cross-tabulations
involving different characteristics, use the set of parameters for the characteristic that will give
the largest standard error.
9
Illustration 1
Suppose there were 3,246,000 unemployed women of ages 15 to 44 in the civilian labor force.
Use the appropriate parameters from Table 3 and Formula (1) to get
Illustration 1
Number of unemployed women ages 15 to
44 in the civilian labor force (x)
a parameter (a)
b parameter (b)
Standard error
90-percent confidence interval
3,246,000
-0.000028
2,788
94,000
3,091,000 to 3,401,000
The standard error is calculated as
s x 0.000028 3,246,000 2 2,788 3,246,000 94,000
The 90-percent confidence interval is calculated as 3,246,000 ± 1.645 × 94,000.
A conclusion that the average estimate derived from all possible samples lies within a range
computed in this way would be correct for roughly 90 percent of all possible samples.
Standard Errors of Estimated Percentages. The reliability of an estimated percentage,
computed using sample data for both numerator and denominator, depends on both the size of
the percentage and its base. Estimated percentages are relatively more reliable than the
corresponding estimates of the numerators of the percentages, particularly if the percentages are
50 percent or more. When the numerator and denominator of the percentage are in different
categories, use the parameter from Table 3 or 4 as indicated by the numerator.
The approximate standard error, sy,p, of an estimated percentage can be obtained by using the
formula:
s y, p
b
p100 p
y
(2)
Here y is the total number of people, families, households, or unrelated individuals in the base of
the percentage, p is the percentage 100*x/y(0 ≤ p ≤ 100), and b is the parameter in Table 3 or 4
associated with the characteristic in the numerator of the percentage.
Illustration 2
Suppose that 31.2 percent of the 62,683,000 women 15 to 44 years old were married when the
first child was born. Use the appropriate parameter from Table 4 and Formula (2) to get
10
Illustration 2
Percentage of women aged 15-44 who were
married when the first child was born (p)
Base (y)
b parameter (b)
Standard error
90-percent confidence interval
31.2
62,683,000
2,016
0.26
30.8 to 31.6
The standard error is calculated as
s y, p
2,016
31.2 100.0 31.2 0.26
62,683,000
The 90-percent confidence interval for the estimated percentage of women aged 15 to 44 who
were married when the first child was born is from 30.8 to 31.6 percent (i.e., 31.2 ± 1.645 ×
0.26).
Standard Errors of Estimated Differences. The standard error of the difference between two
sample estimates is approximately equal to
s x1 x2 s x1 s x2
2
2
(3)
where sx1 and sx2 are the standard errors of the estimates, x1 and x2. The estimates can be
numbers, percentages, ratios, etc. This will result in accurate estimates of the standard error of
the same characteristic in two different areas, or for the difference between separate and
uncorrelated characteristics in the same area. However, if there is a high positive (negative)
correlation between the two characteristics, the formula will overestimate (underestimate) the
true standard error.
Illustration 3
Suppose that of the 6,709,000 women in 2014 between 20-29 years of age who were ever
married, 65.7 percent were in the labor force, and of the 6,949,000 women in 2012 between 2029 years of age who were ever married, 66.0 percent were in the labor force. Use the appropriate
parameters from Table 3 and Formulas (2) and (3) to get
11
Illustration 3
2014 (x1)
Percentage women aged 2029 ever married in the
labor force (p)
Base
b parameter (b)
Standard error
90-percent confidence
interval
2012 (x2)
Difference
65.7
66.0
-0.3
6,709,000
2,788
0.97
6,949,000
2,788
0.95
1.36
64.1 to 67.3
64.4 to 67.6
-2.5 to 1.9
The standard error of the difference is calculated as
s x1 x2 0.97 2 0.95 2 1.36
The 90-percent confidence interval around the difference is calculated as -0.3 ± 1.645 × 1.36.
Since this interval does include zero, we cannot conclude with 90 percent confidence that the
percentage of women in 2014 between 20-29 years of age who were ever married, in the labor
force, is less than the percentage of women in 2012 between 20-29 years of age who were ever
married, in the labor force.
Standard Errors of Ratios. Certain estimates may be calculated as the ratio of two numbers.
The standard error of a ratio, x/y, may be computed using
2
sx
y
2
s sy
x s x s y
2r x
y x y
x y
(4)
The standard error of the numerator, sx, and that of the denominator, sy, may be calculated using
formulas described earlier. In Formula (4), r represents the correlation between the numerator
and the denominator of the estimate.
For one type of ratio, the denominator is a count of families or households and the numerator is a
count of persons in those families or households with a certain characteristic. If there is at least
one person with the characteristic in every family or household, use 0.7 as an estimate of r. An
example of this type is the mean number of children per family with children.
For all other types of ratios, r is assumed to be zero. Examples are the average number of
children per family and the family poverty rate. If r is actually positive (negative), then this
procedure will provide an overestimate (underestimate) of the standard error of the ratio.
Note: For estimates expressed as the ratio of x per 100 y or x per 1,000 y, multiply Formula (4)
by 100 or 1,000, respectively, to obtain the standard error.
12
Illustration 4
Suppose there were 30,797,000 ever-married women 15-44 years old and 31,886,000 nevermarried women 15-44 years old. The ratio of ever-married women, x, to never-married women,
y, is 0.97. Use the appropriate parameters from Table 4 and Formulas (1) and (4) to get
Women 15-44
a parameter (a)
b parameter (b)
Standard error
90-percent confidence
interval
Illustration 4
Ever-married (x) Never-married (y)
30,797,000
31,886,000
-0.000019
-0.000019
4,687
4,687
355,000
361,000
30,213,000 to
31,292,000 to
31,381,000
32,480,000
Ratio
0.97
0.016
0.94 to 0.99
Using Formula (5) with r = 0, the estimate of the standard error is
2
sx
y
2
30,797,000 355,000
361,000
0.016
31,886,000 30,797,000
31,886,000
The 90-percent confidence interval is calculated as 0.97 ± 1.645 × 0.016.
Standard Errors of Fertility Ratios. The standard error of a fertility ratio is a function of the
number of children ever born per 1,000 women and the number of women in a given category.
The formula for the standard error of a fertility ratio is
s x,y x a
b
c
xy 1,000 y
(5)
where a, b, and c are the parameters from Table 5, x is the number of children ever born or
expected per 1,000 women and y is the number of women in thousands. This formula should be
used when calculating standard errors for estimates involving the possibility of more than one
event per women, i.e., number of children ever born. For data involving at most one event per
woman, convert the ratio to a percentage and use Formula (2) and the parameters in Table 3 or 4
to calculate the standard errors.
Illustration 5
Suppose that 8,725,000 women 40-44 years old had 2,012 children ever born per 1,000 women.
Use Formula (5) and the parameters in Table 5 to get
13
Illustration 5
Children ever born (x)
Base (y) in Thousands
a parameter (a)
b parameter (b)
c parameter (c)
Standard error
90-percent confidence interval
2,012
8,725
+0.0000013
810
1,479
30
1,963 to 2,061
The standard error is calculated as
s x , y 2,012 0.0000013
810
1,479
30
2,012 8,725 1,000 8,725
The 90-percent confidence interval is from 1,963 to 2,061 children ever born per 1,000 women
(i.e., 2,012 ± 1.645 × 30). A conclusion that the average estimate derived from all possible
samples lies within a range computed in this way would be correct for roughly 90 percent of all
possible samples.
Standard Errors of Quarterly or Yearly Averages. For information on calculating standard
errors for labor force data from the CPS which involve quarterly or yearly averages, please see
the “Explanatory Notes and Estimates of Error: Household Data” section in Employment and
Earnings, a monthly report published by the U.S. Bureau of Labor Statistics.
Accuracy of State Estimates. The redesign of the CPS following the 1980 census provided an
opportunity to increase efficiency and accuracy of state data. All strata are now defined within
state boundaries. The sample is allocated among the states to produce state and national
estimates with the required accuracy while keeping total sample size to a minimum. Improved
accuracy of state data was achieved with about the same sample size as in the 1970 design.
Since the CPS is designed to produce both state and national estimates, the proportion of the total
population sampled and the sampling rates differ among the states. In general, the smaller the
population of the state the larger the sampling proportion. For example, in Vermont
approximately 1 in every 250 households was sampled each month. In New York the sample is
about 1 in every 2,000 households. Nevertheless, the size of the sample in New York is four
times larger than in Vermont because New York has a larger population.
Standard Errors of State Estimates. The standard error for a state may be obtained by
determining new state-level a and b parameters and then using these adjusted parameters in the
standard error formulas mentioned previously. To determine a new state-level b parameter
(bstate), multiply the b parameter from Table 3 or 4 by the state factor from Table 6. To
determine a new state-level a parameter (astate), use the following:
(1)
If the a parameter from Table 3 or 4 is positive, multiply it by the state factor
from Table 6.
14
(2)
If the a parameter in Table 3 or 4 is negative, calculate the new state-level a
parameter as follows:
a state
b state
POPstate
(6)
where POPstate is the state population found in Table 6.
To determine state-level parameters for the fertility ratio parameters found in Table 5, multiply
all parameters by the state factor from Table 6.
Note:
The Census Bureau recommends the use of 3-year averages to compare estimates across
states and 2-year averages to evaluate changes in state estimates over time.
Standard Errors of Regional Estimates. To compute standard errors for regional estimates,
follow the steps for computing standard errors for state estimates found in “Standard Errors of
State Estimates” using the regional factors found in Table 7.
Illustration 6
Suppose that of 23,647,000 women 15-44 years old in the South, 45.0 percent remain childless.
Use Formula (2) and the appropriate parameter and factor from Tables 4 and 7 to get:
Illustration 6
Percent of childless women in South (p)
Base (x)
b parameter (b)
South regional factor
Regional b parameter (bregion)
Standard error
90-percent confidence interval
45.0
23,647,000
2,016
1.07
2,157
0.48
44.2 to 45.8
Obtain the region-level b parameter by multiplying the b parameter in Table 4 by the regional
factor in Table 7. This gives bregion = 2,016 × 1.07 = 2,157. The standard error of the estimate of
the percentage of women 15-44 years old in the South who are childless can then be found by
using Formula (2) and the new region-level b parameter. The standard error is calculated as
s x, p
2,157
45.0 (100 45.0) 0.48
23,647,000
and the 90-percent confidence interval for the percentage of women 15-44 years old in the South
who are childless is calculated as 45.0 1.645 0.48.
Standard Errors of Groups of States. The standard error calculation for a group of states is
similar to the standard error calculation for a single state. First, calculate a new state group
15
factor for the group of states. Then, determine new state group a and b parameters. Finally, use
these adjusted parameters in the standard error formulas mentioned previously.
Use the following formula to determine a new state group factor:
n
state group factor
(POP state factor )
i 1
i
i
n
POP
(7)
i
i 1
where POPi and state factori are the population and factor for state i from Table 6. To obtain a
new state group b parameter (bstate group), multiply the b parameter from Table 3 or 4 by the state
group factor obtained by Formula (7). To determine a new state group a parameter (astate group),
use the following:
(1)
If the a parameter from Table 3 or 4 is positive, multiply it by the state group
factor determined by Formula (7).
(2)
If the a parameter from Table 3 or 4 is negative, calculate the new state group a
parameter as follows:
a state group
b state group
n
POP
i 1
(8)
i
To determine state group-level parameters for the fertility ratio parameters found in Table 5,
multiply all parameters by the state group factor calculated by Formula (7).
Illustration 7
Suppose the state group factor for the state group Illinois-Indiana-Michigan was required. The
appropriate factor would be
state group factor
12,713,378 1.13 6,504,674 1.11 9,800,619 1.13
1.13
12,713,378 6,504,674 9,800,619
Technical Assistance. If you require assistance or additional information, please contact the
Demographic Statistical Methods Division via e-mail at dsmd.source.and.accuracy@census.gov.
16
Table 3. Parameters for Computation of Standard Errors for Labor Force
Characteristics: June 2014
Characteristic
a
b
Civilian labor force, employed
Not in labor force
Unemployed
-0.000013
-0.000017
-0.000013
2,481
3,244
2,432
Civilian labor force, employed, not in labor force, and unemployed
Men
Women
Both sexes, 16 to 19 years
-0.000031
-0.000028
-0.000261
2,947
2,788
3,244
-0.000117
-0.000249
-0.000191
-0.001425
3,601
3,465
3,191
3,601
-0.000245
-0.000537
-0.000399
-0.004078
3,311
3,397
2,864
3,311
-0.000087
-0.000172
-0.000158
-0.000909
3,316
3,276
3,001
3,316
Total or White
Black
Civilian labor force, employed, not in labor force, and unemployed
Total
Men
Women
Both sexes, 16 to 19 years
Hispanic
Civilian labor force, employed, not in labor force, and unemployed
Total
Men
Women
Both sexes, 16 to 19 years
Asian, AIAN, NHOPI
Civilian labor force, employed, not in labor force, and unemployed
Total
Men
Women
Both sexes, 16 to 19 years
Notes: (1) These parameters are to be applied to basic CPS monthly labor force estimates.
(2) AIAN is American Indian and Alaska Native and NHOPI is Native Hawaiian and Other
Pacific Islander.
(3) Hispanics may be any race. For a more detailed discussion on the use of parameters for race
and ethnicity, please see the “Generalized Variance Parameters” section.
(4) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both
alone and in combination race group estimates.
(5) For nonmetropolitan characteristics, multiply the a and b parameters by 1.5. If the
characteristic of interest is total state population, not subtotaled by race or ethnicity, the a and
b parameters are zero.
(6) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters
should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen
characteristics for Black, Hispanic, and Asian, AIAN, NHOPI parameters.
(7) For the groups self-classified as having two or more races, use the Asian, AIAN, NHOPI
parameters for all employment characteristics.
17
Table 4. Parameters for Computation of Standard Errors for Fertility and Birth
Expectation Characteristics: June 2014
Persons
Characteristic
Households, etc.
a
b
a
b
FERTILITY
Total or White
Black
Hispanic
Asian, AIAN, NHOPI and two or more races
-0.000032
-0.000123
-0.000229
-0.000287
2,016
2,016
3,397
2,016
(X)
(X)
(X)
(X)
(X)
(X)
(X)
(X)
NUMBER OF BIRTHS
Total or White
Black
Hispanic
Asian, AIAN, NHOPI and two or more races
-0.000058
-0.000225
-0.000417
-0.000522
3,676
3,670
6,186
3,670
(X)
(X)
(X)
(X)
(X)
(X)
(X)
(X)
MARITAL STATUS, HOUSEHOLD & FAMILY CHARACTERISTICS
Total or White
-0.000019
Black
-0.000125
Hispanic
-0.000257
Asian, AIAN, NHOPI and two or more races
-0.000300
4,687
6,733
11,347
6,733
-0.000007
-0.000031
-0.000064
-0.000075
1,860
1,683
2,836
1,683
INCOME
Total or White
Black
Hispanic
Asian, AIAN, NHOPI and two or more races
-0.000009
-0.000047
-0.000097
-0.000112
2,207
2,527
4,259
2,527
-0.000008
-0.000041
-0.000084
-0.000098
2,016
2,201
3,709
2,201
EDUCATIONAL ATTAINMENT
Total or White
Black and two or more races
Hispanic
Asian, AIAN, NHOPI
-0.000008
-0.000045
-0.000062
-0.000107
2,131
2,410
2,745
2,410
-0.000007
-0.000031
-0.000064
-0.000075
1,860
1,683
2,836
1,683
NATIVITY – Born in:
Mexico, other N. America, S. America
Europe
Asia, Africa, Oceania
United States
-0.000032
-0.000018
-0.000030
-0.000016
9,942
5,712
9,310
4,997
(X)
(X)
(X)
(X)
(X)
(X)
(X)
(X)
Notes: (1) These parameters are to be applied to the June 2014 Fertility and Birth Expectation
Supplement data.
(2) Fertility includes number of women by number of children ever born, percent childless, and
number of women who have had a child in the last year.
(3) AIAN is American Indian and Alaska Native and NHOPI is Native Hawaiian and Other
Pacific Islander.
(4) Hispanics may be any race. For a more detailed discussion on the use of parameters for race
and ethnicity, please see the “Generalized Variance Parameters” section.
(5) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both
alone and in combination race group estimates.
18
(6) For nonmetropolitan characteristics, multiply the a and b parameters by 1.5. If the
characteristic of interest is total state population, not subtotaled by race or ethnicity, the a and
b parameters are zero.
(7) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters
should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen
characteristics for Black, Hispanic, and Asian, AIAN, NHOPI parameters.
(8) For the group self-classified as having two or more races, use the Asian, AIAN, NHOPI
parameters for all characteristics except employment, unemployment, and educational
attainment, in which case use Black parameters.
Table 5. Parameters for Computation of Standard Errors for
Fertility Ratios: June 2014
a
NOTE:
b
c
0.0000013
810
1,479
Multiply the parameters by 1.3 to get foreign-born parameters.
19
Table 6. Populations and Factors for State Parameters and Standard Errors: June 2014
State
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
District of Columbia
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Factor
Population
State
1.09
0.18
1.13
0.70
1.14
1.14
0.91
0.23
0.18
1.10
1.11
0.31
0.35
1.13
1.11
0.79
0.74
1.11
1.09
0.42
1.16
1.11
1.13
1.11
0.73
1.15
4,768,511
706,520
6,575,191
2,917,228
38,081,322
5,230,515
3,551,105
916,537
648,493
19,434,897
9,843,697
1,355,803
1,603,310
12,713,378
6,504,674
3,064,783
2,838,269
4,319,568
4,539,871
1,313,927
5,868,932
6,657,997
9,800,619
5,398,251
2,922,784
5,953,505
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
Factor
Population
0.25
0.47
0.65
0.37
1.14
0.51
1.16
1.13
0.17
1.13
0.94
1.00
1.13
0.30
1.11
0.18
1.12
1.14
0.54
0.19
1.12
1.15
0.41
1.13
0.15
1,007,183
1,852,180
2,778,145
1,308,504
8,831,892
2,046,481
19,487,766
9,706,688
727,129
11,425,436
3,802,470
3,915,028
12,599,853
1,037,106
4,709,547
836,711
6,427,597
26,272,104
2,911,607
620,304
8,103,839
6,919,544
1,824,404
5,686,633
577,492
NOTES: (1) The state population counts in this table are for the 0+ population.
(2) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters
should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen
characteristics for Black, Asian AIAN, NHOPI, and Hispanic.
Table 7. Populations and Factors for Regional
Parameters and Standard Errors: June 2014
Region
Midwest
Northeast
South
West
Factor
1.06
1.06
1.07
1.02
Population
66,801,568
55,408,454
117,027,167
73,708,141
NOTES: (1) The state population counts in this table are for the 0+ population.
(2) For foreign-born and noncitizen characteristics for Total and White, the a and b parameters
should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen
characteristics for Black, Asian, AIAN, NHOPI, and Hispanic.
20
REFERENCES
[1]
Bureau of Labor Statistics, April 2014, “Redesign of the Sample for the Current
Population Survey.” http://www.bls.gov/cps/sample_redesign_2014.pdf.
[2]
U.S. Census Bureau. 2006. Current Population Survey: Design and Methodology.
Technical Paper 66. Washington, DC: Government Printing Office.
(http://www.census.gov/prod/2006pubs/tp-66.pdf)
[3]
Brooks, C.A. and Bailar, B.A. 1978. Statistical Policy Working Paper 3 - An Error
Profile: Employment as Measured by the Current Population Survey. Subcommittee on
Nonsampling Errors, Federal Committee on Statistical Methodology, U.S. Department of
Commerce, Washington, DC. (http://www.fcsm.gov/working-papers/spp.html)
File Type | application/pdf |
File Modified | 2016-01-11 |
File Created | 2016-01-11 |