Download:
pdf |
pdfAttachment C
Source of the Data and Accuracy of the Estimates for the
2017 Annual Social and Economic Supplement Microdata File
SOURCE OF THE DATA
The data in this microdata file are from the 2017 Annual Social and Economic Supplement
(ASEC) of the Current Population Survey (CPS). The U.S. Census Bureau conducts the CPS
ASEC over a 3-month period in February, March, and April, with most of the data collection
occurring in the month of March. The CPS ASEC uses two sets of questions, the basic CPS and
a set of supplemental questions. The CPS, sponsored jointly by the Census Bureau and the U.S.
Bureau of Labor Statistics, is the country’s primary source of labor force statistics for the entire
population. The Census Bureau and the Bureau of Labor Statistics also jointly sponsor the CPS
ASEC.
Basic CPS. The monthly CPS collects primarily labor force data about the civilian
noninstitutionalized population living in the United States. The institutionalized population,
which is excluded from the population universe, is composed primarily of the population in
correctional institutions and nursing homes (98 percent of the 4 million institutionalized people
in the 2010 Census). Interviewers ask questions concerning labor force participation about each
member 15 years old and over in sample households. Typically, the week containing the
nineteenth of the month is the interview week. The week containing the twelfth is the reference
week (i.e., the week about which the labor force questions are asked).
The CPS uses a multistage probability sample based on the results of the decennial census, with
coverage in all 50 states and the District of Columbia. The sample is continually updated to
account for new residential construction. When files from the most recent decennial census
become available, the Census Bureau gradually introduces a new sample design for the CPS.
Every ten years the CPS first stage sample is redesigned 1 reflecting changes based on the most
recent decennial census. In the first stage of the sampling process, primary sampling units
(PSUs) 2 were selected for sample. In the 2000 design, the United States was divided into 2,025
PSUs. These were then grouped into 824 strata and one PSU was selected for sample from each
stratum. In the 2010 sample design, the United States was divided into 1,987 PSUs. These PSUs
were then grouped into 852 strata. Within each stratum, a single PSU was chosen for the sample,
with its probability of selection proportional to its population as of the most recent decennial
census. In the case of strata consisting of only one PSU, the PSU was chosen with certainty.
In April 2014, the Census Bureau began phasing out the 2000 sample and replacing it with the
2010 sample, creating a mixed sampling frame. Two simultaneous changes occur during this
phase-in period. First, within the PSUs selected for both the 2000 and 2010 designs, sample
households from the 2010 design gradually replace sample households selected for the 2000
design. Second, new PSUs selected for only the 2010 design gradually replace outgoing PSUs
1
2
For detailed information on the 2010 sample redesign, please see reference [1].
The PSUs correspond to substate areas (i.e., counties or groups of counties) that are geographically contiguous.
SOURCE & ACCURACY
1
selected for only the 2000 design. By July 2015, the new 2010 sample design was completely
implemented and the sample came entirely from the 2010 redesigned sample.
Approximately 74,500 housing units were selected for sample from the sampling frame for the
basic CPS. Based on eligibility criteria, 10 percent of these housing units were sent directly to
computer-assisted telephone interviewing (CATI). The remaining units were assigned to
interviewers for computer-assisted personal interviewing (CAPI). 3 Of all housing units in
sample, about 61,700 were determined to be eligible for interview. Interviewers obtained
interviews at about 52,400 of these units. Noninterviews occur when the occupants are not
found at home after repeated calls or are unavailable for some other reason. Table 1 summarizes
historical changes in the CPS design.
The 2017 Annual Social and Economic Supplement. In addition to the basic CPS questions,
interviewers asked supplementary questions for the CPS ASEC. They asked these questions of
the civilian noninstitutional population and also of military personnel who live in households
with at least one other civilian adult. The additional questions covered the following topics:
•
•
•
•
•
•
•
•
•
•
Household and family characteristics
Marital status
Geographic mobility
Foreign-born population
Income from the previous calendar year
Poverty
Work status/occupation
Health insurance coverage
Program participation
Educational attainment
Including the basic CPS sample, approximately 95,000 housing units were in sample for the CPS
ASEC. About 80,900 housing units were determined to be eligible for interview, and about
70,000 interviews were obtained (see Table 1).
The additional sample for the CPS ASEC provides more reliable data for Hispanic households,
non-Hispanic minority households, and non-Hispanic White households with children 18 years
or younger. These households were identified for sample from previous months and the
following April. For more information about the households eligible for the CPS ASEC, please
refer to reference [2].
3
2
For further information on CATI and CAPI and the eligibility criteria, please see reference [2].
SOURCE & ACCURACY
Table 1. Description of the March Basic CPS and CPS ASEC Sample Cases
Number Basic CPS housing units eligible Total (CPS ASEC/ADS1 + basic CPS)
housing units eligible
Time period
of sample
Interviewed Not interviewed
PSUs Interviewed Not interviewed
2017
2016
2015
2014 Redesign
2014 Traditional
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1990 to 1994
1989
1986 to 1988
1985
1982 to 1984
1980 to 1981
1977 to 1979
1976
1973 to 1975
1972
1967 to 1971
1963 to 1966
1960 to 1962
1959
1
2
3
4
852
52,400
9,300
70,000
852
52,000
9,100
69,500
852
52,900
8,200
74,300
824
17,200
2,200
22,700
824
35,500
4,600
51,500
824
52,700
6,800
-824
52,900
6,400
75,500
824
53,300
5,800
75,100
824
53,400
5,300
75,900
824
54,100
4,600
77,000
824
54,100
4,600
76,200
824
53,800
5,100
75,900
824
53,700
5,600
75,500
824
54,000
5,400
76,000
2
754/824
54,400
5,700
76,500
754
55,000
5,200
77,700
754
55,500
4,500
78,300
754
55,500
4,500
78,300
754
46,800
3,200
49,600
754
46,800
3,200
51,000
754
46,800
3,200
50,800
754
46,800
3,200
50,400
754
46,800
3,200
50,300
754
46,800
3,200
49,700
792
56,700
3,300
59,200
729
57,400
2,600
59,900
729
53,600
2,500
56,100
729
57,000
2,500
59,500
3
629/729
57,000
2,500
59,500
629
59,000
2,500
61,500
629
65,500
3,000
68,000
614
55,000
3,000
58,000
624
46,500
2,500
49,000
461
46,500
2,500
49,000
4
449/461
45,000
2,000
45,000
449
48,000
2,000
48,000
357
33,400
1,200
33,400
333
33,400
1,200
33,400
330
33,400
1,200
33,400
The CPS ASEC was referred to as the Annual Demographic Survey (ADS) until 2002.
The Census Bureau redesigned the CPS following the Census 2000. During phase-in of the new design,
housing units from the new and old designs were in the sample.
The Census Bureau redesigned the CPS following the 1980 Decennial Census of Population and Housing.
The Census Bureau redesigned the CPS following the 1970 Decennial Census of Population and Housing.
SOURCE & ACCURACY
10,900
10,600
10,300
2,600
5,800
-7,700
7,200
6,500
5,700
5,700
6,400
7,100
7,100
7,500
7,000
6,800
6,600
4,300
3,700
4,300
5,200
3,900
4,100
3,800
3,100
3,000
3,000
3,000
3,000
3,500
3,500
3,000
3,000
2,000
2,000
1,200
1,200
1,200
3
Estimation Procedure. This survey’s estimation procedure adjusts weighted sample results to
agree with independently derived population estimates of the civilian noninstitutionalized
population of the United States and each state (including the District of Columbia). These
population estimates, used as controls for the CPS, are prepared monthly to agree with the most
current set of population estimates that are released as part of the Census Bureau’s population
estimates and projections program.
The population controls for the nation are distributed by demographic characteristics in two
ways:
•
•
Age, sex, and race (White alone, Black alone, and all other groups combined).
Age, sex, and Hispanic origin.
The population controls for the states are distributed by race (Black alone and all other race
groups combined), age (0-15, 16-44, and 45 and over), and sex.
The independent estimates by age, sex, race, and Hispanic origin, and for states by selected age
groups and broad race categories, are developed using the basic demographic accounting formula
whereby the population from the 2010 Decennial Census data is updated using data on the
components of population change (births, deaths, and net international migration) with net
internal migration as an additional component in the state population estimates.
The net international migration component in the population estimates includes a combination of
the following:
•
•
•
•
•
Legal migration to the United States.
Emigration of foreign-born and native people from the United States.
Net movement between the United States and Puerto Rico.
Estimates of temporary migration.
Estimates of net residual foreign-born population, which include unauthorized
migration.
Because the latest available information on these components lags the survey date, it is necessary
to make short-term projections of these components to develop the estimate for the survey date.
ACCURACY OF THE ESTIMATES
A sample survey estimate has two types of error: sampling and nonsampling. The accuracy of an
estimate depends on both types of error. The nature of the sampling error is known given the
survey design; the full extent of the nonsampling error is unknown.
Sampling Error. Since the CPS estimates come from a sample, they may differ from figures
from an enumeration of the entire population using the same questionnaires, instructions, and
enumerators. For a given estimator, the difference between an estimate based on a sample and
the estimate that would result if the sample were to include the entire population is known as
sampling error. Standard errors, as calculated by methods described in “Standard Errors and
4
SOURCE & ACCURACY
Their Use,” are primarily measures of the magnitude of sampling error. However, they may
include some nonsampling error.
Nonsampling Error. For a given estimator, the difference between the estimate that would
result if the sample were to include the entire population and the true population value being
estimated is known as nonsampling error. There are several sources of nonsampling error that
may occur during the development or execution of the survey. It can occur because of
circumstances created by the interviewer, the respondent, the survey instrument, or the way the
data are collected and processed. For example, errors could occur because:
•
•
•
•
•
The interviewer records the wrong answer, the respondent provides incorrect
information, the respondent estimates the requested information, or an unclear
survey question is misunderstood by the respondent (measurement error).
Some individuals who should have been included in the survey frame were
missed (coverage error).
Responses are not collected from all those in the sample or the respondent is
unwilling to provide information (nonresponse error).
Values are estimated imprecisely for missing data (imputation error).
Forms may be lost, data may be incorrectly keyed, coded, or recoded, etc.
(processing error).
To minimize these errors, the Census Bureau applies quality control procedures during all stages
of the production process, including the design of the survey, the wording of questions, the
review of the work of interviewers and coders, and the statistical review of reports.
Two types of nonsampling error that can be examined to a limited extent are nonresponse and
undercoverage.
Nonresponse. The effect of nonresponse cannot be measured directly, but one indication of its
potential effect is the nonresponse rate. For the cases eligible for the 2017 ASEC, the basic CPS
household-level nonresponse rate was 13.5 percent. The household-level nonresponse rate for
the ASEC was an additional 14.0 percent. These two non-response rates lead to a combined
supplement nonresponse rate of 25.6 percent.
As part of the nonsampling error analysis, the item response rates, item refusal rates, and edits
are reviewed. For the CPS ASEC, the item refusal rates range from 0.0 percent to 12.1 percent.
The item allocation rates range from 2.1 percent to 55.2 percent.
In accordance with Census Bureau and Office of Management and Budget Quality Standards, the
Census Bureau will conduct a nonresponse bias analysis to assess nonresponse bias in the 2017
ASEC.
Sufficient Partial Interview. A sufficient partial interview is an incomplete interview in which
the household or person answered enough of the questionnaire for the supplement sponsor to
consider the interview complete. The remaining supplement questions may have been edited or
imputed to fill in missing values. Insufficient partial interviews are considered to be
SOURCE & ACCURACY
5
nonrespondents. Refer to the supplement overview attachment in the technical documentation for
the specific questions deemed critical by the sponsor as necessary to be answered in order to be
considered a sufficient partial interview.
Undercoverage. The concept of coverage in the survey sampling process is the extent to which
the total population that could be selected for sample “covers” the survey’s target population.
Missed housing units and missed people within sample households create undercoverage in the
CPS. Overall CPS undercoverage for March 2017 is estimated to be about 12 percent. CPS
coverage varies with age, sex, and race. Generally, coverage is larger for females than for males
and larger for non-Blacks than for Blacks. This differential coverage is a general problem for
most household-based surveys.
The CPS weighting procedure partially corrects for bias from undercoverage, but biases may still
be present when people who are missed by the survey differ from those interviewed in ways
other than age, race, sex, Hispanic origin, and state of residence. How this weighting procedure
affects other variables in the survey is not precisely known. All of these considerations affect
comparisons across different surveys or data sources.
A common measure of survey coverage is the coverage ratio, calculated as the estimated
population before poststratification divided by the independent population control. Table 2
shows March 2017 CPS coverage ratios by age and sex for certain race and Hispanic groups.
The CPS coverage ratios can exhibit some variability from month to month.
Table 2. CPS Coverage Ratios: March 2017
Total
White only
Black only
Residual race
Hispanic
All
Age
Male Female Male Female Male Female Male Female Male Female
group people
0.86
0.86
0.86
0.91
0.92
0.73
0.66
0.78
0.77
0.84
0.85
0-15
0.85
0.87
0.83
0.90
0.87
0.77
0.69
0.82
0.77
0.84
0.87
16-19
0.75
0.79
0.78
0.82
0.59
0.71
0.76
0.74
0.72
0.81
20-24 0.77
0.79
0.85
0.83
0.89
0.65
0.69
0.71
0.78
0.74
0.82
25-34 0.82
0.88
0.91
0.92
0.94
0.70
0.78
0.81
0.84
0.80
0.88
35-44 0.90
0.89
0.93
0.92
0.96
0.73
0.80
0.81
0.87
0.86
0.87
45-54 0.91
0.92
0.91
0.93
0.91
0.94
0.91
0.88
0.86
0.84
0.81
0.85
55-64
0.97
0.97
0.97
0.98
0.98
0.89
0.93
0.92
0.88
0.85
0.90
65+
0.89
0.88
0.90
0.90
0.93
0.75
0.79
0.80
0.82
0.80
0.86
15+
0.88
0.87
0.90
0.90
0.93
0.74
0.76
0.80
0.81
0.81
0.85
0+
Notes: (1) The Residual race group includes cases indicating a single race other than White or Black,
and cases indicating two or more races.
(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for
race and ethnicity, please see the “Generalized Variance Parameters” section.
Comparability of Data. Data obtained from the CPS and other sources are not entirely
comparable. This results from differences in interviewer training and experience and in differing
survey processes. This is an example of nonsampling variability not reflected in the standard
errors. Therefore, caution should be used when comparing results from different sources.
6
SOURCE & ACCURACY
Data users should be careful when comparing estimates for 2015 in Income and Poverty in the
United States: 2015 and Health Insurance Coverage in the United States: 2015 (which reflect
2010 Census-based controls) with estimates for 1999 to 2010 (from March 2000 CPS to March
2011 CPS), which reflect Census 2000-based controls, and to 1992 to 1998 (from March 1993
CPS to March 1999 CPS), which reflect 1990 Census-based controls. Ideally, the same
population controls should be used when comparing any estimates. In reality, the use of the same
population controls is not practical when comparing trend data over a period of 10 to 20 years.
Thus, when it is necessary to combine or compare data based on different controls or different
designs, data users should be aware that changes in weighting controls or weighting procedures
could create small differences between estimates. See the following discussion for information
on comparing estimates derived from different controls or different sample designs.
Data users should be careful when comparing the data from this microdata file, which reflects
2010 Census-based controls, with microdata files from January 2003 through December 2011,
which reflect Census 2000-based controls. Ideally, the same population controls should be used
when comparing any estimates. In reality, the use of the same population controls is not
practical when comparing trend data over a period of 10 to 20 years. Thus, when it is necessary
to combine or compare data based on different controls or different designs, data users should be
aware that changes in weighting controls or weighting procedures can create small differences
between estimates. See the discussion following for information on comparing estimates derived
from different controls or different sample designs.
Microdata files from previous years reflect the latest available census-based controls. Although
the most recent change in population controls had relatively little impact on summary measures
such as averages, medians, and percentage distributions, it did have a significant impact on
levels. For example, use of 2010 Census-based controls results in about a 0.2 percent increase
from the Census 2000-based controls in the civilian noninstitutionalized population and in the
number of families and households. Thus, estimates of levels for data collected in 2012 and later
years will differ from those for earlier years by more than what could be attributed to actual
changes in the population. These differences could be disproportionately greater for certain
population subgroups than for the total population.
Users should also exercise caution because of changes caused by the phase-in of the 2010
Census files (see “Basic CPS”). 4 During this time period, CPS data were collected from sample
designs based on different censuses. Two features of the new CPS design have the potential of
affecting published estimates: (1) the temporary disruption of the rotation pattern from August
2014 through June 2015 for a comparatively small portion of the sample and (2) the change in
sample areas. Most of the known effect on estimates during and after the sample redesign will
be the result of changing from 2000 to 2010 geographic definitions. Research has shown that the
national-level estimates of the metropolitan and nonmetropolitan populations should not change
appreciably because of the new sample design. However, users should still exercise caution
when comparing metropolitan and nonmetropolitan estimates across years with a design change,
especially at the state level.
4
The phase-in process using the 2010 Census files began in April 2014.
SOURCE & ACCURACY
7
Caution should also be used when comparing Hispanic estimates over time. No independent
population control totals for people of Hispanic origin were used before 1985.
A Nonsampling Error Warning. Since the full extent of the nonsampling error is unknown,
one should be particularly careful when interpreting results based on small differences between
estimates. The Census Bureau recommends that data users incorporate information about
nonsampling errors into their analyses, as nonsampling error could impact the conclusions drawn
from the results. Caution should also be used when interpreting results based on a relatively
small number of cases. Summary measures (such as medians and percentage distributions)
probably do not reveal useful information when computed on a subpopulation smaller than
75,000.
For additional information on nonsampling error including the possible impact on CPS
data when known, refer to references [2] and [3].
Estimation of Median Incomes. The Census Bureau has changed the methodology for
computing median income over time. The Census Bureau has computed medians using either
Pareto interpolation or linear interpolation--depending on the size of the income interval--Pareto
for intervals larger than $2,500 in width, linear otherwise. Currently, we are using linear
interpolation to estimate all medians. Pareto interpolation assumes a decreasing density of
population within an income interval, whereas linear interpolation assumes a constant density of
population within an income interval. The Census Bureau calculated estimates of median
income and associated standard errors for 1979 through 1987 using Pareto interpolation if the
estimate was larger than $20,000 for people or $40,000 for families and households.
We calculated estimates of median income and associated standard errors for 1976, 1977, and
1978 using Pareto interpolation if the estimate was larger than $12,000 for people or $18,000 for
families and households. This is because the width of the income interval containing the
estimate is greater than $1,000. All other estimates of median income and associated standard
errors for 1976 through 2014 (2015 CPS ASEC) and almost all of the estimates of median
income and associated standard errors for 1975 and earlier were calculated using linear
interpolation.
Thus, use caution when comparing median incomes above $12,000 for people or $18,000 for
families and households for different years. Median incomes below those levels are more
comparable from year to year since they have always been calculated using linear interpolation.
For an indication of the comparability of medians calculated using Pareto interpolation with
medians calculated using linear interpolation, see reference [5].
Standard Errors and Their Use. The sample estimate and its standard error enable one to
construct a confidence interval. A confidence interval is a range about a given estimate that has
a specified probability of containing the average result of all possible samples. For example, if
all possible samples were surveyed under essentially the same general conditions and using the
same sample design, and if an estimate and its standard error were calculated from each sample,
then approximately 90 percent of the intervals from 1.645 standard errors below the estimate to
1.645 standard errors above the estimate would include the average result of all possible samples.
8
SOURCE & ACCURACY
A particular confidence interval may or may not contain the average estimate derived from all
possible samples, but one can say with specified confidence that the interval includes the average
estimate calculated from all possible samples.
Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing
between population parameters using sample estimates. The most common type of hypothesis is
that the population parameters are different. An example of this would be comparing the
percentage of men who were part-time workers to the percentage of women who were part-time
workers.
Tests may be performed at various levels of significance. A significance level is the probability
of concluding that the characteristics are different when, in fact, they are the same. For example,
to conclude that two characteristics are different at the 0.10 level of significance, the absolute
value of the estimated difference between characteristics must be greater than or equal to 1.645
times the standard error of the difference.
The Census Bureau uses 90-percent confidence intervals and 0.10 levels of significance to
determine statistical validity. Consult standard statistical textbooks for alternative criteria.
Estimating Standard Errors. The Census Bureau uses replication methods to estimate the
standard errors of CPS estimates. These methods primarily measure the magnitude of sampling
error. However, they do measure some effects of nonsampling error as well. They do not
measure systematic biases in the data associated with nonsampling error. Bias is the average
over all possible samples of the differences between the sample estimates and the true value.
There are two ways to calculate standard errors for the 2017 CPS ASEC microdata file. They
are:
•
•
Direct estimates created from replicate weighting methods;
Generalized variance estimates created from generalized variance function
parameters a and b.
While replicate weighting methods provide the most accurate variance estimates, this approach
requires more computing resources and more expertise on the part of the user. The Generalized
Variance Function (GVF) parameters provide a method of balancing accuracy with resource
usage as well as a smoothing effect on standard error estimates across time. For more
information on calculating direct estimates, see reference [6]. For more information on GVF
estimates refer to the “Generalized Variance Parameters” section.
Generalized Variance Parameters. While it is possible to compute and present an estimate of
the standard error based on the survey data for each estimate in a report, there are a number of
reasons why this is not done. A presentation of the individual standard errors would be of
limited use, since one could not possibly predict all of the combinations of results that may be of
interest to data users. Additionally, data users have access to CPS microdata files, and it is
impossible to compute in advance the standard error for every estimate one might obtain from
SOURCE & ACCURACY
9
those data sets. Moreover, variance estimates are based on sample data and have variances of
their own. Therefore, some methods of stabilizing these estimates of variance, for example, by
generalizing or averaging over time, may be used to improve their reliability.
Experience has shown that certain groups of estimates have similar relationships between their
variances and expected values. Modeling or generalizing may provide more stable variance
estimates by taking advantage of these similarities. The GVF is a simple model that expresses
the variance as a function of the expected value of the survey estimate. The parameters of the
GVF are estimated using direct replicate variances. These GVF parameters provide a relatively
easy method to obtain approximate standard errors for numerous characteristics.
The GVF parameters to use in computing standard errors are dependent upon the race/ethnicity
group of interest. Table 3 summarizes the relationship between the race/ethnicity group of
interest and the GVF parameters to use in standard error calculations.
In this source and accuracy statement, Table 4 provides the GVF parameters for labor force
estimates, and Table 5 provides GVF parameters for characteristics from the 2017 CPS ASEC
supplement. Also, tables are provided that allow the calculation of parameters for prior years
and parameters for states and regions. Tables 6 and 7 contain correlation coefficients for
comparing estimates from consecutive years. Tables 8 and 9 provide factors and population
controls to derive state and regional parameters.
The basic CPS questionnaire records the race and ethnicity of each respondent. With respect to
race, a respondent can be White, Black, Asian, American Indian and Alaskan Native (AIAN),
Native Hawaiian and Other Pacific Islander (NHOPI), or combinations of two or more of the
preceding. A respondent’s ethnicity can be Hispanic or non-Hispanic, regardless of race.
Table 3. Estimation Groups of Interest and Generalized Variance Parameters
Race/ethnicity group of interest
GVF parameters to
use in standard error calculations
Total population
Total or White
White alone, White AOIC, or White non-Hispanic population
Total or White
Black alone, Black AOIC, or Black non-Hispanic population
Black
Asian alone, Asian AOIC, or Asian non-Hispanic population
AIAN alone, AIAN AOIC, or AIAN non-Hispanic population
Asian, AIAN, NHOPI
NHOPI alone, NHOPI AOIC, or NHOPI non-Hispanic
population
Populations from other race groups
Hispanic population
Two or more races – employment/unemployment and
educational attainment characteristics
Two or more races – all other characteristics
10
Asian, AIAN, NHOPI
Hispanic
Black
Asian, AIAN, NHOPI
SOURCE & ACCURACY
Notes: (1) AIAN is American Indian and Alaska Native and NHOPI is Native Hawaiian and Other
Pacific Islander.
(2) AOIC is an abbreviation for alone or in combination. The AOIC population for a race group
of interest includes people reporting only the race group of interest (alone) and people
reporting multiple race categories including the race group of interest (in combination).
(3) Hispanics may be any race.
(4) Two or more races refers to the group of cases self-classified as having two or more races.
Standard Errors of Estimated Numbers. The approximate standard error, sx, of an estimated
number from this microdata file can be obtained by using the formula:
s x = ax 2 + bx
(1)
Here x is the size of the estimate and a and b are the parameters in Table 4 or 5 associated with
the particular type of characteristic. When calculating standard errors from cross-tabulations
involving different characteristics, use the set of parameters for the characteristic that will give
the largest standard error.
Illustration 1
Suppose there were 3,274,000 unemployed females in the civilian labor force. Use Formula (1)
and the appropriate parameters from Table 4 to get
Illustration 1
Number of unemployed females in the
civilian labor force (x)
a-parameter (a)
b-parameter (b)
Standard error
90-percent confidence interval
3,274,000
-0.000028
2,788
94,000
3,119,000 to 3,429,000
The standard error is calculated as
s x = − 0.000028 × 3,274,000 2 + 2,788 × 3,274,000 = 94,000
and the 90-percent confidence interval is calculated as 3,274,000 ± 1.645 × 94,000.
A conclusion that the average estimate derived from all possible samples lies within a range
computed in this way would be correct for roughly 90 percent of all possible samples.
Illustration 2
Suppose there were 60,804,000 married-couple family households. Use Formula (1) and the
appropriate parameters from Table 5 to get
SOURCE & ACCURACY
11
Illustration 2
Number of married-couple family
households (x)
a-parameter (a)
b-parameter (b)
Standard error
90-percent confidence interval
60,804,000
-0.000005
1,285
244,000
60,403,000 to 61,205,000
The standard error is calculated as
s x = − 0.000005 × 60,804,000 2 + 1,285 × 60,804,000 = 244,000
and the 90-percent confidence interval is calculated as 60,804,000 ± 1.645 × 244,000.
A conclusion that the average estimate derived from all possible samples lies within a range
computed in this way would be correct for roughly 90 percent of all possible samples.
Standard Errors of Estimated Percentages. The reliability of an estimated percentage,
computed using sample data for both numerator and denominator, depends on both the size of
the percentage and its base. Estimated percentages are relatively more reliable than the
corresponding estimates of the numerators of the percentages, particularly if the percentages are
50 percent or more. When the numerator and denominator of the percentage are in different
categories, use the parameter from Table 4 or 5 as indicated by the numerator.
The approximate standard error, sy,p, of an estimated percentage can be obtained by using the
formula:
s y, p =
b
p(100 − p )
y
(2)
Here y is the total number of people, families, households, or unrelated individuals in the base or
denominator of the percentage, p is the percentage 100*x/y (0 ≤ p ≤ 100), and b is the parameter
in Table 4 or 5 associated with the characteristic in the numerator of the percentage.
Illustration 3
Suppose there were 219,804,000 out of 246,325,000 adults (aged 18 and older), or 89.2 percent,
who graduated from high school. Use Formula (2) and the appropriate parameter from Table 5
to get
Illustration 3
Percentage of adults who are high school graduates (p)
Base (y)
b-parameter (b)
Standard error
90-percent confidence interval
12
89.2
246,325,000
1,473
0.08
89.1 to 89.3
SOURCE & ACCURACY
The standard error is calculated as
s y, p =
1,473
× 89.2 × (100 − 89.2) = 0.08
246,325,000
The 90-percent confidence interval of the percentage of adults who graduated from high school
is calculated as 89.2 ± 1.645 × 0.08.
Standard Errors of Estimated Differences. The standard error of the difference between two
sample estimates is approximately equal to
s x1 − x2 = s x1 + s x2 − 2rs x1 s x2
2
2
(3)
where sx1 and sx2 are the standard errors of the estimates, x1 and x2. The estimates can be
numbers, percentages, ratios, etc. Tables 7 and 8 contain the correlation coefficient, r, for CPS
year-to-year comparisons. The correlations were derived for income, poverty, and health
insurance estimates, but they can be used for other types of estimates where the year-to-year
correlation between identical households is high. For making other comparisons, assume that r
equals zero. Making this assumption will result in accurate estimates of standard errors for the
difference between two estimates of the same characteristic in two different areas, or for the
difference between separate and uncorrelated characteristics in the same area. However, if there
is a high positive (negative) correlation between the two characteristics, the formula will
overestimate (underestimate) the true standard error.
Illustration 4
Suppose there were 23,780,000 men over age 24 who were never married and 10,829,000 men
over age 24 who were divorced. The apparent difference is 12,951,000. Use Formulas (1) and
(3) with r = 0 and the appropriate parameters from Table 5 to get
Illustration 4
Never married (x1)
Divorced (x2)
Number of males
over age 24
a-parameter (a)
b-parameter (b)
Standard error
90-percent
confidence interval
Difference
23,780,000
10,829,000
12,951,000
-0.000010
3,240
267,000
23,341,000 to
24,219,000
-0.000010
3,240
184,000
10,526,000 to
11,132,000
324,000
12,418,000 to
13,484,000
The standard error of the difference is calculated as
s x1 − x2 = 267,000 2 + 184,000 2 = 324,000
SOURCE & ACCURACY
13
The 90-percent confidence interval around the difference is calculated as 12,951,000 ± 1.645 ×
324,000. Since this interval does not include zero, we can conclude with 90-percent confidence
that the number of never married men over age 24 was higher than the number of divorced men
over age 24.
Illustration 5
Suppose that the percentage of children in poverty in 2016 was 18.0 percent out of 73,586,000
children, and the percentage of children in poverty in 2015 was 19.7 percent out of 73,647,000
children. The apparent difference is 1.7 percent. Use Formulas (2) and (3) and the appropriate
parameter and correlation coefficient from Tables 5 and 6 to get
Illustration 5
2015 (x1)
Percentage of children in
poverty (p)
Base
b-parameter (b)
Correlation coefficient (r)
Standard error
90-percent
confidence interval
2016 (x2)
Difference
19.7
18.0
1.7
73,647,000
4,974
0.33
73,586,000
4,974
0.32
0.45
0.34
19.2 to 20.2
17.5 to 18.5
1.1 to 2.3
The standard error of the difference is calculated as
s x1 − x2 = 0.33 2 + 0.32 2 − 2 × 0.45 × 0.33 × 0.32 = 0.34
and the 90-percent confidence interval around the difference is calculated as 1.7 ± 1.645 × 0.34.
Since this interval does not include zero, we can conclude with 90-percent confidence that the
percentage of children in poverty in 2016 is lower than the percentage of children in poverty in
2015.
Standard Errors of Estimated Ratios. Certain estimates may be calculated as the ratio of two
numbers. Compute the standard error of a ratio, x/y, using
2
sx y
2
sx s y
x sx s y
=
+ − 2r
y x y
xy
(4)
The standard error of the numerator, sx, and that of the denominator, sy, may be calculated using
formulas described earlier. In Formula (4), r represents the correlation between the numerator
and the denominator of the estimate.
For one type of ratio, the denominator is a count of families or households and the numerator is a
count of people in those families or households with a certain characteristic. If there is at least
14
SOURCE & ACCURACY
one person with the characteristic in every family or household, use 0.7 as an estimate of r. An
example of this type is the average number of children per family with children.
For all other types of ratios, r is assumed to be zero. Examples are the average number of
children per family and the family poverty rate. If r is actually positive (negative), then this
procedure will provide an overestimate (underestimate) of the standard error of the ratio.
Note: For estimates expressed as the ratio of x per 100 y or x per 1,000 y, multiply Formula (4)
by 100 or 1,000, respectively, to obtain the standard error.
Illustration 6
Suppose there were 10,818,000 males working part-time and 18,221,000 females working parttime. The ratio of males working part-time to females working part-time would be 0.594, or 59.4
percent. Use Formulas (1) and (4) with r = 0 and the appropriate parameters from Table 4 to get
Illustration 6
Males (x)
Number who work parttime
a-parameter (a)
b-parameter (b)
Standard error
90-percent confidence
interval
Females (y)
Ratio
10,818,000
18,221,000
0.594
-0.000031
2,947
168,000
10,542,000 to
11,094,000
-0.000028
2,788
204,000
17,885,000 to
18,557,000
0.011
0.576 to 0.612
The standard error is calculated as
2
sx y
2
10,818,000 168,000 204,000
=
+
= 0.011
18,221,000 10,818,000 18,221,000
and the 90-percent confidence interval is calculated as 0.549 ± 1.645 × 0.011.
Illustration 7
Suppose that the number of families below the poverty level was 8,081,000 and the total number
of families was 82,854,000. The ratio of families below the poverty level to the total number of
families would be 0.098 or 9.8 percent. Use the appropriate parameters from Table 5 and
Formulas (1) and (4) with r = 0 to get
SOURCE & ACCURACY
15
Number of families
a-parameter (a)
b-parameter (b)
Standard error
90-percent confidence
interval
Illustration 7
In poverty (x)
8,081,000
0.000052
1,518
125,000
7,875,000 to
8,287,000
Total (y)
82,854,000
-0.000005
1,285
269,000
82,411,000 to
83,297,000
Ratio (in percent)
9.8
0.15
9.6 to 10.0
The standard error is calculated as
2
sx y
2
8,081,000 125,000 269,000
=
+
= 0.0015 = 0.15%
82,854,000 8,081,000 82,854,000
and the 90-percent confidence interval of the percentage is calculated as 9.8 ± 1.645 × 0.15.
Standard Errors of Estimated Medians. The sampling variability of an estimated median
depends on the form of the distribution and the size of the base. One can approximate the
reliability of an estimated median by determining a confidence interval about it. (See “Standard
Errors and Their Use” for a general discussion of confidence intervals.)
Estimate the 68-percent confidence limits of a median based on sample data using the following
procedure:
1.
Using Formula (2) and the base of the distribution, calculate the standard error of 50
percent.
2.
Add to and subtract from 50 percent the standard error determined in step 1. These two
numbers are the percentage limits corresponding to the 68-percent confidence interval
about the estimated median.
3.
Using the distribution of the characteristic, determine upper and lower limits of the
68-percent confidence interval by calculating values corresponding to the two points
established in step 2.
Note: The percentage limits found in step 2 may or may not fall in the same
characteristic distribution interval.
Use the following formula to calculate the upper and lower limits:
Xp =
16
pN − N 1
( A2 − A1 ) + A1
N 2 − N1
(5)
SOURCE & ACCURACY
where
Xp =
estimated upper and lower bounds for the confidence interval
(0 ≤ p ≤ 1). For purposes of calculating the confidence interval, p
takes on the values determined in step 2. Note that Xp estimates
the median when p = 0.50.
N =
for distribution of numbers: the total number of units (people,
households, etc.) for the characteristic in the distribution.
=
p =
the values obtained in Step 2.
A1, A2 =
the lower and upper bounds, respectively, of the interval
containing Xp.
N1, N2 =
for distribution of numbers: the estimated number of units
(people, households, etc.) with values of the characteristic less than
or equal to A1 and A2, respectively.
=
4.
for distribution of percentages: the value 100.
for distribution of percentages: the estimated percentage of units
(people, households, etc.) having values of the characteristic less
than or equal to A1 and A2, respectively.
Divide the difference between the two points determined in step 3 by 2 to obtain the
standard error of the median.
Note: Median incomes and their standard errors calculated as below may differ from those in
published tables and reports showing income, since narrower income intervals were used
in those calculations.
Illustration 8
Suppose there were 126,224,000 households in 2017, and their income was distributed in the
following way:
SOURCE & ACCURACY
17
Illustration 8
Number of
Cumulative number of
Income level
households
households
Under $5,000
4,138,000
4,138,000
$5,000 to $9,999
3,878,000
8,016,000
$10,000 to $14,999
6,122,000
14,138,000
$15,000 to $24,999
12,083,000
26,221,000
$25,000 to $34,999
11,857,000
38,078,000
$35,000 to $49,999
16,303,000
54,381,000
$50,000 to $74,999
21,405,000
75,786,000
$75,000 to $99,999
15,473,000
91,259,000
$100,000 and over
34,963,000
126,222,000
*There may be a difference due to rounding.
Cumulative percent of
households
3.28%
6.35%
11.20%
20.77%
30.17%
43.08%
60.04%
72.30%
100.00%
1.
Using Formula (2) with b = 1,393, the standard error of 50 percent on a base of
126,224,000 is about 0.17 percent.
2.
To obtain a 68-percent confidence interval on an estimated median, add to and subtract
from 50 percent the standard error found in step 1. This yields percentage limits of 49.83
and 50.17.
3.
The lower and upper limits for the interval in which the percentage limits falls are
$50,000 and $75,000, respectively.
Then the estimated numbers of households with an income less than or equal to $50,000
and $75,000 are 54,381,000 and 75,786,000, respectively.
Using Formula (5), the lower limit for the confidence interval of the median is found to
be about
X 0.4985 =
0.4983 × 126,224,000 − 54,381,000
(75,000 − 50,000) + 50,000 = 59,947
75,786,000 − 54,381,000
Similarly, the upper limit is found to be about
X 0.5015 =
0.5017 × 126,224,000 − 54,381,000
(75,000 − 50,000) + 50,000 = 60,448
75,786,000 − 54,381,000
Thus, a 68-percent confidence interval for the median income for households is from
$59,947 to $60,448.
4.
The standard error of the median is, therefore,
60,448 − 59,947
= 250.5
2
18
SOURCE & ACCURACY
Standard Errors of Averages for Grouped Data. The formula used to estimate the standard
error of an average for grouped data is
( )
b 2
S
y
sx =
(6)
In this formula, y is the size of the base of the distribution and b is the parameter from Table 4 or
5. The variance, S², is given by the following formula:
c
S 2 = ∑ pi xi2 − x 2
(7)
i =1
where x , the average of the distribution, is estimated by
c
x = ∑ p i xi
(8)
i =1
where
c = the number of groups; i indicates a specific group, thus taking on values 1
through c.
pi = estimated proportion of households, families, or people whose values for the
characteristic being considered fall in group i.
xi = (ZLi + ZUi)/2 where ZLi and ZUi are the lower and upper interval boundaries,
respectively, for group i. xi is assumed to be the most representative value for
the characteristic of households, families, or people in group i. If group c is openended, i.e., no upper interval boundary exists, use a group approximate average
value of
xc =
3
Z Lc
2
(9)
Illustration 9
Suppose that there were 8,081,000 families in poverty and that the distribution of the income
deficit (the difference between their family income and poverty threshold) for all families in
poverty was
SOURCE & ACCURACY
19
Number of
Income deficit
families in poverty
Under $1000
424,000
$1000 to $2,499
679,000
$2,500 to $4,999
1,204,000
$5,000 to $7,499
1,109,000
$7,500 to $9,999
828,000
$10,000 to $12,499
725,000
$12,500 to $14,999
817,000
$15,000 and over
2,295,000
Total
8,081,000
*There may be a difference due to rounding
Percentage of families in
poverty (pi)
5.2%
8.4%
14.9%
13.7%
10.2%
9.0%
10.1%
28.4%
100%
Average income deficit
( xi )
500
1,750
3,750
6,250
8,750
11,250
13,750
22,500
Using Formula (8),
x = (0.052 × 500) + (0.084 × 1,750) + (0.149 × 3,750) + (0.137 × 6,250) + (0.102 × 8,750) + (0.090 × 11,250)
+ (0.101 × 13,750) + (0.284 × 22,500) = 11,272
and Formula (7),
S 2 = (0.052 × 500 2 ) + (0.084 × 1,750 2 ) + (0.149 × 3,750 2 ) + (0.137 × 6,250 2 ) + (0.102 × 8,750 2 )
+ (0.090 × 11,250 2 ) + (0.101 × 13,750 2 ) + (0.284 × 22,500 2 ) − 11,272 2 = 62,729,000
Use the appropriate parameter from Table 5 and Formula (6) to get
Illustration 9
Average income deficit for families in
poverty (x )
Variance (S2)
Base (y)
b-parameter (b)
Standard error
90-percent confidence interval
$11,272
62,729,000
8,081,000
1,518
$109
$11,093 to $11,451
The standard error is calculated as
sx =
1,518
(62,729,000) = 109
8,081,000
and the 90-percent confidence interval is calculated as $11,272 ± 1.645 × $109.
Standard Errors of Estimated Per Capita Deficits. Certain average values in reports
associated with the CPS ASEC data represent the per capita deficit for households of a certain
class. The average per capita deficit is approximately equal to
20
SOURCE & ACCURACY
x=
hm
p
(10)
where
h =
number of households in the class.
m=
average deficit for households in the class.
p =
number of people in households in the class.
x =
average per capita deficit of people in households in the class.
To approximate standard errors for these averages, use the formula
2
sp
hm s m s p s h
sx =
+ + − 2r
p m p h
p
2
2
s h
h
(11)
In Formula (11), r represents the correlation between p and h.
For one type of average, the class represents households containing a fixed number of people.
For example, h could be the number of 3-person households. In this case, there is an exact
correlation between the number of people in households and the number of households.
Therefore, r = 1 for such households. For other types of averages, the class represents
households of other demographic types, for example, households in distinct regions, households
in which the householder is of a certain age group, and owner-occupied and tenant-occupied
households. In this and other cases in which the correlation between p and h is not perfect, use
0.7 as an estimate of r.
Illustration 10
Suppose there were 27,762,000 people living in families in poverty, and 8,081,000 families in
poverty, with an average deficit income for families in poverty of $11,272 with a standard error
of $109 (from Illustration 9). Use Formulas (1), (10), and (11) and the appropriate parameters
from Table 5 and r = 0.7 to get
SOURCE & ACCURACY
21
Illustration 10
Number of
people (p)
Number (h)
Value for families in
poverty
a-parameter (a)
b-parameter (b)
Correlation (r)
Standard error
90-percent
confidence
interval
Average income
deficit (m)
Average per capita
deficit (x)
8,081,000
0.000052
1,518
125,000
27,762,000
-0.000020
6,452
405,000
$11,272
$109
$3,281
0.7
$50
7,875,000 to
8,287,000
27,096,000 to
28,428,000
$11,093 to
$11,451
$3,199 to
$3,363
The estimate of the average per capita deficit is calculated as
x=
8,081,000 × 11,272
= 3,281
27,762,000
and the standard error is calculated as
𝑠𝑠𝑥𝑥 =
8,081,000 × 11,272
415,000 2
125,000 2
405,000
125,000
109 2
��
� +�
� +�
� − 2 × 0.7 × �
��
�
27,762,000
27,762,000
8,081,000
27,762,000
8,081,000
11,272
= 50
The 90-percent confidence interval is calculated as $3,281 ± 1.645 × $50.
Accuracy of State Estimates. The redesign of the CPS following the 1980 census provided an
opportunity to increase efficiency and accuracy of state data. All strata are now defined within
state boundaries. The sample is allocated among the states to produce state and national
estimates with the required accuracy while keeping total sample size to a minimum. Improved
accuracy of state data was achieved with about the same sample size as in the 1970 design.
Since the CPS is designed to produce both state and national estimates, the proportion of the total
population sampled and the sampling rates differ among the states. In general, the smaller the
population of the state the larger the sampling proportion. For example, in Vermont
approximately 1 in every 250 households is sampled each month. In New York the sample is
about 1 in every 2,000 households. Nevertheless, the size of the sample in New York is four
times larger than in Vermont because New York has a larger population.
Note: The Census Bureau recommends the use of 3-year averages to compare estimates across
states and 2-year averages to evaluate changes in state estimates over time. See
“Standard Errors of Data for Combined Years” and “Standard Errors of Differences of 2Year Averages.” The Census Bureau also recommends the American Community Survey
microdata file as the preferred source for income and poverty state data in years 2006
(2005 estimates) to the present.
22
SOURCE & ACCURACY
Standard Errors for State Estimates. The standard error for a state may be obtained by
determining new state-level a- and b-parameters and then using these adjusted parameters in the
standard error formulas mentioned previously. To determine a new state-level b-parameter
(bstate), multiply the b-parameter from Table 4 or 5 by the state factor from Table 8. To
determine a new state-level a-parameter (astate), use the following:
(1)
If the a-parameter from Table 4 or 5 is positive, multiply it by the state factor
from Table 8.
(2)
If the a-parameter in Table 4 or 5 is negative, calculate the new state-level aparameter as follows:
a state =
− bstate
POPstate
(12)
where POPstate is the state population found in Table 9.
Illustration 11
Suppose there were 14,731,000 people living in New York state who were born in the United
States. Use Formulas (1) and (12) and the appropriate parameter, factor, and population from
Tables 5 and 8 to get
Illustration 11
Number of people in NY who were born in the U.S. (x)
b-parameter (b)
New York state factor
State population
State a-parameter (astate)
State b-parameter (bstate)
Standard error
90-percent confidence interval
14,731,000
3,240
1.19
19,521,914
-0.000198
3,856
118,000
14,537,000 to 14,925,000
Obtain the state-level b-parameter by multiplying the b-parameter, 3,240, by the state factor,
1.19. This gives bstate = 3,240 × 1.19 = 3,856. Obtain the needed state-level a-parameter by
a state =
− 3,856
= −0.000198
19,521,914
The standard error of the estimate of the number of people in New York state who were born in
the United States can then be found by using Formula (1) and the new state-level a- and bparameters, -0.000198 and 3,856, respectively. The standard error is given by
s x = − 0.000198 × 14,731,000 2 + 3,856 × 14,731,000 = 118,000
SOURCE & ACCURACY
23
Standard Errors of Regional Estimates. To compute standard errors for regional estimates,
follow the steps for computing standard errors for state estimates found in “Standard Errors for
State Estimates” using the regional factors and populations found in Table 9.
Illustration 12
Suppose there were 17,028,000 of 120,855,591 people, or 14.1 percent, living in poverty in the
South. Use Formulas (2) and (12) and the appropriate parameter, factor, and population from
Tables 5 and 9 to get
Illustration 12
Poverty rate in the South (p)
Base (y)
b-parameter (b)
South regional factor
Regional b-parameter (bregion)
Standard error
90-percent confidence interval
14.1
120,855,591
6,452
1.13
7,291
0.27
13.7 to 14.5
Obtain the region-level b-parameter by multiplying the b-parameter, 6,452, by the South regional
factor, 1.13. This gives bregion = 6,452 × 1.13 = 7,291.
The standard error of the estimate of the poverty rate for people living in the South can then be
found by using Formula (2) and the new region-level b-parameter, 7,291. The standard error is
given by
s y, p =
7,291
× 14.1 × (100 − 14.1) = 0.27
120,855,591
and the 90-percent confidence interval of the poverty rate for people living in the South is
calculated as 14.1 ± 1.645 × 0.27.
Standard Errors of Groups of States. The standard error calculation for a group of states is
similar to the standard error calculation for a single state. First, calculate a new state group
factor for the group of states. Then, determine new state group a- and b-parameters. Finally, use
these adjusted parameters in the standard error formulas mentioned previously.
Use the following formula to determine a new state group factor:
n
state group factor =
∑ POP × state factor
i =1
i
i
n
∑ POP
i =1
(13)
i
where POPi and state factori are the population and factor for state i from Table 8. To obtain a
new state group b-parameter (bstate group), multiply the b-parameter from Table 4 or 5 by the state
24
SOURCE & ACCURACY
factor obtained by Formula (13). To determine a new state group a-parameter (astate group), use the
following:
(1)
If the a-parameter from Table 4 or 5 is positive, multiply it by the state group
factor determined by Formula (13).
(2)
If the a-parameter in Table 4 or 5 is negative, calculate the new state group aparameter as follows:
a state group =
− bstate group
n
∑ POP
i =1
(14)
i
Illustration 13
Suppose the state group factor for the state group Illinois-Indiana-Michigan was required. The
appropriate factor would be
state group factor =
12,595,529 × 1.17 + 6,553,089 × 1.11 + 9,829,697 × 1.11
= 1.14
12,595,529 + 6,553,089 + 9,829,697
Standard Errors of Data for Combined Years. Sometimes estimates for multiple years are
combined to improve precision. For example, suppose x is an average derived from n
n
x
consecutive years’ data, i.e., x = ∑ i , where the xi are the estimates for the individual years.
i =1 n
Use the formulas described previously to estimate the standard error, s xi , of each year’s estimate.
Then the standard error of x is
sx =
sx
n
(15)
where
sx =
n −1
n
∑s
i =1
2
xi
+ 2r ∑ s xi s xi +1
(16)
i =1
and s xi are the standard errors of the estimates xi. Tables 6 and 7 contain the correlation
coefficients, r, for the correlation between consecutive years i and i+1. Correlation between
nonconsecutive years is zero. The correlations were derived for income and poverty estimates,
but they can be used for other types of estimates where the year-to-year correlation between
identical households is high.
SOURCE & ACCURACY
25
The Census Bureau recommends the use of 3-year average estimates for certain small population
subgroups 5 (see also “Accuracy of State Estimates.”) Two-year moving averages are
recommended for these small population subgroups for comparisons across adjacent years (see
“Standard Errors of Differences of 2-Year Averages.”)
Illustration 14
Suppose the 2014-2016 3-year average percentage of families with female householder, no
husband present (FFH), in poverty was 28.5. Suppose the percentages and bases for 2014, 2015,
and 2016 were 30.6, 28.2, and 26.6 percent and 15,553,000, 15,630,000, and 15,581,000,
respectively. Use the appropriate parameters and correlation coefficients from Tables 5 and 6
and Formulas (2), (15), and (16) to get
Illustration 14
2014
2015
Percentage of families with female
householder, no husband present
(FFH), in poverty (p)
Base (y)
b-parameter (b)
Correlation (r)
Standard error
90-percent confidence interval
30.6
15,553,000
1,518
0.46
29.8 to 31.4
28.2
15,630,000
1,518
0.44
27.5 to 28.9
2016
26.6
15,581,000
1,518
0.44
25.9 to 27.3
2014-2016 avg
28.5
0.35, 0.35
0.31
28.0 to 29.0
The standard error of the 3-year average is calculated as
sx =
0.94
= 0.31
3
where
s x = 0.46 2 + 0.44 2 + 0.44 2 + ( 2 × 0.35 × 0.46 × 0.44) + ( 2 × 0.35 × 0.44 × 0.44) = 0.94
The 90-percent confidence interval for the 3-year average percentage of families with a female
householder, no husband present in poverty is 28.4 ± 1.645 × 0.31.
Standard Errors of Quarterly or Yearly Averages. For information on calculating standard
errors for labor force data from the CPS which involve quarterly or yearly averages, please see
the “Explanatory Notes and Estimates of Error: Household Data” section in Employment and
Earnings, a monthly report published by the U.S. Bureau of Labor Statistics.
Year-to-Year Factors. In past years, the Census Bureau published a table of year factors for the
CPS ASEC Supplement in the Source and Accuracy Statement. User demand for these factors
has diminished with the introduction of replicate weights. Data users producing estimates from
5
Estimates of characteristics of the American Indian and Alaska Native (AIAN) and Native Hawaiian and Other
Pacific Islander (NHOPI) populations based on a single-year sample would be unreliable due to the small size of the
sample that can be drawn from either population. Accordingly, such estimates are based on multiyear averages.
26
SOURCE & ACCURACY
prior years should consult the Source and Accuracy Statements covering the years of their
analysis to estimate standard errors.
Technical Assistance. If you require assistance or additional information, please contact the
Demographic Statistical Methods Division via e-mail at dsmd.source.and.accuracy@census.gov.
SOURCE & ACCURACY
27
Table 4. Parameters for Computation of Standard Errors for Labor Force Characteristics:
March 2017
Characteristic
a
b
Civilian labor force, employed
Not in labor force
Unemployed
-0.000013
-0.000013
-0.000017
2,481
2,432
3,244
Civilian labor force, employed, not in labor force, and unemployed
Men
Women
Both sexes, 16 to 19 years
-0.000031
-0.000028
-0.000261
2,947
2,788
3,244
-0.000117
-0.000249
-0.000190
-0.001425
3,601
3,465
3,191
3,601
-0.000245
-0.000537
-0.000399
-0.004078
3,311
3,397
2,874
3,311
-0.000087
-0.000172
-0.000158
-0.000909
3,316
3,276
3,001
3,316
Total or White
Black
Civilian labor force, employed, not in labor force, and unemployed
Men
Women
Both sexes, 16 to 19 years
Hispanic, may be of any race
Civilian labor force, employed, not in labor force, and unemployed
Men
Women
Both sexes, 16 to 19 years
Asian, American Indian and Alaska Native, Native Hawaiian and
Other Pacific Islander
Civilian labor force, employed, not in labor force, and unemployed
Men
Women
Both sexes, 16 to 19 years
NOTES: (1) These parameters are to be applied to basic CPS monthly labor force estimates.
(2) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both alone and in
combination race group estimates.
(3) For nonmetropolitan characteristics, multiply the a- and b-parameters by 1.5. If the characteristic
of interest is total state population, not subtotaled by race or ethnicity, the a- and b-parameters are
zero.
(4) For foreign-born and noncitizen characteristics for Total and White, the a- and b-parameters
should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen
characteristics for Black, Hispanic, and Asian, AIAN, NHOPI parameters.
(5) For the groups self-classified as having two or more races, use the Asian, AIAN, NHOPI
parameters for all employment characteristics.
28
SOURCE & ACCURACY
Table 5. Parameters for Computation of Standard Errors for People and Families: 2017 CPS ASEC
Characteristics
Total or White
a
PEOPLE
Educational attainment
Employment
People by family income
Income characteristics
Total
Male
Female
Age
15 to 24
25 to 44
45 to 64
65 and over
Health insurance
Marital status, household and family
Some household members
All household members
Mobility (movers)
Educational attainment, labor force,
Marital status, HH, family, and income
US, county, state, region, or MSA
Below poverty
Total
Male
Female
Age
Under 15
Under 18
15 and over
15 to 24
25 to 44
45 to 64
65 and over
Unemployment
b
Black
a
b
Asian, AIAN, &
NHOPI
a
b
Hispanic
a
b
-0.000005 1,473 -0.000023 1,666 -0.000042 1,345 -0.000020 1,126
-0.000012 3,748 -0.000057 4,220 -0.000122 3,907 -0.000073 4,220
-0.000010 3,047 -0.000047 3,488 -0.000109 3,488 -0.000061 3,488
-0.000005 1,526 -0.000024 1,747 -0.000055 1,747 -0.000030 1,747
-0.000010 1,526 -0.000050 1,747 -0.000113 1,747 -0.000061 1,747
-0.000009 1,526 -0.000045 1,747 -0.000106 1,747 -0.000061 1,747
-0.000036
-0.000018
-0.000018
-0.000031
-0.000010
1,526
1,526
1,526
1,526
3,240
-0.000150
-0.000083
-0.000102
-0.000230
-0.000063
1,747
1,747
1,747
1,747
4,653
-0.000330
-0.000184
-0.000257
-0.000572
-0.000146
1,747
1,747
1,747
1,747
4,653
-0.000120
-0.000102
-0.000154
-0.000430
-0.000081
1,747
1,747
1,747
1,747
4,653
-0.000010 3,240 -0.000063 4,653 -0.000146 4,653 -0.000081 4,653
-0.000012 3,936 -0.000093 6,861 -0.000215 6,861 -0.000119 6,861
-0.000006 1,783 -0.000024 1,783 -0.000056 1,783 -0.000031 1,783
-0.000015 4,843 -0.000066 4,843 -0.000152 4,843 -0.000084 4,843
-0.000020 6,452 -0.000087 6,452 -0.000202 6,452 -0.000112 6,452
-0.000041 6,452 -0.000184 6,452 -0.000418 6,452 -0.000224 6,452
-0.000040 6,452 -0.000166 6,452 -0.000391 6,452 -0.000224 6,452
-0.000081
-0.000061
-0.000025
-0.000057
-0.000029
-0.000029
-0.000050
-0.000012
4,974
4,974
6,452
2,441
2,441
2,441
2,441
3,782
-0.000285
-0.000221
-0.000112
-0.000209
-0.000116
-0.000143
-0.000321
-0.000057
4,974
4,974
6,452
2,441
2,441
2,441
2,441
4,220
-0.000603
-0.000490
-0.000262
-0.000462
-0.000257
-0.000359
-0.000799
-0.000122
4,974
4,974
6,452
2,441
2,441
2,441
2,441
3,907
-0.000302
-0.000245
-0.000137
-0.000167
-0.000142
-0.000215
-0.000601
-0.000073
4,974
4,974
6,452
2,441
2,441
2,441
2,441
4,220
FAMILIES, HOUSEHOLDS, OR UNRELATED INDIVIDUALS
Income
-0.000006 1,393 -0.000028 1,521 -0.000068 1,521 -0.000028 1,521
Marital status, HH and family, educational
attainment, population by age/sex
-0.000005 1,285 -0.000022 1,163 -0.000052 1,163 -0.000022 1,163
Poverty
0.000052 1,518 0.000052 1,518 0.000052 1,518 0.000052 1,518
NOTES: (1) These parameters are to be applied to the 2017 Annual Social and Economic Supplement data.
(2) AIAN, NHOPI are American Indian and Alaska Native, Native Hawaiian and Other Pacific Islander, respectively.
(3) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and ethnicity,
please see the “Generalized Variance Parameters” section.
SOURCE & ACCURACY
29
(4) The Total or White, Black, and Asian, AIAN, NHOPI parameters are to be used for both alone and in-combination
race group estimates.
(5) For nonmetropolitan characteristics, multiply the a- and b-parameters by 1.5. If the characteristic of interest is total
state population, not subtotaled by race or ancestry, the a- and b-parameters are zero.
(6) For foreign-born and noncitizen characteristics for Total and White, the a- and b-parameters should be multiplied by
1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Black, Asian, AIAN, NHOPI,
and Hispanic.
(7) For the group self-classified as having two or more races, use the Asian, AIAN, NHOPI parameters for all
characteristics except employment, unemployment, and educational attainment, in which case use Black parameters.
Table 6. CPS Year-to-Year Correlation Coefficients for Income and
Health Insurance Characteristics: 1961 to 2017
Characteristics
Total
White
Black
Other
Hispanic
NOTES:
30
1961-2001 (basic)
or 2001 (expanded)-2017
2000 (basic)2001 (expanded)
People
Families
People
Families
0.30
0.30
0.30
0.30
0.45
0.35
0.35
0.35
0.35
0.55
0.19
0.20
0.15
0.15
0.36
0.22
0.23
0.18
0.17
0.28
(1) Correlation coefficients are not available for income data before 1961.
(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and
ethnicity, please see the “Generalized Variance Parameters” section.
(3) These correlation coefficients are for comparisons of consecutive years. For comparisons of
nonconsecutive years, assume the correlation is zero.
(4) For households and unrelated individuals, use the correlation coefficient for families.
SOURCE & ACCURACY
Table 7. CPS Year-to-Year Correlation Coefficients for Poverty Characteristics: 1971 to 2017
Characteristics
1973-84, 19852001 (basic)
or 2001
(expanded)-2017
2000 (basic)2001 (expanded)
1984-1985
People Families People Families People Families
Total
White
Black
Other
Hispanic
NOTES:
0.45
0.35
0.45
0.45
0.65
0.35
0.30
0.35
0.35
0.55
0.29
0.23
0.23
0.22
0.52
0.22
0.20
0.18
0.17
0.40
0.39
0.30
0.39
0.30
0.56
0.30
0.26
0.30
0.30
0.47
1972-1973
1971-1972
People Families People Families
0.15
0.14
0.17
0.17
0.17
0.14
0.13
0.16
0.16
0.16
0.31
0.28
0.35
0.35
0.35
0.28
0.25
0.32
0.32
0.32
(1) Correlation coefficients are not available for income data before 1961.
(2) Hispanics may be any race. For a more detailed discussion on the use of parameters for race and
ethnicity, please see the “Generalized Variance Parameters” section.
(3) These correlation coefficients are for comparisons of consecutive years. For comparisons of
nonconsecutive years, assume the correlation is zero.
(4) For households and unrelated individuals, use the correlation coefficient for families.
Table 8. Factors and Populations for State Standard Errors and Parameters: 2017 CPS ASEC
State
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
District of Columbia
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Factor
1.11
0.18
1.25
0.73
1.28
1.22
0.86
0.22
0.17
1.14
1.15
0.32
0.41
1.17
1.11
0.77
0.82
1.13
1.01
0.39
1.15
1.10
1.11
1.13
0.69
1.13
Population
4,791,427
713,673
6,889,336
2,944,169
38,897,317
5,501,951
3,525,339
941,311
677,845
20,519,984
10,180,747
1,371,120
1,679,708
12,595,529
6,553,089
3,102,418
2,847,512
4,362,460
4,583,832
1,319,057
5,933,998
6,757,918
9,829,697
5,486,356
2,922,688
5,998,199
State
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
Factor
0.21
0.52
0.77
0.33
1.15
0.51
1.19
1.18
0.17
1.10
1.06
1.07
1.11
0.28
1.07
0.22
1.10
1.32
0.53
0.18
1.19
1.18
0.48
1.13
0.16
Population
1,032,607
1,885,322
2,933,967
1,319,871
8,853,596
2,042,836
19,521,914
9,996,224
742,066
11,454,580
3,856,275
4,095,432
12,589,262
1,041,010
4,899,206
851,907
6,585,858
27,634,076
3,062,284
617,927
8,227,165
7,250,909
1,798,326
5,715,406
574,177
NOTES: (1) The state population counts in this table are for the 0+ population.
(2) For foreign-born and noncitizen characteristics for Total and White, the a- and b-parameters should
be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for
Black, Asian, AIAN, NHOPI, and Hispanic.
SOURCE & ACCURACY
31
Table 9. Factors and Populations for Regional Standard
Errors and Parameters: 2017 CPS ASEC
Region
Midwest
Northeast
South
West
Factor
1.06
1.07
1.13
1.12
Population
67,062,081
55,545,894
120,855,591
76,045,317
NOTES: (1) The state population counts in this table are for the 0+ population.
(2) For foreign-born and noncitizen characteristics for Total and White, the a- and b-parameters should
be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for
Black, Asian, AIAN, NHOPI, and Hispanic.
32
SOURCE & ACCURACY
References
[1]
Bureau of Labor Statistics, April 2014, “Redesign of the Sample for the Current
Population Survey.” http://www.bls.gov/cps/sample_redesign_2014.pdf
[2]
U.S. Census Bureau. 2006. Current Population Survey: Design and Methodology.
Technical Paper 66. Washington, DC: Government Printing Office.
http://www.census.gov/prod/2006pubs/tp-66.pdf
[3]
Brooks, C.A. and Bailar, B.A. 1978. Statistical Policy Working Paper 3 - An Error
Profile: Employment as Measured by the Current Population Survey. Subcommittee on
Nonsampling Errors, Federal Committee on Statistical Methodology, U.S. Department of
Commerce, Washington, DC.
https://s3.amazonaws.com/sitesusa/wp-content/uploads/sites/242/2014/04/spwp3.pdf
[4]
U.S. Census Bureau. 1993. Money Income of Households, Families, and Persons in the
United States: 1992. Current Population Reports, P60-184. Washington, DC:
Government Printing Office. http://www2.census.gov/prod2/popscan/p60-184.pdf
[5]
U.S. Census Bureau. 1978. Money Income in 1976 of Families and Persons in the
United States. Current Population Reports, P60-114. Washington, DC: Government
Printing Office. http://www2.census.gov/prod2/popscan/p60-114.pdf
[6]
U.S. Census Bureau, July 15, 2009, “Estimating ASEC Variances with Replicate Weights
Part I: Instructions for Using the ASEC Public Use Replicate Weight File to Create
ASEC Variance Estimates.”
http://www.bls.census.gov/pub/cps/march/Use_of_the_Public_Use_Replicate_Weight_Fi
le_final_PR.doc
All online references accessed September 1, 2017.
SOURCE & ACCURACY
33
File Type | application/pdf |
Author | caste311 |
File Modified | 2018-05-23 |
File Created | 2018-05-23 |