Download:
pdf |
pdfPART B OF THE SUPPORTING STATEMENT
FOR THE INFORMATION COLLECTION REQUEST PACKAGE THAT WILL
SUPPORT DEVELOPMENT OF STEAM ELECTRIC POWER GENERATING
EFFLUENT GUIDELINES
U.S. ENVIRONMENTAL PROTECTION AGENCY
February 22, 2010
PART B OF THE SUPPORTING STATEMENT
1. QUESTIONNAIRE OBJECTIVES, KEY VARIABLES, AND OTHER
PRELIMINARIES
1(a) Questionnaire Objectives
The primary objective of this Information Collection Request (ICR) is to gather and
update information about fossil- and nuclear-fueled steam electric power plants to support
technical analyses that EPA will use to develop revised effluent guidelines for the steam electric
power generating industry. More specifically, through this ICR, EPA will administer a
questionnaire to power plants in order to obtain sufficient information to develop an industry
profile and to assess the pollutant reductions, costs, and economic achievability of candidate
pollution control technologies, management practices, and process changes. This information is
necessary to fulfill requirements established by the Clean Water Act and to inform Agency
decision making about the appropriate course of regulatory action to address the pollutants
present in wastewater discharges from steam electric power plants. While EPA has obtained
certain information through a recently-completed study of the steam electric power generating
industry and through public sources, it has determined that there are no existing sources for the
sufficiently-detailed plant- and generator-level data that will be collected through this ICR
(questionnaire).
1(b) Key Variables
EPA will use the information collected from the questionnaire to gain knowledge of
current steam electric processes, characterize and quantify the pollutants currently discharged by
the industry, assess technical and economic achievability of available controls, and assess
potential environmental impacts. The information collected will provide EPA with
owner/operator-level, plant-level, and/or generating unit-level data on processes, production, and
costs that complement, and go beyond, data that are available from public sources. EPA is
investigating the following key variables as they relate to the potential technology options for
this industrial sector:
•
•
•
•
•
•
Proper Characterization and Classification of the Steam Electric Industry;
Baseline Estimates of Pollutants Discharged in Steam Electric Process
Wastewater;
BMPs and Pollution Prevention Opportunities that can Reduce Pollutant
Discharges;
Best Available and Best Demonstrated Technologies for Effluent Limitations
Guidelines and Standards;
Potential Environmental Impacts of Steam Electric Process Wastewater; and
Costs and Economic Impacts of Candidate Pollutant Control Technologies and
Practices.
B-1
Please see Part A, Section 4(b)(i) of this ICR Supporting Statement for detailed
information on the data to be collected by the questionnaire.
1(c) Terminology
Within the sample frame (defined in Section 2(b) below), EPA has classified the fuel
usage for each plant within the scope of the ICR. This plant fuel classification is based on the
types of fuels used by each of the generating units operating at the plant. The plant fuel
classifications are determined in a hierarchical manner as follows:
First, all coal plants are identified:
• Coal: one or more units uses coal as its primary or secondary fuel. This includes coal
units using integrated gasification combined cycle (IGCC);
Second, all petroleum coke plants are identified:
• Petroleum Coke: one or more units uses petroleum coke as its primary or secondary
fuel (and the plant is not already classified as a coal plant);
Third, plants for which all the units have the same primary fuel type are classified as follows:
• Gas (all units have gas as the primary fuel type, but do not use combined cycle steam
turbines);
• Gas-Combined Cycle (all units have gas as the primary fuel type and use combined
cycle steam turbines);
• Oil (all units have oil as the primary fuel type);
• Nuclear (all units are nuclear);
Finally, the remaining plants with units having different fuel types are classified as
combinations as follows:
• Combination (Gas and Gas-Combined Cycle);
• Combination (Gas and Oil);
• Combination (Gas-Combined Cycle and Oil);
• Combination (Gas and Gas-Combined Cycle and Oil); and
• Combination (Gas-Combined Cycle and Nuclear and Oil).
B-2
2. STATISTICAL APPROACH FOR THE SURVEY
The statistical approach considers the survey’s target population, the available
information in the sample frame, the sample design, and sources of error. The following sections
describe each component in detail.
2(a) Target Population
The principal task in the development of a sample survey design is establishing a clear,
concise description of the survey’s target population. The target population consists of the entire
set of elements on which inferences will be made from the collected survey data. The definition
should clearly identify every element of the target population so that all non-population elements
can be excluded. For this survey, EPA has defined the target population to be all fossil- and
nuclear-fueled steam electric power plants in the U.S. that report as operating under North
American Industry Classification System (NAICS) code 22. In addition to plant-level
information, EPA intends to collect information about all units at each plant. EPA will develop
both plant-level and unit-level estimates from the data.
2(b) Sample Frame
A sample frame is a list or set of procedures for identifying all elements of a target
population. In addition to listing population elements, sample frames contain such additional
information as addresses and key characteristics of the population that will be used to draw
samples. Sample frames are essential to the quality of surveys because sample elements are
drawn from them.
Because the target population consists of all fossil- and nuclear-fueled steam electric
power plants in the U.S. that report as operating under NAICS code 22 and their corresponding
generating units, EPA has created a sample frame of all such plants and generating units. This
frame consists of information obtained from databases that are maintained by the Energy
Information Administration (EIA), a statistical agency of the U.S. Department of Energy (DOE)
that collects information on existing electric generating plants and associated equipment to
evaluate the current status and potential trends in the industry. The source of the information
comes primarily from the 2007 Electric Generator Report (Form EIA-860) and is supplemented
by information found in Form EIA-923 and a survey conducted by EPA’s Office of Resource
Conservation and Recovery (ORCR). In addition, EPA identified two facilities that started
operations after 2007 and has obtained information about them.
Collectively, the data sources provided key information for each steam electric plant with
a NAICS code of 22 such as county, state, North American Electric Reliability Council (NERC)
region, business size (small or non-small), and regulatory status (i.e., regulated by public service
commission). Also included is the number of each type of generating unit operated at the plant,
an identifier that specifies the fuel classification of each generating unit for the plant, and an
identifier for the plant fuel classification that is based on the fuel classifications of the generating
B-3
units (e.g., coal, gas-combined cycle, nuclear). In addition, the ORCR survey results and the
EIA-923 data set provide information on the presence of ponds and landfills at the plant along
with the materials that are stored or disposed in the pond/landfill. The sample frame also includes
details on each generating unit reported in the EIA-860 data set classified as a steam electric
generating unit based on the primary NAICS code, such as prime mover and fuel (fossil or
nuclear), nameplate capacity (in megawatts), unit fuel classification, and the plant where the
generating unit is housed. EPA considers that it has created a sample frame for this ICR that is
complete relative to the defined target population. The sample frame contains information on
1,203 plants containing 2,589 generating units that are within scope of the data collection
objectives of the questionnaire.
2(c) Sample Design
The steam electric questionnaire is comprised of multiple parts. Some parts of the
questionnaire focus on gathering general information and financial data from each plant selected
to receive the questionnaire. Other questionnaire parts are structured to collect technical
information about certain aspects of power plant operations and will be distributed to a subset of
the plants receiving the questionnaire. For example, Parts B and C collect information about ash
handling and FGD operations at coal, oil and petroleum coke plants, while Part H collects
information about operations at nuclear plants. See Part A, Section 2(b)(i) of this Supporting
Statement for details on the questionnaire parts and the specific types of plants that would
receive each part.
Through this ICR, EPA intends to collect information from the industry using a stratified
cluster-based sample design. The stratification feature of the design is discussed further below.
Within each stratum, a cluster sample will be collected. A cluster sample is a probability sample
in which each sampling unit is a collection, or cluster, of elements from the target population.
Here the sampling unit is a plant which is comprised of a collection of generating units
(elements). EPA intends to send the questionnaire to a sample (subset) of plants that is generated
by implementing the sample design. Each selected plant would be required to complete the
questionnaire for every generating unit at the plant. By requiring only a subset of plants to
respond to the questionnaire, the survey burden would be greatly reduced. EPA would select
plants in a manner that would be statistically representative of all plants in the target population.
The following subsections describe how the sample frame will be stratified, the
population sizes for each stratum, the sample selection process, and the precision estimates used
to derive the estimated sample size.
(i)
Stratification
The sample design will stratify plants, primarily by their plant fuel classification and
regulatory status. Stratification involves selecting one or more characteristics of interest and
dividing the members of the population into strata based on those characteristics. Stratified
sampling consists of selecting a sample from within each stratum, then combining them to
constitute the total sample. There are several benefits that result from a stratified sample design,
including:
B-4
•
•
•
•
Ensuring that the sample contains representatives from every stratum;
Improving the precision of parameter estimates;
Allowing important parameters to be estimated at the stratum level; and
Allowing certain subpopulations of particular interest to be sampled at a greater
rate than others.
To select plants that would receive questionnaires, EPA intends to use the following stratification
variables:
All Plants:
•
Plant Fuel Classification. EPA’s sample frame identifies the plant fuel
classification for each plant. EPA intends to stratify by plant fuel classification to
ensure that the sample design selects plants in each fuel type, because there are
inherent differences due to the type of fuel used at the plant. For those plants
classified as “Combination”, the sample frame further delineates them by the
specific fuels that comprise the combination. The plant fuel classification strata
are as follows:
o Coal;
o Petroleum Coke;
o Gas;
o Gas-Combined Cycle;
o Oil;
o Nuclear;
o Combination (Gas and Gas-Combined Cycle);
o Combination (Gas and Oil);
o Combination (Gas-Combined Cycle and Oil);
o Combination (Gas and Gas-Combined Cycle and Oil); and
o Combination (Gas-Combined Cycle and Nuclear and Oil).
•
Regulatory Status. EPA’s sample frame identifies whether the plant is regulated
or non-regulated. EPA intends to stratify by regulatory status to ensure that the
sample design selects both regulated and non-regulated plants, due to inherent
differences between them. Each stratum defined by plant fuel classification will
be segregated into two strata (regulated and non-regulated).
Coal and Petroleum Coke Plants:
EPA’s sample frame includes additional information about coal plants, which will be
used to select a subset of coal plants that will receive additional questions (i.e., Parts E, F and G
of the questionnaire). To minimize burden to small entities, EPA will only collect this additional
information from plants that are not operated by small entities. EPA intends to collect
information from plants that have ponds and/or landfills containing coal combustion residues
(i.e., coal ash or flue gas desulfurization (FGD) wastes). The pond/landfill strata are as follows:
B-5
•
•
Small entities;
Non-small entities:
o
Pond only – FGD: Contains all coal plants identified in the sample frame as
having a pond with FGD waste as one of its contents, but not a landfill;
o
Pond only – no FGD: Contains all coal plants that have an ash pond that
does not also receive FGD waste, but has no landfill;
o
Landfill only – FGD: Contains all coal plants that have a landfill with FGD
waste as one if its contents, but no pond containing coal combustion
residues (CCR);
o
Landfill only – no FGD: Contains all coal plants that have a landfill that
contains ash but with no FGD wastes, and did not report operating a CCR
pond;
o
Ponds and landfill: Contains all plants that have both ponds and landfills
that contain CCR (either ash or FGD wastes). To minimize the number of
strata, no distinction in plants here is made by FGD status. EPA believes a
sufficient number of plants with ponds (both with and without FGD waste)
and landfills (both with and without FGD waste) will be selected from the
previous four non-small entity strata so that a further breakdown of those
plants that have both a pond and a landfill is unnecessary; and
o
No Ponds or landfills: Contains all plants that did not report storing or
disposing of ash or FGD wastes in a pond or landfill.
In addition to coal plants, petroleum coke plants that are not identified as small entities
will also receive Parts E, F, and G of the questionnaire. However, insufficient information was
available to determine which petroleum coke plants had ponds and/or landfills so they will not be
stratified by pond/landfill status.
Accounting for NERC Region:
EPA is also interested in capturing regional differences in the industry as measured by the
NERC region of the plant. However, rather than stratify the sample frame explicitly by NERC
region, EPA will assure geographic diversity in the sample by using a systematic sampling
approach within each stratum. Stratum members will be sorted by NERC region and every kth
member of the stratum will be selected systematically, (where k is the ratio of the stratum
population size to the stratum sample size, rounded to the nearest integer). As a result, the
representation of each NERC region in the sample should be proportional to the size of the
population.
In addition, EPA is interested in capturing information from a representative sample of
plants in terms of plant capacity (as defined by the nameplate capacity of generating units within
the plant). EPA considered nameplate capacity as a stratification variable in the design but
decided not to stratify in this manner as preliminary analysis showed that the systematic
sampling of plants within each stratum described above will provide a distribution of plants in
terms of capacity that meets EPA’s objectives.
B-6
(ii)
Population Sizes for Each Stratum
Table B-1 displays the number of plants within the sample frame, as well as the number
of generating units within these plants, for each type of plant fuel classification and regulatory
status. For those plants labeled as “Combination,” a further breakdown was done by the specific
fuels used in the plant.
Table B-1.
Numbers of Plants and Generating Units by Plant Fuel Classification and
Regulatory Status in Sample Frame
Plant Fuel
Classification
Coal
Gas
Gas-Combined Cycle
Nuclear
Oil
Petroleum Coke
Combination: Gas-CC
and Nuclear and Oil
Combination: Gas-CC
and Oil
Combination: Gas and
Gas-CC
Combination: Gas and
Gas-CC and Oil
Combination: Gas and
Oil
Total
a
Regulatory
Status
Number of Plants
(Sample Frame)
Regulated
Non-Regulated
Unknown
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
343
150
2
131
57
95
278
31
32
23
20
0
9
1
0
2
0
20
6
1
0
1
1
1,203
Number of
Generating Units at
these Plantsa
966
339
2
318
142
126
379
52
48
55
38
0
9
5
0
6
0
69
23
4
0
4
4
2,589
Plants can have multiple generating units with different fuel types. For example, of the 1307
(966+339+2) generating units at coal plants, 103 use something other than coal as their primary or
secondary fuel.
Table B-1 indicates that 1,203 fossil- and nuclear-fueled steam electric power plants and
2,589 generating units exist within the sampling frame. There are 495 coal plants that contain a
total of 1,307 generating units. Coal plants comprise about 41 percent of all plants, and their
generating units make up half of all generating units at all plants. Of the 495 coal plants, 303
B-7
(with 886 generating units) contain a pond and/or landfill containing coal combustion residues.
Table B-2 displays the number of coal plants and corresponding number of generating units by
each stratum associated with pond and/or landfill status defined in the previous section. Fiftyfive coal plants are classified as being operated by a small business. These plants, in addition to
the 143 coal plants that did not report storage or disposal of ash or FGD wastes in a pond and/or
landfill, are not part of the target population for coal plants that will receive Parts E, F, and G of
the questionnaire. In addition, EPA has excluded two coal plants containing a total of nine
generating units from the sample frame for Parts E, F, and G of the questionnaire to limit the
potential burden imposed on these plants recognizing that they may be required to provide
additional data not covered by this ICR.
Table B-2.
Numbers of Coal Plants (and Corresponding Generating Units) by Stratum
in Sample Frame
Business Size
Small
Non-Small
a
(iii)
Pond/Landfill
All
Pond Only – FGD
Pond Only – No FGD
Landfill Only – FGD
Landfill Only – No FGD
Both Pond and Landfill
No Pond or Landfill
Total
Number of
Coal Plants
55
39
99
18
55
84
143
493a
Number of Generating
Units at these Plants
121
122
313
34
142
256
310
1298a
EPA excluded two coal plants (containing nine total generating units) from the sample frame for Parts E, F,
and G of the questionnaire.
Sample Selection
Most parts of the questionnaire focus on gathering information from all coal and
petroleum coke-fired power plants. Thus, all plants with a fuel classification of “Coal” or
“Petroleum Coke” will be selected with certainty (i.e., probability of selection equal to one),
except for Parts E, F, and G. In addition, for strata with ten or fewer plants, EPA assumed that
the sample would include all plants, while the minimum number of plants sampled would be ten
within strata containing more than ten plants. As such, all regulated and non-regulated plants
with a plant fuel classification of “Combination” (except gas and gas-combined cycle) will also
be selected with certainty. For the remaining non-regulated and regulated plants with plant fuel
classifications of gas, gas-combined cycle, oil, nuclear, and combination (gas and gas-combined
cycle), 30 percent of the plants will be randomly selected to receive the questionnaire while
adhering to the ten plant minimum per stratum. Based on this sampling design, a total of 734
plants would be selected to receive the questionnaire. EPA assumes a 10 percent non-response
rate, based upon EPA’s experience in administering similar effluent guidelines questionnaires,
which would result in a net sample size of 661. Table B-3 shows the number of plants that
would be selected to receive the questionnaire by the strata formed by plant fuel classification
B-8
and regulatory status. EPA has determined that the sample size resulting from this design is
sufficient to meet the objectives of this survey.
Table B-3.
Number of Plants in Sample Frame and Number of Plants to be Selected for
Questionnaire, by Plant Fuel Classification and Regulatory Status
Plant Fuel
Classification
Coala
Gas
Gas-Combined Cycle
Nuclear
Oil
Petroleum Cokea
Combination: Gas-CC
and Nuclear and Oila
Combination: Gas-CC
and Oila
Combination: Gas and
Gas-CC
Combination: Gas and
Gas-CC and Oila
Combination: Gas and
Oil*
Total
a
Regulatory
Status
Regulateda
Non-Regulateda
Unknowna
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulated
Non-Regulated
Regulateda
Non-Regulateda
Regulateda
Non-Regulateda
Regulateda
Non-Regulateda
Regulated
Non-Regulateda
Regulateda
Non-Regulateda
Regulateda
Non-Regulateda
Number of Plants
(Sample Frame)
343
150
2
131
57
95
278
31
32
23
20
0
9
1
0
2
0
20
6
1
0
1
1
1,203
Number of Plants
to Sample
343
150
2
39
17
29
83
10
10
10
10
0
9
1
0
2
0
10
6
1
0
1
1
734
All plants in stratum are to be selected with certainty (probability equal to one).
The exact number of generating units selected for the sample will not be known prior to
the sample draw and must be estimated because the number of generating units varies by plant
and thus will depend on the specific set of plants selected. Generating units within plants that are
selected with certainty will automatically be included in the sample. For those plants in a
stratum that will be sampled at 30 percent, EPA estimated the corresponding number of
generating units as 30 percent of the total number of generating units in the stratum. Table B-4
displays the number of generating units for each unit fuel classification. For example, of the 144
oil units in the sample frame, 51 of them are at plants that will be selected with certainty and an
additional 93 oil units are at plants that are to be sampled at 30 percent. Thus, the number of oil
units estimated to be selected for the questionnaire is 51 + 0.30 * 93 = 79. Overall, information
B-9
from approximately 1,714 generating units would be included in the survey based on 734 plants
selected for the survey.
Table B-4.
Number of Generating Units in Sample Frame and Approximate Number to
be Selected for Questionnaire, by Unit Fuel Classification
Unit Fuel
Classification
Coala
Gas
Gas-CC
Nuclear
Oil
Petroleum Cokea
Total
a
Number of Units
(Sample Frame)
1,206
557
567
104
144
11
2,589
Estimated Number of
Units to be Selected for
Survey
1,206
200
184
34
79
11
1,714
All generating units in these unit fuel classifications are at plants selected with certainty
(probability equal to one).
Parts E, F, and G of the questionnaire will be sent to all petroleum coke plants and a
subset of coal and petroleum coke plants operated by non-small entities that contain either a
pond and/or a landfill containing coal combustion residues (i.e., coal ash or FGD waste).
(Insufficient information was available to determine which petroleum coke plants had ponds
and/or landfills, so all three non-small petroleum coke plants will receive Parts E, F, and G.) For
those strata of coal plants listed in Table B-2 where information is to be collected, 30 percent of
the plants will be randomly selected to receive parts E, F, and G of the questionnaire. In
addition, EPA has identified seven coal plants within these strata that have leachate collection
systems. EPA has decided to sample these seven plants with certainty to ensure information on
leachate collection systems is captured in the questionnaire responses. These seven plants will
be included in the sample selection process so that a total of 30 percent of coal plants will be
selected. Based on these calculations, a total of 94 coal plants would be selected to receive parts
E, F, and G of the questionnaire. Table B-5 shows the number of coal plants that would be
selected to receive parts E, F, and G of the questionnaire by stratum. Assuming a 10% nonresponse rate within each stratum, EPA expects a net sample size of 85 plants. As a result of
these calculations and assumptions, EPA estimated that having 94 coal plants receive parts E, F,
and G of the questionnaire would meet its objectives.
The exact number of generating units selected for the sample will not be known prior to
the sample draw and must be estimated as the number of generating units varies by plant and
thus depends on the specific plants selected. Within each stratum, the ratio of generating units to
plants in the sample frame was multiplied by the number of sampled plants to estimate the
number of generating units selected for the sample. For example, on average, there are
122/39=3.13 generating units per coal plant for pond only coal plants with FGD wastes in the
pond. Thus, the estimated number of generating units in the sample for this stratum would be
3.13 * 12 sampled plants = 38 generating units. The final two columns of Table B-5 display the
B-10
number of generating units in the sample frame from coal plants as well as the approximate
number of generating units from coal plants for parts E, F, and G of the questionnaire. As a
result of these calculations, approximately 272 generating units from coal plants would be
represented in parts E, F, and G of the questionnaire.
Table B-5.
Business
Size
Small
Non-Small
Numbers of Coal Plants (and Corresponding Generating Units) by Stratum
in the Sample Frame as well as to be Selected for Parts E, F, and G of the
Questionnaire
Pond/Landfill
All
Pond Only – FGD
Pond Only – No
FGD
Landfill Only –
FGD
Landfill Only –
No FGD
Both Pond and
Landfill
No Pond or
Landfill
Total
55
39
99
Number of
Coal Plants
Selected for
Parts E, F, and
G of
Questionnaire
0
12
30
18
10
34
19
55
17
142
44
84
25
256
76
143
0
310
0
493
94
1298
272
Number
of Coal
Plants
(Sample
Frame)
Number of
Generating
Units at the
Coal Plants in
Sample Frame
Estimated
Number of
Generating
Units in
Sample
121
122
313
0
38
95
Note that Table B-5 only contains information on coal plants. Also, there are three
petroleum coke plants operated by non-small entities (each having one generating unit) that will
be sent parts E, F, and G of the questionnaire. In addition, Part E (but not Parts F and G) will be
sent to the 230 oil, gas, nuclear, gas-combined cycle, and combination plants identified in the
final column of Table B-3 that would be selected for the overall questionnaire.
Table B-6 below provides the summary of the design.
B-11
Table B-6.
Summary of Population, Sample and Subsample Sizes
Stratum
Coal – Small business
Coal – Pond Only –FGD
Coal – Pond Only –no FGD
Coal – Landfill Only – FGD
Coal – Landfill Only – No FGD
Coal – Both Pond and Landfill
Coal – No Pond and Landfill
Gas
Gas-Combined Cycle (CC)
Nuclear
Oil
Pet Coke
Combination: Gas-CC and Nuclear and
Oil
Combination: Gas-CC and Oil
Combination: Gas and Gas-CC
Combination: Gas and Gas-CC and Oil
Combination: Gas and Oil
Total
Number of Plants in:
Population Sample Subsample
55
0
39
12
99
30
18
10
55
17
84
25
143
0
188
56
373
112
63
20
43
20
9
9
1
1
2
26
1
2
1201
2
16
1
2
239
-
94
The population size is the number of plants in the stratum. From the population, we selected a
sample to receive the questionnaire. The subsample will be required to respond to some
additional parts of the questionnaires.
(iv)
Precision Estimates for the Given Sample Sizes
Because a sample of plants will be given the questionnaire rather than all plants within
the target population, it follows that some degree of uncertainty will be associated with estimates
made from the data collected from the questionnaires. The precision of these estimates depends
on both the sample design and the sample size – that is, the number of plants that would be
selected. One measure of precision is the width of the confidence interval for the estimate.
Confidence intervals provide a range of values for a particular estimate that would be likely if the
study were repeated an infinite number of times. Thus, when using 95 percent confidence
intervals, 95 percent of such intervals would include the true value, if we could take an infinite
number of samples. The binomial distribution is often used as the basis of sample designs and
can be used to estimate precision. The binomial distribution applies to situations where there are
only two outcomes (yes or no) to a dichotomous question such as “Are residues generated by
cleaning operations?”
The presence or absence of the attribute for a particular plant is a dichotomous, or binary,
variable. The binomial distribution models these data based on the notion of obtaining national
B-12
estimates of the percentage or proportion of plants in the target population (or a subset of the
target population) that have a particular attribute. The binomial distribution also provides
estimates of the variance that is used to calculate the confidence intervals. Because a proportion
of 0.5 (or 50 percent) results in the largest possible variance for the binomial distribution, EPA
assumed that the probability of one outcome would be 0.5 (e.g., cleaning operation residues are
generated at 50 percent of the plants). In other words, if the population value is any value other
than 50 percent, the survey estimate will be more precise – in statistical expectation – than it
would be if the population value is 50 percent.
Because EPA is developing a national rule, it is primarily concerned with the precision of
the overall plant-level and generating unit-level estimates, rather than estimates made within
strata. Consequently, in estimating the overall sample size, EPA assumed more stringent
requirements for overall plant-level and generating unit-level estimates than estimates for
specific plant fuel or unit fuel classifications. Given a sample size of 734 plants described in the
previous section, EPA would expect to receive completed questionnaires for 661 plants
(assuming a 10 percent non-response rate). Under these assumptions, the sample would be
expected, with 95 percent confidence, to yield sufficient data to estimate the value of an
unknown proportion to within +/- 3.5 percent of its true value for the target population (i.e.,
plants). For example, if 30 percent of the responding plants indicate they generate cleaning
residues, then there is 95 percent confidence that 30 percent +/- 3.5 percent (i.e. between 26.5
percent and 33.5 percent) of the entire population of plants generate cleaning residues.
Based on estimated sample of 1,714 generating units, the sample (similarly accounting
for plant non-response) would be expected, with 95 percent confidence, to yield sufficient data to
estimate the value of an unknown proportion to within +/- 2.5 percent of its true value for the
target population (i.e., generating units). For example, if 30 percent of the generating units
generate cleaning residues, then there is 95 percent confidence that 30 percent +/- 2.5 percent
(i.e. between 27.5 percent and 32.5 percent) of the entire population of generating units generate
cleaning residues. The precision estimates associated with national plant-level and unit-level
estimates are calculated based on proportion estimates from Part I of the questionnaire, a part
sent to all sampled plants. Other questionnaire parts (e.g., Parts B through H) will be sent to only
a subset of the plant fuel classifications and thus will have a smaller sample size (and, as a result,
less precision).
Based on the sample size of 94 coal plants with a pond and/or landfill for parts E, F, and
G of the survey, EPA would expect to receive completed questionnaires for 85 of these plants
(assuming a 10 percent non-response rate). Under these assumptions, the sample would be
expected, with 95 percent confidence, to yield sufficient data to estimate the value of an
unknown proportion to within +/- 9.2 percent of its true value for the target population (i.e., coal
plants with ponds/landfills). For example, if 30 percent of the coal plants collect landfill
leachate, then there is 95 percent confidence that 30 percent +/- 9.2 percent (i.e. between 20.8
percent and 39.2 percent) of the entire population of coal plants with a pond or landfill collect
landfill leachate.
B-13
All of these precision targets will hold when the proportion’s true (unknown) value is
equal to 0.5, and even greater precision is expected when the true value of the proportion is not
equal to 0.5.
2(d) Sources of Error
In developing the sample design, as described previously, EPA considered the estimated
precision targets for data collected from the target population. EPA also considered potential
error that could be associated with estimates calculated from the collected data, due to sources
associated with sampling, such as response rates, as well as non-sampling sources of error, such
as processing error.
(i)
Response Rates
In developing the sample design, EPA considered both unit (questionnaire) and item
(question) non-response. EPA expects that the response rate would be relatively high for this
mandatory survey effort, which would be conducted under the authority of Section 308 of the
Clean Water Act. The cover letters and instructions for the questionnaire would explain the legal
authority, responsibility to respond, reasons for the questionnaire, and penalty for non-response.
EPA would use reminder letters and/or telephone calls to remind respondents of the duty to
respond under Section 308 of the Clean Water Act. If possible, EPA would seek the
endorsement of the major trade associations, which would be expected to increase the response
rate from its members. EPA recognizes that some non-response is unavoidable, and in past
survey efforts, EPA has waived the duty to respond in extreme and rare cases (e.g., natural
disasters) which also might occur for this survey effort.
The sample size estimates presented in Section 2(c)(iii) have been adjusted to help ensure
that the effective sample sizes (i.e., respondents) would be sufficient for precision requirements
when the non-response rate equals 10 percent. (This assumption of a 10 percent non-response
rate is based upon a typical effluent guidelines questionnaire.) In addition to increasing the
initial sample size, EPA would strive to improve the response rate by reminder letters and/or
telephone calls. Furthermore, after receiving the responses, EPA intends to adjust the
questionnaire weights for any non-response and to review publicly available information in order
to determine if non-respondents appear to have different characteristics than respondents. EPA
would examine these characteristics both for the entire industry and for subgroups in the
analyses. For any differences, EPA intends to determine the major causes, and to incorporate
appropriate adjustments for bias. (Bias is the difference between the expected value of an
estimate and the true value of a parameter or quantity being estimated. If the data collection
process generates estimates that are consistently (or on average) above or consistently below the
true value, the data collection process is biased.)
To minimize item non-response, EPA’s subject matter experts have worked closely with
industry to develop questions that would be easy to understand with clearly defined and familiar
terms, are formatted in a logical sequence, and would request data that are readily available
within the industry. In this manner, EPA expects to minimize inaccurate or incomplete response
of the questions that can occur due to misunderstanding or misinterpretation of questions and the
B-14
unintentional skipping of questions by respondents. Additionally, EPA would operate an e-mail
helpline to assist respondents with the questionnaire. After receipt of the completed
questionnaires, EPA intends to conduct extensive follow-up with respondents for any item nonresponse. If necessary, EPA would impute responses to key questions in our analyses.
(ii)
Processing Errors
Processing errors can occur when questionnaire responses are coded, edited, and entered into the
database. The design and implementation of the questionnaire database would employ a number
of quality assurance techniques to reduce the frequency of such errors. These techniques would
include the following:
•
Electronic questionnaires will directly transfer responses into a survey database
•
Computerized comparison of selected responses to detect inconsistencies and
illogical responses;
•
Computerized analyses to screen for out-of-range and inconsistent numerical
values; and
•
Computerized analyses to detect missing numerical data and missing units.
3. PRETESTS AND PILOT TESTS
EPA does not intend to pre-test the questionnaire. For more than 30 years, EPA’s
Engineering and Analysis Division has conducted surveys of numerous industrial sectors to
collect information to support regulation development activities in the effluent guidelines
program. While EPA develops different questionnaires for each industry, there are common
elements for all industries. The questionnaires collect the same basic data such as information
about processes, treatment, and financial status. Thus, when EPA develops a questionnaire for a
particular industry, it generally tailors the questions for specific terms and processes used by that
industry. In past years, EPA has relied predominantly on active participation by trade groups in
reviewing the questionnaires. In EPA’s experience, such collaboration generally tends to better
reflect the industry at large than pre-tests. For this reason, EPA considers additional review
through the pre-test process to be unnecessary for this industry.
4. COLLECTION METHODS AND FOLLOW-UP
Please See Part A, Section 5(b) of this ICR for this information.
5. DATA PREPARATION AND ANALYSIS
5(a) Data Preparation
B-15
Upon receipt of completed questionnaires, EPA and its contractors would review the
questionnaires for completeness and CBI claims. All questionnaires will also be reviewed for
consistency and reasonableness and follow-up calls will be conducted as needed to clarify
inconsistencies found in the responses. Reviewed questionnaire files will then be uploaded into
the questionnaire database. Once the data are uploaded into the database, numerous electronic
QA activities would be performed and the results would be provided to engineering and
economic staff for further resolution and documentation. This database would then be used to
perform data analyses.
5(b) Analysis
In support of making decisions that will affect rulemaking, EPA will use the responses
from this questionnaire to help answer the following questions:
•
What pollutants are currently discharged by power plants and what
technologies/practices are being used to treat power plants wastestreams?
•
Are process changes, BMPs, treatment technologies, or recycle/reuse processes
feasible options to reduce pollutant discharges from steam electric power plants?
•
Is subcategorization of the steam electric power generating industry appropriate,
and if so, what is the correct subcategorization?
•
What are the available technologies for reducing pollutant discharges associated
with waste streams of concern, such as FGD scrubber purge, ash handling,
process equipment cleaning wastes, and other wastewaters?
•
What pollutant reductions and other benefits would be realized by implementing
candidate regulatory options?
•
What are the costs for operators to implement technologies, process changes, and
best management practices to reduce pollutant discharges? How do these costs
impact the economics of the steam electric industry?
The objectives of the information collection would be achieved by the statisticallydesigned sample survey because the resulting inferences and analyses would be as statistically
unbiased and as precise as is practicable. EPA intends to apply sample weights derived from the
statistical sample design and adjust for non-response to the data during statistical analysis.
Weighting the data would allow inferences to be made about all eligible plants and generating
units, including those that did not respond to the questionnaires. Another advantage is that
weighted estimates would have smaller variances than unweighted estimates. EPA also intends
to evaluate whether estimates could be improved by post-stratifying the detailed questionnaire
data using the data collected from the survey. EPA would use accepted statistical methods for
survey statistics, such as those described in Sampling Techniques (Cochran, 1977) and Survey
Sampling (Kish, 1965).
B-16
See Part A, Section 2(b) of this Information Collection Request for a detailed discussion
of the technical and economic analyses.
EPA intends to use the following contractors to assist in conducting this survey:
Sample Frame Preparation and Analysis:
Eastern Research Group
14555 Avion Parkway
Suite 200
Chantilly, VA 20151
Questionnaire Development and Analysis Support (Financial/Economic Data)
Abt Associates Inc.
55 Wheeler Street
Cambridge, MA 02138
Statistical Design and Analysis:
Battelle
505 King Avenue
Columbus, OH 43201
6. REFERENCES
Cochran, W.G. (1977). Sampling Techniques. New York: Wiley.
Kish, L. (1965). Survey Sampling. New York: Wiley.
B-17
File Type | application/pdf |
File Title | Microsoft Word - Part B_SS_022310.doc |
Author | JAlicea |
File Modified | 2010-02-24 |
File Created | 2010-02-24 |