Description of Statistical Design

Attachment 15 -Description of statistical design.pdf

Willingness to Pay for Improved Water Quality in the Chesapeake Bay (Revised)

Description of Statistical Design

OMB: 2010-0043

Document [pdf]
Download: pdf | pdf
Attachment 15: Description of Statistical Survey Design

The following represents an anticipated experimental design for survey implementation, along
with the associated number of completed surveys that will be required. Part B of this supporting
statement provides detail on the sampling design. The proposed design and sampling plan is based on
standard design and sampling theory for choice experiments and population surveys, as outlined by
Louviere et al. (2000), Kuhfeld (2009) and Dillman (2000). EPA notes that the anticipated experimental
design described here is preliminary and it may be subject to refinements during design evaluations to
account for issues such as dominant or dominated pairs, ecological feasibility, and to remove attribute
combinations which do not provide information for estimation.

The purpose of the Chesapeake Bay survey is to calculate average per household parameters
(e.g., willingness to pay and choice probabilities) within a given survey population. Additional analysis
that differentiates per-household parameters may be conducted within groups of households which use
or do not use the Chesapeake Bay.

Experimental design for the choice experiments
Based on focus groups and pretests, and guided by realistic ranges of attribute outcomes, the
anticipated experimental design includes a fixed status quo or “no policy” option (Option A), and two
multi-attribute choice options or alternatives, Options B and C. These alternatives, Option B and
Option C, are characterized by three potential levels for environmental attributes and six different levels
of annual household cost. The study design will employ two split samples. The first consists of three
different baselines, where Option A or the status quo choice represents either a constant, declining, or
improving baseline in the environmental attributes, relative to current levels. The second experimental
treatment involves two different reference years for predictions of the attribute levels (2025 or 2040).

Each of these split-sample experiments will be produced in the three geographic divisions:
1. Bay States: Maryland, Virginia, District of Columbia
2. Watershed States: Delaware, New York, Pennsylvania, and West Virginia
3. Other East Coast States: Vermont, New Hampshire, New Jersey, Massachusetts,
Connecticut, Rhode Island, Maine, North Carolina, South Carolina, Georgia, and Florida
1

Within these geographic divisions, the split-sample experiment cells will collect more detailed
information in the Bay states, as stipulated in Table A15-1. Based on the three geographic strata, three
baseline versions, and two reference years for environmental conditions, the study will consist of the
following 18 cell design:
Table A15-1. Split-sample design cells.
Geographic division

Baseline factor

Reference year for
environmental conditions

Cell 1

Bay States

Declining

2025

Cell 2

Bay States

Constant

2025

Cell 3

Bay States

Improving

2025

Cell 4

Watershed

Declining

2025

Cell 5

Watershed

Constant

2025

Cell 6

Watershed

Improving

2025

Cell 7

Other East Coast

Declining

2025

Cell 8

Other East Coast

Constant

2025

Cell 9

Other East Coast

Improving

2025

Cell 10

Bay States

Declining

2040

Cell 11

Bay States

Constant

2040

Cell 12

Bay States

Improving

2040

Cell 13

Watershed

Declining

2040

Cell 14

Watershed

Constant

2040

Cell 15

Watershed

Improving

2040

Cell 16

Other East Coast

Declining

2040

Cell 17

Other East Coast

Constant

2040

Cell 18

Other East Coast

Improving

2040

Options B and C, the alternatives to the status quo baseline (option A), are characterized by
different levels for the following six attributes, including cost. The number of levels corresponding to
each of these attributes is depicted below:
1. Change in water clarity in Options B and C (x1B; x1C) – 3 possible levels
2

2.
3.
4.
5.
6.

Change in adult blue crab abundance in Options B and C (x2B; x2C) – 3 possible levels
Change in adult striped bass abundance in Options B and C (x3B; x3C) – 3 possible levels
Change in oyster abundance in Options B and C (x4B; x4C) – 3 possible levels
Change in lake condition in Options B and C (x5B; x5C) – 3 possible levels
Cost in Options B and C (x6B; x6C) - 6 possible levels

This implies an experimental design characterized by three levels for each of the five attributes and six
levels for costs [35×6] for each alternative option, or [310×62] for Options B and C together.
To construct a preliminary main effects design with 72 profiles that is sufficiently flexible to
estimate alternative specific main effects and response patterns (i.e., a non-generic design), we begin
with a 35x6 orthogonal fractional factorial design with 144 profiles. We then combined the elements of
this design into pairs that would reflect trade-offs at the margin (i.e., improvements in the attributes that
are attained at the cost of decrease in other environmental attributes and/or increase of the overall cost of
the program). Finally, these pairs were blocked1 in such a way that variability of the environmental and
cost attributes within a block would be maximized (and hence the main effects would not be confounded
with the block effects). The result is a design with 72 profiles, with attributes labeled following the
above notation, and levels indicated by integers 1...N, where N for each attribute is the number of levels
identified above.

Following common practice in the environmental economics literature, we anticipate three
choice questions per survey. This allows the 72 profiles to be included (orthogonally blocked) in 24
unique survey booklets, as illustrated in Table A15-2. The attribute levels applied within surveys are
summarized in Table A15-3. Monte Carlo evidence suggests that 6 to 12 completed responses are
required for each profile in order to achieve large sample statistical properties for choice experiments
(Louviere et al. 2000, p. 104, citing Bunch and Batsell 1989). Following this guidance, the above design
will require 24×12 = 288 completed surveys, or 12 completed surveys for each unique survey booklet,.
This will provide a total of 864 profile responses per cell.
Table A15-2. The set of 72 design profiles within each geographic division
and reference year by baseline cell.
Booklet
1
1
1
2

X1B
1
2
3
1

X2B
2
1
3
1

X3B
1
2
3
2

X4B
2
3
1
2

X5B
2
3
2
2

X6B
6
3
2
4

X1C
2
3
3
1

X2C
1
1
3
1

X3C
1
2
2
3

X4C
1
3
2
1

X5C
2
1
2
2

X6C
5
1
1
5

1

EPA assigned each profile to an independent subset, or “block” of profiles. Blocking reduces the number of profiles each
respondent sees, thus reducing respondent burden.

3

2
2
3
3
3
4
4
4
5
5
5
6
6
6
7
7
7
8
8
8
9
9
9
10
10
10
11
11
11
12
12
12
13
13
13
14
14
14
15
15
15
16
16
16
17
17
17
18

2
3
2
2
3
1
2
3
1
2
3
1
2
3
1
1
2
2
2
3
1
1
3
1
1
3
2
2
3
2
3
3
1
2
2
1
2
3
1
2
3
1
2
3
1
2
3
1

2
3
2
3
1
1
3
1
1
3
2
1
1
1
1
3
2
1
2
3
1
2
1
2
3
1
1
3
2
2
1
3
3
2
3
3
1
2
2
3
1
2
2
1
1
3
1
2

1
3
2
1
3
2
3
2
3
3
3
2
1
3
2
3
1
1
2
2
2
2
1
1
3
2
3
1
3
1
2
2
1
2
1
1
2
2
1
2
1
3
3
1
3
2
1
2

3
2
1
2
3
2
3
2
1
2
2
1
2
3
2
1
3
1
2
3
3
1
2
3
2
1
1
2
1
1
2
3
1
1
3
2
1
2
3
3
2
1
3
2
2
1
3
2

1
3
2
3
1
3
1
3
1
3
3
3
2
1
3
2
1
3
3
1
3
1
2
1
3
2
3
2
3
1
2
3
3
3
2
3
2
1
3
2
1
1
3
3
2
3
1
3

5
3
2
6
1
3
1
5
4
6
3
3
3
6
2
1
3
2
1
4
5
6
1
1
3
6
1
6
3
3
1
2
1
6
5
1
4
6
5
3
6
5
6
1
5
6
6
3

1
3
1
1
2
1
2
3
1
1
3
2
2
3
2
1
3
2
2
1
1
1
2
1
2
1
2
1
3
2
2
3
3
2
3
2
2
3
3
2
3
1
3
3
1
1
2
1

2
3
2
1
1
3
2
1
2
3
3
2
3
2
1
2
2
1
2
3
3
1
1
3
3
1
2
3
2
1
2
2
3
2
1
2
1
1
2
3
1
3
2
3
2
3
1
1

1
2
1
1
3
3
3
2
1
3
2
3
1
3
3
3
1
3
2
1
2
1
3
1
3
2
2
2
3
1
2
3
1
2
1
1
2
3
1
3
1
2
3
1
3
3
1
1

3
3
3
2
3
2
3
1
1
2
3
1
2
2
2
3
1
1
3
3
3
1
2
3
3
1
1
2
2
3
2
3
2
2
3
2
3
2
1
2
1
1
1
1
2
1
1
2

2
2
2
3
3
1
2
2
2
1
2
3
1
1
1
2
1
2
2
1
2
1
2
2
3
1
3
2
2
1
2
3
3
1
2
3
1
1
3
2
2
1
3
3
1
3
1
3

6
2
5
2
2
4
1
2
4
2
3
4
5
5
4
2
1
4
1
3
6
3
3
3
5
3
1
5
4
4
1
1
4
4
3
2
6
5
4
4
6
6
5
2
6
6
4
1

4

18
18
19
19
19
20
20
20
21
21
21
22
22
22
23
23
23
24
24
24

2
3
1
1
3
1
2
3
2
3
3
1
2
2
2
3
3
1
1
3

3
2
1
3
2
2
3
3
3
2
3
3
1
3
1
1
2
2
3
3

3
2
3
2
2
2
2
3
2
3
1
1
3
2
2
1
2
3
1
2

1
2
3
3
1
3
2
3
1
2
3
1
3
3
2
3
1
1
2
1

2
1
2
1
3
2
1
3
1
2
3
1
3
3
1
3
2
2
1
1

2
2
1
4
5
2
5
4
1
6
4
1
6
5
5
5
4
5
2
5

2
3
3
1
3
1
2
2
1
3
3
2
1
1
2
1
3
1
3
3

2
2
1
2
3
2
2
3
3
2
2
3
1
2
1
1
3
1
3
3

3
3
3
2
1
1
1
3
3
3
1
1
3
2
1
2
2
3
1
1

2
3
3
2
1
3
2
3
1
1
3
1
1
3
1
3
1
3
2
1

2
1
1
1
3
3
1
1
1
1
2
2
3
3
1
3
1
2
2
2

Table A15-3: Attribute Levels Included in Each Survey Version.
Attribute Levels
Baseline
1
2
3
4
Attribute
Declining Baseline
Water Clarity (feet)
2
3
3.5
4.5
Adult Striped Bass (millions)
21
24
30
36
Adult Blue Crab (millions)
235
250
285
328
Oysters (tons)
2,800
3,300
5,500
10,000
Low Algae Level Lakes
2,300
2,900
3,300
3,850
Annual Household Cost
$20
$40
$60
$180
Constant Baseline
Water Clarity (feet)
3
3
3.5
4.5
Adult Striped Bass (millions)
24
24
30
36
Adult Blue Crab (millions)
250
250
285
328
Oysters (tons)
3,300
3,300
5,500 10,000
Low Algae Level Lakes
2,900
2,900
3,300
3,850
Annual Household Cost
$20
$40
$60
$180
Water Clarity (feet)
Adult Striped Bass (millions)

3.3
26

3.3
26

Improving Baseline
3.5
4.5
30
35
-

3
3
2
2
6
4
2
2
1
3
2
3
2
4
2
4
3
6
4
5

5

6

$250

$500
-

$250

$500

-

5

Adult Blue Crab (millions)
Oysters (tons)
Low Algae Level Lakes
Annual Household Cost

260
4,300
3,100
-

260
4,300
3,350
$20

312
5,500
3,600
$40

340
10,000
3,850
$60

$180

$250

$500

Realized Sample Sizes for Maximum Acceptable Sampling Error
The goal of the choice experiment is to estimate regression coefficients from mixed or
conditional logit models that may be used to estimate willingness to pay for multi-attribute policy
alternatives, or the likelihood of choosing a given multi-attribute alternative, following standard random
utility modeling procedures (Haab and McConnell 2002). Hence, the sample size requirements are
determined by the accuracy of the parameter estimates in the WTP models.
The resulting sample design will be a single stage stratified sample. No clustering (multiple
stages of selection) will be necessary. Unequal probabilities of selection will result in different
geographic divisions defined in Part B, Section 2 “Survey design”, and lead to varying sampling
weights, as demonstrated in Table A15-4 (assuming that the design contains 9 cells). Due to these
varying weights, under assumptions of constant response rate and fixed sample size, the expected design
effect due to differential baseline weights is 1.75. The realized design effect will likely be higher due to
extra variability of weights within cells due to non-response adjustments.

Table A15-4. Sample size and accuracy projections.
Geographic
division

Population

Expected

Expected

Standard

Standard

size

sample size

weights

error,

error, 10%

50%

incidence

incidence
Bay States

5,479,176

1,728

3,171

0.017

0.010

Watershed

13,442,787

1,728

7,779

0.017

0.010

Other East Coast

25,431,478

1,728

14,717

0.017

0.010

Overall

44,353,441

5,184

8,556

0.011

0.007

Source: The household population size for each region was obtained from U.S. Census Bureau
(2012). 2010 Census Summary File 1. Retrieved May 31, 2012 from http://factfinder2.census.gov/.

6

The maximum acceptable sampling error for predicting response probabilities (the likelihood of
choosing a given alternative) in the present case is ±10%, assuming a true response probability of 50%
associated with a utility indifference point. Given the survey population size, this level of precision
requires a minimum sample size of approximately 96 observations. The number of observations
(completed surveys) required to obtain large sample properties for the choice experiment design provide
more than sufficient observations to obtain this required precision for population parameters.

Projected sample sizes given the potential non-response
Survey non-response is a common phenomenon. The sample design must be proactive and
account for the potential non-response. Based on recent experience with surveys of similar nature, EPA
expects the response rate for the Chesapeake Bay survey to be close to 30%. Additionally, the expected
eligibility rate for a mail survey is 92%, and accounts for vacant, seasonal, non-existent, and otherwise
ineligible units. The projected number of required mailings is given in Table A15-5 for different
scenarios (response rates of 20% and 30%) and different sample size determination methods (expected
number of mailings vs. the number of mailings that ensures 90% probability of reaching the cell target
sample size).

Table A15-5. Required sample size.
Target cell size:
n=288

Required cell size
District of Columbia
Maryland
Virginia
Delaware
New York
Pennsylvania
West Virginia
Connecticut

r=20% response rate
Mean
90% prob
projection,
to achieve
n/r
cell size
1,565
1,672

r=30% response rate
Mean
90% prob to
projection,
achieve cell
n/r
size
1,043
1,111

458

488

304

324

3,696

3,948

2,462

2,622

5,238

5,596

3,490

3,714

240

256

160

170

5,112

5,462

3,406

3,626

3,506

3,746

2,336

2,486

534

570

356

378

506

540

340

360
7

Florida
Georgia
Maine
Massachusetts
New Hampshire
New Jersey
North Carolina
Rhode Island
South Carolina
Vermont
Total:

2,740

2,930

1,826

1,946

1,324

1,414

882

940

206

220

138

146

940

1,004

626

668

192

204

128

136

1,186

1,268

790

842

1,382

1,478

922

982

152

164

102

108

666

710

444

472

94

102

64

68

28,172

30,100

18,776

19,988

The sample size required for 90% probability of achieving the cell size is computed as the 90-th
percentile of the negative binomial distribution with success probability equal to the response rate and
the number of successes equal to the target cell size.

8


File Typeapplication/pdf
AuthorAnn Speers
File Modified2013-09-17
File Created2013-08-16

© 2024 OMB.report | Privacy Policy