Web Diary Feasibility Study Overview
The current Consumer Expenditure Diary Survey (CED) uses a pencil-and-paper instrument (PAPI) to collect expenditure information from respondents. PAPI diary data collection has a number of inherent drawbacks, such as limiting entry to a single individual in a single location, and requiring that the main diary keeper carry the diary with them throughout the day. A web diary has the potential to address these limitations, and over time, the feasibility and potential benefits of using a web diary have grown. There is evidence to suggest that accuracy of reporting may be improved through the use of a web diary (Couper, 2008). A web mode also has the potential to improve unit and item response rates, and would allow for easier access across consumer unit (CU) members since respondents can enter expenditure data from any internet-enabled location and would not be tied to a single instrument kept in one location. Finally, a web component has potential cost savings over PAPI due to reduced or eliminated materials, scanning, and data entry expenses. This study will test the feasibility and impact of using a web diary to collect CE diary expenditures, relying on a prototype developed by the Census Bureau.
This study will attempt to answer the following research questions:
What are the operational issues related to implementing a web mode?
Debrief with field representatives (FRs) to determine:
What issues if any did respondents have with the website and login information provided?
What technical issues if any were raised by the respondents?
How should respondent materials be revised?
Debrief with the staff that handle respondent help requests to determine:
What kinds of questions did respondents ask?
How should respondent materials be revised?
How do high-level expenditure reporting rates and data quality differ by mode?
Expenditures by test/control for the “Food” category
Item-missingness by test/control
Expenditures and item-missingness by double-placed diary controls/test
How do acceptance and completion rates differ by mode?
Week 1 acceptance rate by test/control
Week 1 completion rates by test/control
Week 2 completion rates by test/control
What are web respondents’ specific data entry patterns?
Number of log-ins per case
Start/stop time stamps by diary day (to determine multiple times per day versus only data entry at the end of the week)
Study Design
Field period
The field period is scheduled for January – March, 2013. (The cases won’t close out, however, until late April due to placement dates.) This allows flexibility due to Census workload and regional office (RO) consolidation plans.
Respondents
The Research sample will serve as the test group. The control group will come from the Production sample for whatever quarter is used for fielding.
Modes and diary changes
The control group will use the current CED methodology. The test group will use multiple modes: an in-person first visit with some CAPI data collection; a telephone reminder at day 3; a telephone reminder at day 8 instead of visit 2; completion of the diary via web; and an in-person third visit with some CAPI data collection.
Changes to the web diary format and content were based on two rounds of usability tests conducted by the Center for Survey Measurement (CSM) at the Census Bureau1. (See Attachments H and I.) BLS has added the OMB number and expiration date to the web survey instrument. (See Attachment E for screen shots.) Additionally, a pilot was conducted with 11 BLS staff to gather feedback on the web diary. Copies of those reports can be forwarded as necessary to OMB.
CAPI changes
The CAPI survey instrument will have to be modified to add two questions for both the Research and Production samples: internet access and how accessed. (See Attachment F for screening questions.) The first question will allow for matching between Research and Production samples since households/consumer units with internet access are different from those without internet access.2 The second question, on how accessed, will allow BLS to screen-out those households that only access the internet through mobile phones.
There will also be questions added into CAPI to de-brief both the respondents and FRs. (See Attachment F for debriefing questions.) These will gauge respondents’ general comments, use of records, and burden as well as FRs’ assessments of the respondents and their general comments.
Finally, current questions will be modified to ask whether the respondent made entries in the web diary, to provide FR instructions to probe the respondent for recall, and to add a tab to enter receipts/recall. (See Attachments D for CAPI changes.)
Methodology
The control group will consist of production sample from the study’s field period, and will be administered the current diary using standard procedures. The control group will be matched on socio-demographic characteristics to the test group after the field period concludes, as mentioned above.
Test Group
The test group will consist of research sample cases, and will be administered the web diary using modified procedures. The contact schedule for the Test Group is proposed as follows:
Visit 1 - All respondents randomly assigned to the treatment group will be asked the internet access screening question and a follow-up question on how the internet is accessed. The web diary (URL) would be placed in person. Respondents would be given materials describing the web diary and instructions on how to log in and complete the diary. If the CU does not have any internet connection or has internet connection through their mobile telephones only, then the CU is screened out. Any CAPI questions would be asked as currently scheduled:
Front section: http://www.bls.gov/cex/ced/2011/csxfront.htm
Household Characteristics: http://www.bls.gov/cex/ced/2011/csxsection1.htm
CU characteristics: http://www.bls.gov/cex/ced/2011/csxsection2.htm
Back section (not asked, but FR-completed): http://www.bls.gov/cex/ced/2011/csxback.htm
Non-English households in the Research group will be screened out as Type C-language barrier (“WD Language Barrier”). (Type Cs are ineligibles for the FRs response rates.) There will not be a Spanish web survey, but this will be revisited if the web survey is put into Production.
Spawned cases (multiple CUs found in one address) will also be coded out at Type Cs.
Visit 2 - The second-week visit and recall option will not occur; instead, CUs will be reminded about the diary via telephone on day 3 and day 8. The FRs will not have access to the respondents’ web diaries. Therefore, the FR question will be modified to directly ask of respondents: “Did you record any expenses in the Web Diary during the first week?” FRs will also be instructed to probe for any purchases not entered in the web diary.
The following sections of CAPI questions would be asked as currently scheduled:
Coverage (not asked, but FR-completed): http://www.bls.gov/cex/ced/2011/csxcoverage.htm
Back section (not asked, but FR-completed): http://www.bls.gov/cex/ced/2011/csxback.htm#WK2_ST2
FRs would not visit week 1 non-contacts.
Visit 3 - The CUs would be visited in-person at the end of the diary period to thank them for their participation. Any CAPI questions would be asked as currently scheduled:
Work experience and income: http://www.bls.gov/cex/ced/2011/csxsection4a.htm
Work experience and income for the CU: http://www.bls.gov/cex/ced/2011/csxsection4b.htm
Coverage (not asked, but FR-completed): http://www.bls.gov/cex/ced/2011/csxcoverage.htm
Back section (not asked, but FR-completed): http://www.bls.gov/cex/ced/2011/csxback.htm#WK2_ST2
FRs will also collect receipts if provided by respondents. FRs would enter any receipts into a modified CAPI instrument. (The CAPI instrument will be modified to include a new tab that mimics the web survey instrument. FRs will be trained on the new web survey. There will also be instructions in CAPI on how to access the new tab.)
At the end of the last visit, respondents will be debriefed, as well as FRs (both using the CAPI instrument).
There will be no reinterviews for Research cases.
Sampling and sample size
A sample size of 1,000 completed diaries was chosen so as to be able to perform a statistically significant comparison of completion rates between the production sample and the treatment group and in order to obtain a robust enough sample to examine trends of expenditures.
Completion rate analyses
In order to perform a statistically significant comparison of completion rates between the production sample and the treatment group, the sample size for the treatment group must be 900. The sample size is calculated using the following criteria:
90% of CUs that agree to fill out a diary actually do (as is currently)
2012Q4 production sample is used with n=3,200
(See Table 1: Estimated sample sizes below.)
Note that the completion rate is defined as the percent of CUs that fill out (or “complete”) a diary given that they agreed to do so:
Completion Rate |
= |
Prob(CU fills out a diary | CU agreed to fill out a diary) |
|
= |
Prob(CU completes a diary | a diary was placed) |
Table 1: Estimated sample sizes.
|
Research sample size for 1,000 total completed diaries |
Starting sample addresses |
1,313 |
Occupied housing units (occupancy rate = 80%) |
1,050 |
Eligible CUs (screening rate = 68%) |
714 |
Completed diaries for week 1 (CED response rate = 70%) |
500 |
Completed diaries for week 2 (CED attrition rate = 0%) |
500 |
Total completed diaries (weeks 1 and 2) |
1,000 |
Expenditure Analysis
The field test is expected to span a 3-month period and the corresponding Diary Production data will be used as the control group. Knowing this, the sample size of the control group is the total number of usable weekly diaries in a quarter’s worth of diary data. Accordingly, the 2009 CE Phase III Diary Quarter 1 Data are used as source data and the number of weekly diaries in the subsequent database is used as the control group’s sample size (n=3,596 weekly diaries.)
The table below shows a few of the item categories/major groups from the Diary publication stub. While the team will not have statistical significance for all inferential statistics at all of the significance levels, the team may be able to see practical significance through the examination of trends.
For the most part, each treatment group needs thousands or hundreds of weekly cases for an effect of collecting data via a Web diary to be statistically significant. For example, at least 3,345 weekly cases need to be assigned to the treatment group in order for a 10% difference between the reported expenditures of the treatment and control groups to be statistically significant at the average weekly expenditures level. To be able to detect a 25% difference, then at least 413 weekly cases need to be assigned to the treatment group; and if we only want to be able to detect a 50% difference, then at least 100 weekly cases need to be assigned to the treatment group. Also, given in the table is the minimum number of [household] addresses, (computed as 50% of the weekly cases,) needed to collect the minimum number of weekly cases.
The Treatment Group’s Minimum Sample Size Needed For Differences to be Statistically Significant
|
10% Difference is Statistically Significant |
25% Difference is Statistically Significant |
50% Difference is Statistically Significant |
||||||
Item category |
# Usable Diaries |
# Weekly Cases |
# HH Addresses |
# Usable Diaries |
# Weekly Cases |
# HH Addresses |
# Usable Diaries |
# Weekly Cases |
# HH Addresses |
Average weekly expenditures |
1,271 |
3,345 |
1,673 |
157 |
413 |
207 |
38 |
100 |
50 |
Food |
874 |
2,300 |
1,150 |
116 |
305 |
153 |
28 |
74 |
37 |
Food at Home |
926 |
2,437 |
1,219 |
122 |
321 |
161 |
30 |
79 |
40 |
Food Away from Home |
1,797 |
4,729 |
2,365 |
203 |
534 |
267 |
49 |
129 |
65 |
Alcoholic Beverages |
+ ∞ |
+ ∞ |
+ ∞ |
1,680 |
4,421 |
2,211 |
311 |
818 |
409 |
Apparel and Services |
+ ∞ |
+ ∞ |
+ ∞ |
3,722 |
9,795 |
4,898 |
524 |
1,379 |
690 |
Transportation |
13,935 |
36,671 |
18,336 |
524 |
1,379 |
690 |
118 |
311 |
156 |
Healthcare |
+ ∞ |
+ ∞ |
+ ∞ |
1,125 |
2,961 |
1,480 |
228 |
600 |
300 |
Entertainment |
10,568 |
27,811 |
13,905 |
487 |
1,282 |
641 |
111 |
292 |
146 |
Personal Care Products and Services |
51,395 |
135,250 |
67,625 |
632 |
1,663 |
832 |
140 |
368 |
184 |
(Note: A 38% participation rate is assumed for converting usable diaries into weekly cases, where a case is defined as a weekly diary.)
Table 1 shows the assumptions for the starting sample given a targeted number of completes and screening and response rates. The Production sample will serve as the control group (meaning not offered the web diary option). The research sample will be randomly selected. (See Attachment G – Sampling Plan.)
III. Burden Hour Estimate
BLS estimates that this study will require 534 burden hours. We base this estimate upon the following assumptions:
|
Mins. |
Sample |
Total Hours |
Estimated time to be screened for Research sample |
1 |
1,313 |
22 |
Estimated time to complete diary (first week) |
203 |
500 |
167 |
Estimated time to complete diary (second week) |
20 |
500 |
167 |
Additional time for CAPI question during different visits |
15 |
714 |
179 |
TOTAL |
|
|
534 |
1 Usability testing by CSM was covered by their generic clearance and they submitted their own clearance packages.
2 Consumer units will also be matched by other variables, such as geography and income.
3 BLS estimates that the web diary will take 20 minutes to complete per week, based on results of the BLS internal pilot.
File Type | application/msword |
Author | Scott Fricker |
Last Modified By | Murphy_P |
File Modified | 2012-10-09 |
File Created | 2012-10-09 |