OMB #15XX-NEW
1. Describe (including a numerical estimate) the potential respondent universe and any sampling or other respondent selection methods to be used.
The potential respondent universe is composed of wage and investment and self-employed taxpayers living in the United States. These taxpayers file a Form 1040, Form 1040A, or Form 1040EZ (as well as supporting forms and schedules). For this survey, we are focusing on tax year 2010. The sample frame will be developed using IRS individual returns transaction file (IRTF). Individuals that filed a tax year 2010 tax return will be sampled. Taxpayers that filed more than one tax return (e.g., an original and an superseding return) will be sampled based on the most recent tax return available at the time of sampling and will have only one chance of being sampled. Some populations will be explicitly excluded from the survey population. This includes taxpayers that are minors, deceased taxpayers, and taxpayers that have international addresses, including active duty military serving overseas.
When sub-populations vary considerably, it is advantageous to sample each subpopulation (stratum) independently. Stratification is the process of grouping members of the population into relatively homogeneous sub-groups before sampling. The strata should be:
Mutually Exclusive. Members must be assigned to only one stratum; and
Collectively Exhaustive. No members can be excluded.
Then, random or systematic sampling can be applied within each stratum. Stratification often improves the representativeness of the sample by reducing sampling error. It also tends to produce a weighted mean that has less variability than the arithmetic mean of a simple random sample of the population. For these reasons, the proposed sample design for this study is a stratified random sample.
The sampling approach has been designed to ensure that key taxpayer subgroups are adequately represented in the study findings. The stratification includes two main criteria:
Preparation method. The method by which the taxpayer prepared his or her return.
Prepared by a paid professional (paid)
Prepared using tax preparation software (soft)
Prepared by hand (self)
Differential burden. Variable reflecting type of activities performed by taxpayers to meet their federal tax obligations. Taxpayers are assigned burden corresponding to the highest burden item reported on their tax forms.
Low
Low-Medium
Medium
Medium-High
High
Differential burden is summarized in the following table.
Strata |
Definition |
Low |
Wage income; Interest income; Unemployment income; Withholding; Earning income tax credit (with no qualifying children) or advanced EIC; Does not meet any of the conditions for higher levels of differential burden |
Low-Medium |
Capital gain income (includes capital gains distributions and undistributed capital gains); Dividend income; Earned income tax credit (with qualifying children); Estimated tax payments; Retirement income (includes SS benefits, IRA distributions, or pensions and annuities); Any non-refundable credit (includes child and dependent care expenses, education credits, child tax credit, elderly or disabled credit); Household employees; Non-business adjustments; Does not meet any of the conditions for higher levels of differential burden |
Medium |
Itemized deductions (includes mortgage interest, interest paid to financial institutions, charitable contributions, and medical expenses); Foreign income, expense, tax, credit, or payment; Moving expenses; Simple Schedule C or C-EZ; General business credit; Does not meet any of the conditions for higher levels of differential burden |
Medium-High |
Farm income as reported on Schedule F; Owns rental property as reported on Schedule E, including farm rental and low-income housing; Estate or trust income as reported on Schedule E; Employee business expense deductions; Files AMT without AMT preference items; Prior year alternative minimum tax credit; Investment interest expense deduction; Net loss as reported on Schedule C; Depreciation or amortization as reported on Schedule C; Expenses for business use of home as reported on Schedule C; Does not meet any of the conditions for higher levels of differential burden |
High |
Cost of goods sold as reported on Schedule C; Partnership or S-Corp income as reported on Schedule E; Files AMT with AMT preference items |
These variables were chosen for stratification because of their importance to the modeling of taxpayer burden and behavioral activities. The differential burden variable is included to ensure that different tax concepts, tax provisions, and tax characteristics with differential recordkeeping and reporting requirements are included. The tax preparation method variable ensures both a proper balance and an adequate representation of paid preparers, software preparers, and self preparers, allowing us to reflect the role of technology and services in meeting recordkeeping and reporting requirements.
The specifications of the sample design are developed to balance three main issues. The first is that it must be efficient in the way the sample is distributed so that estimates from the sample are reliable (i.e., meet confidence interval range requirements). Specifically, the aim is for the coefficient of variation to be under 2%. The second is to ensure that there are a sufficient number of cases to meet the needs of the modeling tool to identify the determinants of burden within and across strata. The third is that the design should facilitate comparisons between the Individual Taxpayer Burden tax year 2010 survey and the previous tax year 2007 survey.
To make the tax year 2010 survey comparable with the tax year 2007 survey, we continue to use the same design variable (total monetized burden), the same stratified random sampling approach, and the same stratification variables as in the tax year 2007 survey. In the 2007 survey, the Neyman allocation method was used to determine the sample size for each stratum, subject to the total sample size of 15,000. It aimed to minimize the variance of estimated mean burden; however, it limited the sophistication of the modeling of certain thin populations of interest.
To address this problem, in the tax year 2010 survey we adjusted the Neyman allocation by requiring a minimum number of observations per stratum. The minimum number of observations was defined by applying a common rule of thumb, which states that a sample must include at least 10 or 15 observations per independent variable in a regression model (Stevens, 2002; Bartlett et al., 2001). To be conservative, we chose 15. Given that the expected number of independent variables is 15, the minimum desired number of complete responses for modeling each stratum is 225.
Our objective is to minimize the variance of estimated mean burden constrained on this minimum sample size for modeling, with response rate incorporated. We start with the same total sample size of 15,000 as in the tax year 2007 study, considering this as our base sample. We then calculate the coefficient of variation, given the target minimum number of completes per stratum of 225. Since the coefficient of variation is too large for the sample size of 15,000, we adjust the sample size to 20,000, and recalculate the coefficient of variation. The sample size of 20,000 results in a coefficient of variation of 1.62%. This coefficient of variation meets our requirement. Because we will be using a new data collection protocol, it also allows us some additional confidence that we will achieve the desired number and mix of complete responses.
A summary of the final sample design is shown in the table below.
Sample allocation for ITB TY2010 survey
Monetized Burden Strata |
Projected Pop Count |
Est. Mean |
Est. Std. Dev. |
Est. Response Rate |
Sample Allocation |
Expected Number of Respondents |
11 paid, low |
9,822,075 |
190.46 |
241.53 |
0.2558 |
880 |
225 |
12 paid, low-medium |
26,114,402 |
295.10 |
370.49 |
0.3213 |
1,644 |
528 |
13 paid, medium |
15,940,360 |
619.92 |
980.87 |
0.3916 |
2,656 |
1,040 |
14 paid, medium-high |
15,732,824 |
946.43 |
1,157.12 |
0.3970 |
3,092 |
1,228 |
15 paid, high |
10,685,596 |
1,837.13 |
2,524.26 |
0.3894 |
4,582 |
1,784 |
21 self, low |
3,503,015 |
85.97 |
115.25 |
0.3594 |
626 |
225 |
22 self, low-medium |
2,707,918 |
157.75 |
225.08 |
0.3436 |
655 |
225 |
23 self, medium |
1,695,808 |
499.83 |
709.51 |
0.4355 |
517 |
225 |
24 self, medium-high |
770,422 |
715.88 |
876.97 |
0.4046 |
556 |
225 |
25 self, high |
288,597 |
923.48 |
881.83 |
0.4119 |
546 |
225 |
31 soft, low |
10,478,344 |
116.18 |
159.24 |
0.3058 |
736 |
225 |
32 soft, low-medium |
15,971,640 |
185.25 |
228.28 |
0.3678 |
619 |
228 |
33 soft, medium |
10,942,941 |
518.45 |
713.67 |
0.4620 |
1,327 |
613 |
34 soft, medium-high |
6,336,666 |
769.97 |
1,015.50 |
0.4396 |
1,093 |
480 |
35 soft, high |
1,639,707 |
1,278.71 |
1,615.97 |
0.4772 |
472 |
225 |
Total |
132,630,316 |
551.90 |
|
|
20,000 |
7,701 |
Overall CV |
|
|
|
|
|
1.62% |
References:
Bartlett, J. E., Kotrlik, J. W., & Higgins, C. C., “Organizational research: Determining
appropriate sample size in survey research”, Information Technology, Learning, and Performance Journal, 19: 43-50, (2001).
Stevens, J. P., Applied Multivariate Statistics for the Social Sciences (4th ed.),
Mahwah, N.J.: Lawrence Erlbaum Associates, p. 143, (2002).
2. Describe the procedures for the collection of information.
We have two objectives in the design of this protocol. The first is the efficient collection of the current sample; the second is to inform the design of future studies. Variation in how taxpayers are contacted is part of an experimental design based on current research with an eye to better inform our understanding of trade-offs and available efficiencies in minimizing a mix of sample size, respondent burden per response, and government costs in meeting our data collection objectives.
Recruitment of respondents will follow a step-wise progression, with four contact stages:
an initial invitation to take the survey,
a reminder communication to the entire sample,
a mail survey sent to non-respondents, and
a final contact urging non-respondents to complete the survey.
The exact form of each of these contacts will vary somewhat, depending upon several factors. First, it will depend upon whether the contractor is able to obtain a telephone number for sampled taxpayers. For those respondents able to be matched with phone numbers, communication will take place via both mail and telephone. For respondents without phone numbers, all communication will necessarily take place via mail. The anticipated success rate for matching the sample to telephone numbers is about 50%.
A $2 incentive will be included in the first mailing for approximately half of the respondents to see if this improves response rates. This incentive will be distributed evenly through both the phone and no-phone groups.
In addition, there will be two different tracks or “cohorts,” in which the order of the offered modes varies. Increasingly, survey researchers have found that in multi-mode surveys, it is best to offer respondents different modes sequentially, rather than concurrently in order to avoid “mode confusion.” For example, it is better to offer respondents an option to take a survey first by one mode (e.g., mail) then later by another mode (e.g., web), rather than to initially offer them a choice of mail or web (Schneider, et al., 2005).
Because it is not clear from previous research which mode should be offered first, we are dividing the sample into two cohorts: one that will receive an invitation to the web survey first (the web-first group) and one that will receive the paper survey (the mail group). During all stages of contact respondents in the web group who do not have access to the internet will be able to request a paper survey instrument. To minimize mode confusion, respondents in the mail group will not be offered information about web access until the second contact.
Finally, in an effort to determine the best means of increasing response rates, at the second and fourth contact, the success of mail-based and telephone-based prompting will be compared.
Overall, this experimental design was designed to maximize response rates, in a cost-effective manner. The most recent ITB survey, conducted by telephone, received a response rate of 48% and included a $25 incentive upon completion. Our data collection approach is grounded in much of the recent survey literature (Dillman et al., 2008) and Westat’s experiences conducting household surveys. In recent years, conducting telephone surveys has become increasingly expensive. In addition, response to telephone surveys has declined at a greater rate than other modes, mostly because many households are replacing landlines with cellular devices and using caller ID to screen out unsolicited phone calls. Consequently, we chose web and mail with the expectation that these two modes would provide the highest initial rate of response. In addition, the choice to use primarily visual modes reduces the potential for mode effects. Our hierarchical approach to data collection encourages responses using the most cost effective technology without detrimentally affecting response rates. Given this, the current experiment should shed light on which approach is most cost-effective in increasing survey response. We anticipate that the combination of multiple modes and methods will ultimately yield a response rate of approximately 50%.
A detailed outline of each of the four contact stages follows.
For those respondents who are matched with a telephone number:
|
Web-First Cohort |
Mail Cohort |
Step 1: Invitation to the survey. |
A hardcopy letter will be sent to the targeted respondent inviting him or her to go to the website URL to complete the online survey. The invitation will include information about the survey, assurances that there is no risk associated with participation, and web access information. In addition, respondents will be given directions on how to obtain a paper survey if they do not have access to the web or would prefer a hard copy. This mailing will also include a letter from the IRS Director of Research, Analysis, and Statistics endorsing the survey and emphasizing the importance of the data collection effort. |
A paper questionnaire will be mailed to the targeted respondent. The paper-and-pencil mail survey will include a postage-paid return envelope. The survey package will provide information about the survey and assurances that there is no risk associated with participation. This mailing will also include a letter from the IRS Director of Research, Analysis, and Statistics endorsing the survey and emphasizing the importance of the data collection effort. |
Step 2: Thank you/reminder prompt. |
Approximately 10 business days after the initial mailing, a thank you/reminder postcard will be mailed to all respondents. The postcard will thank those who have already submitted a completed web survey and ask those who have not to please do so. The postcard will include access information for the web survey. |
Approximately 10 business days after the initial mailing, all respondents will receive a thank you/reminder prompt. Half will receive a mailed postcard prompt, while the other half will receive an automated telephone prompt. Both prompts will thank those who have already submitted a completed survey and ask those who have not to please do so. Respondents will also be informed of the option to complete the survey online. |
Step 3: Mail survey sent. |
A paper questionnaire will be mailed to those households who have not responded to either the initial letter invitation or the reminder prompt. The paper-and-pencil mail survey will include a postage-paid return envelope. Materials sent at this stage will also inform respondents that a web survey option is available upon request. |
|
Step 4: Non-response follow-up. |
If no completed survey is received either by mail or web, non-response follow-up will begin. Non-response follow-up will take one of three forms:
Respondents will be randomly assigned to one of these three options, with approximately 40% of non-respondents assigned to the express mailing, 20% to the automated telephone prompt, and 40% to the live telephone prompt. |
For those respondents who cannot be matched with a telephone number, the same dichotomous approach will be used, but (for obvious reasons) without any telephone prompting. As mentioned above, half the respondents in each of the cohorts will receive a $2 incentive in the first contact (i.e., invitation to the survey).
|
Web-First Cohort |
Mail Cohort |
Step 1: Invitation to the survey. |
A hardcopy letter will be sent to the targeted respondent inviting him or her to go to the website URL to complete the online survey. The invitation will include information about the survey, assurances that there is no risk associated with participation, and web access information. In addition, respondents will be given directions on how to obtain a paper survey if they do not have access to the web or would prefer a hardcopy. This mailing will also include a letter from the IRS Director of Research, Analysis, and Statistics endorsing the survey and emphasizing the importance of the data collection effort. |
A paper questionnaire will be mailed to the targeted respondent. The paper-and-pencil mail survey will include a postage-paid return envelope. The survey package will include information about the survey and assurances that there is no risk or costs associated with participation. This mailing will also include a letter from the IRS Director of Research, Analysis, and Statistics endorsing the survey and emphasizing the importance of the data collection effort. |
Step 2: Thank you/reminder prompt |
Approximately 10 business days after the initial mailing, a thank you/reminder postcard will be mailed to all respondents. The postcard will thank those who have already submitted a completed web survey and ask those who have not to please do so. The postcard will include access information for the web survey. |
Approximately 10 business days after the initial mailing, all respondents will receive a thank you/reminder postcard. The postcard will thank those who have already submitted a completed survey and ask those who have not to please do so. Respondents will also be informed of the option to complete the survey online. |
Step 3: Mail survey sent |
A paper questionnaire will be mailed to those households who have not responded to either the initial letter invitation or the reminder prompt. Materials sent at this stage will also inform respondents that a web survey option is available upon request. The paper-and-pencil mail survey will include a return envelope in order to return the completed survey to Westat. |
|
Step 4: Non-response follow-up. |
If no completed survey is received either by mail or web, respondents will receive an additional copy of the paper survey, sent via express mail. A postage-paid return envelope will be included. |
If an initial survey invitation is returned by the post office as undeliverable, the respondent will be removed from this four-step contact approach. Instead, because of logistical constraints, we will use a modified approach that does not divide respondents into cohorts (web-first and mail). As well, this modified approach will be shortened into two or three steps because it may take a significant amount of time for the post office to notify us of undeliverable mail.
In the case of undeliverable mail, there are two possible scenarios: (1) a new address is available and supplied by the post office and (2) no new address is available. For each scenario, a detailed outline of the modified contact approach follows.
|
New Address Available |
No Address Available |
Step 1 |
A paper questionnaire will be mailed to the targeted respondent at the new address. The paper-and-pencil mail survey will include a postage-paid return envelope. The survey package will include information about the survey and assurances that there is no risk associated with participation. This mailing will also include a letter from the IRS Director of Research, Analysis, and Statistics endorsing the survey and emphasizing the importance of the data collection effort. Because of time constraints, this package will be sent via express mail. |
If a phone number is available for the respondent, an interviewer will contact the respondent to request an updated mailing address. The interviewer will briefly introduce the survey and will be able to answer questions about the survey and the process. In addition, the interviewer will administer the interview over the telephone if the respondent wishes. |
Step 2 |
If no completed survey is received, respondents will receive a reminder prompt from a telephone interviewer asking respondents to complete the survey. Telephone interviewers will be prepared at this stage to administer the interview over the telephone if the respondent wishes.
If no phone number is available, respondents will receive a reminder postcard asking the respondent to please complete the survey. Respondents will also be informed of the option to complete the survey online. |
If a respondent is willing to provide an updated mailing address, a survey package will be mailed via express mail. The paper-and-pencil mail survey will include a postage-paid return envelope. The survey package will include information about the survey and assurances that there is no risk associated with participation. This mailing will also include a letter from the IRS Director of Research, Analysis, and Statistics endorsing the survey and emphasizing the importance of the data collection effort. |
Step 3 |
|
If no completed survey is received, respondents will receive a reminder prompt from the same telephone interviewer who contacted them in Step 1. This interviewer will be prepared to administer the interview over the telephone if the respondent wishes. Respondents will also be informed of the option to complete the survey online. |
The secure web survey will be posted online using a proprietary web survey delivery system developed by our contractor, Westat. The software easily accommodates different question formats, including open-ended response fields. It also allows participants to skip questions and complete the survey in more than one session (i.e., the respondent can leave the web survey and come back to finish it at a later time). Development and testing of the web survey will follow well-established, documented best methods.
The paper-and-pencil mail survey will be designed to be user-friendly, easy to navigate, with clear and simple instructions. The survey will be created using TeleForm technology, a software system for intelligent data capture and image processing. The software extracts indexing information automatically from any document type through the use of multiple recognition engines. TeleForm reads hand print, machine print, optical marks, bar codes, and signatures.
Response data will be stored and tracked in a response database which can then be uploaded into the Individual Taxpayer Burden Model (ITBM). In addition, a tailored Survey Management System will track cases throughout all modes of contact, including mail, telephone, and IVR.
References:
Dillman, D., Smyth, J., Christian, L. Internet, Mail, and Mixed-Mode Surveys: The
Tailored Design Method. Hoboken, NJ: Wiley, (2008).
Schneider, S.J., Cantor, D., Malakhoff, L., Arieira, C., Segel, P., Nguyen, L., and Guarino
Tancreto, J.). “Telephone, Internet and paper data collection modes for the Census 2000 short form”, Journal of Official Statistics, 21: 89-101 (2005).
Describe methods to maximize response rates and to deal with issues of non-response.
Upon completion of the survey protocol, we plan to conduct a non-response bias analysis. This analysis will be the same as what was done for the tax year 2007 survey. That analysis resulted in the use of a raking technique as a way to control for bias in a multivariate scenario. The process is further outlined in the paper “Response Mode and Bias Analysis in the IRS’ Individual Taxpayer Burden Survey”.
To analyze the various experiments embedded within the survey administration protocol, the following analysis approach will be undertaken:
The modes chosen for the survey are primarily visual ones (paper and Web). This choice was made specifically to reduce the potential for mode effects associated with interviewer administered surveys and self-administered surveys. The effects of visual and aural survey methods and the effects arising from interviewer and self-administered approaches are relatively consistent. However, there is little research to suggest that differences will be generated when two visual, self-administered surveys are implemented using consistent formatting and design principles as planned in this collection. The research we are referring to is summarized in the article by Dillman et al. (2009).
As a result, we are primarily interested in the analysis of differences due to differential response rates and the cost-effectiveness of the survey methods. We do consider the possibility of mode effects associated with the Web and paper data collections in our plans, but we are not expecting this to be very fruitful. More details on the nature of the planned analysis are given below.
Analysis of incentives – The data collection is split equally into halves with one half being sent $2 and the other no incentive. The main objective of the analysis is to identify the most cost-effective way to survey this population in the future while minimizing non-response error. The analysis will focus on overall response rate, cost-effectiveness (the percent that responded at each contact stage), and a couple of key outcome variables such as taxpayer time and expenses (to evaluate the potential for differential non-response bias). Even though the sample is balanced with respect to the other experiments such as Web-first or paper-first and telephone availability, these variables will be included in the initial analysis as explanatory variables to assess the potential for different treatments being offered to different incentives. In addition, other characteristics known for the filers such as whether they filed electronically or used a paid-preparer will also be considered as explanatory variables.
Analysis of mode cohorts – This analysis will continue a line of research that is examining the future of Web data collection that is especially appropriate for this target population. The main goal is to identify subgroups that respond to either the Web or paper for the survey. The analysis will be done separately by telephone availability since the treatments are different for these two subsets of the population and there is considerable research that indicates the response rates are very different for these subsets. As with the incentive analysis we will examine overall response rate, cost-effectiveness, and a couple of key outcome variables such as taxpayer time and expenses. One type of analysis that may be conducted is a tree-analysis to identify categorical variables that best predict which filers will be most likely to respond positively to the Web or paper offers.
Analysis of prompts – This analysis will examine whether mail or telephone prompting is more effective at increasing response rates. At the second contact, the thank you/prompt is primarily used to increase response at a low cost. For this analysis, the only group assessed will be those with telephone numbers available since those without telephones will only get the mail prompt and they have different response propensities. The immediate boost in the response will be the primary focus of the analysis (typically the increase in response is important but may not be large enough to study other characteristics such as non-response bias). At the fourth contact, the analysis will again be focused on the groups with telephones available. The analysis will explore the additional responses attained by the three treatments (express mailings/telephone prompt/telephone interviewer) to inform future follow-up designs.
References:
Brick, M., Contos, G., Masken, K., Nord, R. “Response Mode and Bias Analysis
in the IRS Individual Taxpayer Burden Survey”, Survey Practice. (2010).
Dillman, D., Phelps G., Tortora, R., Swift, K., Kohrell, J., Berck, J., Messer, B.
“Response rate and measurement differences in mixed-mode surveys using mail, telephone, interactive voice response (IVR) and the Internet”, Social Science Research, 38: 1–18. (2009).
Describe any tests of procedures or methods to be undertaken.
To ensure that the collection of information is not burdensome and that the questions are clearly written and will produce accurate and valid results, the IRS conducted two rounds of cognitive testing. Cognitive testing is a well-established qualitative research method intended to identify problems respondents have with comprehension of survey questions (Willis, 2005). The testing was conducted with taxpayers in the Washington, D.C. area. Respondents were recruited according to specific criteria (e.g., filing status, complexity of return, and filing method). Efforts were made to recruit respondents who were demographically representative of the population being surveyed. As a result of the cognitive testing, the IRS and their contractor made significant changes to the question wording, ordering, and survey length.
In addition, at the outset as well as after each interaction of testing, the instrument underwent extensive review by the IRS, the contractor, and stakeholders.
References:
Willis, G. Cognitive Interviewing. Thousand Oaks, CA: Sage Publications. (2005).
Provide the names and telephone numbers of individuals consulted on statistical aspects of the design and the name of the agency unit, contractor(s), grantee(s), or other person(s) who will actually collect and/or analyze the information for the agency.
IRS Office of Research
Statistical Design:
Michael Sebastiani, 202-874-0831
Wei Liu, 202-874-0575
Data Collection and Analysis:
John Guyton, Research, Analysis & Statistics
Melissa Vigil, Research, Analysis & Statistics
George Contos, Research, Analysis & Statistics
Westat
Statistical Design:
Mike Brick, Statistician
Data Collection and Analysis:
Jennifer O’Brien, Project Director
Kerry Levin, Project Manager
Jocelyn Newsome, Research Analyst
Department of Treasury
Data Collection and Analysis:
Allen Lerman, Office of Tax Policy
Susan Nelson, Office of Tax Policy
File Type | application/msword |
File Title | The potential respondent universe is comprised of Wage & Investment and Self Employed taxpayers living in the United States |
Author | wtdcb |
Last Modified By | z8dmb |
File Modified | 2011-06-16 |
File Created | 2011-06-13 |