ACS Methods Panel Test

American Community Survey Methods Panel Tests

Attachment A Research and Analysis Plan for the Strategic Framework Field Test

ACS Methods Panel Test

OMB: 0607-0936

⚠️ Notice: This form may be outdated. More recent filings and information on OMB 0607-0936 can be found here:

Document [docx]

Download: docx | pdf

American Community Survey Research and Evaluation Program

May 5, 2021

ACS Research & Evaluation Analysis Plan (REAP)

2021 Strategic Framework

Mail Materials Test

Census Bureau Logo

REAP Revision Log

Version	Date	Description
1	12/16/20	Initial Draft for Feedback
	4/27/21	Second Draft for Approval
	5/5/21	Final REAP

1. INTRODUCTION 1

2. BACKGROUND 1

2.1 Strategic Framework Project 1

2.1.1 Phase 1—Research Best Practices 2

2.1.2 Phase 2—Assess Current Materials 2

2.1.3 Phase 3—Develop New Materials 2

2.1.4 Phase 4—Cognitively Test New Materials 3

2.2 2021 American Community Survey Mail Contact Strategy 3

3. RESEARCH-BASED DEVELOPMENT OF MATERIALS 4

3.1 Themes of Messages in Each Mailing 5

3.1.1 First Mailing – Establish Legitimacy and Trust 5

3.1.2 Second Mailing – Convey Local, Tangible Benefits 5

3.1.3 Third Mailing – Reduce the Sense of Burden of Responding 5

3.1.4 Fourth Mailing – Restate Appeals and Express Gratitude 6

3.1.5 Fifth Mailing – Heighten the Sense of Urgency 6

3.2 Plain Language Principles Used in the Materials 7

3.2.1 Elements that Improve Readability 7

3.2.2 Elements that Improve Visual Appeal 8

3.3 Other Researched-Based Design Elements 8

4. METHODOLOGY AND RESEARCH QUESTIONS 10

4.1 Sample Design 10

4.2 Experimental Design 10

4.2.1 Treatment 1 (Icon Treatment): 11

4.2.2 Treatment 2 (Column and Header Treatment): 11

4.2.3 Treatment 3 (Sidebar Treatment): 11

4.2.4 Treatment 4 (Minimalist Treatment): 11

4.2.5 Treatment 5 (Production, Sorted Separately): 12

4.3 Research Questions 12

4.4 Analysis Metrics 12

4.4.1 Unit Response Analysis 12

4.4.2 Item Response Analysis 14

4.4.3 Cost Analysis 16

4.5 Research Question Analysis 16

4.5.1 Questions Involving Unit-level Response Analysis 16

4.5.2 Questions Involving Item-level Response Analysis 18

4.5.3 Question involving Cost Analysis 18

5. DECISION CRITERIA FOR PICKING A “WINNING” TREATMENT 19

6. ASSUMPTIONS AND LIMITATIONS 20

6.1 Assumptions 20

6.2 Limitations 20

7. TABLE SHELLS 21

7.1 Self-Response Returns Rates 21

7.2 Final Response Rates 22

7.3 Item and Section Nonresponse 22

7.4 Form Completeness 22

7.5 Response Distributions of Respondent Demographics 23

8. POTENTIAL CHANGES TO ACS 23

9. REFERENCES 23

Appendix A. Materials for the Experiment 28

Appendix B. 2021 Mailing Descriptions and Schedule for the 2021 September Production Panel 79

Appendix C. Future Testing Ideas 80

INTRODUCTION

In 2017, the U.S. Census Bureau began the Strategic Framework Project—a long-term, multi-phase project to update the messaging in the American Community Survey (ACS) mail materials. The goals of the project are to improve communication with potential respondents, increase self-response to the survey, reduce program costs, and reduce respondent burden. The project includes research of best practices in messaging to gain survey cooperation, development of new materials based on the research, and testing (qualitative and quantitative) of the new materials.

After conducting research on best practices in communications in a variety of disciplines, the Census Bureau’s Strategic Framework Project team made recommendations for messaging in ACS mail materials in two reports (Oliver, Heimel, and Schreiner, 2017; Schreiner, Oliver, and Poehler, 2020). Following the recommendations, new materials were designed holistically resulting in four sets of materials. The Census Bureau’s Center for Behavioral Science Methods (CBSM) tested the materials in three rounds of cognitive testing. The materials are now ready to be field-tested.

The purpose of the 2021 Strategic Framework Mail Materials Test is to identify which treatment is most successful at increasing self-response. This test will not be able to isolate the effectiveness of individual messages or ideas incorporated into the overall design of any given treatment.

BACKGROUND
1. Strategic Framework Project

The Census Bureau began the Strategic Framework Project following a 2016 workshop where a panel of experts offered advice on how to improve the ACS mail materials (NAS, 2016). Panelist Nancy Mathiowetz suggested that the Census Bureau develop a strategic plan—grounded in communication theory as well as theories from survey methodology—for messaging in the ACS materials (Mathiowetz, 2016). Mathiowetz thought that having a strategic plan would allow the Census Bureau to judge how expert recommendations fit with the strategic plan. Her suggestion led to the creation of the multi-phase Strategic Framework Project that began in 2017. The five phases of this project are depicted in Figure 1.

Figure 1. Phases of the Strategic Framework Project

Phase 1—Research Best Practices

The first phase involved research of best practices for messaging to increase survey cooperation. To guide the research, we focused on answering the following questions: (1) What are the demographics of the target audience of the ACS mail materials? (2) What are the best practices in survey messaging to use in the materials to obtain a survey response? (3) What are the best ways to convey those messages? The research and the resulting recommendations for ACS mail messaging are recorded in “Strategic Framework for Messaging in the American Community Survey Mail Materials,” referred to as the Strategic Framework Report in this document (Oliver, Heimel, and Schreiner, 2017).

Phase 2—Assess Current Materials

The second phase of the project involved reviewing and assessing messaging in the ACS mail materials considering the best practices outlined in the Strategic Framework Report. For this phase, we sought to answer the following questions: (1) Do the messages in the current materials meet the recommendations? (2) If so, in what ways? (3) If not, how can the messaging be improved? The findings from the current messaging assessment are recorded in the report, “Assessment of Messaging in the 2018 American Community Survey Mail Contact Materials,” referred to as the Messaging Assessment Report in this document (Schreiner, Oliver, and Poehler, 2020).¹

Phase 3—Develop New Materials

For the third phase, we developed mail materials, incorporating the recommended messaging and design ideas from the first two phases of the project. While the designs of the materials are new, they adhere to the current ACS mail contact strategy (see Section 2.2), which includes the type of mailers (package, pressure seal, or postcard) and the number and timing of mailings that are sent. Each set of materials was designed holistically so that the messaging and look-and-feel within and across the five mailings are interconnected. The Census Bureau designed three sets of updated ACS mail materials and a team of researchers outside of the Census Bureau designed a fourth set.

Phase 4—Cognitively Test New Materials

The fourth phase involved cognitive testing and an expert review of the newly designed materials. The materials were cognitively tested in three iterative rounds by Census Bureau researchers in CBSM (Martinez et al., forthcoming). We also received feedback from a panel of survey methodology experts at a National Academy of Sciences (NAS) meeting in 2019 before the third round of testing. Suggestions for improvements from the NAS meeting experts were incorporated into the materials for the third round of cognitive testing. After the third round, the materials were modified according to suggestions from CBSM researchers. The resulting materials are now ready for field testing.

2021 American Community Survey Mail Contact Strategy

The current ACS mail contact strategy is detailed below to provide context for the field test. The test materials were designed using the types of mailers and the timing of the mailings in the current strategy. We will compare the test materials to production materials.

The first two mailings are sent to all mailable addresses in the monthly sample. The first mailing is a package that includes a letter, a multilingual brochure, and a card with instructions on how to respond via the internet. The letter contains an invitation to participate in the ACS online and more information in a frequently asked questions format on the back of the letter. A week later, the same addresses are sent a second mailing (reminder letter in a pressure seal mailer).

Responding addresses are removed from the address file after the second mailing to create a new mailing universe of nonrespondents; these addresses are sent the third and fourth mailings.² The third mailing is a package that includes a letter, a paper questionnaire, and a business reply envelope. Four days later, these addresses are sent a fourth mailing (reminder postcard) which encourages them to respond.

After the fourth mailing, responding addresses are again removed from the address file to create a new mailing universe of nonrespondents. The remaining sample addresses are sent the fifth mailing (a more urgent final reminder letter with a due date in a pressure seal mailer).

Two to three weeks later, responding addresses are removed and the unmailable and undeliverable addresses (from the initial sample) are added to create the universe of addresses eligible for the Computer-Assisted Personal Interview (CAPI) nonresponse followup operation.³ Of this universe, a subsample is chosen to be included in the CAPI operation. Census Bureau field representatives (FR) first attempt to interview those selected for CAPI by phone.⁴ If the FR is unable to complete a phone interview, they visit the address to conduct an in-person interview.

Additional information can be found in the ACS Design and Methodology Report (U.S. Census Bureau 2014).

RESEARCH-BASED DEVELOPMENT OF MATERIALS

The complete literature review for the concepts used to design the mail materials in this test is contained in two reports: “Strategic Framework for Messaging in the American Community Survey Mail Materials” and “Assessment of Messaging in the 2018 American Community Survey Mail Contact Materials (Oliver, Heimel, and Schreiner, 2017; Schreiner, Oliver, and Poehler, 2020).

High-level recommendations from the two reports were to limit the number of messages in each mailing; reduce repetitious messaging; use new appeals; use messages justified by research; and make a clear connection to the well-known Census Bureau brand, in a more prominent way.⁵ However, the reports also recognized that, for those recommendations to be successful, we had to use plain language writing to lower the reading level of the letters and plain language design principles (such as white space, organization of letter text, graphics, and color) to make the letters easier to read. (See Section 6 of the Messaging Assessment Report for more details.)

While the test materials strongly adhere to the high-level recommendations in general, this section highlights some specific ways those recommendations were implemented.

Themes of Messages in Each Mailing

The Strategic Framework Report (Section 5) recommended using messages designed to resonate with the cynical and distrusting segments of the population, as an increase in their self-response has the most potential to increase overall self-response (Oliver, Heimel, and Schreiner, 2017).⁶ The report recommended thematic messages for the first four mailings only. The elements for the fifth mailing were developed after we conducted subsequent research to learn about the demographics of ACS nonrespondents receiving the fifth mailing (Berkley, 2018).

First Mailing – Establish Legitimacy and Trust

The first mailing focuses on building trust with the respondent through messages that legitimize the survey and connect the survey to the Census Bureau, a known and trusted organization. In social exchange theory, building trust is the most important aspect of survey messaging (Groves et al., 2012; Dillman, Smyth, and Christian, 2014). With increased trust, subsequent statements such as the benefits of survey participation are more likely to be believed.

Second Mailing – Convey Local, Tangible Benefits

The second mailing focuses on communicating how ACS data has tangible benefits to communities (Reingold, 2014b). This mailing communicates local-level survey benefits because research has shown that prospective respondents are more interested in potential benefits for his or her own neighborhood than for the nation, state, or city (Reingold, 2014b).

This mailing also uses some benefits to show that responding to the survey may directly help other people.⁷ Some people feel a sense of accomplishment when completing a task for someone else and generally feel a sense of reward when they feel they have helped others. For some, this sense of accomplishment is heightened when the action provides no personal benefit aside from helping someone else (Homans, 1961; Blau, 1964; Dillman, Smyth, and Christian, 2014).

Third Mailing – Reduce the Sense of Burden of Responding

The third mailing focuses on messages that reduce the sense of burden associated with responding to the ACS. The three main burden-reducing messages used are:

Providing a choice in response mode—This mailing reduces the burden for respondents who are unwilling or unable to respond by internet. Offering a choice of response mode can have a positive effect on response (Gentry and Good, 2008; Smyth et al., 2010; Millar and Dillman, 2011; Olson, Smyth, and Wood, 2012).

Explaining that response to the ACS is a normal activity regularly completed by others in the community—Knowing others have responded may help make it more comfortable to respond for those who are hesitant to do so (Cialdini, 1984; Hallsworth et al., 2014).

Linking ACS response to civic duty or responsibility—Some feel a sense of pride as they fulfill their civic obligations and feel a sense of reward when they fulfill a patriotic duty that helps their country (Groves, Singer, and Corning, 2000; Reingold, 2014b).

Fourth Mailing – Restate Appeals and Express Gratitude

The fourth mailing primarily summarizes messages from the first three mailings by restating the appeals to trust, benefits, and burden reduction in a different way. We did not want to introduce new concepts in this mailing and overwhelm the recipients of our mailings. So, we repackaged the previous important messages in one mailing using different wording, to avoid repetition (Dillman, Smyth, and Christian, 2014).

This mailing also prominently includes a “thank you” statement which is good communication practice when you ask someone to complete a task. It is natural to thank people for their time and effort.

Fifth Mailing – Heighten the Sense of Urgency

The fifth mailing is the last opportunity to obtain a self-response through mail contact before the start of the CAPI nonresponse followup operation. Thus, we used the following strategies to heighten the sense of urgency and to make it even easier to self-respond:

The tone of the letter is more formal and urgent than the previous mailings. The opening salutation “An important message from the U.S. Census Bureau:” replaces the salutation used in the previous mailings, “Dear Resident.” The opening sentence “Time is running out.” is short and direct. The letter contains only one paragraph, and then presents the response options—much shorter and to the point than previous mailings.

A due date is prominently displayed three times in the mailing to convey a sense of urgency. Providing a deadline or a due date reduces burden on the respondent by giving clear instructions on when a task is due, which fits into a respondent’s mail prioritization process (Dillman, 2016). A recent ACS mail messaging test showed an increase in self-response when a due date was used on the outside and the inside of the fifth mailing pressure seal mailer (Risley and Oliver, forthcoming in 2021).
A new response option is provided (phone response) to increase the likelihood of a participant being able to respond in their preferred mode. We know that a substantial portion of individuals who have not yet responded have lower reading levels. These individuals may be more likely to respond via a telephone interview. We also know that a good portion of individuals who receive the fifth mailing do not speak English proficiently. So, we also added Spanish text to the bottom of the letter to highlight that respondents have the option to complete the survey by phone in Spanish.
A commitment device is used in this mailing to get recipients to commit to responding. This commitment device asks the potential respondent which response option he or she will use to respond to the survey by the due date. Asking for a potential respondent’s commitment to an action can increase the chance that an action is taken (Milkman et al., 2011; Feygina, Foster, and Hopkins, 2015; Shephard and Bowers, 2016).

Plain Language Principles Used in the Materials

In addition to the above recommended messaging themes and strategies, the Messaging Assessment Report also recommended applying plain language principles when developing the new mail materials. Plain language involves wording choices and visual design techniques to make things easier to read.

Section 6.5 of the Messaging Assessment Report outlines the reading level assessment of the current ACS materials. Both the choice of text and the amount of text on each mailing piece caused the materials to rank at high reading levels. The new designs reduce text and use words that are easier to read and understand. The visual design elements mentioned in this section were ideas that developed during the design process through cognitive testing observations, further research, and expert recommendations for improving the materials.

Elements that Improve Readability

According to the Plain Writing Act of 2010, all government documents issued to the public must be written clearly so that people can find the information they need, understand what they find, and use the information they find to meet their needs. In general, the text and layout in the new materials adhere to the official writing guidelines proposed on plainlanguage.gov. Here are a few other specific ways that we implemented plain language principles in the experimental treatments:

Instruction Card: We redesigned the front of the instruction card in the first mailing with a simpler design and more precise text about how to respond to the survey online. We also changed the font size and style of the user ID (embedded in the address label) to make it easier to locate on the card (Martinez et al., forthcoming).

All mail pieces, where applicable: We eliminated the “https://” from the URL for the online response website (https://respond.census.gov/acs) to ease the burden of typing too many characters. A simplified URL is also more visually pleasing and created more white space.⁸

Questionnaire: We redesigned the front cover of the questionnaire by reducing text and updating the icons for a cleaner appearance and ease of reading. (See Attachment A for an image of the front cover.)

Letters: We moved the Census Bureau address to the upper right-hand corner, instead of beneath the logo, to allow more space for the body of the letter.

Elements that Improve Visual Appeal

Research has shown that, for government surveys, some color and graphic elements can be used in the letters to catch the eye and draw attention to important information, as long as the letters still appear official and “governmental” (Dillman et al., 1996; Leslie, 1997; Whitcomb and Porter, 2004; Hagedorn, Panek, and Green, 2014; Reingold, 2014a). Where possible, all treatments developed by the Census Bureau incorporated the color of the ACS stateside housing unit questionnaire (“ACS green”) into the redesigned mail pieces.

The treatments use different visual design elements to test which design best resonates with mail recipients. Since we do not know which visual appeal will work the best at increasing overall response, we are experimenting with different designs.

The Icon Treatment uses icons, or symbolic pictures, to quickly convey messages through pictures rather than with text. Icons, which are signs or images that represent the objects that they signify, are known to draw attention and increase readability (Mertz, 2012). Icons could be beneficial for visual learners, readers with low literacy levels, or for those who are not fluent in English.

The Column and Header Treatment organizes text in a way that makes the letters easier to navigate and thus easier to read. Plain language principles suggest using headers, columns, and short sentences and paragraphs.⁹ For people who prefer scanning text to obtain information quickly, this may be the most beneficial letter design.

The Sidebar Treatment uses a graphic element to draw attention to details not found in the letter. Research has shown that we remember visual images much easier and better than words (Kouyoumdjian, 2012). This design style may appeal to readers who desire more details before responding to the survey.

Within each treatment, the same visual “look and feel” of the design is used throughout the mailings to maintain the cohesive nature of the continuing conversation concept. Research suggests that messages sent across multiple mail contacts, as well as the overall design of graphics, need to look and feel as if they came from the same place and should feel like a continuous conversation (Whitcomb and Porter, 2004; Dillman, Smyth, and Christian, 2014; Hagedorn, Panek, and Green, 2014; Reingold, 2014a).

Other Researched-Based Design Elements

This section describes some design features that came about through recommendations from the two reports and some design features that came about from ideas that surfaced during the design phase.

Multi-Lingual Brochure: We eliminated the multilingual brochure in the first mailing to reduce the volume of messages as suggested in the Messaging Assessment Report.

The multilingual brochure is written in six languages: English, Spanish, Chinese, Vietnamese, Korean, and Russian. The brochure informs people that they can respond to the ACS via telephone in the languages other than English and Spanish.¹⁰ A 2010 study showed that including the brochure in a mailing significantly increased response for the five non-English languages in the brochures (Joshipura, 2010).

In the past five years, an FAQ brochure and an Instruction Booklet and an instruction card have been removed from ACS mailing packages, due to positive results from field tests (Clark, 2015a; Clark, 2015b; Risley and Berkley, forthcoming). Removing the materials reduced costs and reduced respondent burden and did not negatively affect survey response.

Because of past test results with brochures, we feel confident that removing the multilingual brochure will not decrease overall survey response. However, to mitigate the possibility of losing response in non-English languages, we included a sentence in the non-English languages on the back of the Instruction Card sent in the first mailing. The sentence instructs the household that the survey can be answered by phone in Chinese, Vietnamese, Korean, and Russian. A different phone number for each language is given.¹¹ Spanish speakers can also respond to the survey by internet, so the Spanish text on the instruction card mentions responding by internet or calling a number to speak to a representative in Spanish.

Letters: We included Spanish language text on the back of the third letter and the bottom of the fifth letter to remove language barriers that may impede response. One of the recommendations from previous focus group testing of ACS materials was to tailor the materials to acknowledge cultural nuances and make response options readily apparent in Spanish and other languages (Reingold, Inc., 2014b).

In past ACS production materials, the paper questionnaire mailing included more Spanish text to help people respond in Spanish. We have eliminated the instruction card in that mailing that had Spanish text on the back of it. Also, an older design of the front cover of the paper questionnaire included more Spanish text than is currently found on the cover.

We hope that including Spanish on the letters, something new for ACS materials, will increase response for Spanish speakers.

Letters: Wherever text in the mail materials refers to responding online, we added the words “using your computer, smartphone, or tablet.” In some of the materials, we used an icon with the three computing devices as a visual reminder.

One of the researchers at the NAS conference said that his study found that if people start a survey on a smartphone, they are more likely to finish it, whether it be on the same device or on a different one. In short, once you get started you are more likely to finish, and most people have their smartphones with them all the time.

Participants in cognitive testing noticed that they could respond on a smartphone, and they acknowledged that this was helpful and good to know (Martinez et. al, forthcoming).

Envelopes: To increase the likelihood that the recipient would recognize that the letter came from the government and to increase the likelihood of the enveloped being opened, we included the phrase “Official U.S. Government Mail” on the outside of the first and third mailing envelopes. A study mentioned in the Messaging Analysis Report (6.4.2) showed that people first look at their own address on an envelope. Thus, we placed “Official U.S. Government Mail” directly above the recipient’s mailing address, so that it could be readily seen. We were also building on the fact that government-sponsored survey requests receive the highest response rates among all types of surveys (Presser, Blair, and Triplett, 1992).

METHODOLOGY AND RESEARCH QUESTIONS
1. Sample Design

The 2021 Strategic Framework Mail Materials Test will be conducted using the September 2021 ACS production sample. The monthly ACS production sample consists of approximately 295,000 housing unit addresses and is divided into 24 nationally representative groups (referred to as methods panel groups) of approximately 12,000 addresses each. The sample for each of the four experimental treatments in this test will consist of two randomly assigned methods panel groups (approximately 24,000 mailing addresses per treatment). The sample for the control treatment will also consist of two randomly assigned methods panel groups. The control treatment will receive production ACS materials, but will be sorted and mailed separately from production.¹² All remaining methods panel groups will receive production ACS materials.

Experimental Design

The treatments will adhere to the current ACS mailing strategy (the number of mailings, types of mailings, and timing of mailings) detailed in Section 2.2 of this report. Images of each mailing piece can be found in Appendix A.

The Strategic Framework Report did not make recommendations of a specific messaging theme for the fifth mailing. [We conducted research for the fifth mailing universe to identify characteristics of nonrespondents with the current mailing strategy, to use messaging appropriate for the target audience; however, those characteristics may change when the first four mailings are changed (Berkley, 2018).¹³] For this reason, while each treatment employs a distinctive design for the first four mailings, the designs and messaging of the fifth mailing letters vary only slightly among the four experimental treatments.

A brief description of each of the five treatments in this test are provided below.

Treatment 1 (Icon Treatment):

Treatment 1 uses icons as its distinctive design feature. Icons are symbols used to replace words or to draw attention to key text. The icons break the monotony of text, segment the letter content into different parts, and make the content more interesting to read. Effective use of icons improves content readability. The letters in this treatment are written in a traditional letter format but incorporate icons in the body of the letter. The “ACS green” color, found on the paper questionnaire, is used throughout the five mailings to create visual appeal and to build cohesiveness among the mail pieces and mailings.

Treatment 2 (Column and Header Treatment):

Treatment 2 uses columns and headers in green font as distinctive design features. Using columns and headers to segment the content makes it easier to read and navigate to the most important information on the page. The letters in this treatment minimize text and get to the point in a more direct way than a traditional letter format. Like the Icon Treatment, the “ACS green” color is used throughout this treatment.

Treatment 3 (Sidebar Treatment):

Treatment 3 uses images as its distinctive design feature. The images are imbedded in a sidebar that is either green or grey in color, depending on the mailing.¹⁴ Response option icons are used in some of the letters. The sidebar look is common among flyers and infographics, so it is a recognized look. The sidebar also provides a unique space to add more information about the survey that is not found in the letter. The “ACS green” color is also used throughout this treatment.

Treatment 4 (Minimalist Treatment):

This treatment was designed with a minimalist approach, using as few words as possible to convey the most important information needed to respond to the survey. This treatment maintains more of a “governmental” look and feel than the other treatments; no color or graphics are used in the letters. While this treatment does not include the recommended thematic messaging from the Strategic Framework Report, it does employ some plain language principles to make the letters easier to read.

Treatment 5 (Production, Sorted Separately):

Treatment 5 will have materials identical to production, but the mailings will be sorted separately from production. Previous ACS testing found that smaller volumes of mail arrive later than larger volumes of mail. Thus, we sort the control treatment separately to ensure mail delivery timing consistency with the experimental treatments (Heimel, 2016).

Research Questions

How do the treatments affect self-response to the survey before CAPI?
If a treatment affects self-response before CAPI, how does it affect overall response to the survey?
How do the treatments affect Spanish language self-responses?
How do the treatments affect hard-to-count areas?
How does adding visual design elements and messaging affect self-response to the survey before CAPI, compared to the minimalist approach?
How does the redesigned front cover of the questionnaire affect item nonresponse for the questions on the front cover?
How do the treatments affect overall form completeness?
How do the treatments affect the demographics of early respondents? Late respondents? Overall respondents before CAPI?
How would the treatments affect the costs of data collection if implemented in production ACS?

Analysis Metrics

All self-response analyses, except for the cost analysis, will be weighted using the ACS base sampling weight (the inverse of the probability of selection). Cases in the CAPI subsample will have their weight multiplied by a CAPI subsampling factor unless they are self-responses. The sample size will be able to detect differences of approximately 1.25 percentage points between the self-response return rates of the experimental treatments (with 80 percent power and α=0.1). Detectable differences for the analysis of item-level data (such as item nonresponse rates) vary depending on the item, with housing-level items having minimum detectable differences up to 1.6 percentage points. We will use a significance level of α=0.1 when determining significant differences between treatments. For analysis that involves multiple comparisons, we will adjust for the Type I familywise error rate using the Hochberg method (Hochberg, 1988).

Unit Response Analysis

To determine the effect of each treatment on self-response, we will calculate the self-response return rates at selected points in time in the data collection cycle. The selected points in time reflect the dates of additional mailings or the end of the self-response data collection period. An increase in self-response presents a cost savings for each subsequent phase of the mailing process by decreasing the number of mailing pieces that need to be sent out. A significant increase in self-response before CAPI decreases the number of costly interviews that need to be conducted. Calculating the self-response return rates at different points in the data collection cycle gives us an idea of how the experimental treatments would affect operational and mailing costs if they were implemented into a full ACS production year.

To determine whether the experimental changes affect the final self-response and CAPI response by the end of the data collection period, we will calculate final response rates and how each response mode contributes to the total final response.

Self-Response Return Rates

Self-response return rates will be calculated for total self-response combined and separately for internet, mail, and TQA responses. If no significant differences in TQA rates are detected, we may combine mail and TQA rates.

The return rates will be calculated using the following formula:

Self-Response Return Rate

Number of mailable and deliverable sample addresses that provided a response by mail, Telephone Questionnaire Assistance (TQA), or internet

* 100

Total number of mailable and deliverable sample addresses¹³

Final Response Rates

To determine the effect of the experimental treatments on overall response to the survey, we will calculate final overall response rates and how each response mode contributes to the overall final response rate. The final response rates will be calculated using the following formula:

Final Response Rate	=	Number of mailable and deliverable sample addresses that provided a response by mail, Telephone Questionnaire Assistance (TQA), internet, or CAPI	*100
		Total number of mailable and deliverable sample addresses¹⁵ in the universe that were eligible to respond to the survey

Item Response Analysis
1. Item Nonresponse

Although the questionnaire cover redesign proposed in this test does not differ substantially from the production design, we would still like to see if the redesign influences response to the questions on the cover.

We will calculate item nonresponse rates for each item on the front cover individually and for all the items combined to determine the effect of the redesigned questionnaire cover on the survey questions that appear on the front cover of the paper questionnaire. We want to determine if the new design of the cover affects response to the questions for Last Name, First Name, Middle Initial, Phone Number, and the number of people living or staying at the address.

The formula for the item nonresponse rate is:

Item Nonresponse Rate

Number of nonresponses to the

questions on the front cover

*100

Number of nonblank mail responses

Form Completeness

While we have no reason to believe that any part of this experiment will affect overall form completeness, and we assume that there will be no effect, we would like to verify this assumption.

To determine the effect of the treatments on the quantity of survey questions answered by each household, we will calculate and compare form completeness rates. Form completeness measures the number of questions on the form that were answered among those that should have been answered, based on questionnaire skip patterns and respondent answers.

We will only calculate and compare form completeness rates for mail and internet responses, because with phone or in-person interviews form completeness can vary depending on the interviewer. For internet responses, the “form” is the internet instrument; for mail responses, the “form” is the paper questionnaire. The following formula will be used:

Overall Form Completeness Rate

Number of questions answered

*100

Number of questions that should

have been answered

Response Distributions of Respondent Demographics

This analysis will not be conducted for treatments that have lower self-response return rates or lower final response rates than the production treatment. Any treatment that performs worse than current production will not be considered as a candidate for replacing production materials. This analysis is only concerned with the success of any treatment that may be used in production to increase self-response with populations that are considered hard-to count.

The letters being tested were designed to have a broad appeal, but they were also designed with a lower reading level and with added Spanish text. Part of the reason these features were used was to gain more self-response from demographic groups that typically respond by personal interview in the CAPI phase of data collection. We hope to convert late CAPI respondents to earlier self-respondents. To see if we were successful in reaching the target audience, we will compare respondent demographics before the third and fifth mailings and before CAPI.

We will calculate and compare the distributions of all non-blank self-responses for the following demographic and housing categories: age, educational attainment, Hispanic origin, race, sex, building type, and tenure.

Proportion estimates will be calculated using the following formula:

Valid in-scope responses will be included in the analysis. The demographic characteristics will be for the respondent, or Person 1, in the survey. We will use uncoded data for the race and Hispanic origin analysis.

This analysis will only use self-responses. In our calculations, we will calculate combined self-response and separate the distributions by mode: mail, internet, and TQA.¹⁶ We will use Rao-Scott chi-squared tests of independence to determine whether the response distributions are statistically different at the α=0.1 level (Rao & Scott, 1987). If the distributions are significantly different, we will perform t-tests on the differences for each subcategory.

To control for the overall Type I error rate for a set of hypotheses tested simultaneously, we will perform multiple-comparison procedures using the Hochberg method (Hochberg, 1988). The overall Type I error rate is called the familywise error rate and is the probability of making one or more Type I errors among all hypotheses tested simultaneously. A family for our analysis will be the list of p-values obtained from comparisons of the overall characteristic categories (age, educational attainment, Hispanic origin, race, sex, building type, and tenure). If the response distributions differ significantly for a specific topic, the list of p-values for subcategory comparisons will be used as a family for multiple comparisons.

Cost Analysis

Excluding the multilingual brochure from the first mailing presents a cost savings. Aside from that savings, we will use estimated workloads to determine any other effects on the cost of implementing any of the experimental treatments into a full production. We will only perform cost analysis for treatments that show significant increases or decreases in self-response before CAPI. We will compare each experimental treatment to Treatment 5 (production materials) before the file creation for the second mailout phase (third and fourth mailings), before the file creation of the third mailout phase (the fifth mailing), and before the creation of the CAPI sample.

In addition to changes in workloads for each mailing, the cost analysis will consider any differences associated with the mailing materials including printing, assembly, and postage costs for potential cost savings if we reduce the mailing workload for the paper questionnaire package mailing. Cost differences associated with CAPI will account for any significant changes in the CAPI workload due to a significant increase or decrease in self-response before CAPI for each experimental treatment compared to production.

Research Question Analysis
1. Questions Involving Unit-level Response Analysis

RQ1. How do the treatments affect self-response to the survey before CAPI? This analysis will evaluate the effect of each treatment on self-response to the survey. We will calculate and compare self-response return rates of the initial mailing universe for all treatments vs. production materials (Treatment 5). Since an increase in self-response will decrease the cost of subsequent phases of the data collection cycle targeting nonresponders, we will compare self-response return rates just before the third mailing, before the fifth mailing, and before the start of CAPI. We will compare return rates by response mode and overall (modes combined). We will make each comparison using a two-tailed hypothesis test, for a total of four comparisons. Each null hypothesis will be H₀: T₅ = T_i and each alternative hypothesis will be H_A:T₅ ≠ T_i

where i = 1, 2, 3, and 4 and T₅ = control treatment.

RQ2. If a treatment affects self-response before CAPI, how does it affect overall response to the survey? This analysis will be performed if a treatment is statistically different from the production treatment before the start of CAPI. To evaluate the effect of an experimental treatment on overall response to the survey, we will calculate final overall response rates and how each response mode contributed to the overall final response rate. These rates will be compared with the production treatment. We will make each comparison using a two-tailed hypothesis test, for up to four comparisons. Each null hypothesis will be H₀: T₅ = T_i and each alternative hypothesis will be H_A:T₅ ≠ T_iwhere i = 1, 2, 3, and 4 and T₅ = control treatment.

RQ3. How do the treatments affect Spanish language self-response? We will calculate and compare final self-response rates for Spanish self-responses.¹⁷ We will calculate a combined rate for all experimental treatments and compare the rate with production using a two-tailed hypothesis test. The null hypothesis will be H₀: T₅+ Production = T₁ + T₂+ T₃+ T₄ and each alternative hypothesis will be H_A:T₅+ Production ≠ T₁ + T₂+ T₃+ T₄.

RQ4. How do the treatments affect areas with hard-to-count populations? We will calculate and compare final self-response rates and final overall response rates by designated high and low response areas. The areas will be defined at the tract level using the low response score (LRS) on the Census Bureau’s planning database.¹⁸ The LRS is a modeled variable derived from ACS data for “hard to count” populations and 2010 Census mail response. Some characteristics of “hard to count” populations used to create the LRS are the following: ages 5 and under, ages 18-24, don’t speak English very well, foreign born, renters, low education level, below poverty level, and no internet access.

Defining high and low response areas: A low LRS means a tract is in a high response tract. The 75 percent lowest LRS scores will determine the high response area in our analysis. The remaining tracts will be the low response areas.¹⁹

The rates will be compared with the production treatment. We will make each comparison using a two-tailed hypothesis test, for up to four comparisons. Each null hypothesis will be H₀: T₅ = T_i and each alternative hypothesis will be H_A:T₅ ≠ T_iwhere i = 1, 2, 3, and 4 and T₅ = control treatment.

RQ5. How does adding visual design elements and strategic messaging affect self-response before CAPI, compared to the minimalist approach? The visual design elements refer to the layout or how things are arranged on a page and the colors and imagery. Strategic messaging refers to the words chosen to convey the thematic messages used in the letters suggested by the Strategic Framework Report. Treatment 4 did not use color, graphics, or any of the recommended thematic messaging from the Strategic Framework Report. However, most other elements found in Treatments 1, 2, and 3 are also in Treatment 4. By comparing the first three treatments separately to the minimalist treatment, we can see how the layout, visual design elements, and strategic messaging of the Icon, Column and Header, and Sidebar Treatments affect self-response as compared to a treatment that does not use the targeted themes, color, or images and only uses minimal wording throughout. We will calculate and compare self-response return rates for treatments 1, 2, and 3 versus treatment 4. We will make each comparison using a two-tailed hypothesis test, for a total of three comparisons. Each null hypothesis will be H₀: T₄ = T_i and each alternative hypothesis will be H_A:T₄ ≠ T_iwhere i = 1, 2, and 3 and T₄ = Minimalist Treatment.

Questions Involving Item-level Response Analysis

Item-level response analysis is for occupied housing units only.

RQ6. How does the redesigned front cover of the questionnaire affect item nonresponse for the questions on the front cover? We will calculate and compare item nonresponse rates for each question individually and for the front cover overall. We will make each comparison using a two-tailed hypothesis test, for a total of four comparisons. Each null hypothesis will be

H₀: T₅ = T_i and each alternative hypothesis will be H_A:T₅ ≠ T_iwhere i = 1, 2, 3, and 4 and

T₅ = control treatment. The universe for calculations will be all mail responses to the survey.

RQ7. How do the treatments affect overall form completeness? We will calculate and compare form completeness rates. We will make each comparison using a two-tailed hypothesis test, for a total of four comparisons. Each null hypothesis will be H₀: T₅ = T_i and each alternative hypothesis will be H_A:T₅ ≠ T_iwhere i = 1, 2, 3, and 4 and T₅ = control treatment.

The universe for calculations will be all internet and mail responses.

RQ8. How do the treatments affect the demographics of early respondents? Late respondents? Overall respondents before CAPI? We will calculate and compare distributions of responses to questions about the following demographic and housing categories: age, educational attainment, Hispanic origin, race, sex, building type, and tenure. We will compare each treatment to the production treatment (T1, T2, T3, T4 vs. T5). For early respondents we will use internet, mail, and TQA responses received before the third mailing is sent. For late respondents we will use internet, mail, and TQA responses received before the fifth mailing is sent. For the overall calculations, we will use all self-responses received before the CAPI sample is created.

Question involving Cost Analysis

RQ9. How would the treatments affect the costs of data collection? Apart from the cost savings from the removal of the multilingual brochure, we will only assess cost impacts on treatments that perform better than the production treatment. To assess impacts on costs we will calculate the annual expected cost of implementing each experimental treatment into a full ACS production year and compare it to the production costs. A confidence interval for the cost that accounts for sampling error in the workload estimates will also be calculated. Cost differences will be calculated as described in Section 4.4.3.

An abbreviated version of the cost analysis will appear in the final report. We will prepare a separate, more detailed, cost report for internal use, if necessary.

DECISION CRITERIA FOR PICKING A “WINNING” TREATMENT

Each treatment in this test exhibits clear benefits when compared to current production materials. Those benefits are: (1) reduction in costs from eliminating the multilingual brochure and (2) reduction in respondent burden by improving the visual appeal and readability of the letters. Having said that, the goal of this test is to choose one experimental treatment as the best among all treatments. To do that we will adhere to the following decision criteria, listed in Table 1 by order of importance.

Table 1. Decision Criteria for “Winning” Treatment

Priority Order	Decision Criteria
1	Self-Response Return Rates
2	Response in Low Response vs High Response Areas
3	Response Distributions
4	Form Completeness Rates
5	Final Response Rates

As stated in the introduction, the purpose of this field test is to identify which experimental treatment is most successful at increasing self-response when compared to the control treatment (production materials). Self-response is the most important metric, because increasing self-response reduces data collection costs. We will look at self-response throughout the data collection period. The earlier we receive self-response, the more money we save. If we do not see a “winner” in the self-response category, or if some of the treatments are not statistically different, we will continue with the list of decision criteria to determine a “winner”.

We have also stated that a very important goal of the new materials is to convert households with characteristics of those that typically respond in CAPI to be households that self-respond. We have implemented many new features in the materials that we hope will reach this goal. We expect this strategy to increase overall self-response, but if it does not, the next best metric to look at will be response in Low Response Areas. If we don’t see any differences in this category, we will look for changes in the demographic makeup of early responders, which we hope to see with our response distribution analysis.

Not that we will consider magnitude of differences as we make comparisons. For example, if Treatment 1 performs nominally better than Treatment 2 in self-response but significantly worse than Treatment 2 in response in low response areas, we will take that into consideration.

We hope to come to a decision without having to use #4 and #5 on the list. These are extremely low priority. We don’t expect these categories to be significantly different for any of the treatments, but if two treatments are tied for #1 - #3 and one of them performs extremely well for #4 and #5, we will take that into consideration when making a decision on the winning treatment.

If, using the above criteria, a single treatment cannot be identified as the “best” among all treatments, then additional field testing will be conducted. The additional field testing may include a subset of treatments, updated materials, or a combination of the current treatments. (See Appendix C for some ideas for future testing.)

ASSUMPTIONS AND LIMITATIONS
1. Assumptions

A single ACS monthly sample is representative of an entire year (twelve panels) and the entire frame sample, with respect to both response rates and cost, as designed.
A single methods panel group (1/24 of the full monthly sample) is representative of the full monthly sample, as designed.
We assume that there is no difference between treatments in mail delivery timing or subsequent response time. The treatments had the same sample size and used the same postal sort and mailout procedures. Previous research indicated that postal procedures alone could cause a difference in response rates at a given point in time between experimental treatments of different sizes, with response for the smaller treatments lagging (Heimel, 2016).

Limitations

Group quarters and sample housing unit addresses from remote Alaska and Puerto Rico are not included in the sample for the test. Any conclusions from the test can only be made for housing units like those in the test sample.
The cost analysis uses estimates to make cost projections. These estimates do not account for monthly variability in production costs such as changes in staffing, production rates, or printing price adjustments.
Each treatment was designed holistically and, as such, if differences in response are detected we will not be able to identify the specific elements in each treatment that caused differences to occur.

TABLE SHELLS
1. Self-Response Returns Rates

Table 1. Sample Table for Total Self-Response Return Rates: Comparison of Treatment vs. Control

Point in Data Collection Cycle	Treatment	Control	Difference	P-Value
Before the Third Mailing All Self-response Modes	NN.N (N.N)	NN.N (N.N)	N.N (N.N)	N.NN
Before the Fifth Mailing All Self-response Modes	NN.N (N.N)	NN.N (N.N)	N.N (N.N)	N.NN
Before CAPI All Self-response Modes	NN.N (N.N)	NN.N (N.N)	N.N (N.N)	N.NN