Generic Information Collection Request for ACES Machine Learning Debriefings
Request: The Census Bureau plans to conduct additional research under the generic clearance for questionnaire pretesting research (OMB number 0607-0725). We plan to conduct debriefing interviews on a proposed machine learning functionality for the Annual Capital Expenditures Survey (ACES) conducted by the Census Bureau.
The ACES is a mandatory annual collection that gathers data on business investment for new and used structures and equipment. This survey is collected under the authority of Title 13, United States Code (U.S.C). It is a sample survey of approximately 70,000 companies. Basic data for each year include all expenditures during the year for both new and used structures and equipment chargeable to asset accounts for which depreciation or amortization accounts are ordinarily maintained. For reference years ending in “2” and “7,” detailed data by types of structures and types of equipment are collected from companies with employees.
These data are critical to evaluate productivity growth, the ability of U.S. business to compete with foreign business, changes in industrial capacity, and measures of overall economic performance. Industry analysts use the data for market analysis, economic forecasting, identifying business opportunities, and developing strategic plans. Government agencies use the data to improve estimates for investment indicators for monetary policy, improve estimates for capital stocks for productivity analysis, monitor and evaluate the healthcare industry, and analyze depreciation. Other users of the data include private companies, organizations, educators, students, and economic researchers.
Further information regarding ACES and its uses can be found at this website: https://www.census.gov/programs-surveys/aces/about.html
In an effort to improve data quality and minimize the prevalence of missing expenditures, the Annual Capital Expenditures Survey (ACES) includes an “Other” specify item allowing respondents to enter expenditures that are not prelisted. In the past, these expenditures were manually reviewed by subject matter analysts and reclassified as spending for structures or equipment or not applicable. For the 2016 survey cycle, the presence of a value in “Other” accounted for 30 percent of total edit failures.
In 2017, ACES incorporated a machine learning component which automatically codes the write-ins for reclassification based on a series of keywords. Survey respondents are prompted to re-classify an item for a given expenditure if the prediction for the write-in is either “Structures” or “Equipment” and the probability associated with that prediction is 80 percent or higher. Although this new functionality is intended to enhance the reporting process for providing expenditures, it is unclear how respondents interacted with this functionality, whether it was well understood and utilized, and whether it aided in reporting. Debriefing interviews will be used to assess the functionality of the machine learning component from the respondents’ perspective and to determine how improvements can be made to its current design/functionality.
Purpose
These debriefing interviews will be conducted to gain a better understanding of respondents’ reporting of expenditures using the new machine learning functionality. We will use the company’s write-in responses to Items 3a/3b-Other New/used Capital expenditures to frame our points of discussion. During these interviews, we will seek the following information:
Respondents’ experience using the write-ins and interactions with the machine learning component for reclassification of entries
Respondents’ ability to resolve errors associated with the classification of expenditures
Assessing respondents’ ability to reclassify entries when prompted
Identifying areas of improvement for the design/functionality of the machine learning component
Staff from the Data Collection Methodology & Research Branch within the Economic Statistical Methods Division (ESMD) of the Census Bureau will be conducting debriefing interviews for this testing.
Language
Testing will be conducted in English only.
Method
The method of research will be respondent debriefing interviews, which are interviews aimed at understanding how a respondent recently reported to a survey. Paradata may also be used to inform debriefing questions. All interviews will be conducted over the telephone. The interviews will follow a semi-structured interview protocol (Attachment A). For illustrative purposes and to aid in recall, respondents (with consent) will be emailed screenshots of Item 3 illustrating the machine learning functionality and error messages prompting reclassification. This may also be illustrated via a video recording if necessary. All data provided in the illustrations will be fictitious with no identifying information. (See Attachment B)
Subject area specialists from the Census Bureau will participate in some of the debriefing interviews in order to listen to the interviews.
Sample: We plan to conduct a total of 30 interviews. This number of interviews was selected because it is a manageable number of interviews for the time period allotted, and should be large enough to provide reactions to the questions that are representative of the survey population. We plan to conduct interviews with a variety of sizes and types (i.e., industries) of businesses. The frame of respondents will be recent respondents to the 2018 ACES, with a particular focus on respondents who provided write-in responses that triggered the machine learning functionality (e.g., received an error message prompting them to re-classify their expenditure).
Recruitment: Participants will be recruited using a list of respondents from the 2018 ACES. Before beginning the interviews, we will inform participants that their response is voluntary and that the information they provide is confidential under Title 13. The interviews may be recorded (with consent), to facilitate summarization.
Protocol: A copy of a draft interview protocol and screenshots for testing purposes are enclosed. Respondent debriefings will be conducted via telephone. Respondents will be sent visuals of the survey item and the machine learning component via email.
Timeline:
Recruiting for these debriefing interviews will begin as early as May, 2019 and continue through July, 2019.
Length of interview: For respondent debriefings, we expect that each interview will last no more than 30 minutes (30 cases x 30 minutes per case = 15 hours). Additionally, to recruit respondents we expect to make up to 5 phone contacts per completed case. The recruiting calls are expected to last on average 3 minutes per call (5 attempts per phone call per completed case x 30 cases x 3 minutes per case = 7.5 hours). Thus, the estimated burden is 22.5 hours (15 hours for interviews + 7.5 hours for recruiting).
Enclosures: Below is a list of materials to be used in the current study:
Debriefing protocol (Attachment A )
2018 ACES screenshots illustrating Item 3 machine learning component to be evaluated (Attachment B)
Contact: The contact person for questions regarding data collection and statistical aspects of the design of this research is listed below:
Temika Holland
Data Collection Methodology & Research Branch
Economic Statistics and Methodology Division
U.S. Census Bureau
Washington, D.C. 20233
(301) 763-5241
Temika.Holland@census.gov
Enclosures
Cc:
Nick Orsini (ADEP) with enclosure
Carol Caldwell (ESMD) with enclosure
Diane Willimack (ESMD) with enclosure
Amy Anderson Riemer (ESMD) with enclosure
Kristin Stettler (ESMD) with enclosure
Kimberly Moore (EWD) with enclosure
Anne Sigda Russell (EWD) with enclosure
Valerie Mastalski (EWD) with enclosure
Catherine Buffington (ADEP) with enclosure
Jennifer Hunter Childs (ADRM) with enclosure
Jasmine Luck (ADRM) with enclosure
Danielle Norman (PCO) with enclosure
Mary Lenaiyasa (PCO) with enclosure
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
File Modified | 0000-00-00 |
File Created | 0000-00-00 |