APPENDIX H 2021 NSCG Adaptive Survey Design Experiment Goals, Interventions, Monitoring Metrics, and Potential Intervention Points

Appendix_H.1_AdaptiveDesignExperiment.docx

National Survey of College Graduates (NSCG)

APPENDIX H 2021 NSCG Adaptive Survey Design Experiment Goals, Interventions, Monitoring Metrics, and Potential Intervention Points

OMB: 3145-0141

⚠️ Notice: This form may be outdated. More recent filings and information on OMB 3145-0141 can be found here:

Document [docx]

Download: docx | pdf

APPENDIX H

2021 NSCG Adaptive Survey Design

Experiment Goals, Interventions, Monitoring Metrics, and Potential Intervention Points

2021 NSCG Adaptive Design Experiment Goals, Interventions,

and Monitoring Metrics

The 2021 NSCG Adaptive Design Experiment (“2021 Experiment”) will be structured largely the same as the 2017 and 2019 NSCG Adaptive Design Experiments. Just as in those years, we will have experimental groups for the new sample cases (8,000) and the returning sample cases (10,000) with control groups identified for comparative purposes. Improvements will come from two directions for the 2021 Experiment:

Cases will be identified for interventions based on their ability to reduce data collection costs while minimizing the increase in the root mean squared error (RMSE) for several key variables in the NSCG. Additionally, we will expand the data monitoring metrics that we implement during data collection to include evaluating the stability of survey estimates.
We will automate both the identification and selection of cases for interventions, as well as the delivery of the intervention file directly to the data collection modes. This will reduce the number of handoffs required to enact an intervention, making the implementation of adaptive design more efficient.

Previously, NCSES and the Census Bureau worked to develop flow processing capabilities for the entire survey, with editing, weighting, and imputation occurring at time points during the data collection period as opposed to waiting until after data collection was over to perform the data processing. For the 2021 Experiment, we will be implementing simplified versions of flow processing to allow us to examine differences between the treatment and control groups not only with respect to representativeness and response rate, but also regarding stability of estimates and the effect of our nonresponse adjustment. These types of metrics will be considered as contributing factors in our decisions to make interventions.

Additionally, we will use past rounds of the NSCG to impute responses for non-respondents throughout data collection, along with the propensity to respond given the application of particular data collection features and the cost of those features. These simulations will allow us to determine which features are most effective at reducing the RMSE of key estimates while understanding their effect on response rates and budget. We can use these simulations to evaluate the 2021 NSCG to see if the effects of data collection features are relatively stable over time. This continues work started in the 2019 adaptive survey design experiment that attempts to incorporate predictions of actual survey estimates into the adaptive design decision-making process.

The second improvement will continue the automation of the data analytic and business rule execution that was ad hoc in nature in the adaptive design experiments of early cycles. While some monitoring metrics, including R-indicators, were run on an automated basis, specific decisions about when and where interventions should actually occur were the result of extended conversations and incremental data analysis. While these steps were important in the early stages of adaptive design, and for understanding how large interventions would be, adaptive design cannot be implemented in a standardized, repeatable production setting while maintaining such an extremely hands-on approach. For the 2021 Experiment, we will review the lessons learned from past adaptive design experiments in order to automate informative analyses in conjunction with intervention files.

In a general sense, the goal of the 2021 Experiment is to expand the 2019 experiment in order to identify cases for interventions based on several outcome variables versus a single variable, expand usage of and access to data monitoring metrics, and develop a baseline level of comfort with automated interventions for adaptive design in a production setting.

The remainder of this appendix discusses several reasonable adaptive design goals, what interventions would allow the NSCG to achieve those goals, and what monitoring metrics would inform those interventions. As noted earlier, the 2021 Experiment will be structured largely the same as the earlier experiments, and so the goals listed below are consistent with goals identified for earlier experiments. The major difference is that, instead of focusing on R-indicators, which only require frame data and response indicators, the selection criteria for interventions in the 2021 NSCG will utilize historical and current response data to intervene on cases that will reduce the RMSE of key survey estimates. However, both R-indicators and RMSE of key estimates can be used to reduce the risk of nonresponse bias in estimates and balance cost, so this change represents an expanding evaluation of monitoring metrics, without losing sight of our main adaptive design goals.

Goal 1: Balance Sample / Reduce Nonresponse Bias

Sampling balancing and/or reducing nonresponse bias relate to maintaining data quality in the face of shrinking budgets and falling response rates. Nonresponse bias arises when the outcomes of interest (the survey estimates) for respondents are different from those of nonrespondents. This difference results in a bias because the resulting estimates only represent a portion of the total target population. Surveys often try to correct for this after data collection using weighting, post-stratification, or other adjustments. Adaptive design interventions attempt to correct for nonresponse bias during data collection by actually changing the respondent population to be more balanced on frame characteristics related to response and outcome measures.

While discussing R-indicators, Schouten et al., provides reasons why balancing on variables related to response status and outcome variables is desirable. “In fact, we view the R-indicator as a lack-of-association measure. The weaker the association the better, as this implies that there is no evidence that nonresponse has affected the composition of the observed data.” [3] This suggests that “selective forces…are absent in the selection of respondents” out of the sample population [2], and so nonresponse approaches missing at random, reducing the risk of nonresponse bias.

Interventions: Interventions are used to change the type or quantity of contacts targeted at specific subgroups or individuals. Interventions that will be considered for inclusion in the 2021 Experiment include:

Sending an unscheduled mailing to sample persons;
Sending cases to computer assisted telephone interviews (CATI) prior to the start of production CATI nonresponse follow up (NRFU), to target cases with an interviewer-assisted mode rather than limiting contacts to self-response modes;
Putting CATI cases on hold, to reduce contacts in interviewer-assisted modes, while still requesting response in self-response modes;
Withholding paper questionnaires while continuing to encourage response in the web mode to reduce the operational and processing costs associated with certain groups of cases;
Withholding web invites to discourage response in certain groups of cases, while still allowing these cases to respond using previous invitations;
Sending paper questionnaires to web nonrespondents earlier than the scheduled mail date to provide two modes of self-response rather than one; and
Changing the CATI call time prioritization to increase or decrease the probability a case is called during a specific time.

The interventions ultimately selected for use during data collection may be limited by production data collection decisions. For example, if CATI is not available until Week 8, we cannot send a case to the CATI operation prior to Week 8.

Monitoring Methods:

Root Mean Squared Error of Key Estimates;
R-indicators [2], [3], [4];
Mahalanobis Distance or other distance measure [5];
Response influence [6]; and
Uncertainty/influence of imputed y-values [7].

We used R-indicators in the 2013 and 2015 Experiments and used a modified version of an R-indicator, and individual balancing propensity score, in the 2017 effort. In 2019, we used the R-indicator to compare how the treatment interventions affected representativeness versus the control group. As a metric, R-indicators are useful for measuring response balance and served their purpose as a proof of concept for data monitoring. However, employing more metrics during data collection allows us to assess the usefulness of each monitoring metric and provides more confidence that data collection interventions were targeted in the most efficient way possible. That is, if R-indicators identify subgroups that should be targeted to increase response balance, and another metric (e.g., balancing propensity, response influence, Mahalanobis distance, etc.) identifies specific cases in those subgroups that also are likely to have an effect on nonresponse bias, then we have more confidence that those identified cases are the optimal cases for intervention, both from a response balance and nonresponse bias perspective.

Goal 2: Increase Timeliness of Data Collection

Analysts and other data users that need relevant, up-to-date information to build models, investigate trends, and write policy statements rely on timely survey data.

Interventions: Interventions will attempt to either encourage response to the NSCG earlier than the standard data collection pathway or will be used to stop data collection if new respondents are not changing key estimates. This could be achieved by introducing modes earlier than the standard data collection pathway, sending reminders that elicit response more quickly, or stopping data collection for all or a portion of cases and reallocating resources. Possible interventions include:

Sending cases to CATI prior to the start of production CATI NRFU, to target cases with an interviewer-assisted mode rather than limiting contacts to self-response modes;
Sending paper questionnaires to web nonrespondents earlier than the scheduled mail date to provide two modes of self-response rather than one;
Sending email reminders earlier than the scheduled dates in data collection; and
Stopping data collection for the sample or for subgroups given a sufficient level of data quality. For example, we could stop data collection if:

key estimates have stabilized, and standard errors fall within acceptable ranges, or

the coverage ratio for a subgroup of interest reaches a pre-determined threshold.

Monitoring Methods:

Propensity to Respond by Modes [8];
Change Point Analysis [9];
Stability of Estimates [10]; and
Coverage Ratios.

Ongoing NSCG research conducted by Chandra Erdman and Stephanie Coffey [8] could inform appropriate times to introduce new modes to cases ahead of the standard data collection schedule. Another possibility involves exploring change point analysis. If respondents per day as a metric changes over time, showing fewer responses in a given mode, there may be cause to introduce a new mode ahead of schedule. In addition, we will be able to calculate key estimates on a weekly or semi-weekly basis. As a result, we will be able to track stability of estimates during data collection to identify times when the data collection strategy has peaked, resulting in fewer responses or similar information that was already collected.

Goal 3: Reduce Cost

Controlling costs are always a survey management goal. More recently however, “the growing reluctance of the household population to survey requests has increased the effort that is required to obtain interviews and, thereby, the costs of data collection…[which] has threatened survey field budgets with increased risk of cost overruns” [10]. As a result, controlling cost is an important part of adaptive design. By allowing survey practitioners to reallocate resources during the data collection period, surveys can make tradeoffs to prioritize cost savings over other goals.

Interventions: Interventions will be used to encourage survey response via the web while discouraging response in more expensive modes (mail, CATI), or to eliminate contacts that may be ineffective. Possible interventions include:

Putting CATI cases on hold, to reduce contacts in interviewer-assisted modes, while still requesting response in self-response modes;
Withholding paper questionnaires while continuing to encourage response by web to reduce the operational and processing costs associated with certain groups of cases;
Withholding web invites to discourage response from certain groups of cases, while still allowing these cases to respond using previous invitations;
Prioritizing or deprioritizing cases in CATI during certain call times to increase or decrease the probability a case is called during a specific time frame without having to stop calling the case entirely; and

Stopping data collection for the sample or for subgroups if key estimates and their standard errors have stabilized.

Monitoring Methods:

Root Mean Squared Error of Key Estimates;
R-indicators;
Mahalanobis Distance or other distance measure;
Response influence;
Uncertainty/influence of imputed y-values;
Stability of estimates; and
Numbers of trips to locating.

The same indicators that are valuable for monitoring data quality also could measure survey cost reduction. If cases are in over-represented subgroups, or have low response influence, we may want to reduce or eliminate contacts on those cases.

In addition, the key estimates valuable to increasing timeliness, are also valuable for controlling cost. When estimates stabilize and their standard errors fall within acceptable limits for subgroups or the entire survey, new respondents are providing similar information to that which we have already collected. If continuing data collection would have little effect on estimates and their standard errors, stopping data collection to all or subgroups of cases would be an efficient way to control costs.

Another potential cost-saving intervention would be to limit the number of times a case could be sent to locating. If we have no contact information for a case, or previously attempted contact information has not been useful for obtaining contact, a case is sent to locating where researchers attempt to identify new, more up-to-date contact information. This operation can be time intensive, especially for cases repeatedly sent to locating. We could track the number of times a case is sent to interactive locating, or the length of time it spends in locating. Cases repeatedly sent to locating and cases that spend a large amount of time being researched may not be ultimately productive cases. Reallocating effort spent on these cases to those in locating for a fewer number of times may be a sensible cost-saving measure that allows us to attempt contact on more cases, rather than spending large amounts of time (money) on the same cases.

Adaptive Design Data Collection Intervention Schedule and Intervention Criteria

To provide insight on the way that adaptive design criteria will be applied in the determination of interventions for the 2021 NSCG adaptive design experiment, NCSES is submitting a table documenting the adaptive design intervention schedule and criteria (Table H.1.).

All sample cases will be monitored beginning at week 0. Adaptive interventions will be reviewed and implemented as needed at weeks 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 23, and 24 of the data collection period. As part of the adaptive design experiment, we have identified certain adaptive interventions that may be implemented depending upon the case monitoring results that could help the NSCG meet its data collection goals. The decision to implement an adaptive intervention will be based on the evaluation of specific criteria associated with the data collection metrics. The specific criteria are described generally below, and the specifics are provided in Table H.1.

The approach for the 2021 NSCG adaptive design experiment is to use predictive models that estimate the RMSE of several survey outcome variables, so that interventions can focus on reducing cost for a relatively stable RMSE. Thus, these interventions focus directly on the quality of key survey outcomes rather than balancing across frame variables, which is the goal of R-indicators. Key survey estimates beyond self-reported salary, which was the target estimate used in 2019, will be agreed upon by Census and NCSES, and then models to estimate those additional key outcomes will be developed. Models will be finalized by December 2020 to ensure that the intervention methods selected have been jointly reviewed and agreed upon before data collection begins.

At the same time, we do not want to ignore R-indicators or their underlying propensity models, as they have been valuable monitoring tools in the NSCG. The propensity models underlying the NSCG R-indicators have been validated over the past three data collection cycles and include variables that are highly correlated to survey outcomes. This use of the propensity models in multiple NSCG cycles provides confidence that those models provide some context about the data quality of the respondent population, even if the metric itself is only a proxy for nonresponse bias. Therefore, R-indicators will still be used to evaluate the effect of RMSE-based interventions on overall sample balance. In other words, while we will not be using the R-indicator models to make interventions, as we have in the past, they will be used as a second piece of information about the data quality of the treatment group.

The interventions considered for a given week are designed to result in the largest cost savings while keeping the RMSE of key survey outcomes relatively low (i.e., small increases). This means that, generally, we do not want to apply an expensive data collection feature (like telephone calls) to a case unless we predict the case is more likely to respond to the more expensive feature than a less expensive feature (like a web invite). At each intervention point, we will be examining both cost and response properties of different data collection features (like sending cases to CATI early or withholding mailed reminders). However, because the NSCG has a sequential design, there are also overarching cost and response properties that will be kept in mind.

The list of potential interventions for each week is shown in Table H.1., which includes information about metrics and criteria used. Additionally, a flowchart view of the potential data collection interventions illustrates which interventions are available each week.

References:

[1] Coffey, S. (2014, April). “Report for the 2013 National Survey of College Graduates Methodological Research Adaptive Design Experiment.” Census Bureau Memorandum for NCSES.

[2] Schouten, B., Cobben, F., Bethlehem, J. (2009, June). “Indicators for representativeness of survey response.” Survey Methodology, 35.1, 101-113.

[3] Schouten, B., Shlomo, N., Skinner, C. (2011). “Indicators for monitoring and improving representativeness of response.” Journal of Official Statistics, 27.2, 231-253.

[4] Coffey, S., Reist, B., White, M. (2013). “Monitoring Methods for Adaptive Design in the National Survey of College Graduates (NSCG).” 2013 Joint Statistical Meeting Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association.

[5] de Leon A.R., Carriere K.C. (2005). “A generalized Mahalanobis distance for mixed data.” Journal of Multivariate Analysis, 92, 174-185.

[6] Särndal, C., Lundström, S. (2008). “Assessing auxiliary vectors for control of nonresponse bias in the calibration estimator.” Journal of Official Statistics, 24, 167-191.

[7] Wagner, J. (2014). “Limiting the Risk of Nonresponse Bias by Using Regression Diagnostics as a Guide to Data Collection.” Presentation at the 2014 Joint Statistical Meetings. August 2014.

[8] Erdman, C., Coffey, S. (2014). “Predicting Response Mode During Data Collection in the NSCG.” Presentation at the 2014 Joint Statistical Meetings. August 2014.

[9] Killick, R., Eckley, I. (2014). “Changepoint: An R Package for Changepoint Analysis.” Downloaded from http://www.lancs.ac.uk/~killick/Pub/KillickEckley2011.pdf on August 8, 2014.

[10] Groves, R.M., Heeringa, S. (2006). “Responsive design for household surveys: tools for actively controlling survey errors and costs.” Journal of the Royal Statistical Society Series A: Statistics in Society, 169, 439-457.

Table H.1. Potential Intervention Points

Week	Production Operation Description	Adaptive Design Interventions	How to determine to intervene using RMSE as the quality metric?	Other contributing factors
1	Week 1 Web Invite, Incentives (If Appropriate)	No interventions.	N/A	N/A
2	Week 2 Reminder, Questionnaire Mailing (If Mail Preference)	No interventions.	N/A	N/A
4 - 23	Production operation varies depending on the data collection week	Activating cases in CATI early or take cases off hold in CATI	If simulations show that sending a case to CATI early will result in reduction in RMSE without increasing the cost beyond predefined limits.	- If the number of cases we select for this intervention is very large and we do not want to move the full set of cases to CATI early, examine response propensity for these cases, and move over cases with a higher CATI-specific response propensity.
4 – 23	Production operation varies depending on the data collection week	Putting cases in CATI on hold	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
4 - 23	Production operation varies depending on the data collection week	Sending an off-production-path questionnaire	If simulations show that sending questionnaires to a subset of cases will reduce the RMSE without increasing the cost beyond predefined limits.	- If the number of cases we select for this intervention is very large and we do not want to send an additional questionnaire to a large group of cases, examine response propensity for these cases, and move over cases with a higher mail-specific response propensity.
5, 6, 12	Weeks 5,6,12, Reminder Email	Withhold email reminder	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
5	Week 5 Reminder Letter	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
6	Week 6 Reminder Postcard	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
8	Week 8 Questionnaire with Web Invite	Replace questionnaire with web invite	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
10	Week 10 Reminder Email	Withhold email reminder	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
12	Week 12 Pressure Sealed, Perforated Reminder (Start of CATI)	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
16	Week 16, Postcard Reminder	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
18	Week 18, Web Invite (Prior Round Respondents)	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
18	Week 18 Questionnaire with Web Invite (Prior Round Nonrespondents)	Replace questionnaire with web invite	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
20	Week 20, Web Invite, new sample, Priority envelope, questionnaire	Replace questionnaire with web invite	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
23	Week 23 Last Chance Email	Withhold email reminder	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.
23	Week 23, Web Invite	Withhold reminder contact	If simulations show that cost savings can be obtained without increasing the RMSE beyond predefined limits.	- If predictions of key estimates of interest have not stabilized in the experimental group, or if reducing effort will not save costs at week X, we may not use this intervention.

H-1

File Type	application/vnd.openxmlformats-officedocument.wordprocessingml.document
Author	Milan, Lynn M.
File Modified	0000-00-00
File Created	2021-01-13