OMB Control No. 0925-0499
Supporting Statement B
Title of Information Collection:
The Second National Survey to Evaluate the Outcomes of the NIH SBIR Program (OD)
This information collection will employ statistical methods. The following sections describe the statistical methods and the reasons for their use.
The unit of study for this survey is the awardee small business named in the Notice of Grant Award. The award is a single NIH/SBIR grant that will be referenced at the top of the questionnaire. The awardee is the small business that received the grant.. The research and development undertaken and supported by the award will be considered the project. The study period included in this evaluation will be fiscal years from 2002 through 2006 inclusive. The response rate goal for information collection is 80 percent.
The awardee companies that received NIH/SBIR awards during the study period will constitute the universe or sampling frame (list or database – first stage of sampling) from which the sample of awards will be selected (second stage of sampling). For those awardee companies that received more than one Phase II award during the study period, the survey contractor will randomly select an award to serve as the reference for the survey. This order of selection (as opposed to just randomly selecting awards) ensures that each awardee company has an equal and quantifiable probability of selection (that equals one). Each awardee should be included in the sampling frame only once, so that all companies are represented only once in the resultant sample. In effect, this procedure is a census of all awardee companies with a random selection of awards for just those awardees receiving more than one award during the study period.
During fiscal years 2002 through 2006, NIH awarded 1,770 Phase II SBIR awards to 1,037 unique awardee small businesses. Thus, the sampling frame contains 1,037 sample members or awardee companies (for the first stage of sampling). From each of the 329 companies with multiple awards, the contractor will select a random reference award—one award per awardee (for the second stage of sampling).
The principal investigator (PI), the company official who signed the application for the NIH/SBIR award, or another individual(s) in each of the selected companies for the selected award, will be the survey spokesperson for the awardee respondent. The advance survey email will be addressed to the principal investigator and will provide guidance about selecting the most appropriate spokesperson, if the PI is no longer that individual.
Because the sampling frame contains 1,037 sample members, we plan to conduct a census of all awardees in the study period, with sampling used only to select a single Phase II award as the focus of the survey for awardees that received more than one Phase II award. We anticipate that the achieved sample—those awardees that participate in the survey—will number about 704. One thousand is a standard or typical sample size for national surveys. The margin of error in a sample of 1,000 is 3 percent at the 95 percent confidence level.
The rationale for focusing on only one award per awardee company is to minimize respondent burden. Preliminary analyses of the sampling frame show that 68% of the awardees have received a single award during the study time period, and 32% have received more than one award. Thus, approximately one-third of the awardee respondents would have to complete multiple surveys (one for each Phase II award) if the sampling procedure did not restrict the number of awards per awardee to one.
The sampling plan is to randomly select the Phase II award that would serve as the focus of the survey for those awardees with multiple awards. Because of random selection, the achieved sample of awards should be representative of all Phase II awards granted during the study period.
It is customary to define usability and eligibility criteria for the sampling units to ensure a rational and uniform group of respondents. Usability typically refers to the workability of sample contact information. Eligibility refers to the criteria for inclusion in the sample.
Usable sample units are defined in terms of the ability to make contact with a sample member. For this survey, the expectation (based on the first survey) is that it will be possible to contact about 85% of the awardee companies, although doing so may sometimes require extensive tracking efforts. If any awardees are no longer in business, they will be deemed unusable sample units.
Eligible sample units are defined in terms of characteristics of the awardee companies:
Small business companies winning NIH SBIR Phase II awards during fiscal years 2002 through 2006 inclusive
Companies currently located within the U.S.A. (50 states)
Eligible spokespersons for the sample units are defined in terms of the people who will complete the survey:
Respondents within companies who are either the principal investigator on the NIH/SBIR award application, the company official on the NIH/SBIR award application, or another individual qualified to be the respondent (such as the current investigator or the current company official who have replaced the one on the application
Respondents residing within the U.S.A.
Respondents capable of interviewing in English (may be known a priori or determined at a later time)
In this survey, the universe of awardee companies includes 1,037 small businesses. To estimate the size of the achieved sample, we need to make some reasonable assumptions about the usability and eligibility of the universe of awardee companies. The best assumptions are based on the findings from the 2002 survey.
14.8% of the sampling units may not be usable (that is, may be out of business or non-contactable even with extensive tracking)
.04% of the usable sampling units may not be eligible to participate in the survey (that is, there may be no eligible small business or no eligible respondent—see the earlier paragraphs on usability and eligibility)
20% of the usable and eligible sample members will refuse to respond to the survey, despite repeated contact efforts
The first two of these items are assumptions about the sample universe and are based on the findings in 2002. The third item is a goal—we anticipate that, with multiple contact attempts and efforts, we will motivate 80% of the sample members to cooperate with the survey.
Using these assumptions and the response rate goal, we can calculate the anticipated size of the achieved sample:
Initial sample 1,037
Usable sample (85.2%) 884
Eligible sample (99.6%) 880
Responsive sample (80%) 704
Thus, we anticipate that 704 awardees will participate in this survey. This expectation is based on sampling statistics from 2002, which could have changed slightly over time. It also assumes fielding conditions similar to those of the 2002 survey. These conditions are variables, such as (1) scheduling the fielding at optimal times (fall or winter/spring), (2) the assistance of individuals who can promote awardee cooperation (such as IC SBIR Program directors and similar others), and (3) comparable environmental, economic, and social conditions (no markedly atypical national events).
Response Rate
The response rate goal for this survey is 80%. This means that we anticipate that approximately 704 sample members will complete the survey. The numerator in the response rate calculation is the 704 respondents. The denominator is the approximately 880 usable and eligible sample members (see the earlier discussion of usability and eligibility).
The accurate classification of each piece of sample takes place during the survey field period. It is not known ahead of time. For example, it may not be possible to classify a non-responding awardee business until the telephone follow-up with nonresponders. Only then might we learn that the company is no longer in business (unusable sample), that the company did not win an NIH/SBIR Phase II award during fiscal years 2002 through 2006 (ineligible sample, perhaps because of an inaccuracy in the NIH/SBIR database), or that the principal investigator does not want to participate in the survey (non-responding sample unit).
Sometimes, we learn some of this information earlier in the field period, when (for example) the advance email letter is returned with some explanation. Subsequent tracking efforts may find that the PI email address is incorrect (usable sample, once we track down the correct email address), the company is out of business (unusable sample) or that the company moved (usable sample, once we learn the new address). Ongoing survey procedures attempt to achieve a final disposition (classification) for each sample unit. Only then is it possible to calculate the final response rate, using the standard procedures codified by CASRO (Council of American Survey Research Organizations).
The survey contractor will collect data (using the survey and from NIH/SBIR databases) to serve as background variables, sometimes called explanatory or independent variables, and analysis variables, also called dependent variables. Differences in background variables among awardees may explain differences in analysis variables (award outcomes).
The survey will include questions to capture various types of award outcomes, including products, services, processes, publications, patents, and collaborations related to detection, diagnosis, treatment, and prevention of disease (such as drugs, drug delivery systems, medical devices, assays, research tools, behavioral and other interventions, change in number of employees, and educational materials.)
The basic hypothesis in this study is that NIH/SBIR awards to small business companies produce positive outcomes in the detection, diagnosis, treatment, and prevention of disease—products, processes, and/or services, whose success may be measured as increases in technological innovations, commercialization, medical and societal benefits, intellectual property and the knowledge base, growth of small businesses, and furthering the NIH overall mission. The null hypothesis is that the NIH/SBIR awards do not produce positive outcomes.
Secondary hypotheses may be postulated to explain any differences, if any, found among award outcomes. For example, the sponsoring institute might be an explanatory factor for differences in outcomes among awardees. Alternatively, the type of product, process, or service, and any related FDA requirements, for example, might be an explanatory factor. The date or amount of the NIH/SBIR award might be an explanatory factor.
The following sections explain the steps in the information collection. We have carefully planned these information collection procedures to maximize the response rate for this survey, with the goal of obtaining an 80% response rate. This exceedingly high response rate requires a multi-faceted survey approach and strong follow-up efforts with nonresponders.
Securing Missing Contact Information
NIH will employ researchers to secure and/or confirm missing or old contact information for any of the potential respondents (principal investigators or business officials) who do not have accurate email addresses and/or telephone numbers. They will use online and criss-cross directories and/or national telephone company information services to secure current, valid email addresses and telephone numbers.
Sending the Advance Email Letter
NIH and the survey contractor will finalize an advance email letter that advises awardees that they have been selected for participation in the survey and that elicits their cooperation. The contractor will personalize each email letter. The NIH SBIR/STTR Program Coordinator, Ms. Jo Anne Goodnight, will sign the emails using a digitized signature.
The advance email letter will ask the potential respondents to review their current or preferred business email address and telephone number, and if any of the contact information needs updating, to email or call the survey. (Please see Attachment C.2, Introductory and Follow-up Emails to Respondents.)
Track Bad Email Addresses
Beginning on Survey Day 8 (business days), NIH will employ researchers to locate current, valid email addresses and telephone numbers for all potential respondents whose advance email letters have been returned (“bounced”). If the researchers’ efforts are unsuccessful, they will call the awardee business to attempt to (1) learn the identify of the eligible spokesperson, (2) secure the awardee spokesperson’s current business email addresses, and (3) determine the status of any “lost” awardee companies, such as those that are no longer in business.
Email Cover Letters, Reminders
On Survey Day 15, the survey contractor will email an initial cover letter (message) providing background information and instructions, the link to the online the survey, and the awardees’ unique usernames and passwords for accessing the survey. On Survey Day 22, the contractor will send email reminder/thank you messages to all potential respondents.
On Survey Days 22 and 35, the survey contractor will send follow-up email letters containing a reminder and explanation information similar to that contained in the first email. The messages will provide an 800-telephone number for respondents who prefer to call for additional explanations or technical assistance with the log-on process. (Please see Attachment C2, Introductory and Follow-up Letters to Respondents.)
Online Survey
The survey contractor will oversee programming and installation of the online implementation of the survey. The implementation will use SSL encryption technology and password access to provide security. The survey will use the same text—questions and response categories—as the successful 2002 survey, with only minor modifications suggested by respondents to that survey. (Please see Attachment C.1, Data Collection Instrument.)
The computer program will determine the flow of the questions that each respondent reads and answers based on his or her earlier responses. The online survey will automatically do edit checks, such as checking responses for permitted ranges and logical consistency. The “blanks” available for responding to “Other (Specify):” and open-end questions will shrink and grow as necessary. Respondents will be able to stop and continue the survey later.
NIH and the survey contractor will test the online implementation of the survey before it is put into use. They will look for “bugs,” such as incorrect SKIP patterns, entry of unallowed values, and similar erroneous actions. The survey contractor’s technical personnel will also test the survey, specifically with regard to “hacking” (unauthorized access) and similar security aspects of the implementation.
Telephone Follow-up for Nonresponders
NIH will employ a professional telephone contractor to conduct a telephone follow-up with nonresponders. The survey contractor will write and produce briefing and training materials specific for this survey and, if desired, in combination with NIH, the contractor will conduct briefing and training sessions for the telephone interviewers and other personnel who will be working on this survey. The survey contractor will monitor actual interviews from the facilities at the survey contractor or remotely via telephone. Interviewers that do not perform acceptably will be removed from work on the survey.
The telephone interviewers will attempt to contact all potential respondents who have not yet responded to the survey. After contact is made, the interviewers will attempt to elicit a commitment from the potential respondents to complete the online survey.
Multiple procedures and methods contribute to maximizing response to this information collection and dealing with nonresponse. They fall into the general categories of (1) survey implementation, (2) security and confidentiality practices, and (3) multi-faceted survey methodology.
Prior experiences confirm that an online survey appeals to scientific respondents because of its convenience, ease of response, immediacy of response, and minimal burden. (Please see Section A.3, Use of Information Technology and Burden Reduction.) Because of this and the extremely successful survey response in 2002, NIH selected this method of survey implementation to enhance response in this second collection.
Numerous assurances of security and confidentiality practices also help maximize response. When respondents believe that the information they supply will be secure and held confidential to the full extent permitted by law, they tend to feel more confident and are more cooperative. Respondents will see various assurances of security and confidentiality throughout the survey process.
They will see the OMB control number and instructions about response requirements. Next, they will note the password access to the online survey implementation, and they will see the “padlock” symbol (representing SSL encryption technology) at the bottom right of their computer screen when they access the survey.
It is rarely possible to achieve a high response rate to a survey using only a single approach. There usually need to be successive follow-up efforts with nonresponders. Thus, NIH has planned a survey implementation that uses both multiple and different approaches to eliciting cooperation. (Please also see Section B.2, Procedures for the Collection of Information.) The survey implementation includes:
Extensive efforts to confirm or obtain current contact information for all potential respondents
Advance personalized email letters advising awardees of the upcoming survey, the reasons for the data collection and its importance, and eliciting confirmation of email addresses
Names and telephone numbers of NIH personnel and the survey research coordinator to contact for additional information, confirmation of authenticity, and technical assistance
Cover email letters supplying the location of the online survey and containing unique access usernames and passwords
Email reminders to nonresponders, reminding them of the importance of the survey and the ultimate usefulness to them of the data and concomitant changes
Telephone follow-up contacts with nonresponders via telephone to elicit promises of response to the online survey
Email thank-you messages
NIH anticipates a high response because of these extensive efforts and the saliency of this survey for small businesses engaged in federally funded research.
There are two aspects to dealing with nonresponse. The first is to maximize response to avoid having a problem of nonresponse in the first place. When a survey achieves a high response rate, there is little reason for suspecting any bias in the data due to nonresponse. Alternatively, when a survey achieves a low response rate, one may wonder if the data are fully representative of the whole universe that was surveyed or that the sample represents. For example, were a low response rate to this data collection achieved, NIH might be concerned that the data do not fully represent those small business respondents who have not commercialized the results of their funded project, or who have not produced any noncommercial successes (such as technological innovations, other medical or societal benefits, or contributions to the knowledge base). Thus, it is important to maximize the response rate to this survey.
The second aspect of dealing with nonresponse is what one must do when the response rate to a survey is not high. One must attempt to evaluate the data for nonresponse bias, and then to adjust or correct the data using sample balancing (weighting) procedures. This second approach to nonresponse is obviously less desirable since it introduces approximations into the data. (In general, weighting techniques assume that the answers of a particular subgroup of respondents who have not responded adequately to the survey are similar to those of the corresponding subgroup of respondents who have responded.) Thus, it is again apparent that maximizing the response rate is the desired course of action. The previous section (Section B.3, Methods to Maximize Response Rates and Deal with Nonresponse) describes the multifaceted approach of the survey implementation that is targeted at achieving an 80 percent response rate.
The analysis procedures for this data collection have the following goals.
Evaluate the data for possible response bias and for possible erroneous or anomalous data values, and, as necessary, adjust the data to correct for bias or erroneous or anomalous data values
Examine data displays to determine possible answers to the study questions that this data collection seeks to answer
Analyze data to look for associations and correlations, explanatory variables, possible trends, and multivariate relationships that explain the answers to the study questions
The following sections explain analysis procedures, methods, and statistical tests that may be used in meeting these goals. The specific statistical procedures, methods, and tests employed depend, of course, on the characteristics and quality of the collected data.
The survey contractor’s statistician will produce univariate frequency distributions and bivariate cross tabulations to examine and evaluate the data. First, the statistician will look for possible response bias by evaluating whether the group of respondents differs in significant ways from the universe of all eligible potential respondents. For example, she or he might look for disproportionate response between small businesses funded by different institutes or possibly between those located in western and eastern locations. Achieving a high response rate in this data collection will greatly reduce the likelihood of response bias and make this step less important or possibly unnecessary.
Second, the statistician will examine the frequency distributions and cross tabulations for outliers—erroneous or anomalous data values. Such values can result from entry errors (a respondent enters “55” instead of “5”), misinformation (a respondent remembers sales of about $5,000 instead of the actual amount of $8,000), misunderstanding (a respondent enters a monthly amount instead of an annual total, as instructed), and other similar actions. The statistician can sometimes find these kinds of errors if the data values are outliers (values at the extremes of the data distributions), or if the data values are inconsistent with other data for that investigator. Often, a respondent can be contacted again, and he or she can confirm or correct the outlying or inconsistent value. Other times, this is not possible, and the statistician may elect to delete an outlier, replace it with its subgroup’s mean value, and/or exclude the outlier from specific calculations.
Next, the statistician will produce data displays with and without subgroups, create or transform variable values, or in other ways show the data in meaningful displays. (Please see Section A.16, Plans for Tabulation and Publication and Project Time Schedule, for the types of data displays likely to be used.) The goal is to look for possible answers to the study questions that this data collection is intended to answer. (Please see Section A.2, Purpose and Use of the Information, to note the types of information to be collected and the study questions that this information should answer.) For example, the statistician may display many variables grouped by institute to learn if awardees funded by different institutes have comparable levels of success commercializing the results of their projects, or he may display them grouped by year of award to see if awardees’ relative successes are time dependent.
The statistician may display variables grouped by the amount of the award or by the number of related SBIR awards that the small business awardee has received to discern if differences in commercialization exist among subgroups of small businesses. For example, ranked displays grouped by research fields may indicate what types of successes are most common in different fields.
After finding possible answers to the study questions, the statistician will analyze the data to attempt to locate meaningful explanations for some findings. For example, if there are differences in contributions to intellectual property, the statistician will probe further to learn if those differences are functions of the small business’s field of business, or if they are explained in part by other variables, such as the amount of the SBIR award, the number of elapsed years since receipt of the award, and other similar possible explanatory variables.
The statistician will use various procedures, methods, and statistical tests to locate meaningful explanations for different findings. These include the following.
Comparisons of means, proportions, and variation, and tests/calculations (using t, normal, chi square, and F distributions, as appropriate) for significant differences, standard errors, and confidence intervals
Calculations of correlations for continuous and rank-ordered data (Pearson’s r, Spearman’s rho, phi, Kendall’s tau, W, and K, as appropriate) and tests for significant correlations (t, normal, chi square, McNemar, Fisher’s z and p distributions, as appropriate)
Bivariate and multivariate regression analysis and analysis of variance to predict variables that depend on one or more other variables, and to partition the variance among subgroups into parts assignable to other variables or to interactions among variables
NIH consulted with Humanitas, Inc. on the design and implementation of the first and second surveys and the analysis of the survey data. Humanitas implemented the 2002 survey and oversaw all aspects of that information collection and evaluation process. Ms. Stephanie Karsten, Vice President of Humanitas, is Officer in Charge and Project Director. Ms. Lynne Firester, Survey Specialist and Statistician, designed the survey implementation and analysis plan. To contact these people, please telephone Humanitas offices at 301-608-3290.
File Type | application/msword |
File Title | OMB Control No |
Author | curriem |
Last Modified By | curriem |
File Modified | 2007-10-04 |
File Created | 2007-10-04 |