SF-83 SUPPORTING STATEMENT
FOR
SURVEY OF GRADUATE STUDENTS AND POSTDOCTORATES IN SCIENCE, ENGINGEERING AND HEALTH
Survey years fall 2008 - fall 2010
LIST OF ATTACHMENTS
Attachment 1: 2007 GSS Web Survey Screen Shots
1.1 Part 1: Pages 1-10, (NSF Form 811)
1.2 Part 2: Pages 11-18 (NSF Form 812)
1.3 Page 19: Changes for 2007
1.4 Pages 20-27: Complete List of GSS Eligible Fields and Codes
1.5 Pages 28-31: Revisions to the 2007 GSS Codes and Programs
1.6 Pages 32-42: GSS Survey CIP–GSS Code Crosswalk by CIP
1.7 Pages 43-58: Glossary
Attachment 2: Paper worksheet. Part 2- 2008 Graduate Student Survey (NSF Form 812) in
Portrait Format
Attachment 3: 2008 Federally Funded Research and Development Centers (FFRDCs) Survey
Attachment 4: Summary of Changes made to the GSS since the last (2005) OMB clearance
Attachment 5: First Federal Register Notice - 2008 GSS-2010 Graduate Student Survey
Attachment 6: Tests of Procedures Used (references in Section B.4)
6.1 Identifying the Intended Navigational Path of an Establishment Survey
6.2 GSS Usability and Cognitive Interviews
6.3 Summary of GSS User Meetings
6.4 2008 GSS Pilot Studies
6.5 Process for Determining GSS-Relevant CIP Codes
Attachment 7: Background on Response Burden Calculations
Attachment 8: GSS Project Schedule
Supporting Statement
Survey of Graduate Students and Postdoctorates in Science and Engineering
2008 through 2010
A. Justification
This submission requests a three year reinstatement of the previously approved OMB clearance for the National Science Foundation’s and National Institutes of Health’s Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS). The GSS is an annual survey last conducted in fall 2007. The OMB clearance for the GSS will expire on July 31, 2008. With this clearance package, NSF requests approval to collect these data for the 2008-2010 survey cycles.
The GSS is the only annual national survey that collects information on the characteristics of graduate science, engineering and health (SEH) enrollment for specific disciplines at the departmental level. It also collects information on race/ethnicity, citizenship, sex, sources of support, type of support and gender for graduate enrollment; information on postdoctorates by citizenship, sex and sources of support and counts of nonfaculty research staff with doctorates. The GSS has been conducted by the NSF’s Division of Science Resources Statistics (SRS) annually since 1972. Additional financial support is provided for the GSS by the NIH and the U.S. Department of Energy (DOE).
The GSS is a census of all eligible academic institutions and all departments in science, engineering and health fields in the U.S. with post-baccalaureate programs. To improve coverage of postdoctorates, in 2008 the GSS will also survey Federally Funded Research and Development Centers (FFRDCs) to gather information on the race/ethnicity, sex, citizenship, source of support and counts of the postdoctoral appointments. (See Attachment 3 for the FFRDC questionnaire).
The GSS consists of two parts. Part 1, NSF Form 811, is a prepopulated Web listing of eligible “organizational units” defined as departments, programs, research centers and health-care facilities known to exist in the previous GSS survey cycle. The School Coordinators are asked to verify list in preparation to sending out the second part of the GSS. Part 2, NSF Form 812, is the data collection worksheet asking for the counts of graduate students, postdocs, and nonfaculty researchers with doctorates in each GSS-eligible unit.
Since April 2002, SRS has been conducting extensive research and methodological testing to reduce the respondent burden, improve data quality, reduce survey costs, and improve processes that will result in more rapid release of the data to the public. The 2008 GSS reflects changes made to date as a result of the research and testing. The changes being requested here from the preceding version (2006) are itemized in section B.4 and Attachment 4.
1. Need for Data Collection and Legislative Authorization
The authority for the collection of the information on the GSS is established under the National Science Foundation Act of 1950, as amended, Public Law 507 (42 U.S.C. 1862), Section 3(a) (6), which directs NSF “...to provide a central clearinghouse for the collection, interpretation, and analysis of data on scientific and engineering resources and to provide a source of information for policy formation by other agencies of the Federal Government...” Furthermore, Executive Order 10521 (March 17, 1954) states: “The Foundation shall continue to make comprehensive studies and recommendations regarding the Nation’s scientific research effort and its resources for scientific activities, including facilities and scientific personnel, and its foreseeable scientific needs, with particular attention to the extent of the Federal Government’s activities and the resulting effects upon trained scientific personnel.”
The GSS provides a critical piece of the Foundation’s information used to meet its responsibilities under the Act and the Executive Order.
2. How, By Whom, and for What Purpose the Information Is to Be Used
A. Federal Uses
Information on the number and characteristics of students currently enrolled in graduate SEH programs and engaged in postdoctoral programs is extensively used by NSF, NIH, and the DOE to assess future supplies of trained science and engineering (S&E) and health (SEH) personnel. A variety of more general information needs are met through the annual release of data in paper and electronic format. NSF publishes a short InfoBrief and a set of detailed statistical tables in the on-line report, Graduate Students and Postdoctorates in Science and Engineering, available on the SRS Website. A public release file is also available on this site.
Data from the GSS are also available on the Web through the WebCASPAR (Computer Aided Science Policy Analysis and Research) system. The URL for WebCASPAR is http://caspar.nsf.gov/webcaspar. WebCASPAR is an institution-based data system. It contains institutional and summary data from all NSF academic sector surveys for all institutions offering graduate-level instruction and/or maintaining R&D activity in SEH fields. Other data included in this data system are those compiled from the Department of Education’s National Center for Education Statistics (NCES) Integrated Postsecondary Education Data System (IPEDS) surveys of Completions, Fall Enrollment, and Finance, and the NSF Survey of Earned Doctorates. This on-line database is used routinely by SRS as well as by program offices in many of the NSF research Directorates. Primary uses made of the data include review of changing enrollment levels to assess the effects of NSF initiatives, to track student support patterns, and to analyze participation in SEH fields by targeted groups for all disciplines or for selected disciplines and for selected groups of institutions. Program officers check departmental and institutional records, including data from the graduate student survey and NCES IPEDS surveys, to determine department eligibility for NSF programs targeted to special populations or instructional programs.
1. NSF Uses
Special tabulations from the GSS data constitute a key resource in meeting policy and program information needs of the Foundation. Major examples of use of the GSS data are in the Foundation’s two Congressionally-mandated biennial reports, Science and Engineering Indicators and Women, Minorities and Persons with Disabilities in Science and Engineering.
The GSS is one of three NSF surveys whose micro data are combined into an integrated database to produce the publication Academic Institutional Profiles. The other two surveys are: (1) the Survey of Research and Development Expenditures at Universities and Colleges and (2) the Survey of Federal Science and Engineering Support to Universities, Colleges, and Nonprofit Institutions. As explained in the next paragraph, these data are further integrated with institutional data from other NSF surveys and with surveys conducted by the Department of Education. Together these data provide policymakers with information on the role of higher education in the context of the national R&D effort.
2. Other Federal Uses
Data derived from the graduate student survey are routinely provided to Congress and to various agencies of the Executive Branch. Recent uses of the data include:
Data on graduate SEH enrollment are provided annually to the Department of Education’s National Center for Education Statistics for comparison purposes and are published in the Digest of Education Statistics.
Trend data on graduate SEH enrollments are published by the Bureau of the Census, Department of Commerce, in the Statistical Abstract of the United States.
DOE and NIH use specially prepared tabulations from the GSS tailored to answer specific questions to help their agencies prepare budgets and conduct program evaluation studies.
B. Academic Institution Uses
The surveyed institutions themselves are major users of the data collected in the GSS. Requests for the data are received from numerous individual institutions, as well as from national academic organizations. The SRS has been cooperating with the Association of American Universities’ Data Exchange Group to provide them with comprehensive data from the GSS.
The value of the GSS to academic institutions, higher education organizations such as The Council of Graduate Schools, and policymakers interested in higher education issues, is that it is part of a larger set of surveys, all of which provide annual census data on academic institutions. SRS combines data from these annual census surveys in two ways: Academic Institutional Profiles and WebCASPAR, to provide the research and policy community important information resources about higher education.
GSS is one of three NSF annual censuses whose micro data are combined into an integrated database to produce the publication Academic Institutional Profiles. The other two surveys are: (1) the Survey of Research and Development Expenditures at Universities and Colleges and (2) the Survey of Federal Science and Engineering Support to Universities, Colleges, and Nonprofit Institutions.
These data are further integrated with institutional data from other NSF surveys, including the Survey of Earned Doctorates, and surveys conducted by the Department of Education (IPEDS) in the WebCASPAR (Computer Aided Science Policy Analysis and Research) system. WebCASPAR is heavily used by the research, S&E policy, and academic communities.
Institutions use the NSF data to study selected groups of peer institutions for planning and comparative purposes using either the GSS data reports or the WebCASPAR system. They combine the NSF data with information from State and local governments on institutions in their geographic areas. Institutions use the comparative data to review the strength of their own programs on the basis of such factors as support of students by various Federal agencies and progress in reaching special target populations.
C. Professional Societies and Foundation Uses
Representative data users in this category include: American Association of Universities-Data Exchange, Federation of Associations of Experimental Biology, The Council of Graduate Schools, Commission on Professionals in Science and Technology, American Institute of Physics, the American Society for Engineering Education, the American Chemical Society, the American Council of Education, the Carnegie Foundation, American Association of Colleges of Nursing, the Association of American Medical Colleges, the National Postdoctoral Association, American Geological Society and the Computing Research Association.
D. Media Uses
Enrollment of graduate students in science and engineering fields, particularly those holding temporary visas, are well reported by the press including CNN, the Chronicle of Higher Education and the Washington Post and other major national newspapers.
3. Consideration of Using Improved Technology
Since the fall 1999 survey, GSS respondents have had the option to submit the data by either paper form or through the World-Wide Web. Approximately 98 percent of the academic institutions chose the Web option in the fall 2007 survey. The majority of respondents have welcomed the Web version of the GSS for ease of submission and error resolution capabilities.
Reporting burden is stable or potentially reduced when the survey forms and questions are stable and do not vary from year to year. Most of the academic institutions have been in the GSS for many years. Many of them have established automated systems for assembling the requested data. Most of the data GSS collects is required by the academic institutions themselves for other reporting requirements and for planning and evaluation purposes.
The Web version of the survey has a real-time monitoring system allowing NSF to monitor data, response status, system problems and comments from respondents. From the perspective of the respondents, the Web version is more convenient and simplifies the survey (e.g., by automatically checking totals). NSF benefits from the use of the Web version by receiving the data faster and better quality data.
GSS includes a file upload option for providing the count data for those institutions that choose to do so. In 2007, 35 of the responding institutions completely supplied their data via file upload.
4. Identification of Duplication
NSF staff consults regularly with other Federal agencies and private organizations to prevent duplication of data collection activities and to stay abreast of changes in other surveys. Such consultations take place with the Department of Education’s National Center for Education Statistics (NCES), the Council of Graduate Schools (CGS), and others. Specific surveys conducted by these groups will be discussed below.
In addition, staff of the SRS participates in a variety of NCES-related activities including serving on Technical Review Panels and serving on the CIP 2010 Working Group.
The routine uses made by the Federal agencies described in Section 2 above have largely determined the content of the questionnaire. Only the GSS collects the following information at the level of detailed SEH fields of study:
For full-time graduate students, aggregate counts at the discipline level by:
Sources of major financial support (Federal agencies, institutions, self-support, etc.)
Mechanisms of major financial support (fellowships, teaching assistantships, etc.)
Gender
Citizenship
Level of study (first year or beyond first year)
Race/ethnicity background of U. S. citizens
For part-time graduate students, aggregate counts at the discipline level by:
Gender
Citizenship
Race/ethnicity background of U. S. citizens
For postdoctorates, aggregate counts at the discipline level by:
Sources of financial support
Gender
Citizenship
Holders of first-professional medical degrees
Because the data are collected from all eligible institutions with graduate science, engineering and health departments, data are available at detailed field of study levels and also by institutional characteristics such as highest degree granted, geographical distribution, type of control (public or private), or any other special grouping (medical schools, historically black colleges and universities, land-grant institutions, etc.) as well as by rankings on various characteristics (foreign enrollment, minority enrollment, enrollment in a specific field, etc.)
Some graduate enrollment data are collected by other organizations, either Federal or private; but none of the other data collection efforts contain the detailed field distribution required for the analyses for which the data are needed by the National Science Foundation, the National Institutes of Health and the Department of Energy and no other surveys collect data on federal agencies’ support of graduate students.
The IPEDS, for example, collects race/ethnicity data every two years for 9 selected fields, of which 4 are within the NSF definition of science and engineering (and at a more general level than is collected for GSS). The IPEDS annual fall enrollment data collected by race/ethnicity category are not reported by field and hence do not provide a viable substitute for the race/ethnicity data collected in the GSS. No data are collected on source of support or on postdoctorates and non-faculty researchers. Note: The IPEDS has issued guidelines for revised race/ethnicity guidelines to be used by institutions beginning with the 2010 IPEDS. The categories used on the GSS are already in compliance with the OMB race/ethnicity guidelines.
The Council of Graduate Schools (CGS) conducts an annual survey of graduate enrollment in cooperation with the Graduate Records Examinations Board (GRE), surveying 764 institutions which are members of the Council of Graduate Schools or one of the four regional graduate school associations--the Conference of Southern Graduate Schools, the Midwestern Association of Graduate Schools, the Northeastern Association of Graduate Schools, and the Western Association of Graduate Schools. The survey collects data by 9 broad fields of study using the GRE discipline codes as its taxonomy, type of institutional control, and highest level of degree offered, but has no data on source of financial support. It also collects information on post baccalaureate and post-masters certificates and applications to graduate schools.
Only the GSS maintains detailed data on all science, engineering and health fields at all eligible institutions and institution-provided data on source of financial support.
The following table compares the taxonomies used in the various multi-field enrollment surveys:
Table A-4. Comparison of Graduate Enrollment Surveys
Survey |
Number of Fields of Study |
|
|
NSF (GSS) |
84 (61 in S&E, 23 in health) |
NCES (IPEDS) |
9 (4 in S&E, 1 medical) |
CGS/GRE |
49 (24 in SEH) |
A number of surveys are conducted by other professional societies or by groups of institutions, and are limited to a single field or group of related fields, or to institutions that are members of the organization. These may collect far more detailed data on the fields of interest to the organization conducting the survey, and may even collect data on topics not covered by the Graduate Student Survey (for example, on undergraduate enrollment), but do not provide compatible data on all SEH fields, nor do they often address the issue of types and sources of financial support for graduate students. Among the surveys of this type are those conducted by the American Institute of Physics (AIP), the Engineering Workforce Commission (EWC), and the American Mathematical Society/Mathematics Association of America. Data published by AIP cover only total enrollment in physics with no subfield breakdown; EWC statistics include 23 subfields within engineering plus pre-engineering, and 21 engineering technology subfields. The American Mathematical Society/Mathematics Association of America surveys collect detailed data for 45 subfields in mathematics, 10 in statistics, and 22 in computer science.
For the past several years SRS has conducted an initiative to determine the feasibility of collecting data on the number and characteristics of postdoctorates (postdocs) in the United States. Because the GSS presently collects information on the number of postdoctorates by citizenship, sex and field of study, this portion of the GSS may take on greater future importance. For the foreseeable future the GSS will continue to collect the postdoc information for the academic sector.
5. Small Business Involvement
The survey universe consists entirely of U. S. universities and colleges that enroll graduate students and have postdoctoral appointments and FFRDCs.
6. Consequences of Less Frequent Surveying
The GSS data are collected to serve several purposes and a variety of users. One purpose is to produce consistent national estimates over time for the variables in the survey. For this purpose, a sample survey collected less frequently than on an annual basis might be satisfactory. However, another major purpose of the GSS is to provide similar estimates at the individual institution level. The surveyed academic institutions themselves are major users of these data. They utilize GSS data heavily for internal administrative purposes, such as planning and evaluation. Furthermore, they use the data extensively for the purpose of making peer comparisons, with similar institutions, often in conjunction with other data on all academic institutions, such as from IPEDS. The value of the GSS data to the academic institutions would be far less if the data were not collected annually and for all institutions. In addition, less than annual data would make the data less valuable for the internal administrative uses. Data only from sampled institutions would severely limit, if not eliminate, the value of the GSS as the basis for peer comparisons among institutions. If a sample of institutions were surveyed, it is unlikely that NSF would release the data at the individual institution level. If the data were less useful to the academic institutions providing the data, their willingness to participate in the survey may be undermined.
Collecting the GSS annually also increases the value of the data for monitoring trends, particularly the effects of dramatic changes in the larger context. Recent examples are changes in enrollment in response to the dot-com boom and bust and September 11. Less than annual data may not capture such changes or the point of inflection of a change in direction of a trend. For the past few years, the release of the GSS fall enrollment data has been eagerly awaited to see the trends in SEH graduate enrollment in foreign visa-holders post 9/11. That enrollment did not drop immediately, i.e., in 2001, and the trends differed by several years for first-time enrollment and total enrollment. Those nuances would have been lost if the data had not been collected every year.
Experience with a change in the survey in the 1980s also suggests the difficulties in survey operations that could result from conducting the GSS less frequently than every year. Establishing and maintaining relationships with GSS respondents is an important component in obtaining high response rates. In 1983 an attempt was made to reduce response burden by sampling master’s granting institutions. “When institutions which had not been included in the sample were surveyed again in 1988 for the first time in 4 years, the contractor found that reestablishing contract and obtaining results from these institutions required four or five mailings and considerable telephone time. The average cost of survey processing was higher for those institutions in the one year of recontact than the total of four years for those institutions with whom contact had not been broken.” See SF-83, page 83, November 1, 1993.
Most colleges and universities have automated record keeping systems, facilitating their ability to respond to the GSS on an annual cycle. These automated record systems reduce considerably the time required to assemble and report information needed for the GSS concerning graduate enrollment by field, postdoctoral enrollment, sources and mechanisms of support, etc. Thus, collecting consistent data annually because the database and software are retained, kept current and easily accessed, considerably reduces respondent burden for academic institutions with automated data systems.
Annual collection also contributes to the continuity of contacts with the School Coordinators within institutions. Having this continuity helps the school coordinators maintain their databases and therefore maintain the quality of the data.
Special Circumstances
There are no special circumstances.
8. Federal Register Notice and Consultations with Persons Outside the Agency
The Federal Register notice was published on May 5, 2008, p. 24615 (Attachment 5). Comments to this notice were due July 6, 2008. In response to the announcement, NSF received one public comment from S. Bartholomew-Palmer, via email on May 6, 2008. Ms. Parker did not object to the information collection, but wanted to offer her services to make improvements to the survey.
NSF regularly consults with the Department of Education’s National Center for Education Statistics (NCES) and other federal agencies such as NIH and the Department of Energy, professional societies and university staff. NSF staff members maintain frequent contact with members of the data-using community as well as with major academic data providers through attendance at professional society meetings and consultation with institutional and agency officials. The GSS survey manager held sessions on proposed changes to the GSS at the Association for Institutional Researchers (AIR) Annual Forum each year from 2005 through 2008 to obtain respondent input.
As part of the redesign effort, NSF has conducted site visits at academic institutions using a cognitive interviewing methodology to talk with actual respondents to the GSS. The interviews and results are described in section B.4 and in Attachment 6.
NSF has also conducted meetings with users of the GSS data and asked for their feedback on proposed survey design changes and for user input on future data needs. See Attachment 6.3 for more information about these meetings.
9. Payment or Gifts to Nonrespondents
Not applicable. There are no payments to GSS respondents.
10. Assurance of Confidentiality
No pledge of confidentiality is given to institutions providing data to the GSS. Data collected in the GSS are aggregate counts of students, postdocs, and nonfaculty researchers with doctorates. Data are published only at the departmental summary level.
11. Sensitive Questions
The survey questionnaire does not contain any questions of a sensitive nature.
12. Estimated Response Burden
During the fall 2007 GSS, respondents were asked to report how long it took them to complete Part 1 and Part 2 of the GSS. The following burden estimates are based on their responses. (See Attachment 7 for more detailed information on how the 2007 burden hours were calculated.).
Part 1 burden was reported by School Coordinators (SCs) who are responsible for coordinating the GSS at the academic institution, identifying the units that are within the scope of the GSS, identifying the unit respondents, and administering the survey. The Part 1 burden reported by respondents was 12 minutes or .2 hours per organizational unit. [Note: An organizational unit can be a department, program, research center or health care facility.]
Based on the burden estimates provided by Part 2 respondents, the average respondent burden for Part 2 was 2.3 hours per organizational unit. So the combined respondent burden for both Parts 1 and 2 is 2.5 hours per organizational unit.
For 2008, SRS has assumed that there would be a 5 percent increase in the total number of organizational units over the 2007 GSS figure, from 12,622 to 13,253 units. Two pilot studies are included in the 2008 burden figures. The first is a pilot study of 80 new institutions, with an estimated 10 new organizational units per institution. Because this is the first time these institutions have seen the GSS, SRS increased the burden estimate by 1 hour for these 800 units. The second pilot study will look at possible undercoverage in the reporting of GSS-eligible units at 40 institutions. SRS is assuming an average of 5 additional organizational units per institution for a total of 200 additional units. SRS estimates that these new units will require an additional 30 minutes each to respond. Finally, the FFRDC study of 40 units at 4.7 hours per unit will be included. All these efforts add up to a total of 36,721 burden hours for the 2008 GSS.
Table A-12.1. Burden Estimates for the 2008 GSS
Category |
# of Units |
Burden/Unit (Hours) |
Total Burden (Hours) |
|
|
|
|
Existing institutions |
13,253 |
2.5 |
33,133 |
Pilot of newly eligible institutions (80 institutions) |
800 |
3.5 |
2,800 |
Undercoverage pilot (40 institutions) |
200 |
3.0 |
600 |
FFRDC study |
40 |
4.7 |
188 |
|
|
|
|
Total burden for 2008 |
|
|
36,721 |
For the 2009 GSS, the total respondent burden is estimated at 44,095 hours. The major reason for this increase is the addition of up to 500 new institutions located during the GSS frame review. SRS is assuming an average of only 5 additional organizational units per new institution due to the fact that these are likely to be smaller schools. SRS again is assuming that the new units will require an additional hour in reporting burden because they are new to GSS. Also new in 2009 is extended research related to postdocs at 24 organizational units for an additional hour each. Completing the respondent burden estimates are the FFRDC study and the organizational units from the 2008 GSS plus the additional units added from the 2008 pilot study of new institutions.
Table A-12.2. Burden Estimates for the 2009 GSS
Category |
# of Units |
Burden/Unit (Hours) |
Total Burden (Hours) |
|
|
|
|
2008 existing institutions + 2008 pilot institutions |
14,053 |
2.5 |
35,133 |
Add 500 new institutions |
2,500 |
3.5 |
8,750 |
FFRDC study |
40 |
4.7 |
188 |
Postdocs testing |
24 |
1.0 |
24 |
|
|
|
|
Total burden 2009 |
|
|
44,095 |
The 2010 burden estimates include all the organizational units from the 2009 GSS (including the new institutions) and a projected 5 percent increase in the number of organizational units, the FFRDC study, and the additional postdoc-related questions which would require an extra 30 minutes in 30 percent (5,214) of the organizational units. These yield a total estimated burden of 46,247 hours.
Table A-12.3. Burden Estimates for the 2010 GSS
Category |
# of Units |
Burden/Unit (Hours) |
Total Burden (Hours) |
|
|
|
|
2009 existing institutions + 500 new institutions |
17,381 |
2.5 |
43,452 |
FFRDC study |
40 |
4.7 |
188 |
Additional postdoc questions |
5,214 |
.5 |
2,607 |
|
|
|
|
Total burden 2010 |
|
|
46,247 |
In addition, SRS is requesting 360 burden hours over the 3 years for future testing needs. Adding the burden estimates for the 3 survey cycles and the testing, the total respondent burden is 127,423 hours or an average of 42,474 hours per year. Table A-12.4 summarizes the burden estimates for the next 3 years of the GSS.
Category |
Total Burden (Hours) |
|
|
Total burden for 2008 |
36,721 |
|
|
Total burden for 2009 |
44,095 |
|
|
Total burden for 2010 |
46,247 |
|
|
Future testing (across all 3 years) |
360 |
|
|
Estimated burden for all 3 years (2008, 2009, 2010, + testing) |
127,423 |
|
|
Estimated average annual burden |
42,474 |
13. Cost to Respondents
This survey does not require the purchase of equipment, software, or services beyond those normally used in universities as part of customary and usual business.
14. Cost to the Federal Government
NSF is in the third year of a five-year contract with a survey research firm to collect the GSS data. The total value of that contract is $6,799,386. The estimated cost of the Graduate Student Survey to the Federal Government is $2,796,458 for the fall 2008 survey cycle. See table A-14 below for how this estimate was derived.
Table A-14. Annual GSS Survey Federal Government Costs
Data collection and processing contract |
$ 2,465,908 |
GSS survey manager (1.0 person-years) |
125,000 |
Other SRS staff (program manager, statistician, editor, etc.) |
205,000 |
InfoBrief printing and mailing costs (estimated) |
550 |
Total |
$ 2,796,458 |
For the 2007 GSS, the National Institutes of Health contributed $335,077 (13.5%) and the Department of Energy provided $40,000 (1.6%) of the annual contract costs. It is assumed that both agencies will continue that level of support. The National Science Foundation funds the remainder of the annual costs to the Federal Government.
15. Changes in Burden
Numbers provided by the respondents indicate that burden for GSS respondents has increased between the 2005 and 2008 OMB packages by an annual average of 3,239 hours. SRS believes the 2007-based estimates more accurately reflect the burden for the current, redesigned instruments. In addition, the number of organizational units covered by the GSS has been increasing over the years. The current estimates also include burden for 2008 pilot studies and for three survey cycles of the FFRDC survey. See Section 12 and Attachment 7 for additional information.
16. Project Schedule for Information Collection and Publication
The Project schedule (Attachment 8) for the entire project from questionnaire design to final publication is similar each year. Mail-out of survey materials occurs each year in October, with a final closeout date in May, the following year. An InfoBrief is published in the late fall of that year. Detailed statistical data tables with a description of the survey methodology are posted on the SRS Web site (http://www.nsf.gov/statistics/gradpostdoc/ ) approximately 3 months later in the winter of the following year. There are no complex analytical issues, except imputations for nonresponse (see Section B.3).
17. Displaying the OMB Expiration Date
The OMB expiration date appears on the GSS worksheet and on the Web survey login page.
18. Exceptions in Item 19 on Form 83-1
Not applicable. There are no exceptions.
B. Collection of Information Employing Statistical Methods
1. Respondent Universe and Sampling Procedure
The GSS is an annual census of all eligible institutions. The universe is intended to cover all U.S. academic institutions that offer graduate (master’s and PhD, or equivalent) degree-credit programs in science and engineering, as defined by NSF, as well as those offering graduate programs in health fields, as defined by NIH.
Discussion of institutional frame
An institution is considered eligible, or in scope, for the GSS if it meets at least one of the following criteria:
Grants at least one master’s or doctoral degree in at least one program listed in the selected NCES CIP codes (See Attachment 1.6, crosswalk for a list of the GSS-eligible CIP codes)
Has at least one postdoctoral appointee1 or non-faculty research staff member2 conducting research in at least one of the previously mentioned programs.
Prior to the start of each survey cycle, the GSS undergoes a population review to identify any new institutions that should be added to the list of eligible institutions.
For previous survey cycles, a comprehensive population review was undertaken in an effort to ensure that all eligible (in-scope) schools were surveyed. The previous GSS population was compared with the sources listed below:
Carnegie Classification—Master’s
colleges and universities I and II, doctoral/research universities
extensive and intensive, and specialized institutions
(http://
www.carnegiefoundation.org/classification/
)
NCES IPEDS database. All 4-year institutions
that offer programs in engineering, engineering
technologies/technicians, biological and biomedical sciences,
mathematics and statistics, physical sciences, engineering,
psychology, social sciences or health professions and related
clinical sciences
(http://nces.ed.gov/ipeds/cool/)
Council of Graduate Schools membership directory (http://www.cgsnet.org/MembersAndFriends/directory.cfm )
Association of American Medical Colleges list of accredited U.S. M.D.-granting medical schools (http://www.aamc.org/members/listings/msalphaae.htm )
Association of American Universities list of member institutions (http://www.aau.edu/aau/ members.html)
NSF Survey of Research and Development Expenditure at Universities and Colleges
NSF Survey of Science Engineering and Research Facilities
The data collection contractor identified and evaluated schools not in the previous GSS population to determine if they were in the scope of the GSS.
If an institution is deemed in-scope, NSF mails the president a letter of invitation and asks the president to name a School Coordinator for the survey and to verify their eligibility. Institutions that do not respond to the letter are followed up via phone call and email.
Each survey cycle, institutions are asked to maintain their list of eligible units. If an institution drops all of its eligible units, it may be removed from the survey, depending upon the circumstances.
In 2007, the frame updating was expanded.
See section B.4 for more details on the 2007 frame activities.
2. Description of Survey Methodology
The GSS is a multi-mode survey with both paper and Web components. In the fall 2007 survey, 92.5 percent of the respondents provided data via the Web; 2 percent provided their data in part or completely via paper; and 5.5 percent provided their data in part or completely via data files.
The GSS consists of two parts. Part 1, NSF Form 811, is a prepopulated Web listing of eligible “organizational units” defined as departments, programs, research centers and health-care facilities known to exist in the previous GSS survey cycle. Part 2, NSF Form 812, is the data collection worksheet asking for the counts of graduate students, postdocs, and nonfaculty research staff with doctorates in each GSS-eligible unit. Each fall at the launch of the survey, information on both parts is sent to a designated respondent (School Coordinator), appointed by the academic institution. In all cases, the School Coordinator (SC) is responsible for completing Part 1; the SC may complete or delegate all or portions of Part 2 to the organizational units for completion. The SC serves as the point of contact at the institution for all internal and external communications about the GSS. It is the responsibility of the SC to notify the units of their assignments and ensure that the unit respondent submits the completed data by the due date.
Each GSS survey cycle begins with a pre-data-collection e-mail to the School Coordinator from the previous survey cycle to determine if he/she is still the appropriate SC for the upcoming cycle. The e-mail is sent in early October with telephone follow-up if e-mail confirmation is not received. Once the School Coordinator is confirmed/updated, data collection commences. The primary mode of data collection is Web-based; other options include data file upload and paper submission. Data collection begins in October with an e-mail and FedEx package providing the School Coordinator with user-id and access information as well as information about the GSS-relevant programs of study.
Data collection procedures for upcoming survey cycles are expected to be similar to those used in the 2007 GSS.
3. Methods Used To Maximize Response Rate
Because the GSS is designed to produce estimates for all academic institutions that offer graduate degree credit programs in SEH fields, care is made to maximize response rates and impute for nonresponse.
Survey techniques proven successful in past surveys will again be used to maximize the response rate for the GSS. These techniques include:
Early pre-data-collection confirmation of the School Coordinator.
Two-part data collection to ensure early notification of unit respondents of their assignments.
The first part entails a review/update by the School Coordinator of the GSS-relevant programs and notification of unit respondents to begin their data reporting assignments. The second part is the unit-level reporting of counts of graduate students, postdocs, and non-faculty researchers.
Separate due dates for each of the two parts of the GSS to help identify at the earliest juncture those institutions that might be potential non-respondents.
Targeted e-mails and telephone follow-up based on response status.
Availability of knowledgeable contractor staff to provide assistance to School Coordinators as well as unit respondents.
Multiple modes of data collection allowed (Web, data file uploads, paper).
Help desk personnel available to respond to telephone- and e-mail- questions and concerns raised by institution personnel.
Presentations at Association of Institutional Researchers (AIR) and Council of Graduate Students (CGS) meetings demonstrating and encouraging the use of the Web-based data collection system.
The inclusion of cover letters to School Coordinators and to unit respondents explaining the consistent format and the uses that are made of the data provided;
The inclusion of an acknowledgement postcard to ensure that the survey package has reached the proper School Coordinator within each institution.
The inclusion in the survey package of a “Crosswalk” listing the fields of study for which data are requested for the GSS along with the Department of Education’s codes for these fields as published in A Classification of Instructional Programs (2000), for the convenience of those institutions using CIP codes in reporting their enrollment and degrees to the IPEDS system; and
Enlistment of others at the institution, as appropriate, to gain cooperation.
The organizational unit response rate uses AAPOR3 response rate #3 to calculate the organizational unit response rate. This response rate includes complete plus partial responses in the numerator divided by all eligible reporting units:
Using the following AAPOR notation:
RR: =Response rate
I =Complete interview
P =Partial interview
R =Refusal
NC =Non-contact
O =Other
U =Unknown
e =estimated proportion of cases of unknown eligibility
The institutional response rate also uses AAPOR Response Rate #3. This response rate also includes both completes and partials as responses and estimates what proportions of cases of unknown eligibility are eligible.
The responding unit response rate calculation uses AAPOR Response Rate #3 to calculate the responding unit response rate. This response rate includes both completes and partials as responses and estimates what proportions of cases of unknown eligibility are eligibility are eligibility.
The GSS has used the following general procedure for imputation for the past several years.
At the conclusion of the survey data collection and data editing phases, all organizational units are classified into one of the following three categories:
· fully respondent organizational units
· partially respondent organizational units
· non-respondent organizational units
For the fully respondent organizational units and the partly respondent organizational units, four key variables are drawn from the current and previous year:
· total full-time students
· total part-time students
· total postdoctorates
· total other non-faculty research staff
Data for the four variables for all organizational units that reported the variable in both the current and prior year are aggregated into cells defined by institutional highest degree level (doctorate and master's) and field (e.g. aeronautical engineering, mechanical engineering, etc.)
For each of the defined cells, an inflator/deflator factor is computed by dividing the current year key variable value by the corresponding previous year value. A total of 632 inflator/deflator factors were computed for the four key variables (4 key variables by two institutional degree levels by 79 fields). Inflator/deflator factors less than .5 or greater than 2.0 are set to 1.0 and often occur for variables such as postdoctorates or other nonfaculty research staff in department types that normally report very low values.
The current year key variable values for nonrespondent organizational units are computed by applying the inflator/deflator factor appropriate for each unit's institutional level and field to that unit's most recent previous year key variable value. For completely nonrespondent organizational units, all four key variables are computed. For partly nonrespondent organizational units, only the missing key variables, if any, are computed.
Mathematically, the computation takes the following form:
where is the estimated value of key variable for department i for year t,
is the estimated value of key variable for department i for year t-1, and
where r is the set of departments in the same institutional degree level and departmental type peer group as department i that provided key variable in both years t and t-1.
This ratio imputation technique is used to impute key variables. Using the key variables, either reported or imputed, the imputation program then allocates these key variables among the various detail cells. For example, for full-time students, the program will distribute the total by sources and mechanisms of support. This operation is done by allocating the total among the detail cells using the same proportions as reported by that department in the previous year.
Mathematically, the non-key variables are derived from their associated key variables using the formula:
where is the estimated value of non-key variable for department i for year t,
is the estimated (or actual) value of key variable for department i for year t,
is the value of non-key variable for department i for year t-1, and
is the value of key variable for department i for year t-1.
All imputed cells are marked by an "i" status code and imputation rate reports are generated for all cells and all major institutional and departmental categories. Since the GSS has traditionally managed to achieve a high survey response rate, NSF expects that the imputation rates will remain relatively low. For the 2006 survey, the latest survey for which there are complete data, unit nonresponse at the department (organizational unit) level occurred in 329 of 12,320 eligible departments, or 3 percent of the department total. Item nonresponse in one or more data cells occurred for 1,177 departments or approximately 10%.
Proposed New Imputation Procedures for Unit Nonresponse
Also starting for survey year 2007, the nearest neighbor method is being used to impute the missing data of an institution with no reported data in the past 5 years. In this method instead of using the data for all variables from the remote past from the same institution, current data from a similar institution will be used for imputation. Research is ongoing.
4. Tests of Procedures Used
Tests of Survey Procedures
A. Survey Due Date
For years, the due date for the GSS has been the last work day in January. The 2007 GSS split the survey into 2 distinct parts with separate due dates for each: November 30 to complete the organizational unit update work (Part 1) and respondents had an additional month, until February 28th, to submit the unit-by-unit data (Part 2). A comparison was made between the response rates for the 2006 and 2007 GSS. At similar points in the survey (the week after the due date) more 2007 respondents had submitted their surveys than 2006 respondents had at that same point. Even though the 2007 GSS due date was 1 month later, by week 22 for each year (about 1 month after the 2007 GSS due date and about 2 months after the 2006 GSS due date), the 2007 GSS response rate was somewhat higher than the 2006 rate. By week 28, the difference was even more pronounced.
Data Collection Post Due Date-Week 22
2006 (March 24, 2007) – 73% had submitted the survey, 94% had logged in
2007 (April 2, 2008) - 79% had submitted the survey, 96% logged in; 89% submitted Part 1
Data Collection Post Due Date-Week 28
2006 (May 5, 2007) - 78% had submitted the survey, 96% had logged in
2007 (May 14, 2008) - 88% had submitted the survey, 98% had logged; 92% submitted Part 1
The higher response rate is attributed in part to earlier contact and identification of School Coordinators, more frequent contacts, tailored communications, and due dates that were realistic and achievable.
B. Individual Unit Confirmation versus Block Unit Confirmation Test
In the 2007 GSS, an experiment was conducted examining two methods for presenting the unit listing task in Part 1 to respondents. In group A, respondents were asked to update their listing as needed and then check a box at the bottom of the page to confirm they had updated their listing. In group B, respondents confirmed or deleted each unit individually (line by line) – similar to a forced-choice format. It was hypothesized that the forced-choice format would promote more focused processing in respondents yielding more revisions and a more accurate listing. The results of the analysis showed that approximately the same number of units was added in the two groups, but that there were more deletions in Group B. Further analysis of the deleted units found that the units were deleted for valid reasons such as not containing graduate students or postdocs or not being GSS-eligible. There was no evidence that respondents in Group B deleted units erroneously. As a result, it is recommended to continue using the Group B format (line by line confirmation) to encourage a more accurate unit listing. It is important to note that overall there were significantly more additions and deletions in 2007 than there were in 2006, which suggests that the unit listings were reviewed more thoroughly in 2007 than in the past, perhaps facilitated by separating the two tasks in terms of the due dates (see next section).
Tests of Survey Content and Format
In developing the survey materials and procedures for the 2007 and 2008 GSS, SRS built upon work started in 2002. NSF worked with Dr. Don Dillman of Washington State University to develop the 2006 survey materials and procedures, holding day long face-to-face meetings approximately every 6 weeks to further develop the survey design, discuss issues, research results and plan the work or the upcoming weeks.
In the early stages of the research, NSF learned that respondents frequently overlooked the NSF Form 811. The NSF Form 811 had been used by School Coordinators to basically conduct the second (data collection) stage. Over the set of remaining cognitive interviews that constituted this investigative stage, a prototype of a new visual design for the NSF Form 811 was developed. This prototype had a different architecture and wording developed to attract respondents’ attention as well as to improve their understanding of the form’s tasks. The results from the cognitive interviews and implications of the research were presented by Cleo Redline in a paper (See Attachment 6.1) entitled “Identifying the Indented Navigational Path of an Establishment Survey” at the February, 2005 Federal Committee on Statistical Methodology Conference, http://www.fcsm.gov/events/papers05.html.
This early research established the visual design and architecture. The focus of the research and work to implement the redesigned GSS was to take and extend the prototype NSF Form 811 (Part 1) developed in the earlier research. With assistance from RTI, SRS has taken the look and feel developed for Part 1, improved on it and extended it to the entire Web survey. This is for both the paper and the Web survey forms.
To date, SRS has conducted three rounds of usability testing. A fourth round is currently underway. All of these efforts have been conducted under the National Science Foundation’s Division of Science Resources Statistics generic clearance for survey improvement projects (OMB Number 3145-0174).
Rounds 1 and 2 of the GSS usability testing focused on ways to draw the respondents’ attention to preparing a more accurate list of organizational units. For Round 1, two variations of NSF NSF Form 811 in the 2006 GSS instrument were tested and evaluated. Following the first round of testing, a single data collection form was developed, incorporating the results of first round of testing. Round 2 testing was conducted onsite at participating postsecondary institutions to ensure that the suggested revisions improved usability and reduced respondent burden.
Based on the results of the second round of usability testing, the data collection instrument was further revised and implemented in the 2007 GSS data collection instrument. The majority of the changes were made to Part 1 (NSF Form 811) with only minor changes made to Part 2 (NSF Form 812), the data collection worksheet.
A third round of testing was then launched in the Spring of 2008 to evaluate respondents’ reactions to the 2007 GSS survey and to gather insight for proposed changes to data collection worksheet (NSF Form 812) that are being considered for future GSS surveys, such as splitting out master’s degree students from PhD students in departments.
In the 4th round of testing, an alternative layout to the current format of Part 2 is being tested in both paper and Web format. A landscape layout has been used for many years on the GSS. It is an awkward arrangement with the directions and definitions on top of the form itself. A portrait layout has been developed that is being tested against the landscape version. The data collected are the same in each format and to the extent possible, the language is the same in both formats. Once the results of the test have been analyzed, NSF will provide OMB with the results and recommendation for the preferred layout to be used for the 2008 GSS.
SRS has conducted (and in some areas is continuing to conduct) research in such areas as understanding respondent record keeping practices; ability to report NCES Classification of Instructional Programs (CIP) Codes; and respondents’ understanding of the survey’s instructions and items/questions, both in the paper and Web versions of the survey and across all sections of the survey instrument to ensure that respondents navigate through tasks and information as prescribed. The detailed paradata that can be collected from the Web system provides NSF with excellent feedback on respondents’ navigational issues. For example, feedback on the error messages from the Web system has shown that some messages need improvement. As a result, in 2008 some error messages are being eliminated entirely and some error messages are being added or reworded.
Research has also been undertaken into ways to make the GSS lists of programs clearer and more comprehensive and improve the accuracy of the reporting of the fields of study. In 2007, the crosswalk between the GSS Codes and CIP Codes was redesigned based on research performed by Dr. Dillman.
Review of the GSS Discipline Codes and Updating the GSS Crosswalk between GSS Discipline Codes and the 2000 NCES Classification of Instructional Programs (CIP).
As part of the 2007 GSS, SRS did a detailed and comprehensive review of the GSS discipline codes and NCES CIP codes of interest to the GSS, the SRS and the NSF as a whole. (See Attachment 6.6) It was decided that the review should look through the entire CIP Code list for research degrees in disciplines of interest to GSS. Five fields of study were added in the 2007 survey; other fields were deleted and some fields were restricted to PhD only (see Attachment 4).
Adding Additional Schools Pilot
In the 2008 GSS cycle, a pilot study (See Attachment 6.5.) will examine potential undercoverage of GSS-relevant units in a subset of existing GSS institutions -- i.e., the extent to which existing institutions do not report GSS- eligible organizational units. Another part of this pilot test will be a test of the feasibility of adding schools of business, social work and education to the covered population of the GSS. (See Attachment 6.5.) The burden hours for this pilot study are included in this package with the 2008 burden estimates included in section A.12.
GSS Frame Updating
For 2007, to update the GSS frame expeditiously, the National Center for Education Statistics’ (NCES) Integrated Postsecondary Data System (IPEDS) Completions survey was selected because institutions who receive title IV federal funding are required to report data to IPEDS. Using the NCES Classification of Instructional Programs (CIP) taxonomy, NSF partitioned 452 CIPs from the full set of NCES CIP codes based on the fields of study that NSF, in collaboration with the survey contractor and NIH, deemed in scope for the GSS. This partitioned list was used to filter/subset the IPEDS Completions survey data against the 2006 GSS data. A total of 537 institutions not currently part of GSS were identified as potentially eligible for GSS based on IPEDS completions data.
In addition to IPEDS, other sources were examined to identify potential frame additions. Only newly-identified schools -- i.e., those not included in any previous source -- were considered. In order to make sure that the national laboratories and institutions having postdoctoral appointees or non-faculty research staff members are not missed in the frame, the Master Government List of Federally Funded R&D Centers was checked for additional research centers (national laboratories) and also The National Postdoctoral Association list.
The online institutions portion of the frame updating activity included four different sources of information: e-learners, directory of online schools, online degrees directory, and the 2006 work of JPSM intern Tiffany Olsen, specifically, her paper Report of Online Institutions for Consideration in the 2007 Survey Frame, and especially appendix D of her report. Using the online institution list in appendix D, further research was conducted with the four previously mentioned sources which generated a larger listing of traditional brick and mortar institutions with an online component or online-only institutions that are in scope for the GSS.
Table B.4 presents the results of the frame updating from all the sources.
Table B.4 Results of GSS Frame Review
Source |
Newly Identified Schools |
NCES Integrated Postsecondary Education Data System |
537 |
Carnegie Classification of Institutions of Higher Education |
0 |
Higher Education Directory Publication (HEP) |
7 |
Online review |
20 |
NSF Survey of Research and Development Expenditure at Universities and Colleges |
2 |
NSF Survey of Science Engineering and Research Facilities |
4 |
Council of Graduate Schools |
0 |
Selected membership lists* |
8 |
Master Government List of Federally Funded R&D Centers |
15 |
The National Postdoctoral Association List |
11 |
*Includes the following: American Association of Medical Colleges (AAMC), American Association of Colleges of Osteopathic Medicine (AACOM), Association of Schools of Public Health (ASPH), Association of American Veterinary Medical Colleges (AAVMC), American Association of Colleges of Nursing (AACN), and the American Dental Education Association (ADEA).
In summary, the frame updating activity produced a total of 604 newly identified institutions that were not on the 2006 GSS frame and are possibly eligible for GSS based on the list of GSS-relevant CIP codes. Of the 604 newly identified institutions, 537 institutions were identified via the 2006 IPEDS Completions survey and 67 additional institutions were identified from the remaining sources.
Newly Eligible Institutions Pilot
For the 2008 GSS survey cycle, the plan is to select 80 of these identified institutions to contact, determine eligibility, and include in the 2008 survey if they are found to be GSS-eligible. This pilot will allow for careful planning and implementation of adding these new institutions; the procedures may then be refined as the rest of the newly eligible institutions are included (see Attachment 6.5). It is estimated that because these institutions are new to the GSS, it will require more initial effort from the respondents, so the burden is estimated to be 3.5 hours per unit for these institutions (about 1 hour higher per unit than the estimate for current GSS institutions). The burden for this pilot is included in the burden estimates in section A.12.
Testing additional questions about postdocs
In the 2009 survey, the GSS will field test the feasibility of collecting additional information on race/ethnicity, citizenship, and sources and mechanisms of support for postdocs. Both the questions and the field procedures will be tested. If successful, NSF will submit them to OMB for clearance to add these questions to the 2010 GSS.
5. Names and Telephone Numbers of Individuals Consulted
The individuals consulted on technical and statistical issues related to the GSS are listed below.
Name |
Affiliation |
Telephone Number |
Ms. Julia D. Oliver GSS Survey Manager |
National Science Foundation, SRS Arlington, VA |
703-292-7809 |
Ms Emilda Rivers Postdoc Data Project Manager |
National Science Foundation, SRS Arlington, VA |
703-292-7773 |
Ms. Jeri M. Mulrow Senior Mathematical Statistician |
National Science Foundation, SRS Arlington, VA |
703-292-4784 |
Ms. Jennifer Sutton Research Training Coordinator |
National Institutes of Health Bethesda, MD |
301-435-2686 |
Mr. Daniel J. Pratt Project Director |
RTI International Research Triangle Park, NC |
919-541-6615 |
Ms. Laura Burns (reporting and instrumentation) |
RTI International Research Triangle Park, NC |
919-990-8318 |
Ms. Jamie Friedman (data collection) |
RTI International Chicago, IL |
312-456-5262 |
Dr. Pat Green (design) |
RTI International Chicago, IL |
312-456-5260 |
Ms. Emily McFarlane (survey methodology) |
RTI International Research Triangle Park, NC |
919-541-6566 |
Mr. Jim Rogers (data delivery) |
RTI International Research Triangle Park, NC |
919-541-7291 |
Mr. Bob Steele (systems development) |
RTI International Research Triangle Park, NC |
919-316-3836 |
Dr. Shying Wu (imputation and tables production) |
RTI International Research Triangle Park, NC |
919-541-7303 |
1Defined in the GSS as individuals receiving research training through the department or program that meet the following characteristics: the appointee holds a Ph.D. or equivalent degree; the doctorate was awarded recently; the appointment is for a limited term; the appointment is primarily for the purpose of training in research or scholarship; and the appointee works under the supervision of a senior scholar in a department or research unit affiliated with the university.
2 Defined in the GSS as all doctoral scientists and engineers who are involved principally in research activities but are not considered either postdoctoral appointees or members of the regular faculty.
3 American Association for Public Opinion Research (AAPOR), Standard Definitions, Final Dispositions of Case Codes and Outcome Rates for Surveys, 2004. The American Association for Public Opinion Research.
File Type | application/msword |
File Title | Supporting Statement |
Author | mjames |
Last Modified By | nsfuser |
File Modified | 2008-10-01 |
File Created | 2008-10-01 |