3145-0062 Part B

3145-0062 Part B.docx

Survey of Graduate Students and Postdoctorates in Science and Engineering

OMB: 3145-0062

Document [docx]
Download: docx | pdf

3145-0062

B. Collection of Information Employing Statistical Methods

B.1 Respondent Universe and Sampling Procedure

The GSS is an annual census of all eligible institutions. The universe is intended to cover all U.S. academic institutions that offer graduate (master’s and PhD or equivalent) degree-credit programs in science and engineering, as defined by NSF, and graduate programs in health fields, as defined by NIH.

B.1.1 Discussion of Institutional Frame

An institution is considered eligible, or in scope, for the GSS if it meets the following criteria:

  • Grants at least one master’s or doctoral degree in at least one program listed in a GSS-eligible field (See Attachment 4 for the list of GSS fields).

The data collection contractor is currently in the process of identifying and evaluating schools not in the previous GSS frame to determine if they are in the scope of the GSS. These schools will be added to the 2011 GSS; approximately 250 schools and 500 units are expected to be found newly eligible.

In subsequent cycles, beginning with the 2012 GSS, the GSS will undergo a population review to identify any new institutions that should be added to the list of eligible institutions.

This review will be conducted in a manner consistent with earlier reviews to ensure that all eligible (in-scope) schools were surveyed. The population review is conducted by consulting the following sources to identify potentially eligible schools that are not currently included in the GSS frame:

B.2 Description of Survey Methodology and Statistical Procedures

The GSS is primarily a Web-based survey. In the fall 2007 survey, 92.5% of respondents provided data via the Web; 2% provided their data in part or completely via paper; and 5.5% provided their data in part or completely via data files. In 2009, all institutions submitted electronically: 93% of responding institutions submitted using the Web instrument, and 7% of the responding institutions used the data upload feature. Paper versions of the questionnaire are no longer routinely offered, but are made available when requested by an institution.

Each institution has one or more coordinators that manage data collection activities. Before the start of each cycle, NSF mails the president a letter of invitation and asks the president to name a coordinator or coordinators for the survey and to verify the institutions’ eligibility. Institutions that do not respond to the letter are followed up via phone call and e‑mail. Some institutions have separate coordinators for the graduate enrollment section and the postdoc section, and some have separate coordinators for the graduate and medical schools.

The GSS instrument comprises two parts. In Part 1, the coordinator updates a list of all eligible units (departments, programs, research centers, and health care facilities) in the school and classifies each unit by its GSS code (field). A crosswalk between GSS codes and CIP codes is provided to respondents (see Attachment 5; this crosswalk will be updated with 2010 CIP codes for the 2011 GSS). For established GSS schools, this activity involves verifying the eligibility of units pre-populated from the previous survey round, confirming GSS codes, adding any newly eligible units, and deleting defunct units. All Part 1 activities are completed by the coordinator.

In Part 2, data for each unit are entered or uploaded by the coordinator, or by designated unit respondents (URs), whom the coordinator may assign as needed. Part 2 requests details about graduate students, postdocs, and NFRs in each GSS-eligible unit. A paper worksheet corresponding to Part 2 is provided with the survey materials sent to each coordinator. Information can be compiled on this worksheet in preparation for entry into the Web survey. Once a UR has completed Part 2, the UR submits data to the coordinator, who reviews and revises the data as needed. The UR can no longer change the data, but can still view it. Once all units are ready to submit, the coordinator submits the school’s data to NSF. At that point, the coordinator can only view the data. If the coordinator needs to make a revision, the data collection contractor can roll back the submission for resubmission prior to survey close-out date.

The SC serves as the point of contact at the institution for all internal and external communications about the GSS. It is the responsibility of the SC to notify the URs of their assignments and ensure that the UR submits the completed data by the established due date.

Each GSS survey cycle begins with a pre-data collection e‑mail to the SC from the previous survey cycle to determine if he/she is still the appropriate contact for the upcoming cycle. The e‑mail is typically sent in early September with telephone follow-up if confirmation is not received. Once the SC is confirmed/updated, data collection commences. The primary mode of data collection is Web-based; other options include data file upload and paper submission. Data collection begins in October with an e‑mail and FedEx package providing the SC with Web access information and information about the GSS-eligible degree programs.

Data collection procedures for upcoming survey cycles are expected to be similar to those used in the 2010 GSS. The data collection plan is included in Attachment 6.

B.2.1 Imputation for Item Nonresponse in the GSS

Imputation is used when missing data are present for any item. There were 216 data items collected in four separate questions in the 2009 GSS. All missing data were imputed. The imputation rates for these variables ranged from 1.0% to 7.1%, with a mean of 4.2%. The imputation procedures used for the 2011 cycle will remain the same as the procedures used in 2009 and 2010. A simplified summary of the imputation methods used in 2009 follows.

The imputation procedure used for a given question for a given unit depended on whether data were provided in any prior survey cycle and whether totals were provided in 2009. The method used under each of four conditions is shown in Exhibit 6.

Exhibit 6. Imputation Methods Used, by Conditions: 2009

Conditions

Totals provided in 2009

No data provided in 2009

Prior survey cycle data available (extant unit)

Carry forward details only

Carry forward total and details

Prior survey cycle data unavailable (new unit)

Nearest neighbor (NNG)

Adjusted enrollment (AE) for graduate student data; NNG for postdoc/ doctorate-holding nonfaculty researcher (NFR) data



When the 2009 total was reported without complete data for subtotals, but the subtotals were reported by the unit in a previous survey cycle, the details were imputed using a carry-forward method, whereby the prior year’s distribution of the total over the details was applied to the 2009 total.

When details needed to be imputed but a previous year’s data were not available, a nearest neighbor (NNG) was identified from the set of units that responded in the 2009 GSS survey cycle. When graduate student details were being imputed, the nearest neighbor selected had full-time and part-time graduate enrollments that were most similar to the imputee’s enrollments.1 When postdoc and NFR details were being imputed, the total number of postdocs was used to choose the nearest neighbor. In either case, the details were imputed by distributing the total according to the nearest neighbor’s distribution.

When no data for a question were provided in 2009, total imputation by a carry-forward method was employed if data from a previous survey cycle was available. First, the total was imputed by multiplying a prior year’s total by an inflation factor to account for year-to-year change. The details were then imputed by applying the prior year’s distribution to the imputed total. The same procedure was used in the 2008 imputations. In 2007, the carry-forward method was only used if the unit reported data within the previous 5 years. This condition was lifted in 2008 because simulations using the 2007 data revealed that the carry-forward method performed better than other methods, even if the previous data were reported over 20 years ago.

In rare instances where neither current year totals nor data from a previous year were available, a method called adjusted enrollment (AE) was used for imputation of graduate student data as in the 2008 cycle. This method, proposed by NSF, replaced the nearest-neighbor imputation used before 2008 because the aforementioned imputation research demonstrated that the AE method was superior. Unlike the carry-forward and NNG methods, which use only GSS data, the AE method uses IPEDS data to estimate the graduate student totals and the details. In this method, for each gender category, the full-time and part-time institutional graduate enrollment totals were obtained from the IPEDS Fall Enrollment survey. These totals were then distributed respectively to the totals of missing and nonmissing units according to the IPEDS distributions over the CIP codes within the same category by following a crosswalk between the GSS Code and CIP codes. The missing unit totals were then evenly distributed to each of the missing units. These totals were further distributed to detailed cells using the NNG method.

Since the IPEDS data do not include counts of postdocs or NFRs, GSS had to apply a different method when these data were missing and no previous data were available. The unit’s full-time and part-time graduate student enrollment figures, as reported or imputed for the 2009 GSS, were used to identify a nearest neighbor donor from the pool of GSS units. The donor’s postdoc and NFR data were then used to impute the missing data.

The 2009 GSS survey frame contains 13,268 units. Of these units, 7.5% required imputation of graduate student totals or their details; 7.8% required imputation of postdoc totals, NFR totals, or their details. Exhibit 7 summarizes the number of units imputed by each method.

Exhibit 7. Units Requiring Imputation, by Method Used and Type of Data Imputed: 2009


Graduate student data


Postdoc/NFR data


Number

Percent of all units


Number

Percent of all units

Units imputed

995

7.5


1,031

7.8

Units imputed by carry forward (CF)

965

7.3


935

7.1

Units imputed by nearest neighbor (NNG)

28

0.2


95

0.7

Units imputed by adjusted enrollment (AE)

2

0.0


1

0.0



B.3 Methods Used To Maximize Response Rate

Because the GSS is designed to produce estimates for all academic institutions that offer graduate degree programs in SEH fields, care is made to maximize response rates. The GSS contractor staff work closely with the coordinators at institutions and try to ensure that all contacts are positive. One of the primary goals of the GSS staff is to build strong working relationships with all participating institutions.

Survey techniques proven successful in past surveys will again be used to maximize the response rate for the GSS. These techniques include

  • Early pre-data collection confirmation of the school coordinator

  • Two-part data collection to ensure early notification of unit respondents of their assignments. The first part entails a review/update by the coordinator of the GSS-relevant programs and notification of unit respondents to begin their data reporting assignments. The second part is the unit-level reporting of counts of graduate students, postdocs, and NFRs

  • Separate due dates for each of the two parts of the GSS to help identify at the earliest juncture those institutions that might be potential nonrespondents

  • Targeted e‑mails and telephone follow-up based on response status

  • Availability of knowledgeable GSS contractor staff to provide assistance to the coordinators and unit respondents

  • Multiple modes of data collection allowed (Web, data file uploads, etc.)

  • GSS Help desk staff available to respond to telephone and e‑mail questions and concerns raised by institution staff

  • Presentations at AIR and CGS meetings demonstrating the Web-based data collection system and discussing any proposed changes

  • The inclusion of cover letters explaining how the provided data are used

  • The inclusion in the survey package and in the GSS Web survey of a “crosswalk” listing the fields of study for which data are requested for the GSS along with the NCES CIP codes for these fields as published in A Classification of Instructional Programs. This crosswalk is for the convenience of those institutions using CIP codes in reporting their enrollment and degrees to the IPEDS system

  • Enlistment of others at the institution, as appropriate, to gain cooperation.

These methods have proven successful in the past, as evidenced by increasing response rates. Exhibit 8 displays unit, school, and institution response rates for the 2007, 2008, and 2009 survey cycles. At every level, the response rates have shown improvement with each successive survey cycle.

Response rates have steadily increased each year since 2007 at the institution, school, and unit levels.

Exhibit 8. Institution, School, and Unit Response Rates: 2007–09


Complete respondents

Partial respondents

Nonrespondents

2007

2008

2009

2007

2008

2009

2007

2008

2009

Institution

94.9%
n = 552

97.8%
n = 566

99.1%
n = 568b

1.7%
n = 10

1.0%
n = 6

0.2%
n = 1b

3.4%
n = 20

1.2%
n = 7a

0.7%
n = 4b

School

95.4%
n = 668

97.9%
n = 693

99.0%
n = 694

1.4%
n = 10

0.9%
n = 6

0.3%
n = 2

3.1%
n = 22

1.3%
n = 9

0.7%
n = 5

Unit

87.3%
n = 11,020

87.8%
n = 11,560

88.1%
n = 11,684

10.2%
n = 1,290

11.0%
n = 1,450

11.2%
n = 1,486

2.5%
n = 319

1.2%
n = 156

0.7%
n = 98

a Six of these institutions did not have any unit respondent. New York Medical College provided data for 8 of its 17 units.

b Three of these institutions did not have any unit respondent. University of Nebraska Medical Center provided data for 1 of its 32 units.

B.4 Testing of Procedures

NSF has sponsored methodological research for every cycle to help improve the GSS survey. In this section, methodological investigations undertaken since the last OMB clearance submission in 2008 are reviewed.

B.4.1 Tests of Survey Procedures

Initial GSS Frame Update Investigation

An initial review was undertaken to update the GSS frame using the NCES’s IPEDS Completions survey because institutions that receive Title IV federal funding are required to report data to IPEDS. Using the NCES CIP taxonomy, NSF partitioned 452 CIPs from the full set of NCES CIP codes based on the fields of study that NSF, in collaboration with the survey contractor and NIH, deemed in scope for the GSS. This partitioned list was used to filter/subset the IPEDS Completions survey data against the 2006 GSS data. A total of 537 institutions not currently part of the GSS were identified as potentially eligible for GSS based on IPEDS Completions data.

In addition to IPEDS, other sources were examined to identify potential frame additions. Only newly identified schools (i.e., those not included in any previous source) were considered. To make sure that national laboratories and institutions having postdocs or NFRs are not missed in the frame, the Master Government List of Federally Funded R&D Centers was checked for additional research centers (national laboratories) and also the National Postdoctoral Association list.

The online institutions portion of the frame updating activity included four different sources of information: e‑learners, directory of online schools, online degrees directory, and the 2006 work of JPSM intern Tiffany Olsen, specifically, her paper Report of Online Institutions for Consideration in the 2007 Survey Frame, and especially Appendix D of the report. Using the online institution list in Appendix D, further research was conducted with the four previously mentioned sources that generated a larger list of traditional brick and mortar institutions with an online component or online-only institutions that are in scope for the GSS. Exhibit 9 presents the results of the frame updating from all the sources.

Exhibit 9. Results of GSS Frame Review

Source

Newly identified schools

NCES Integrated Postsecondary Education Data System (IPEDS)

537

Carnegie Classification of Institutions of Higher Education

0

Higher Education Directory Publication (HEP)

7

Online review

20

NSF Survey of Research and Development Expenditure at Universities and Colleges

2

NSF Survey of Science Engineering and Research Facilities

4

Council of Graduate Schools (CGS)

0

Selected membership lists*

8

Master Government List of Federally Funded R&D Centers

15

The National Postdoctoral Association List

11

*Includes the following: American Association of Medical Colleges (AAMC), American Association of Colleges of Osteopathic Medicine (AACOM), Association of Schools of Public Health (ASPH), Association of American Veterinary Medical Colleges (AAVMC), American Association of Colleges of Nursing (AACN), and the American Dental Education Association (ADEA)

In summary, the frame updating activity produced a total of 604 newly identified institutions that were not on the 2006 GSS frame and were possibly eligible for GSS based on the list of GSS-relevant CIP codes. Of the 604 newly identified institutions, 537 institutions were identified via the 2006 IPEDS Completions survey, and 67 additional institutions were identified from remaining sources.

Newly Eligible Institutions Pilot

In 2008, a sample of 80 institutions from 604 new institutions identified from the frame review was selected to participate in a pilot study designed to examine ways to incorporate new institutions into the GSS. The new institution pilot study evaluated procedures for confirming eligibility of potentially eligible institutions, and recruiting and incorporating into the GSS those that are deemed eligible. Eligibility of these institutions was determined by speaking with institution personnel in presidents’ offices, institutional research offices, deans’ offices, and the departments where potentially GSS-eligible programs were thought to exist. This new institution pilot study found that approximately one third of these institutions were GSS eligible.

The potentially eligible institutions that were not part of the 2008 new institution pilot study are being contacted during a frame expansion eligibility screening study to more definitively determine eligibility. The 2011 survey cycle will include new institutions found to be GSS eligible in the 2008 new institution pilot study and the eligibility screening study currently being conducted; also included would be any additional institutions that may have become eligible since completion of the new institution pilot study.

Recordkeeping Study

At the end of the 2009 survey, a sample of institutions was contacted and asked to complete a short Web survey about record keeping procedures at their institution. The purpose of the Recordkeeping Study (RKS) was to evaluate data reporting for two key sections of the GSS to understand how institutions track and record certain data for the GSS and then how these institutions pull together the information for reporting purposes. The first focus of the study was an examination of the recordkeeping practices institutions use for tracking financial data for graduate students. The second part of the study determined record keeping practices for NFRs.

After conducting cognitive interviews at 11 institutions, the RKS questionnaire was revised and programmed. A sample of 128 institutions was drawn and coordinators at those institutions were asked to complete the Web-based study instrument. A response rate of over 65% was targeted and achieved.

Overall respondents reported that completing the graduate student financial data was neither easy nor hard. The primary reasons for difficulty are that students have multiple sources and mechanisms of financial support and that the information comes from multiple sources. Respondents had the most difficulty determining funding that was not processed by the university (note that most federal funds are paid through the university) or funding for multidisciplinary/interdisciplinary students (or students with funding from a different department than the one granting their degree). Self-support data were also among the least available information (self-support is often estimated by subtracting known sources of support from total cost) and were often overlooked when completing the financial support grid.

Most respondents could report the mechanism of support for most students, although they had more difficulty determining fellowships and traineeships than determining research or teaching assistantships. The financial data that were easiest for schools to report (i.e., federal funding, institutional funding, and mechanism of support) were more likely to be centrally available.

For the most part, respondents found the GSS definitions and categories to be clear, and these definitions and categories were frequently used by respondents to complete the GSS. However, the GSS definitions and categories did present difficulties when the school did not track the data using those categories.

The second part of the survey included questions about NFRs. Respondents reported that NFRs are primarily determined by their job title/position. The most common title was research associate/scientist/fellow. The other common title was postdoctoral associate/scholar/fellow/ researcher. Almost all respondents indicated that they could distinguish between postdocs and NFRs.

Although job titles were the most common method for identifying NFRs, 45% of respondents reported some may be missed because the job titles do not include all NFRs at the school. Some 41% of respondents also indicated that they may be missing entire departments in which NFRs work. Most respondents can determine if their staff has PhDs, if they primarily perform research, and if they are not faculty. These were also the three most common characteristics listed for NFRs. Of the respondents who did not report any NFRs at their school, just over one third indicated that they think their school might have NFRs.

The overarching finding from the RKS is that the schools are inconsistent in what information they track for the GSS, what information they could track in the future, how they track information, and the sources they use to gather that information. To address this, the data collection contractor recommended that steps be taken to provide GSS respondents with more detailed information or guidelines that they can be use to help them report data more consistently. The following is a list of specific recommendations for providing more detailed information in the GSS.

1. Improve definitions for mechanisms of support. Almost 75% of participants indicated that they use the GSS definitions when completing the GSS. Yet participants also indicated that they have difficulty determining who has a fellowship and who has a traineeship because the distinction between the two is somewhat unclear. Improving the definitions and providing examples may help improve the consistency in reporting.

2. Provide a crosswalk for mechanisms of support. Like improving the definitions, providing a crosswalk between the most common graduate student awards and the GSS mechanisms of support categories (e.g., the NIH T32 award is a traineeship) could facilitate reporting of financial data. Recommendation is first conducting exploratory research by contacting a sample of schools and asking for the titles and descriptions of awards that their students receive. Then these can be mapped to the GSS categories and used to create the crosswalk. The crosswalk can then be implemented experimentally into the GSS to evaluate the effects.

3. Provide examples of job titles for NFRs. To help clarify the definition of NFRs, the GSS could provide examples of the most common job titles used, such as research associate, research scientist, and visiting scholar although the job functions may vary between the institutions with the same job title.

4. Include a “Don’t Know” category for source of financial support. Participants indicated that it is often difficult to know the source of support if it is not processed through their institution. Therefore, they cannot determine if the outside funding is federal or nonfederal and consequently are forced to choose one of the categories. It is possible that respondents use the “self-support” category as a catchall for when they do not know what the source of support is. The use of a “Don’t Know” option would prevent overinflating some categories and allow for more informed imputation. To prevent respondents from abusing the “Don’t Know” option, respondents would be required to enter the details they do have about these students in the comments box, so that the data collection contractor can help the respondent classify these students.

B.4.2 Tests of Survey Content and Format

Testing Additional Questions about Postdocs

In 2010, new questions concerning postdocs were added to the GSS, based on the findings of a postdoc pilot study that was conducted concurrently with the 2009 survey cycle. This pilot study tested the feasibility of collecting data beyond what was then collected in the GSS. The additional information included data on the postdoc definition, race/ethnicity, citizenship, sources and mechanisms of support, doctorate degree type, and origin of doctorate for postdocs. All but two of the expanded postdoc questions were substantively similar to the items asked about graduate students; the exceptions were the question about the unit’s definition of a postdoc position and the question on the origin of the doctoral degree, neither of which is applicable for graduate students. Additional goals were to develop an operational definition of postdocs, determine the best methods for identifying a person to serve as a postdoc coordinator (PC), and implement methods to reduce underreporting of postdoc data in the GSS. The pilot study included 74 schools, of which 26 were “large” schools (10 or more units with postdocs and NFRs) and 48 were “small” schools (having fewer than 10 such units).

The SCs for the large schools were asked to complete the main GSS in addition to the postdoc pilot study. Thus, the large schools participated in the main GSS, while the small schools participated in the postdoc pilot data collection only. Although the graduate student grids in the pilot study survey were identical to the grids in the main GSS, the postdoc and NFR grids were expanded. In addition, SCs were given the option of appointing a PC; PCs were given more time to complete the Part 1 data than SCs. Coordinators received worksheets and Web survey views appropriate for their roles (e.g., PCs were only allowed to view the postdoc section).

The results of the pilot study suggested that, for the most part, schools could answer the new postdoc questions. The demographic items (Question A) appear to be the easiest to complete and have the least amount of item nonresponse; 96.7% of reporting units were able to supply these data. The financial support items (Question B) were answered by 90.8% of reporting units. On doctoral degree–type items (Questions C1 and C2), it was difficult to distinguish between true zeros and missing values given the format of the questions in the pilot study, which was similar to the format historically used in the GSS for such questions. Respondents may leave a cell blank rather than typing a zero. These two items have been modified from the pilot format to a format similar to origin of the doctoral degree item (Question C3) to ensure the ability to distinguish between true zeros and missing data.

The results of the pilot also suggest that providing the option of appointing a separate PC was a valuable addition, especially for schools with large numbers of postdocs. Exactly half of the 74 pilot schools chose to designate a PC. From the debriefing interviews, SCs and PCs seemed pleased with the arrangements. None of the schools that had designated a PC asked to change back to having only an SC.

PCs had higher response rates (97.3%) to the postdoc section than the SCs did (86.5%). Having a PC may also result in more comprehensive coverage of postdocs in an institution. For small schools, designating a PC was associated with a larger net increase in units per school that reported having postdocs (2.5) compared with a net increase of about 0.5 per school if postdocs were reported by SCs. Similarly, small schools with PCs reported 30% more postdocs than in the previous year, compared with a 12% increase for SCs.

Based on the results of the postdoc pilot study, the 2010 GSS incorporated the expanded postdoc questions from the postdoc pilot study. In advance of the 2010 GSS data collection start, the school presidents were sent a letter advising them of the newly expanded questions regarding postdocs and asking them whether they would like to appoint a separate PC to assist with their school’s reporting.

B.4.3 Changes to the 2011 GSS

Attachment 8 contains a listing of the changes to be made to the 2011 GSS, based on the methodological work described here and experience gained in conducting prior survey rounds.

B.5 Names and Telephone Numbers of Individuals Consulted

The individuals consulted on GSS technical and statistical issues are listed in Exhibit 10.

Exhibit 10. Individual Consulted on GSS Technical and Statistical Issues

Name

Affiliation

Telephone number

Ms. Kelly Kang
GSS Survey Manager

National Science Foundation, NCSES
Arlington, VA

703-292-7796

Ms. Emilda Rivers
HRS Program Director

National Science Foundation, NCSES
Arlington, VA

703-292-7773

Mr. Stephen Cohen
Chief Statistician

National Science Foundation, NCSES
Arlington, VA

703-292-7769

Ms. Jennifer Sutton
Research Training Coordinator

National Institutes of Health
Bethesda, MD

301-435-2686

Dr. Patricia Green
Project Director

RTI International
Chicago, IL

312-456-5260

Ms. Laura Burns
Methodology and Reporting Task Leader

RTI International
Research Triangle Park, NC

919-990-8318

Mr. Peter Einaudi
Publications and Data Dissemination Task Leader

RTI International
Research Triangle Park, NC

919-541-8765

Ms. Jamie Friedman
Data Collection Task Leader

RTI International
Chicago, IL

312-456-5262

Dr. Jean Lennon
FFRDC Survey Task Leader

RTI International
Research Triangle Park, NC

919-485-2654

Ms. Emily McFarlane Geisen
Survey Methodologist

RTI International
Research Triangle Park, NC

919-541-6566

Mr. Jim Rogers
Data Delivery Task Leader

RTI International
Research Triangle Park, NC

919-541-7291

Mr. Bob Steele
Systems Development Task Leader

RTI International
Research Triangle Park, NC

919-316-3836

Dr. Shiying Wu
Mathematical Statistical Task Leader

RTI International
Research Triangle Park, NC

919-541-7303


1 In special cases where the unit to be imputed had provided only total enrollment (full-time and part-time combined), the total was split into full-time and part-time enrollment before the imputation process using the carry-forward or NNG method.

File Typeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
Authorsplimpto
File Modified0000-00-00
File Created2021-01-29

© 2024 OMB.report | Privacy Policy