Survey of Occupational Injuries and Illnesses
1220-0045
August 2016
SUPPORTING STATEMENT, Part B
B. Collection of information employing statistical methods.
The statistical methods used in the sample design of the survey are described in this section. The documents listed below are attached or available at the hyperlink provided. These documents are either referenced in this section or provide additional information.
Overview of the Survey of Occupational Injuries and Illnesses Sample Design and Estimation Methodology – Presented at the 2008 Joint Statistical Meetings (10/27/08)-- http://www.bls.gov/osmr/pdf/st080120.pdf
Deriving Inputs for the Allocation of State Samples (05/01/13)
The growth in cases with Restricted Activity or Job Transfer (08/2011)
Methods Used To Calculate the Variances of the OSHS Case and Demographic Estimates (2/22/02)
Variance Estimation Requirements for Summary Totals and Rates for the Annual Survey of Occupational Injuries and Illnesses (6/23/05)
BLS Handbook of Methods – Occupational Safety and Health Statistics (September 2008) -- http://www.bls.gov/opub/hom/pdf/homch9.pdf
Nonresponse Bias in the Survey of Occupational Injuries and Illnesses (August, 2013) -- http://www.bls.gov/osmr/pdf/st130170.pdf
Sample Allocation to Increase the Expected Number of Publishable Cells in the Survey of Occupational Injuries and Illnesses (August, 2015) -- http://www.bls.gov/osmr/pdf/st150070.pdf
1. Description of universe and sample.
Universe
The main source for the SOII sampling frame is the BLS Quarterly Census of Employment and Wages (QCEW) (BLS Handbook of Methods, Chapter 5 from http://www.bls.gov/opub/hom/homch5_a.htm). The QCEW is a near quarterly census of employers collecting employment and wages by ownership, county, and six-digit North American Industry Classification System (NAICS) code. States have an option to either use the QCEW or supply public sector sampling frames for State and local government units. Some states provide their own frames to use for private sector (only for Guam where data is not available in QCEW), state or local government establishments. The numbers that do so are provided in Table 1:
Table 1: Number of states providing frames by ownership type
| Year | State Frame | Local Frame | Private Frame | 
| 2014 | 6 | 5 | 1 | 
| 2015 | 4 | 3 | 1 | 
| 2016 | 4 | 3 | 1 | 
The potential number of respondents (establishments) covered by the scope of the survey is approximately 8.4 million, although only about 800,000 employers keep records on a routine basis due to recordkeeping exemptions defined by OSHA for employers in low hazard industries and employers with less than 11 employees, or having no recordable cases. The occupational injury and illness data reported through the annual survey are based on records that employers in the following North American Industry Classification System (NAICS) industries maintain under the Occupational Safety and Health Act:
| Sector | Description | 
| 11 | Agriculture, Forestry, Fishing, and Hunting | 
| 21 | Mining, Quarrying, and Oil and Gas Extraction | 
| 22 | Utilities | 
| 23 | Construction | 
| 31, 32, 33 | Manufacturing | 
| 42 | Wholesale Trade | 
| 44,45 | Retail Trade | 
| 48,49 | Transportation and Warehousing | 
| 51 | Information | 
| 52 | Finance and Insurance | 
| 53 | Real Estate and Rental and Leasing | 
| 54 | Professional, Scientific, and Technical Services | 
| 55 | Management of Companies and Enterprises | 
| 56 | Administrative and Support and Waste Management and Remediation Services | 
| 61 | Educational Services | 
| 62 | Health Care and Social Assistance | 
| 71 | Arts, Entertainment, and Recreation | 
| 72 | |
| 81 | Other Services (except Public Administration) | 
Excluded from the national survey collection are:
Self-employed individuals;
Farms with fewer than 11 employees (Sector 11);
Employers regulated by other Federal safety and health laws;
United States Postal Service and;
Federal government agencies.
Mining and Railroad industries are not covered as part of the sampling process. The injury and illness data from these industries are furnished directly from the Mine Safety and Health Administration and the Federal Railroad Administration, respectively, and are used to produce State and national level estimates.
Data collected for reference year 2008 and published in calendar year 2009 marked the first time state and local government agency data were collected for all states and published for all states and the nation as a whole.
The SOII is a Federal-State cooperative program, in which the Federal government and participating states share the costs of participating state data collection activities. State participation in the survey may vary by year. Sample sizes are determined by the participating states based on budget constraints and independent samples are selected for each state annually. Data are collected by BLS regional offices for non-participating states.
For the 2016 survey, 41 states plus the District of Columbia plan to participate in the survey. For the remaining nine states which are referred to as Non-State Grantees (NSG), a smaller sample is selected to provide data which contribute to national estimates only.
The nine NSG States for 2016 are:
| Colorado | Florida | Idaho | 
| Mississippi | New Hampshire | North Dakota | 
| Oklahoma | Rhode Island | South Dakota | 
Additionally, estimates are tabulated for three U.S. territories-Guam, Puerto Rico, and the Virgin Islands-but data from these territories are not included in the tabulation of national estimates.
Sample
The SOII utilizes a stratified probability sample design with strata defined by state, ownership, industry, and size class. The first characteristic enables all the State grantees participating in the survey to produce estimates at the State level. Ownership is defined into three categories: state government, local government, and private industry. There are varying degrees of industry stratification levels within each State. This is desirable because some industries are more prevalent in some states compared to others. Also, some industries can be relatively small in employment but have high injury and illness rates which make them likely to be designated for estimation. Thus, states determine which industries are most important in terms of publication and the extent of industry stratification is set independently within each state. BLS sets some minimal levels of desired industry publication to ensure sufficient coverage for national estimates. So the state levels can only be set at an industry detail that is more specific than those set by BLS. These industry classifications are defined using the North American Industry Classification System (NAICS, http://www.census.gov/eos/www/naics/)and are referred to as Target Estimation Industries (TEI). The industry classifications set by the national office are referred to as NTEI, and are not used as sampling strata.
Finally, establishments are classified into five size classes based on average annual employment and defined as follows:
| Size Class | Average Annual Employment | 
| 1 | 10 or less | 
| 2 | 11-49 | 
| 3 | 50-249 | 
| 4 | 250-999 | 
| 5 | 1000 or greater | 
After each establishment is assigned to its respective stratum, a systematic selection with equal probability is used to select a sample from each sampling cell (stratum). As mentioned earlier, a sampling cell is defined as state/ownership/TEI/size class. Prior to sample selection, units within a sampling cell are sorted by employment and then by Reporting Unit number (a unique identifier assigned to each reporting unit on the QCEW) to ensure a consistent representation of all employments in each stratum. Full details of the survey design are provided in Section 2.
For survey year 2016, the sample size will be approximately 240,000 or 2.9 percent of the total 8.4 million establishments in state, local, and private ownerships.
Response rate. The survey is a mandatory survey, with the exception of State and local government units in the States listed below:
| Alabama | Arkansas | Colorado | Delaware | 
| District of Columbia | Florida | Georgia | Idaho | 
| Illinois | Kansas | Louisiana | Mississippi | 
| Missouri | Montana | Nebraska | New Hampshire | 
| North Dakota | Ohio | Pennsylvania | Rhode Island | 
| South Dakota | Texas | 
			 | 
			 | 
Each year, respondents in the SOII are notified of their requirement to participate via mail. All non-respondents are sent up to two non-response mailings as a follow-up to the initial mailing. Some states choose to send a third or fourth non-response mailing to non-respondents late in the collection period. For Survey Year 2014, approximately half of the states sent an optional third non-response mailing to a majority of the non-respondents at that point in time, and less than five percent of the states sent a fourth non-response mailing. In addition, states may contact respondents via telephone for additional non-response follow-up. No systematic establishment level data on the number of telephone non-response follow-up contacts is captured.
As mentioned earlier, public sector establishments were included in the 2008 survey for all states, including those from which no public sector data had been collected in the past. In these states, public sector establishments have no mandate to provide data to the SOII; their participation is voluntary. For SY 2008, the rates for both state and local government decreased, primarily due to the addition of the voluntary state and local government establishments.
In 2010 an in-depth response rate analysis was undertaken. Aggregate response rates in the SOII were shown to be above 90% due to the mandatory nature of the survey and the excellent efforts to obtain survey data by BLS state and regional partners. However, it was also shown that response rates in states with voluntary reporting status for the state and local governments had low response rates for the government units. In subsequent years, this study was updated to continually monitor the item and establishment non-response. As of the most recent update, there have been no significant changes.
The table below illustrates the establishment level response rates from 2003-2014:
 
Although response rates for the SOII program have historically been high, the expansion of public sector collection in voluntary states resulted in a response rate of 75 percent in state government in 2008. Per OMB statistical guidelines, a nonresponse bias study was initiated and completed in 2013 (See “Nonresponse Bias in the Survey of Occupational Injuries and Illnesses” in the supporting documents). This work concluded that in states where participation is voluntary, there is statistically significant evidence to suggest that counts for establishments identified by a model as being ‘likely’ to respond are lower than establishments that were identified as ‘unlikely’ to respond. Similarly, the mean case rates for establishments identified by a model as being ‘likely’ to respond were higher than those identified as being ‘unlikely’ to respond. This apparent contradiction between the biases in the measures was explained by the changes in the estimates of the hours worked that are included in the rate estimate. Given these voluntary state/local units comprised 1.3% of the total survey, efforts to address these observed biases were deferred due to resource constraints.
Additional response efforts are being conducted to analyze response rates for several key data elements collected for each establishment in the survey. Data elements for NAICS industry, SOC occupation, source, nature, part, and event for each case with days away from work are coded by BLS regional staff and/or state partners. As such, these fields are always available for collected data. Other data elements such as ethnicity, whether the event occurred before/during/after the work shift, the time of the event, and the time the employee began work may be missing from collected data. BLS has initiated a response analysis effort for these other data elements to identify specific response rates and the characteristics of respondents versus non-respondents for these variables.
Regional offices are also working with States on collection practices to improve response for voluntary units.
BLS will continue to monitor the response rates in the next three years for all segments of the survey scope. BLS will update the analysis each year and make recommendations for improvements in the data collection process based on the results of the analysis. If response rates at the establishment level remain below 80% for any group of establishments, BLS will conduct additional non-response bias studies. If response rates for any specific data element within establishments are below 70%, BLS will also implement additional non-response bias studies. Details for these studies will be documented as the studies begin.
2. Statistical methodology.
   Survey
design.
 The survey is based on probability survey design theory and
methodology at both the national and state levels.  This methodology
provides a statistical foundation for drawing inference to the full
universe being studied.
Research was done to determine what measure of size was most appropriate for the allocation module. Discussion with Occupational Health and Safety Statistics (OSHS) program management narrowed the choices to the rates for Total Recordable Cases (TRC); Cases with Days Away from Work (DAFW); and Cases with Days Away from Work, Job Transfer, or Restriction (DART).
Rates from the 2003 SOII were studied for all 1251 TEIs for each of the above case categories. The average case rate, standard deviation (SD), and coefficient of variation (CV) for each set of rates were calculated. The CV is the standard deviation divided by the estimate, which is commonly used to compare estimates in relative terms. The results are shown below:
Description Ave. Rate SD CV
DAFW 1.5540 1.078 0.69
DART 3.0479 2.000 0.66
TRC 5.5300 3.229 0.58
Based on this information it was recommended that the TRC rate be used as the measure of size for the sample allocation process for the survey. The lower CV indicates that it is the most stable indicator.
Additionally, to fulfill the needs of users of the survey statistics, the sample provides industry estimates. A list of the industries for which estimates are required is compiled by the BLS after consultation with the principal Federal users. The sample is currently designed to generate national data for all targeted NAICS levels that meet publication standards.
Allocation procedure. The principal feature of the survey’s probability sample design is its use of stratified random sampling with Neyman allocation. The characteristics used to stratify the units are state, ownership (whether private or state or local government), industry code, and employment size class. Since these characteristics are highly correlated with the characteristics that the survey measures, stratified sampling provides a gain in precision and thus results in a smaller sample size.
Using Neyman allocation, optimal sample sizes are determined for each stratum within each State. Historical case data are applied to compute sampling errors used in the allocation process. Details about this process can be found in Deriving Inputs for the Allocation of State Samples (05/01/13).
The
first simplifying assumption for allocation is that for each TEI 
size class stratum h,
the employment in each establishment is the same, which is denoted by
 .
 BLS also ignores weighting adjustments.  In addition, BLS assumes
that the sampling of establishments in each stratum is simple random
sample with replacement.  (It is actually without replacement of
course, but this is a common assumption to simplify the formulas.)
.
 BLS also ignores weighting adjustments.  In addition, BLS assumes
that the sampling of establishments in each stratum is simple random
sample with replacement.  (It is actually without replacement of
course, but this is a common assumption to simplify the formulas.)
One consequence of these assumptions is that the estimate of the overall employment is constant and as a result the estimated incidence rate of recordable cases in the universe is the estimated number of recordable cases divided by this constant. Therefore, the optimal allocation for the total number of recordable cases and the incidence rate of recordable cases are the same. BLS will only consider the optimal allocation for the total number of recordable cases.
BLS introduces the following notation. For sampling stratum h let:
 denote
the number of frame units
denote
the number of frame units  
 denote
the number of sample units
denote
the number of sample units 
 denote
the sample weight
denote
the sample weight
 denote
the total employment in stratum h
denote
the total employment in stratum h
 denote
the incident rate for total recordable cases
denote
the incident rate for total recordable cases 
 denote
the unweighted sample number of recordable cases
denote
the unweighted sample number of recordable cases 
Also let:
 denote
the estimated number of recordable cases in the entire universe.
denote
the estimated number of recordable cases in the entire universe.
Then
 (1)
								(1)
 (2)
									(2)
where V denotes variance.
Now
BLS will obtain 
 under two different assumptions.  Assumption (a) is:
under two different assumptions.  Assumption (a) is:
(a)
All employees in stratum h
have either 0 or 1 recordable cases and the probability that an
employee has a recordable case is 
 .
.
 
In
this case 
 can be considered to have a binomial distribution with
can be considered to have a binomial distribution with 
 trials and
trials and 
 the probability of success in each trial and consequently
the probability of success in each trial and consequently
 (3)
								(3)
Assumption (b) is:
(b)
The total recordable case rate for the 
 sample establishments in stratum h
has a binomial distribution with
sample establishments in stratum h
has a binomial distribution with 
 trials and
trials and 
 the probability of success in each trial.  In that case
the probability of success in each trial.  In that case
 (4)
								(4)
Although
BLS will derive the optimal allocations under both assumptions, BLS
prefers assumption (b) since under assumption (a) the variance of the
recordable case rate among establishments in stratum h
BLS believes will be unrealistically small, particularly for strata
with large 
 .
.
To derive the optimal allocation under assumption (a) BLS substitutes (3) into (2) obtaining
 (5)
								
(5)
Viewing
(5) as a function of the variables 
 and minimizing (5) with respect to these variables by means of the
method of Lagrange multipliers from advanced calculus, BLS obtains
that (5) is minimized when the
and minimizing (5) with respect to these variables by means of the
method of Lagrange multipliers from advanced calculus, BLS obtains
that (5) is minimized when the 
 are proportional to
are proportional to
 (6)
								(6)
As for the preferred assumption (b), to derive the optimal allocation, BLS similarly substitutes (4) into (2) obtaining
 (7)
								(7)
Minimizing
(7) as BLS minimized (5), BLS obtains that (7) is minimized when the
 are proportional to
are proportional to
 (8)
						(8)
which is the preferred allocation.
Sample procedure. Once the sample is allocated, the process of selecting the specific units is done by applying a systematic selection with equal probability independently within each sampling cell. Because the frame is stratified by employment size within each TEI before sample selection, it was felt equal probability sampling was appropriate rather than a PPS selection. PPS selection is often applied to frames that aren’t stratified by size so in this case, it was felt that no additional value would be gained by selecting the sample by PPS.
The survey is conducted by mail questionnaire through the BLS-Washington and Regional Offices and participating state statistical grant agencies. Respondents are able to provide responses to the survey via the internet, an Adobe fillable form, or by submitting data via a paper questionnaire. In a limited number of cases, data is collected by participating State statistical grant agencies or BLS Regional Office employees through telephone conversations with respondents. Starting with survey year 2016, the survey will use email notification for notification of responsibility to participate in the survey as well as for data collection in accordance with BLS policy on the use of email for data collection.
Estimation procedure. The survey's estimates of the number of injuries and illnesses for the population are based on the Horvitz-Thompson estimator, which is an unbiased estimator. The estimates of the incidence of injuries or illnesses per 100 full-time workers are computed using a ratio estimator. The estimates of the incidence rates are calculated as
 
where:
C = number of injuries and illnesses
          
 =
total hours worked by all employees during a
   
=
total hours worked by all employees during a 
calendar year
200,000 = base for 100 full-time equivalent workers
(working 40 hours per week, 50 weeks per
year).
The estimation system has several major components that are used to generate summary estimates. The first four components generate factors that are applied to each unit’s original weight in order to determine a final weight for the unit. These factors were developed to handle various data collection issues. The original weight that each unit is assigned at the time the sample is drawn is multiplied by each of the factors calculated by the estimation system to obtain the final weight for each establishment. The following is a synopsis of these four components.
When a unit cannot be collected as assigned, it is assigned a Reaggregation factor. For example, if XYZ Company exists on the sample with 1,000 employees but the respondent reports for only one of two locations with 500 employees each, it is treated as a reaggregation situation. The Reaggregation factor is equal to the target (or sampled) employment for the establishment divided by the reported employment for collected establishments. It is calculated for each individual establishment.
In cases where a sampled unit is within scope of the survey but does not provide data, it is treated as a nonrespondent. Units within scope are considered viable units. This would include collected units as well as nonrespondents. The Nonresponse adjustment factor is the sum of the weighted viable employment within the sampling stratum divided by the sum of the weighted usable employment for an entire sampling stratum. The nonresponse adjustment factor is applied to each unit in a stratum.
In some cases, collected data is so extreme that it stands apart from the rest of the observations. For example, suppose in a dental office (which is historically a low incidence industry for injuries and illnesses), poisonous gas gets in the ventilation system which causes several employees to miss work for several days. This is a highly unusual circumstance for that industry. This situation would be deemed an outlier for estimation purposes and handled with the outlier adjustment. If any outliers are identified and approved by the national office, the system calculates an Outlier adjustment factor so that the outlier represents only itself. In addition, the system calculates outlier adjustment factors for all other non-outlier units in the sampling stratum. This ensures that the re-assigned weight is distributed equally amongst all units in the strata.
Benchmarking is done in an effort to account for the time lapse between the sampling frame used for selecting the sample and the latest available frame information. Thus, a factor is computed by dividing the target employment (latest available employment) for the sampling frame by the weighted reported employment for collected units.
The system calculates a final weight for each unit. The final weight is a product of the original weight and all four of the factors. All estimates are the sum of the weighted (final weight) characteristic of all the units in a stratum.
In 2010 a pilot study to measure rates of Days of Job Transfer or Restriction (DJTR) for selected industries was begun using data from the 2011 survey reference year. The first public release of the case and circumstances data for DJTR cases from this pilot occurred on April 25, 2013. BLS is analyzing the results of this test to determine the value of the information and is looking at how best to implement the collection of these data as well as days away from work cases in future survey years. Updates to this DJTR pilot study are continuing by changing the industries of interest. See the testing section below for details.
3. Statistical reliability.
Survey sampling errors.
The survey utilizes a full probability survey design that makes it possible to determine the reliability of the survey estimates. Standard errors are produced for all injury and illness counts and case and demographic data as well for all data directly collected by the survey.
The variance estimation procedures are described in detail in the attached documents mentioned earlier:
Methods Used To Calculate the Variances of the OSHS Case and Demographic Estimates (2/22/02)
Variance Estimation Requirements for Summary Totals and Rates for the Annual Survey of Occupational Injuries and Illnesses (6/23/05)
4. Testing procedures.
The survey was first undertaken in 1972 with a sample size of approximately 650,000. Since then the BLS has made significant progress toward reducing respondent burden by employing various statistical survey design techniques; the present sample size is approximately 240,000. The BLS is continually researching methods that will reduce the respondent burden without jeopardizing the reliability of the estimates.
Responding to concerns of data users and recommendations of the National Academy of Sciences, in 1989, the BLS initiated its efforts to redesign the survey by conducting a series of pilot surveys to test alternative data collection forms and procedures. Successive phases of pilot testing continued through 1990 and 1991. Cognitive testing of that survey questionnaire with sample respondents was conducted at that time. The objective of these tests was to help develop forms and questions that respondents easily understand and can readily answer.
In survey year 2006, the SOII program conducted a one-year quality assurance (QA) study that had primarily a focus on addressing the magnitude of employer error in recording data from their OSHA forms to the different types of BLS collection forms and methods. The results showed no systematic under-reporting or over-reporting by employers. There was no strong dependence between error rates and collection methods.
Beginning in survey year 2007, the QA program introduced in 2006 was extended and modified to evaluate the quality of the data collected in terms of proper collection methods with the goal of minimizing curbstoning and collector adjustments without respondent contact. If improper collection methods or procedures were uncovered, they were corrected. A byproduct of this program was that each data collector would know that any form they have processed could be selected for the program.
In 2003, the BLS introduced the Internet Data Collection Facility (IDCF) as an alternative to paper collection of data. This system has edits built in which help minimize coding errors. The system is updated annually to incorporate improvements as a result of experience from previous years.
In 2008, extensive cognitive testing was completed on the IDCF collection system. In addition to being an overall review, this testing also provided detailed analysis of the site’s useability and eye-tracking. The summary (Summary of Expert Review of SOII IDCF Web Pages) provided extensive feedback, as well as a rating system that addressed “short-term” (wording changes), “Mid-term” (changes that affect the order of pages (flow), but seemed simple to execute), and “long-term” (changes with skip patterns, or associated buttons that appear to be more complex and would require more testing). The implementation of these changes went through a prioritization processes that took into account BLS staff resources to implement.
In 2009, extensive cognitive testing was completed on the IDCF Adobe Fillable Form. Recommendations were provided (OSMR Review of the Revised SOII Adobe Form), and were efforts were made to incorporate them in a timely manner.
In 2012, extensive follow-up cognitive testing was completed on the IDCF collection system. This testing showed (Results of the SOII Edits Usability Test) a vast improvement over previous studies, and noted limited issues in three main areas:
Respondents showed difficulty in understanding what they are supposed to enter in the 'total hours worked by all employees' field, and in using the optional worksheet that accompanies this field.
Respondents can be confused and/or frustrated by the way the information about the average hours worked per employee is derived and presented on the screen.
Miss or have negative reactions to the error message that appears on the detailed “cases with days away from work” reporting page.
Currently these issues are being prioritized for future implementation based level of perceived need and available resource constraints.
In 2015, an option was added to the IDCF collection system that would allow users to ‘opt-in’ to receive future communications with BLS via email. Extensive cognitive testing was performed on this additional module to ensure understanding and ease of use.
Current plans will put in place by 2017 technical improvements to several systems that will allow contact via email for those respondents who have agreed to receive correspondence via email.
Since 2008, BLS has been conducting research concerning the completeness of estimates from the SOII. This multiyear research effort provided results in 2012 which were used to guide the selection of further research.
During an examination into the causes the high instance of ‘unpublishable’ estimates (i.e. estimates that for various reasons were deemed to be too volatile, or in violation of confidentiality agreements), it was discovered that some sampling strata exhibit a high degree of ‘sampling inefficiency’ (i.e. items sampled not being useable for estimation for any number of reasons). In 2013, a research project began to determine if it would be feasible to ‘oversample’ these strata in a way that would minimally impact the optimal sizes produced by the Neyman allocation. This research is currently ongoing and is showing promising results (see: “Sample Allocation to Increase the Expected Number of Publishable Cells in the Survey of Occupational Injuries and Illnesses").
The BLS also utilizes statistical quality control techniques to maintain the system's high level of reliability.
Undercount Research
The Bureau of Labor Statistics (BLS) is conducting ongoing research to investigate the completeness of the injury and illness counts from the Survey of Occupational Injuries and Illnesses (SOII). The purpose of this research is to better understand a potential undercount of occupational injuries and illnesses reported by the SOII and to investigate possible reasons behind it. Several articles and papers describing this research are available at http://www.bls.gov/iif/undercount.htm.
The BLS continues to evaluate the results of the undercount research completed. These efforts include evaluating reporting practices employed by establishments and testing the feasibility of collection of injury and illness data directly from workers.
The employer reporting practices is being investigated by conducting a follow-back study of a subsample of respondents to the 2013 SOII. The results of this study should be released to the public within the next year.
The feasibility of collecting injury and illness data directly from workers is being evaluated through an incumbent survey. BLS plans to conduct a pilot test of a worker/incumbent survey in 2016-2017. The test will be a large-scale, nationally representative household pilot survey that will allow BLS to test the collection information over one calendar year and also to produce broad industry and occupation estimates comparable to the SOII. These tests will continue BLS research into ways to improve completeness of injury and illness measures. A nonsubstantive change with further details will be submitted prior to the start of this test.
Computer Assisted Coding
BLS is constantly looking for ways to upgrade data collection that will minimize the impact of human error. Because much of the occupational data are provided in narrative form, BLS and its state partners must manually translate these narratives into codes. While BLS has incrementally developed rules for identifying coding errors, consistency remains a concern. In 2012, BLS began researching the concept of using computer learning algorithms to “autocode” free-form written case narratives from survey respondents. The initial results proved promising and indicated that computer-assisted coding would be feasible.
Currently, BLS is using the research output as part of the annual review of the codes state coders have assigned to occupation and case circumstances for more than a quarter million nonfatal injuries and illnesses. BLS will continue to develop and evaluate computer-assisted coding with the twin goals of improving consistency and freeing personnel for more complex assignments where staff expertise is critically needed.
For the 2014 SOII, BLS began automatically assigning occupation codes. BLS found that it could successfully automatically assign codes to about one-quarter of 2014 SOII cases. With the 2015 SOII, autocoding was expanded to include nature of injury or illness and part of body affected. With this expansion, SOII anticipates autocoding about 500,000 codes. A small portion of the autocoded values will be withheld from the coders and will be manually coded. The manually assigned codes will be compared to the autocoder assigned values for quality assurance measurement purposes.
Days of Job Transfer or Restriction Testing
Beginning with the 2011 survey year, BLS began testing the collection of case and demographic data for injury and illness cases that require only days of job transfer or restriction. The purpose of this on-going pilot study is to evaluate collection of these cases and to learn more about occupational injuries and illnesses that resulted in days of job transfer or work restriction. The results of the first three years of collection were successful and demonstrated that this data could be collected and processed accurately for a limited set of industries. The most recent results from the DJTR study are available at http://www.bls.gov/iif/days-of-job-transfer-or-restriction.htm.
BLS is analyzing the results of this test to determine the value of the resulting information and is looking at how best to implement the collection of these data as well as days away from work cases in future survey years. BLS regards the collection of these cases with only job transfer or restriction as significant in its coverage of the American workforce.
To retain the level of case and demographic characteristics estimates published currently for cases with days away from work and publish similar estimates for cases with job transfer or restriction, a greater number of cases will need to be collected from employers. BLS has maintained the subsampling process for employers to limit to 15 the number of cases each employer needs to submit. BLS is continuing to examine this issue to determine an optimal number of cases to collect for each type of case while limiting the burden on the employer and the burden on the participating State agencies.
OSHA Electronic Recordkeeping
The Occupational Safety and Health Administration (OSHA) requires large establishments in manufacturing and from selected high-risk industries outside of manufacturing to record and retain data similar to those collected by the BLS injury and illness survey. OSHA requires establishment specific data to target interventions such as inspections, consultations, and technical assistance.
OSHA recently amended its recordkeeping regulations to add requirements for the electronic submission of certain injury and illness information employers are already required to keep under OSHA’s regulations. The proposed rule does not add to or change any employer’s obligation to complete and retain injury and illness records under OSHA’s regulations for recording and reporting occupational injuries and illnesses. The proposed rule modifies employers’ obligations to transmit information from these records to OSHA or OSHA’s designee. The proposed rule does not change any employer’s obligation to complete the SOII. BLS will form a working group with OSHA to assess data quality, including timeliness, accuracy, and public use of the collected data, as well as align the collection with the SOII.
5. Statistical responsibility.
The
Statistical Methods Group, Chief, Gwyn Ferguson is responsible for
the sample design which includes selection and estimation.  The
sample design of the survey conforms to professional statistical
standards and to OMB Circular No. A46.
	
	
| File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document | 
| Author | petrie_a | 
| File Modified | 0000-00-00 | 
| File Created | 2021-01-23 |