Download:
pdf |
pdfQUALITY OF THE DMI FILE AS A BUSINESS SAMPLING FRAME
David A. Marker and W. Sherman Edwards, Westat, Inc.
David Marker, Westat, Inc., 1650 Research Blvd., Rockville, MD 20850
Key Words: Establishment
surveys
1.
surveys,
enterprise
2.
Description of the DMI Register
2.1
Frame Basics
Introduction
The DMI register is compiled by D&B through
a review of public records from bankruptcy and district
courts, secretaries of state, departments of motor
vehicles, and unemployment insurance agencies, as
well as newspapers, yellow pages, credit reports, and
records of businesses dealing with governments. All
establishments are then requested (by telephone or in
person) to update their information (including name,
address, and size) annually. As of June 1997 there are
approximately 10.8 million establishments on the DMI
register. When the DMI register was used as a
sampling frame for the NEHIS at the end of 1993, it
contained 10.1 million establishments.
The U.S. Bureau of the Census and Bureau of
Labor Statistics (BLS) each maintain business registers
for use as sampling frames for their surveys. Due to
confidentiality and data restrictions on Federal business
registers, other government agencies needing to survey
businesses are not able to access the registers
maintained by the Census and BLS.
The most
complete privately-maintained register is the Duns
Market Identifiers (DMI) register maintained by Dun
& Bradstreet (D&B).
The Small Business Administration (SBA) was
an early user of the DMI register as a sampling frame.
While they determined that this register provided the
best frame available to them, they did report a number
of difficulties with frame coverage (Phillips, 1993).
Among the difficulties were: not all firms reported
every branch to D&B; total firm employment did not
necessarily equal the sum of employment at all of the
branches; there were lags in recording company births
and new branches; and there were delays in cleaning
out firms and establishments that were no longer in
business, sometimes as long as four years in the late
1980s.
The DMI register contains extremely detailed
information on the type of industry conducted by each
establishment. Each establishment has associated with
it up to six four-digit Standard Industrial Classification
(SIC codes) codes. Each of these codes then has up to
four four-digit extensions that provide even greater
detail than the standard BLS coding scheme. For
example, this allows one to separate elementary and
secondary schools (SIC code 8211) into public,
Catholic, or other private; and elementary, junior, or
high school. This provides a potential for identifying
24 different types of business at each establishment.
Westat has used the DMI register in the last five
years to conduct two large-scale surveys of employerprovided health insurance. This has provided us with a
great amount of detailed knowledge concerning the
strengths and weaknesses of using the DMI register as
a sampling frame. We first used the register as a
sampling frame for surveys in eight states conducted
for a private foundation. We then used the DMI
register for the National Employer Health Insurance
Survey (NEHIS) conducted for the National Center for
Health Statistics, Health Care Finance Administration,
and Agency for Health Care Policy Research. NEHIS
involved the selection of over 100,000 private-sector
establishments from 750 strata. This paper summarizes
our findings of the suitability of the DMI register to
serve as a sampling frame for future government
surveys.
No establishments are purposely excluded from
the DMI register. Both the public sector and private
sector are on the frame, as are some self-employed.
2.2
Measures of Size
The DMI register contains a large number of
variables that might be of use as measures of size for
stratifying and/or selecting samples with unequal
probabilities. These variables include the number of
employees at the establishment, the number that report
to that headquarters, and the total in the firm; the
percentage employee growth in the total firm over the
last three or five years; and annual sales volume and
percentage sales growth over the last three or five
years. One limitation is that the number of employees
21
at the establishment is missing approximately 13
percent of the time. The authors did not use the sales
data and therefore cannot attest to their response rates.
2.3
payroll. After deleting known government primary
SIC codes from the October 1993 DMI register, the file
contained 9,900,000 "location" records (while DMI
provides a def'mition of "establishment" to all
respondents, the flame is based on the list of
establishments identified by each respondent). Some
scattered government activities still remained in the
DMI file, but their numbers were fairly small and
could be further reduced by identifying government
codes among secondary SIC codes.
Definitions
DMI defines an establishment as one location of
one firm. Multiple locations of a firm are theoretically
reported separately, but are defined by respondents.
Multiple establishments (of different firms) can be at
the same location. Field employees (e.g., sales reps,
interviewers) are treated as branch offices if they have
an office, even if its in their home.
The DMI count compares with 6,176,000
establishments included in the 1990 Census Bureau
estimates (Statistical Abstract, 1993, Table 859). This
Census Bureau estimate excludes self employed and
farms. Both of these classes are covered by DMI,
although their coverage is uncertain. Census also
includes establishments in business any time during the
calendar year; the impact of this definition is uncertain
in validating a comparison with DMI. Because of the
differences in definitions (e.g., differences in time
period, as of a specific date versus any time during the
year, temporary hiring of employees by otherwise selfemployed with no employees, etc.), one cannot draw
conclusions about the coverage of establishments (or
locations) by the DMI compared to the Census
coverage.
All intermediate headquarters and ultimate
headquarters are identified by DMI, as are which
branches and lower-level headquarters report to which
higher-level headquarters. There are procedures for
identifying these connections, although they must be
asked for specially.
DMI's instructions are that owners/proprietors
are to be included in the number of employees, as are
part-time employees. Holding company's whose only
employees are shared with another company's have its
officers listed as its employees. These instructions are
different from those used by BLS, for example, where
owners/proprietors are not included in the count of
employees.
2.4
One facet of coverage is the extent to which
new establishments are covered. This is a problem
with government statistics as well as the DMI. Westat
asked D&B to provide the age distribution of
establishments in the eight states included in the
private foundation study (Table 1). "Age" reflects
either the year the firm (not necessarily the
establishment) started or the year in which there was a
reorganization of the firm.
Other Available Information
DMI attempts to list the names of the business
principal, all corporate officers or partners, and any
non-officer individuals responsible for major business
segments (e.g., sales & marketing, data processing).
They
provide
secondary
names
for
many
establishments. They also have indicators for when
establishments are foreign owned, SBA small
businesses, female owned, or minority owned, as well
as an identifier of its census tract.
3.
Survey Findings
3.1
Coverage
The D&B report is dated October 27, 1993, and
it may be presumed that several months elapse,
typically, before a new establishment (location) is
entered into the file. As a result, assuming that the
number of new establishments is approximately
constant, the 1993 data may represent only about onethird of the year-to-date's new establishments. It is also
likely that some 1992 new locations may not have been
entered at the time the tabulation was prepared. One
may also presume that the records for (say) 1990,
1991, and 1992 include numerous new small
businesses that failed or were consolidated or dissolved
and whose records were still in the file in 1993.
Taking these factors into account, the estimated
number of new establishments not in the file as of
October 1993 is approximately 1 or 2 percent of the
Estimates
of the
number
of private
establishments vary considerably due primarily to
differences in definitions, including such factors as
whether the self-employed are included and the
identification of the establishment as a location or as an
identifiable function, such as management of the
22
Table 1. Age of establishments on the DMI frame for eight states
Number of locations
445,888
23,031
21,792
20,526
16,300
5,569
533,106
Year started
0-1988
1989
1990
1991
1992
1993
Total
NEHIS sample is close to the other two estimates.
Comparisons of the NEHIS estimates with the two
government sources are shown in the second and third
to last columns, and a comparison of the two
govemment sources is given in the final column. The
unweighted standard deviations of these last three
columns are 7.1 percent, 8.1 percent, and 6.2 percent,
respectively. Thus, the BLS and Census state estimates
are almost as variable from each other as the NEHIS
estimates are from either of them.
total file and may be approximately one-half of the
businesses less than 1 year old as of that time.
Def'mitions of employees also vary, but not as
much as def'mitions of establishments. BLS reported
approximately 89,858,000 private sector employees in
nonfarm establishments in 1992 (May issue of
Employment and Earnings, reported in the Statistical
Abstract, 1993, Table 661).
BLS uses both a
household survey and a survey of establishments to
report on employment, and the latter survey is the basis
for the data reported above. Number of employees was
estimated from the DMI frame by summing (across a
sample of 100,000 establishments selected for the
NEHIS) the products of the sampling weight and the
average number of employees in each employment size
stratum for each state. This estimation process yielded
approximately 103,486,000 private sector employees.
The differences between this estimate and the estimates
by the Census Bureau and BLS are due to DMrs
inclusion of some farms and self-employed persons,
differences in the time periods, and the fact that a
nontrivial part of the DMI file covers out-of-business
locations whose records have not been purged from the
frame. Final NEHIS survey responses were poststratified to adjusted BLS unpublished totals that
attempted to take account of those structurally not
included in BLS publications, in particular the more
than
2,000,000
employees
not
covered
by
unemployment insurance. The post-stratified number
of employees was approximately 98,000,000.
The
coverage
of smaller
and
newer
establishments on DMI's register (or on either
government register) is not as complete as it is for
other establishments. However, on the basis of the
above analysis, there is little reason to believe that
there is serious undercoverage in the general DMI
register other than self-employed with no employees
(SENE). It was decided that an altemative frame was
needed for the SENEs.
At the request of the govemment two potential
register supplements were examined further. First,
D&B has a separate file containing 2.6 million
incomplete records. It was thought that this might
contain records for new establishments that had not yet
been introduced into the DMI flame. Second, how
accurate is DMI's identification of the self-employed
with no employees? These should be identified as
having one employee on DMI (the instructions for
DMI are to include the owner in the number of
employees) and not be part of a multi-establishment
firm. Samples of 304 and 198 records were selected
from each of these types, respectively, and phone calls
were made to determine the status of the
establishments.
Table 2 shows, by state, the comparison of the
estimates of number of employees from three data
sources. It should be noted that if the out-of-scope
locations are in the neighborhood of 18 percent, the
Column totals in Table 2 differ from the sum of state
estimates because of differences in the estimation
procedures for the states and the national totals.
23
,,
,
Percent
83.64
4.32
4.09
3.85
3.06
1.04
100.00
Table 2. NEHIS flame compared with BLS and Census frames (employment in thousands)
State
AK
AL
AR
AZ
CA
CO
CT
DC
DE
FL
GA
HI
IA
ID
IL
IN
KS
KY
LA
MA.
MD
ME
MI
MN
MO.
MS
MT
NC
ND
NE
NH
NJ
NM
NV
NY
OH
OK
OR
PA
RI
SC
SD
TN
TX
UT
VA
VT
WA
WI
WV
WY
BLS
BLS
All
! excluding
overnment*
Total* . government* ~o
174
247
73
1,335
1,673
338
795
963
168
1,242
1,520
278
10,049
12,140.
2,091
1,302
1,593
291
1,317
1,522
205
390
677
287
i
344
49
295
4,470
5,3391
869
2,447
2,982
535
430
541
111
1,029
1,251
222
328
416
88
4,437
. 5,205.
768
2,151
2,538
387
1,115
226
889
1,237
1,511
274
1,287
1,625
338
2,400
2,778.
378
1,665
2,079
414
416
512
96
3,276
3,917 i
641
2,186
347
1,839
1,948
2,320
372
962
209
753
243
317
74
2,623
3,133
510
210
277
67
747
148
599
413
485
72
2,872
3,441
569
442
598
156
641
86
555
. 7,728.
1,428
6,300
4,842
734
4,108
940
1,210
270
1,271
231
1,040
5,071
698
4,373
421
61
360
1,529
292
1,237
242
307
65
2,232
355
1,877
7,271
1,334
5,937
611
768
157
2,251
2,840
589
249
43
206
2,216
423
1,793
2,349
356
1,993
639
132
507
205
57
148
i
Totalst 108,743~
18,962
89,858
Estimates
from
Census* *
158
1,341
751
1,236
11,317
1,248
1,482
427
311
4,608
2,499
433
1,008
3OO
4,647
2,150
894
1,186
1,271
2,773
1,811
424
3,411
1,832
2,013
725
222
2,675
196
587
441
3,220
418
537
7,074
4,246
941
1,017
4,599
393
1,266
215
1,869
5,865
571
2,321
215
1,761
1,949
483
132
93,476
Estimates
from sample
design
166
1,530
890
1,436
11,825
1,505
1,675
518
364
4,869
2,609
429
1,258
384
5,274
2,332
1,040
1,289
1,449
2,903
1,853
501
3,865
1,999
2,285
856
283
2,826
242
693
469
3,618
560
609
7,811
4,515
1,185
1,188
4,868
424
1,337
273
2,061
7,182
690
2,507
235
1,875
2,193
566
172
103,486
Sample
minus
BLS(%)
-4.6
14.6
11.9
15.6
17.7
15.6
27.2
32.8
23.4
8.9
6.6
-0.2
22.3
17.1
18.9
8.4
17.0
4.2
12.6
21.0
11.3
20.4
18.0
8.7
17.3
13.7
16.5
7.7
15.2
15.7
13.6
26.0
26.7
9.7
24.0
9.9
26.1
14.2
11.3
17.8
8.1
12.8
9.8
21.0
12.9
11.4
14.1
4.6
10.0
11.6
16.2
15.2
Sample
BLS
minus
minus
Census(%) Census(%)
5.1
10.1
14.1
-0.4
18.5
5.9
16.2
0.5
4.5
-11.2
20.6
4.3
13.0
-11.1
21.3
-8.7
17.0
-5.1
5.7
-3.0
4.4
-2.1
-0.9
-0.7
24.8
2.1
28.0
9.3
13.5
-4.5
8.5
0.0
16.3
-0.6
4.3
8.7
14.0
1.3
4.7
-13.5
2.3
-8.1
18.2
-1.9
13.3
-4.0
9.1
0.4
13.5
-3.2
18.1
3.9
27.5
9.5
5.6
-1.9
23.5
7.1
18.1
2.0
6.3
-6.3
12.4
-10.8
34.0
5.7
13.4
3.4
10.4
-10.9
6.3
-3.3
-0.1
25.9
16.8
2.3
-4.9
5.8
-8.4
7.9
-2.3
5.6
12.6
27.0
0.4
10.3
1.2
22.5
20.8
7.0
8.0
-3.0
-4.2
9.3
1.8
6.5
2.3
12.5
5.0
17.2
12.1
30.3
10.7
* U.S. Bureau of Labor Statistics, Bulletin 2320: and Employmentand Earnings, monthly,May Issue; Statistical Abstract, 1993.
** U.S. Bureau of the Census employees"on-boardMarch 12, 1990.
t Statesdo not add to totals due to differences in methods of estimation.
24
-3.9
Of the 304 incomplete records, only 64 were
identified as current business establishments. A similar
number (68) were clearly out-of-scope, and the
remaining were not successfully contacted despite
repeated attempts. Forty-three of the current business
establishments stated that they had more than one
employee.
These were recontacted to try and
determine whether this frame contained many
establishments that had not yet made it into the DMI
frame (since the 2.6 million records do not have DUNS
numbers they cannot be directly matched against the
DMI frame).
determined to be SENEs were considered out-of-scope
for the NEHIS. It was estimated that between onefourth a n d one-third of t h e 1.2 million DMI
establishments with one employee on the frame are
likely to actually have more than one employee at the
time of the survey (in addition to the 46 such
establishments that were identified out of the 198, there
were 38 nonrespondents, some of whom also are likely
to have more than one employee). Therefore, it was
decided to include a sample of such establishments in
NEHIS. If they responded that they were either part of
a multi-establishment firm or had more than one
employee they were eligible for the survey.
Of the ten establishments that had stated that
they were part of multi-establishment firms six
participated in the call back. These included locations
of well-known national corporations. All six had
existed for at least five years, offer health insurance,
and have from 5 to 300 employees.
Table 3 shows the national distribution of
private sector establishments on the DMI frame by firm
and establishment sizes.
An abstract file should be purchased from D&B
to improve the accuracy of the resulting sample. The
abstract file contained size, corporate structure, and
other information for every establishment on the DMI
frame but does not include establishment names and
street addresses. Using the abstract file, Westat was
able to examine alternative designs for producing a
sample that would result in more accurate estimates
than would have been possible under the original plan.
This approach offers the advantages outlined below.
Twenty-seven of the 33 single location firms
participated.
One had only been in business six
months and two others between one and two years.
The majority (at least 19) had been in business more
than five years.
Only two had more than ten
employees and eight offered health insurance.
After discussions with the government it was
agreed that this 2.6 million record flame did not
contain a high number of new establishments. Most of
these records that did correspond to establishments
may also be found on the regular DMI flame. Thus it
was decided not to include any sample from the 2.6
million incomplete D&B record frame in the NEHIS.
•
It facilitated the definition of "firm. "
It provided a fixed reference frame that
remained constant during the survey,
thus facilitating comparison between
frame and sample results.
Sixty percent (120) of the sample of records on
DMI reporting one employee responded that they were
indeed establishments, but only 74 of the 198 were
self-employed with no employees.
The SENE
population of inference was therefore not covered by
the sample from the DMI frame. Sampled DMI cases
It facilitated the comparison of sample
allocations and the selection of more
nearly optimal designs than could have
been achieved otherwise.
Table 3. DMI sample frame by firm and establishment size
Establishment size
6-24
25-49
Firm
size
Unknown
1 no other
1-49
50-999
1000+
Total
1,197,959
26,094
29,504
1,253,557
1,105,384 4,884,932 1,539,707
22
105,387
147,541
29
147,134
83,353
1,105,435 5,073,672 1,834,382
1-5
25
•
\.
"
,4
210,621
61,946
51,041
323,608
50-249
0
190,392
77,550
267,942
250-999
1000+
Total
0
16,991
18,825
35,816
0
0
7,478
7,478
8,938,603
548,373
4i4,914
9,901,890
and those of unknown size were sampled with the 5-9
size class. The eligibility rules were also somewhat
different from that used for the NEHIS.
It permitted the examination of sample
sizes per detailed stratum and facilitated
the collapsing of detailed strata to
increase the likelihood of obtaining
sufficient complete interviews for
variance computations.
A separate issue is the accuracy of the DMI
register with respect to the number of employees in
establishments. These estimates were used to create
one of the principal stratifications for sample allocation
and selection for both health insurance studies. Table
6 compares the classification of number of employees
in the DMI abstract flame with the same classification
of number of employees found during interviewing.
It allowed for improved quality control
over the selection of the sample of
establishments, because the selection
was performed at Westat rather than by
D&B.
3.2
The frequencies are heavily loaded along the
principal diagonal (boxed), which indicates that, for the
most part, the DMI estimates of the number of
employees were reasonably accurate. In some cases,
however, the estimates of size were quite different
from what was reported during data collection. Weight
trimming in some cases can avoid the domination of
the estimates by a few establishments with extremely
large weights. Such cases are infrequent enough that
they do not destroy the effectiveness of the DMI
estimates for stratification purposes. For the NEHIS
the "unknowns" were classified into the 1-5 employee
size stratum for firms with less than 50 employees and
into the 6-24 size stratum for larger firms. This proved
to be the correct classification for capturing the modal
group, as shown in the table. Unfortunately, the
"unknowns" ranged widely in size and are as variable
as the rest of the data. In summary, it appears that the
DMI data were effective in classifying the frame into
size strata.
In-scope and Eligibility
One weakness of the DMI register is that it
contains a lot of listings that are not current
establishments. Such listings can be as frequent as 40
percent for very small establishments of small firms
(see first few lines of Table 4). Many of these
represent
temporary
business
locations,
e.g.,
construction offices, that are no longer used. The
effect of this when using the DMI register as a
sampling frame is that it increases screening costs,
especially for surveys that emphasize very small
establishments. It does not, however, introduce any
biases into the estimators.
Table 5 shows similar results by establishment
size from the earlier survey conducted in eight states.
That survey did not take firm size into consideration.
Establishments with only one employee were excluded
26
Table 4. In-scope and eligibility rates from NEHIS
Firm
size
<50
<50
<50
<50
<50
505050505050-
999
999
999
999
999
999
1,000+
1,000+
1,000+
1,000+
1,000+
1,000+
1,000+
Establishment
size
unknown
1 no other location
1-5
6-24
25 -49
unknown
1-5
6-24
25 -49
50- 249
250- 999
unknown
1-5
6-24
25 - 49
50- 249
250- 999
1,000+
Total
Located by
telephone
interviewers
65.0%
80.0%
80.0%
92.0%
96.0%
98.0%
97.0%
98.0%
99.0%
98.0%
99.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
88.0%
In scope rate
(is in
Screener
response rate
business)
75.0%
82.0%
84.0%
81.0%
86.0%
80.0%
94.0%
80.0%
96.O%
79.0%
80.0%
90.0%
92.0%
80.0%
76.0%
95.0%
97.0%
80.0%
96.0%
79.0%
78.0%
97.0%
70.0%
90.0%
77.0%
90.0%
72.0%
93.0%
72.0%
94.0%
72.0%
96.0%
75.0%
96.0%
76.0%
95.0%
90.8%
78.1%
Table 5. In-scope and eligibility rates from earlier survey
Establishment
size
2-"4
5--9
10--24
25+
Located by
telephone
interviewers
78.6%
88.2%
93.1%
96.5%
In scope rate (is
in business)
95.4%
97.6%
98.1%
97.8%
Eligibility rate
(private sector with
employees)
72.8%
89.5%
94.7%
90.5%
27
Eligibility rate
(private sector
with employees)
57.0%
22.0%
66.0%
91.0%
89.0%
86.0%
89.0%
94.0%
84.0%
88.0%
90.0%
79.0%
89.0%
92.0%
86.0%
84.0%
85.0%
83.0%
76.2%
Table 6.
Effectiveness of the DMI employment data for stratification
Firm size
stratum
DMI
size stratum
1-5
<50
Unknown
1, No other
943
735
105
8
84
2
23
2
11
0
1,714
822
1,470
75
50
19
3
8,540
1,471
4,667
281
62
2
1
6,484
25-49
60
329
803
138
1,336
Unknown
50
56
29
34
174
1-5
387
141
20
28
583
6-24
122
902
113
68
1,217
25-49
18
150
418
171
769
50-249
130
289
708
'3,861
146
9
5,143
250-999
16
22
23
178
716
30
985
Unknown
6-24
1,000+
All
, 6,923
[
32
61
29
52
17
14
205
1-5
382
121
16
22
6
1
548
6-24
183
965
122
46
10
6
1,332
25-49
23
139
370
117
16
8
673
50-249
27
87
257
'"1,687
121
32
2,211
250-999
10
14
16
244
922
67
1,273
1,000+
4
14
38
150
698
907
1,025
735
665
75
163
8
170
2
45
2
25
0
2,093
822
1-5
7,692
1,732
111
100
30
6
9,671
6-24
1,776
6,534
516
176
21
10
9,033
25-49
I01
618
111591
426
30
12
2,778
50-249
157
376
965
i-5,548
267
41
7,354
250-999
26
36
39
422
97
2,258
1,000+
4
14
11,516
10,050
Unknown
1, No other
Totals
3.3
Total
548
75
1-5
50-999
Employee size class found during data collection
6-24
25-49
50-249
250-999
1,000+
3,396
Linkages and Firm Size Comparison Between
Frame and Survey
il,638
38
150
6,882
2,183
]
698 1
889
907
34,916
DIAS code provides a sort order that groups together
all establishments in a firm, from ultimate headquarters
down to each branch.
The hierarchy code then
identifies how many levels of headquarters are between
the establishment and its ultimate firm headquarters.
Combining this information with other DMI
information such as the field that reports the DUNS
If explicitly requested, DMI can provide what
they refer to as DIAS and hierarchy codes with their
sample abstract file. These codes can be used to
identify the establishment structure within firms. The
28
number of the headquarters to which each
establishment reports, allows users to understand the
structure of the entire firm. It can also allow users to
subdivide firms to suit their analytic needs, for
example to separate General Motors' major
subsidiaries from its automotive core.
benefits, which varied considerably among large flu'ms.
Some were very centrally organized, while others had
separate benefits administration for subsidiaries or for
regions or divisions. Generally, the firm size reported
in the survey seemed to fall somewhere in between the
entire-firm and headquarters sizes shown on the DMI
register. However, the NEHIS data do not support
determination of, for example, how closely any of the
firm size figures match with the operational definition
of"health benefit groups." Further, in modeling policy
changes using NEHIS data (or other survey data
including firm size), one should consider the effects of
policy changes on how corporations define their own
structures. For example, the Clinton health plan would
have exempted firms over a certain size from some
regulations. If firms felt that it was in their interest to
be exempt, they would perhaps organize to maximize
their firm size.
Based upon both the NEHIS pre-screening, in
which alphabetic matches with DMI-based multiestablishment firms were checked for inclusion in
corporate families, and on comparison of survey
responses and flame linkage information of whether
establishments were part of multi-establishment firms
or not, the corporate linkage information in the DMI
file appears to be reasonably reliable. Only four
percent of alphabetic matches (that were not otherwise
identified by DMI) were found to be actually in the
corporate structure with which they were matched, and
13.8 percent of establishments were classified
differently as to inclusion in a larger firm between the
frame and the survey. Particularly considering the
difference in time between the flame data and the
survey data and the likelihood of some response error
in each source, the Kappa value of 0.71 indicates very
good agreement between the DMI frame and
respondent reports of whether or not an establishment
was part of a multi-establishment firm.
4.
Conclusions
In general the DMI register works well as a
sampling flame for high quality establishment-based
surveys. The coverage of establishments appears to be
near 98 or 99 percent (based on a study in eight states).
Family farms and self-employed coverage is less,
although most of these with employees are probably
covered. Other missing establishments are likely to be
new small establishments. Approximately one-half of
new establishments get on the list within the first year
(from the same study of eight states). Coverage of all
employees is probably higher than the coverage of
establishments since it is much more likely for large
establishments to be included in the register.
To summarize, very few establishments that
DMI says are unattached responded that they were
really part of a larger firm. The reverse is hard to
measure since in large firms with subfirms it is not
always clear to respondents which "firm" they belong
to. Thus, it appears that the DMI flame may be used
with some confidence in drawing samples for firmlevel estimates.
These weaknesses are similar to those of the
Census and BLS frames.
The Census frame is
probably at least as likely to miss small new
establishments,
especially
nonmanufacturing
establishments in small firms.
Self-employed are
problematic for all business registers but are excluded
from the Census flame. The BLS flame also misses
many new establishments in new firms since it depends
on establishments beginning to pay unemployment
insurance to be identified for their register. The BLS
flame also does not include employees who are not
eligible for unemployment insurance, such as
employees of many religious institutions and railroads.
BLS estimates these employees to number around
2,000,000.
The agreement between survey and frame
information on firm size was more problematic. While
more than 50 percent of establishments in the survey
sample had relative agreement (reports with ratios
between 2/3 and 3/2) in reports of firm size, there were
many cases with large proportional disagreements.
While some of these disagreements may be trivial (the
difference between firms of size two and of size four,
for example), the fact that the distribution of ratios was
skewed as much or more for multi-establishment firms
as for single-establishment firms indicates that the
discrepancies may not easily be ignored.
There are many different definitions of "firm,"
and the def'mition chosen should be driven by the
analysis being conducted. In the NEHIS, the concem
was with the operational organization of health
The main weakness of the DMI register is its
inclusion of many small establishment listings that are
29
Reference
no longer in business. This requires extra screening,
especially in surveys that emphasize small businesses.
This weakness, however, does not cause bias, only an
increase in costs over what would result from a cleaner
list. Use of the DMI abstract file allows for higher
quality control on sample selection and for the
implementation of more complicated sample designs.
Overall, careful use of the DMI register can result in
surveys that meet the standards expected of high
quality government data collection.
Phillips, Bruce D. (1993). Perspectives on Small
Business Sampling Frames.
Proceedings of the
International Conference on Establishment Surveys,
American Statistical Association, Alexandria, VA.
pp. 177-184.
30
File Type | application/pdf |
File Title | 1997: QUALITY OF THE DMI FILE AS A BUSINESS SAMPLING FRAME |
File Modified | 2002-09-14 |
File Created | 2002-08-02 |