Information Collection Request
Revision
National Program of Cancer Registries Cancer Surveillance System
OMB No. 0920-0469
Supporting Statement: Part A and Part B
Program Officials |
|
Cheryll C. Thomas, MSPH Epidemiologist Cancer Surveillance Branch Centers for Disease Control and Prevention Atlanta, Georgia 770-488-3254 (office) 770-488-4759 (fax) |
Reda J. Wilson, MPH, RHIT, CTR Program Consultant Cancer Surveillance Branch Centers for Disease Control and Prevention Atlanta, Georgia 770-488-3245 (office) 770-488-4759 (fax)
|
October 2009
TABLE OF CONTENTS
A. JUSTIFICATION
A1. Circumstances Making the Collection of Information Necessary
A2. Purpose and Use of the Information Collection
A3. Use of Improved Information Technology and Burden Reduction
A4. Efforts to Identify Duplication and Use of Similar Information
A5. Impact on Small Businesses or Other Small Entities
A6. Consequences of Collecting the Information Less Frequently
A7. Special Circumstances Relating to the Guidelines of 5 CFR 1320.5
A8. Comments in Response to the Federal Registrar Notice and Efforts to Consult Outside the Agency
A9. Explanation of Any Payment or Gifts to Respondents
A10. Assurance of Confidentiality Provided to Respondents
A11. Justification for Sensitive Questions
A12. Estimation of Annualized Burden Hours and Costs
A13. Estimates of Other Total Annual Cost Burden to Respondents or Record Keepers
A14. Annualized Cost to the Federal Government
A15. Explanation for Program Changes or Adjustments
A16. Plans for Tabulations and Publication and Project Time Schedule
A17. Reason(s) Display of OMB Expiration Date is Inappropriate
A18. Exceptions to Certification for Paperwork Reduction Act Submissions
B. COLLECTION OF INFORMATION EMPLOYING STATISTICAL METHODS
B1. Respondent Universe and Sampling Methods
B2. Procedures for the Collection of Information
B3. Methods to Maximize Response Rates and Deal with Non response
B4. Test of Procedures or Methods to be Undertaken
B5. Individuals Consulted on Statistical Aspects and Individuals Collecting and/or Analyzing Data
ATTACHMENTS
1a Cancer Registries Amendment Act, Public Law 102-515
1b Section 306 of the Public Health Service Act [42 U.S.C. 242k]
2a Federal Register Notice
2b Summary of Public Comments and CDC Response
3 Data Collection and Data Flow Process for NPCR CSS
4a NPCR CSS Submission Specifications
4b OMB Burden Statement on NPCR CSS Web Site
5 NPCR CSS Data Release Policy
6 Participants in Consultation outside the Agency
7 308(d) Assurance of Confidentiality Certificate Approval
8 Institutional Review Board Approval Notification
9 Cases Reported to CDC in Year 2009 Submission
10 Sample of NPCR CSS Data Evaluation Report
ABSTRACT
CDC requests OMB approval to continue collecting cancer incidence information through the National Program of Cancer Registries Cancer Surveillance System (NPCR CSS) (OMB no. 0920-0469, exp. 1/31/2010). The NPCR CSS is the primary source of information for United States Cancer Statistics (USCS), which CDC has published annually since 2002. Information is collected electronically once per year. Changes to be implemented with this Revision include: (1) updated data definitions to reflect changes in national standards for cancer diagnosis and coding, and (2) a decrease in the number of respondents. The estimated burden per response will not change. Respondents will be 45 stated-based central cancer registries (CCR), the CCR of the District of Columbia, the CCR of Puerto Rico, and the CCR responsible for aggregating information from 10 flag territories and freely associated states in the Pacific Islands. The adjusted number of respondents will result in a reduction in the total estimated burden hours for the NPCR CSS. OMB approval is requested for a period of three-years.
At the HHS Comparative Effectiveness Research (CER) Federal Coordinating Council’s request, CDC’s Division of Cancer Prevention and Control (DCPC) developed a proposal to enhance data collection for CCRs. Upon approval of the entire project’s plan and receipt of additional funding, this ICR may be amended to incorporate the collection of data items necessary for comparative effectiveness research.
National Program of Cancer Registries Cancer Surveillance System
JUSTIFICATION
Background
The Centers for Disease Control and Prevention (CDC) is requesting OMB approval for the revision of the ‘National Program of Cancer Registries Cancer Surveillance System (NPCR CSS) (OMB Control No. 0920-0469, exp. 1/31/2010). The NPCR CSS is an electronic data management system that produces comprehensive national statistics on cancer by compiling information collected through central cancer registries located in 45 states, the District of Columbia, Puerto Rico, and the U.S. Pacific Island Jurisdictions; the central cancer registries in the remaining 5 states are funded by the National Cancer Institute (NCI). Information from the central cancer registries funded by CDC is reported annually and each respondent receives a feedback report on its annual submission.
Recognizing the public health value of comprehensive cancer surveillance at the state and national level, Congress mandated the National Program of Cancer Registries (NPCR) in 1992 by enacting the Cancer Registries Amendment Act, Public Law 102-515 (Attachment 1a), later incorporated into the Public Health Service (PHS) Act [42 U.S.C. 242k]. This legislation authorizes the CDC to provide funds to states and territories to: 1) improve existing cancer registries; 2) plan and implement registries where none existed; 3) develop model legislation and regulations for states to enhance the viability of registry operations; 4) set standards for data completeness, timeliness, and quality; 5) provide training for registry personnel; and 6) help establish a computerized reporting and data-processing system.
Information is collected and maintained at CDC under Section 306 of the Public Health Service (PHS) Act [42 U.S.C. 242k] (Attachment 1b).
Since the last clearance, there have been changes with the number of respondents (from 63 to 48) for the estimated burden. In fiscal year 2008, CDC awarded $46 million to fund 45 states, the District of Columbia, Puerto Rico, and the Pacific Island Jurisdictions for central cancer registries operations. Ten territories, commonwealths, and freely associated states in the Pacific Region combined their resources and applied as the Pacific Region Central Cancer Registry. While the five states funded by the NCI are eligible for funding from CDC, including them in the burden estimate artificially increases the burden. The U.S. Virgin Islands was eligible to apply for funds from CDC, but did not do so for the most current funding cycle. These changes have also been noted in the 60-Day Federal Register Notice (Attachment 2).
Since 2000, state and territorial cancer registries have been collecting and reporting cancer incidence data to CDC. Cancer registration within the United States involves multiple partners including physician offices, hospitals, state and local health departments, and professional organizations. In addition, the NPCR works with partners to facilitate the development and implementation of national data standards, which change over time to reflect changes in cancer diagnosis and treatment.
Cancer is the second leading cause of death in the United States, second only to heart disease. In 2005, the most recent year for which complete information is available, more than 500,000 people died of cancer and more than 1.34 million were diagnosed with cancer.(1) In addition to the personal impact of cancer, the financial burden is also substantial. The direct treatment costs of cancer in 2008 have been estimated at $93.2 billion, with additional indirect costs of $134.9 billion in lost productivity due to illness and premature death.(2) It is estimated that 9.4 million Americans are currently alive with a history of cancer.(3)
There are several effective primary and secondary prevention measures that could substantially reduce the number of new cancer cases and prevent many cancer-related deaths. To reduce the nation's cancer burden, behavioral and environmental factors that increase cancer risk must be reduced, and high-quality screening services and evidence-based treatments must be available and accessible, particularly to medically underserved populations. (4,5) The impact of primary and secondary prevention measures may be monitored through state cancer registries since they are designed to monitor cancer trends over time, determine cancer patterns in various populations, and guide planning and evaluation of cancer control programs.
The CDC and the states face the challenge of reducing cancer morbidity and mortality through prevention and early detection. Within CDC, the DCPC plans, directs, and supports cancer control efforts through collaboration with prevention partners in state health agencies; federal agencies; academic institutions; and national, voluntary and private sector organizations. To obtain a firm basis for such programs, DCPC is actively involved in surveillance and applied research.
Privacy Impact Assessment
The NPCR CSS has not been previously assessed. An overview of the data collection system and a listing of the items of information to be collected are provided in the subsequent sections.
Overview of the Data Collection System
NPCR-funded grantees collect and aggregate data for local public health purposes, and retain primary responsibility for information collection procedures. As depicted in Attachment 3, the first step occurs when a physician makes a diagnosis of cancer. Once a definitive diagnosis has been established and treatment planned, the data are entered into a local computer, usually with a commercial software package that includes quality control measures to assure high-quality data (step 2). NPCR has established a goal of no more than six months between the diagnosis and computerization of cancer data. Step 3 on the flow chart occurs when reporting facilities (hospitals, physicians' offices, radiation facilities, freestanding surgical centers, and pathology laboratories) perform additional quality control measures over and above what is performed at data entry. The data are sent to the central cancer registry (step 4). Quarterly submissions to the central registry are common, but larger facilities may report more often, and smaller facilities, less frequently.
After the central cancer registry receives the data, each incoming case must be checked against the existing database to ascertain if it is a new case or has been reported previously. At the same time additional quality control measures are applied (step 5). Based on this processing, the central cancer registry may return data to the reporting facility for clarification (step 6). Central cancer registries must link state incidence data with state mortality data to obtain cases that are first diagnosed at death (death certificate only cases). In addition to work that is done within state boundaries, central cancer registries are funded for inter-state data exchange to obtain cancer data on residents who travel to other states for diagnosis or treatment. Once quality control standards are met and the data are complete, they are ready for use and dissemination by the state and submission to CDC (steps 7 and 8). This process usually takes 12 to 18 months after the close of the year in which the cancer is diagnosed.
After CDC receives the data from the individual central cancer registries, they are processed and data evaluation reports are generated (step 9). The data evaluation reports include the results of evaluating state data by the data standards for completeness of case ascertainment and data quality as adopted by NPCR for program goals and a report detailing the states’ submission including details of edit errors.
Items of Information to be Collected
Once a year, CDC requests cumulative data from central cancer registries beginning with their reference year for NPCR (1995 for most programs) to the close of the most current diagnosis year (e.g., diagnosis 1995-2007 data were received in calendar year 2008, but named the 2009 submission since this is the calendar year that the data products will be released). CDC updates its longitudinal database each year with data from the most recent diagnosis year from the central cancer registries. The data items for the annual submission are based upon the North American Association of Central Cancer Registries (NAACCR) Standards for Cancer Registries, Volume II, which is a comprehensive reference to ensure uniform data collection, to reduce the need for redundant coding and data recording between agencies, and to facilitate the collection of comparable data among groups. To meet the needs of standard-setting organizations, central cancer registries, software vendors, and reporting facilities, NAACCR developed guidelines for major changes to be implemented on a three-year cycle (calendar years 2009 and 2012) and minor changes to be implemented on an annual cycle (calendar years 2010 and 2011).
Attachment 4a is a copy of the submission specifications that were sent to NPCR grantees in August 2008 providing instructions for the reporting of cancer incidence data to CDC in January 2009. Attachment 2 of the document contains a list of data items for the annual data submission. This table is updated annually based upon any changes outlined in the NAACCR Standards for Cancer Registries, Volume II.
Based upon this table, information in identifiable form (IIF) is collected by NPCR CSS. Specifically, date of birth and medical information about the types of cancer that occur (histology, morphology, and behavior), the anatomic location, the extent of disease at the time of diagnosis, the kinds of treatment received by cancer patients, and the outcomes of treatment and clinical management are collected.
Identification of Website(s) and Website Content Directed at Children Under 13 Years of Age
Data are compiled at the state central cancer registry, are transmitted to NPCR CSS via a secure Website and are encrypted during transmission. The encryption is accomplished via Secure Sockets Layer (SSL) strong encryption, the same level of protection used by e-commerce sites to protect financial transactions. This Website for data collection has no content directed at children under 13 years of age.
The NPCR CSS is designed to provide cancer incidence data that meet CDC’s responsibilities for public health surveillance while enhancing the quality, completeness, and timeliness of state cancer incidence data and monitoring progress toward the NPCR program objectives.
Grantees have been funded to improve the completeness, timeliness, and quality of population-based central cancer registry data. Under the current National Cancer Prevention and Control program announcement (CDC program announcement #DP07-703), grantees are requested to submit annual cancer incidence data to CDC.
As stated in Public Law 102-515 (Attachment 1a), state central cancer registries must collect each form of invasive cancer (with the exception of basal cell and squamous cell carcinoma of the skin). The central cancer registry routinely receives a standard set of data items on all cancer patients diagnosed in the state from hospitals, pathology labs, clinics and private physicians. Based on negotiations with the state, larger facilities may report monthly to the central cancer registry and smaller facilities less frequently. NPCR has established a goal of no more than six months between the diagnosis of cancer and receipt by the central registry. The cancer registries maintain these data items permanently in longitudinal databases that are used for public health surveillance, program planning and evaluation, and research.
Once a year, CDC requests cumulative data from central cancer registries beginning with their reference year for NPCR (1995 for most programs) to one year after the close of the most current diagnosis year (e.g., diagnosis 1995-2007 data in the calendar year 2008). Attachment 4a is a copy of the submission specifications that were sent to NPCR grantees in August 2008 providing instructions for the reporting of cancer incidence data to CDC in January 2009. CDC updates its longitudinal database each year with data from the most recent diagnosis year from the states.
A data contractor, ICF Macro, has been retained to assist with data management and analysis. Based on annual CSS submissions, standardized reports are generated by ICF Macro for the grantees and the CDC. These reports allow the program to monitor and evaluate the grantees performance with respect to the quality and completeness of their data. Data will be used by CDC for program planning and improvement and CDC will provide regular feedback to grantees based on their data submission and will tailor technical assistance as indicated. In particular, CDC monitors the ability of each grantee to reach data standards with respect to the completeness, timeliness and quality of the data.
Within 24 months past the close of the most recent diagnosis year, each NPCR grantees is expected to have registry data that include at least 95% of the expected, unduplicated cases where the expected cases are estimated by using methods developed by NAACCR (6). Because some cancer patients receive diagnostic or treatment services at more than one reporting facility, cancer registries perform a procedure known as unduplication to ensure that each cancer case is counted only once (7). Within 12 months past the close of the diagnosis year, grantees are expected to have registry data that include at least 90% of expected cases.
Within 24 months past the close of the diagnosis year, no more than 3% of cases are to have been ascertained solely on the basis of a death certificate. The proportion of cases ascertained solely on the basis of a death certificate, with no other information on the case available after the registry has completed a routine procedure known as “death clearance and follow back” (8) is an approximate measure of the completeness of case ascertainment.
Within 24 months past the close of the most recent diagnosis year, each NPCR grantee is expected to have registry data with no more than 2% of cases having missing information on sex; no more than 2% of cases having missing information on age; no more than 3% of cases having missing information on race; and no more that 2% of cases having missing information on county of residence.
Within 24 months past the close of the most recent diagnosis year, each NPCR grantee is expected to have registry data where at least 99% of the registry’s records passed a set of single-field and inter-field computerized edits. Computerized edits are computer programs that test the validity and logic of data components. Within 12 months past the close of the diagnosis year, grantees are expected to have data where at least 97% of the record pass edits.
These performance indicators may be modified or changed over time to more accurately reflect program priorities and areas of concerns. These performance indicators will also be used for reporting to CDC officials, Congress and other national stakeholders.
When standards of completeness and quality have been met, CDC will aggregate the data and it may be used for the following:
Cancer Surveillance: The CDC and the states face the challenge of reducing cancer morbidity and mortality through prevention and early detection. Effective cancer control requires the regular, ongoing collection and analysis of health-related data to monitor the frequency and distribution the disease in the population. The NPCR CSS will help CDC continue to meet its public health responsibilities by providing routine surveillance reports on the national cancer burden by demographic characteristics, tumor characteristics, survival time, and other items of interest to the public health agencies responsible for the design, implementation, and evaluation of cancer prevention and control activities. CDC’s prevention efforts will be enhanced by the ability to target areas with high rates of cancer with appropriate screening such as mammography, Pap smears, and colorectal cancer screening. The Agency for Healthcare Research and Quality (AHRQ) (http://www.ahrq.gov/) includes measures for effectiveness of care in cancer. The AHRQ Healthcare Quality Report includes rates of advanced stage female breast and colorectal cancer and all invasive cervical cancer by state.
Since 2002, CDC and the NCI, in collaboration with NAACCR have published United States Cancer Statistics (USCS) (http://www.cdc.gov/cancer/uscs). The USCS report contains a set of official federal cancer incidence statistics from each state that had high quality registry data. For cancer cases diagnosed in 2005, the most recent year for which federal data is available, 44 statewide population-based cancer registries and the District of Columbia, covering 96% of the U.S. population, met the eligibility criteria for inclusion in this report. Data for selected cancer sites are also available as pre-calculated counts and rates on the NCI/CDC State Cancer Profiles Website (http://statecancerprofiles.cancer.gov/) and on the CDC’s WONDER Website (http://wonder.cdc.gov/CancerIncidence.html).
The Council of State and Territorial Epidemiologists Association (CSTE) has voted to include cancer as part of the chronic disease indicators of the National Public Health Surveillance System (NPHSS) (9). The NPCR CSS continues to work to make timely data available for the NPHSS and publication in the Morbidity and Morality Weekly Report.
Program Planning and Evaluation: CDC sponsors and supports a wide variety of public health programs in the U.S. designed to monitor and reduce morbidity and mortality from cancer such as the National Comprehensive Cancer Control Program, National Tobacco Control Program, the National Breast and Cervical Cancer Early Detection Program, the National Colorectal Cancer Roundtable, prostate cancer control initiatives, and the National Skin Cancer Prevention Education Program. Increasingly, there is Congressional and public demand for federal agency documentation and accountability of achievement of program objectives and outcomes (e.g., the Government Performance and Results Act of 1993).
Cancer information collected under NPCR CSS will be very important to evaluate the success and remaining challenges in meeting CDC program goals and objectives, as well as to identify areas that could benefit from education and training, technical assistance, and other resources.
Research: When all NPCR-funded cancer registries meet the data criteria for publication in United States Cancer Statistics, the registries will provide geographic coverage for 96% of the U.S. population. State registries, with the exception of large densely populated states, lack the number of cases to permit calculation of stable rates for special populations and in some cases the general population. Currently available data are frequently inadequate for the surveillance of cancer in special populations such as racial and ethnic minorities, medically under-served groups, and populations at high risk for selected cancers that may not be identifiable in statewide databases because of small numbers or other special circumstances.
Public use and restricted access datasets are in development (Attachment 5) that will provide a statistical basis for analyzing the cancer burden on a regional and national level (http://www.cdc.gov/cancer/npcr/datarelease.htm).
Privacy Impact Assessment Information
Cancer registration is the fundamental method in the United States by which information is systematically collected about the occurrence of cancer, about the types of cancer that occur (histology, morphology, and behavior), the anatomic location, the extent of disease at the time of diagnosis, the kinds of treatment received by cancer patients, and the outcomes of treatment and clinical management. With the specificity of the medical information that is collected and the geographic coverage, NPCR CSS is able to derive more accurate and stable estimates of cancer incidence for population groups including racial and ethnic minorities, medically underserved groups, and other subpopulations; to conduct regional and national analyses to more accurately indentify geographic variability in cancer treatment practices; to promote greater access to cancer data for the general public, scientists, and policymakers.
Prior to the use of data for cancer surveillance, program planning and evaluation, or research, data standards for completeness and quality must be met. Date of birth is required of all central cancer registries because some of the computerized edits used to check the data for its quality are written using the entire date of birth. Computerized edits are also used for the data items associated with the occurrence of cancer, about the types of cancer that occur (histology, morphology, and behavior), the anatomic location, the extent of disease at the time of diagnosis, the kinds of treatment received by cancer patients, and the outcomes of treatment and clinical management.
When the data have met the standards for completeness and quality, the data can be used. Attachment 5 outlines two types of data sets: public-use data sets (PUDS) and restricted-access data sets (RADS). Current users of the NPCR CSS data must sign a data release agreement as outlined in Attachment 5, which is updated annually.
PUDS are defined as data sets that are comprised of aggregated data (i.e., not individual case-specific data) that have been modified as needed, according to accepted procedures, to block breaches of confidentiality and prevent disclosure of the patient’s confidential information. (10-16) PUDS will not contain information that is identifiable or potentially identifiable according to currently accepted procedures for reducing disclosure risk. (10-16).
RADS are defined as versions of the full NPCR-CSS analytic data set (i.e., individual case-specific data that have been modified as needed to minimize (but may not remove entirely) the potential for disclosure of confidential information. RADS will not contain personal identifiers such as a patient’s name, street address, or social security number as this information is not transmitted by central cancer registries to CDC as part of their annual data submission. However, they may contain information that is potentially identifiable especially when linked with other data sets, such as the occurrence of a rare cancer in a person of a certain age or racial or ethnic group. Only the month and year (and not the full date of birth) are provided in this data set. Because restricted-access data sets may potentially contain identifiable information, states will have the option to not have their data included in RADS.
Since the information received by CDC or its contractor as part of NPCR CSS could lead to direct or indirect identification of cancer patients, CDC applied for and received 308(d) confidentiality protection approval in May 2000 (see Attachment 7). In addition, any data published from NPCR CSS in surveillance reports, either in printed copy or on the Internet, will be scrutinized to assure that the confidentiality of the individual is protected.
The central cancer registries send their data to the CDC electronically via a secure socket layer (SSL) encryption using standard data definitions and record layouts. Central cancer registries complete the accompanying data submission forms electronically.
NAACCR standards (17) help reduce errors and the electronic transmission of data will be efficient and minimize the reporting burden on the states.
At the national level, cancer incidence data are available through the NCI’s SEER Program, which represents 9-26% of the population of the U.S. (http://seer.cancer.gov). These four states receive joint funding from the two federal programs and report their data to both federal agencies. SEER data are of high quality and are used to analyze long term trends in cancer incidence, patient survival, and for many other research purposes. While the SEER data are appropriate for analyses of major cancers in large population subgroups, they are not always adequate for analysis of U.S. regions, minority populations and rare cancer analyses. These data are not useful for most states for program planning and evaluation. When all NPCR-funded registries meet NPCR data standards, the NPCR CSS will cover 96% of the United States and will complement the SEER data to provide 100% coverage of the U.S. population. In the states where the SEER program covers a part of the state (Alaska, California, Georgia, Michigan, Washington) and the state participates in the NPCR, there is no duplication of effort. The SEER program reports data from their catchment area to the NPCR-funded state central cancer registry. Four additional states (California, Kentucky, Louisiana, and New Jersey) receive funding from NCI beginning in 2002 to enhance the representativeness of the SEER program.
NAACCR plays a leadership role in setting standards for the collection of cancer data and currently publishes population-based state cancer incidence data and aggregated state data yearly in Cancer Incidence in North America (CINA) (18). The submission of data to NAACCR is voluntary and varies from year to year. No public use data set is available to meet both public health surveillance needs and NPCR needs for program planning and evaluation.
The National Cancer Data Base (NCDB) from the American College of Surgeons (ACoS) contains data items required by the Commission on Cancer Approvals Program. NCDB is based on approximately 1,400 participating hospitals. The program was started in 1989 and approximately 75% of all U.S. cancer cases are collected annually. The data are not population-based since NCDB does not collect all cancer cases in a defined geographic area and cannot be used to calculate incidence rates.
While there are a number of cancer registration activities in the U.S., it is clear that the resulting data do not meet the public health need for a national cancer surveillance system. The NPCR CSS is unique in meeting the national need for a population-based dataset with adequate numbers of rare cancers, representation of minority populations, and state-based data for program planning and evaluation.
No small businesses will be involved in this study.
The NPCR CSS data aggregation occurs on an annual basis in November in place of a quarterly written report. The ability of CDC to monitor and improve program effectiveness would be compromised if data were collected less frequently. It is essential that CDC and State program managers evaluate program strengths and weaknesses on an annual basis and make adjustments. It is also important to provide annual information on the national cancer burden to CDC officials, Congress, constituents, and other Federal, State, and local agencies.
There are no legal obstacles to reduce the burden.
This request fully complies with the regulation 5 CFR 1320.5.
A. A 60-Day Federal Register Notice was published on July 27, 2009 in Volume 74, No. 142 pp. 37038-37039 (Attachment 2a). One public response was received; see Attachment 2b for the original inquiry and response. No changes were made to the proposed project.
B. Attachment 6 contains a list of experts in cancer registration that met with NPCR staff on August 12, 1998 to provide expert advice on data aggregation. These experts include representatives from grantees, NAACCR, NCI, and the American Cancer Society. There were no major problems to be resolved.
In May 1999, NPCR distributed a Rationale and Approach Paper for NPCR CSS to states and national partners (e.g., ACS, NCI, NAACCR) and comments were solicited. The most frequently asked question was about confidentiality of data. Some states have legislation that restricts the exchange of data and some states have policies that discourage the practice. The CDC respects state laws governing data release, and will work with states on this issue. In response to these concerns, CDC applied for and received a Confidentiality Assurance. CDC has based its approach to confidentiality for NPCR CSS on that of the National Center for Health Statistics (NCHS). The NCHS has been successful in protecting confidential health data for more than ten years. An NCHS Confidentiality Expert has reviewed our data release policy.
Since the inception of NPCR CSS, CDC staff receive continuous feedback from grantees on the annual data submission packet (Attachment 4a) and the data release policy (Attachment 5) through regular scheduled conference call of the NPCR Central Cancer Registry Council and the NPCR Scientific Working Group. The two groups are facilitated by CDC and workgroup members include a subset of the NPCR grantees.
No payment will be made to respondents (grantees) to submit NPCR data to CDC.
Confidentiality and privacy are of paramount concern to the NPCR because of the confidentiality concerns of the grantees, the private nature of medical data in a cancer surveillance database, and the potential for direct and deductive identification of an individual in the NPCR CSS. After extensive discussions with the CDC Privacy Officer, CDC obtained an Assurance of Confidentiality (308(d)) on June 7, 2000 (Attachment 7). The proposed new data collection will be covered under an extension of the 308(d).
The risk of direct identification of an individual in NPCR CSS data is remote because personal identifying data (name, social security number and street address) will not be reported to the CDC. However, a unique identifier assigned by the state to each individual cancer patient will be reported to CDC. While each record constitutes a single primary cancer, it is necessary to identify multiple primary cancers in an individual. The grantees maintain the linkage information between the unique codes and the personal identifies in their database in order to respond and follow-up on data queries from CDC. Since multiple primary cancers are a matter of research interest, the public use files must also contain a unique identifier.
Of greater concern is the geographic data (e.g., county, census tract, zip code) that will be reported to CDC and the potential for deductive identification. Geographic data could be combined with other publicly available information and potentially be a threat to confidentiality. Because surveillance and analysis of cancer by county are of public health interest, NPCR proposes to make these data available, but to limit access, require a signed data release agreement, and provide guidelines for data use. CDC will create multiple datasets of increasing sensitivity with respect to geographic data (Attachment 5). In the first tier of data (the least confidential), state would be the smallest geographic unit released. More sensitive dataset would contain county-level data. The user would have to describe the need for county-level data. In data tiers one and two, other potential identifiers include date and place of birth, race, vital status, date of last contact and rare primary sites. These data will be examined prior to release, and if necessary recoded to protect small population subsets. For example, only the month and year will be provided for potentially identifying dates such as date of birth, diagnosis and death. Once tier one and two data have been examined and recoded, we believe that they will not pose a significant risk to confidentiality.
A third and most sensitive dataset would contain census tract and zip code in addition to the variables in the first and second datasets. This dataset is the most likely to create opportunities for deductive identification and as such, CDC intends to guard this dataset very carefully. To provide data, CDC would need a research protocol, local IRB approval, and a plan to assure confidentiality. If CDC staff were co-investigators with states on the analysis, an IRB protocol would be submitted to a CDC IRB for review. Data would be provided to meet specific needs and data items would be collapsed when necessary to protect confidentiality. Only a limited number of tier three analyses would be approved each year. The NPCR CSS data use agreement is based on the NCHS model.
To address the issue of deductive identification of an individual because of small numbers (e.g., in a census tract), guidelines from the NCHS Staff Manual on Confidentiality will be used (19). NCHS has guidelines for published data and one for micro-data files or public-use files. The guidelines for published data include: 1) “In no table should all cases of any line or column be found in a single cell”, 2) “In no case should the total figure for a line or column of a cross-tabulation be less than five unweighted cases", and 3) “In no case should a quantity figure be based upon fewer than five unweighted cases.” The guidance for avoiding inadvertent disclosures through the release of micro data tapes includes 1) “The tape must not contain any detailed information about the subject that could facilitate identification and that is not essential for research purposes (e.g., exact date of the subject’s birth)” and 2) “Geographic places that have fewer than 100,000 people are not to be identified on the tape.” These guidelines from NCHS will serve as a model for NPCR CSS as confidentiality procedures are established. In addition, the program will need to be attentive to changes in the environment that may impact efforts to maintain confidentiality.
The study protocol (#2594) for CSS has been reviewed and approved by a CDC Institutional Review Board (IRB). The most current notice of approval (September 23, 2008) is attached (Attachment 8). The Division of Cancer Prevention and Control maintains IRB approval through the annual continuation process.
Privacy Impact Assessment Information
A. This submission has been reviewed by ICRO, who determined that the Privacy Act does not apply. Although grantees have access to personally identifiable information, only de-identified records are transmitted to CDC. Additional information on privacy safeguards applicable to data collection, de-identification, coding, transmission, storage, and reporting appears below.
B. The NPCR CSS data are secured by technical, physical, and administrative safeguards. A data contractor, ICF Macro, in Bethesda, Maryland, has been retained to assist with data management and analysis. The safeguards are outlined below:
Technical
The NPCR CSS project is undergoing the required Security Certification and Accreditation process managed by CDC’s Chief Information Security Officer.
The NPCR CSS project data reside on a dedicated server that resides on ICF Macro’s local area network (LAN) behind the contractor’s firewall and is password protected on its own security domain. Access to the NPCR CSS server is limited to the contractor’s authorized project staff. No other non-project staff are allowed access to the NPCR CSS. All of the contractor’s project staff are required to sign a confidentiality agreement before passwords and keys are assigned. All staff must pass background checks appropriate to their responsibilities for a public trust position.
NPCR CSS data that are submitted electronically are encrypted during transmission from the states. They arrive on a document server behind the data collection contractor’s firewall. Each State has its own directory location so no State has access to another State’s data. The data are moved automatically from the document server to the NPCR CSS server.
Once the data have been compiled by the contractor and delivered to CDC via the document server behind the firewall, all NPCR CSS datasets are maintained for restricted access on CDC’s secure LAN server.
Physical
The contractor’s NPCR CSS server is housed in a secure, guarded facility. All contractor staff are issued identification badges. Elevator and stairwell access is controlled by key cards.
Receipt and processing logs are maintained to document data receipt, file processing, and report production. All reports and electronic storage media containing NPCR CSS data will be stored under lock and key when not in use and will be destroyed when no longer needed.
Once the data have been compiled by the data collection contractor and delivered to CDC, all NPCR CSS datasets are maintained for restricted access on a secure LAN server, which is housed in a secure facility. All CDC staff are issued identification badges and access to the building is controlled by key cards.
Administrative
CDC staff and the contract staff have developed a security plan to ensure that the data are kept secure and confidential. Periodic review and update of the data collection contractor’s security processes is conducted to adjust for rapid changes in computer technology and to incorporate advances in security approaches. The security plan will be amended as needed to maintain the continued security and confidentiality of NPCR CSS data.
All project staff receive annual security awareness training covering security procedures. The contractor’s project security team oversees operations to prevent unauthorized disclosure of the NPCR CSS data.
Once the data have been delivered to CDC, access to these datasets is only granted when appropriate confidentiality release forms have been signed and returned to the NPCR CSS Data Security Steward.
C. The respondents for the NPCR CSS are central cancer registries, not individuals. Each central cancer registry is responsible for working with local sources of cancer information to comply with applicable local requirements relevant to informing patients about the intended uses of the information collection and any plans for sharing the information.
D. NPCR-funded central cancer registries are required to report patient-level information to CDC annually. Confidentiality and privacy are of paramount concern to the NPCR because of the confidentiality concerns of the grantees, the private nature of medical data in a cancer surveillance database, and the potential for direct and deductive identification of an individual in the NPCR CSS. After extensive discussions with the CDC Privacy Officer, CDC obtained an Assurance of Confidentiality (308(d)) on June 7, 2000 (Attachment 7). The proposed new data collection will be covered under an extension of the 308(d).
The threat of direct identification of an individual in NPCR CSS data is remote because personal identifying data (name, social security number and street address) will not be reported to the CDC. However, a unique identifier assigned by the state to each individual cancer patient will be reported to CDC. While each record constitutes a single primary cancer, it is necessary to identify multiple primary cancers in an individual. The grantees maintain the linkage information between the unique codes and the personal identifies in their database in order to respond and follow-up on data queries from CDC. Since multiple primary cancers are a matter of research interest, the public use files must also contain a unique identifier.
This data collection includes sensitive information about cancer diagnosis and treatment, which is central to the purposes of the project. In addition, race and ethnicity data are collected per HHS guidelines, and for use in epidemiologic analyses. The information is required to meet cancer surveillance objectives.
A. Respondents are the 48 NPCR grantees (45 stated-based CCR, the CCR of the District of Columbia, the CCR of Puerto Rico, and the CCR that aggregates information from 10 flag territories and freely associated states in the Pacific Islands). The data items collected in the NPCR CSS are listed in Attachment 4a. The cancer incidence data are already collected and aggregated at the grantee level, thus, the additional burden of annual reporting to CDC is modest, consisting of some enhancements through linkages and algorithms and the time to electronically submit the data. NPCR program standards require funded awardees to submit initial data to the CDC twelve months after the close of a diagnosis year and more complete data at twenty-four months to obtain additional data and vital status from mortality data. The burden of reporting data to CDC is reduced by the use of data standards adopted by all NAACCR member registries as detailed in section A3 of the Supporting Statement. The estimated burden per response is 2 hours and the total estimated annualized burden hours are 96.
Table A12-A. Number of Respondents and Estimated Burden Hours
Respondents |
Number of Respondents |
Number of Responses per Respondent |
Average Burden per Response (in hours) |
Total burden in hours |
NPCR Awardees |
48 |
1 |
2 |
96 |
States prepare their data files and send them electronically to CDC. The web page displays the OMB control number, the expiration date and a burden statement (Attachment 4b). This information appears on the log in page of the website that the states use to transfer their files electronically.
B. The annualized cost to respondents of reporting data to CDC is estimated to be $3,312 per year. It is estimated that the following state cancer registry personnel will be required to help prepare and submit data electronically to CDC: data managers, and information technology staff and program directors. However, it should be noted that specific nature of the work in the central cancer registries do not correlate with the employment categories as outlined by the Department of Labor. The categories listed below are similar in job description to those in central cancer registries.
Table A12-B. Annualized Cost to Respondents
Type of Respondents |
Number of Respondents |
Total Burden Hours |
Hourly Wage Rate* |
Total Respondent Costs |
Operations Research Analysts
Database administrators
Epidemiologists Total |
48
48
48 |
1.0
0.5
0.5 |
$36
$35
$31 |
$1,728
$840
$744
$3,312 |
*Based upon U.S. Bureau of Labor Statistics. Occupational Employment Statistics. May 2008 National Occupational Employment and Wage Estimates. Washington, DC: U.S. Bureau of Labor Statistics. Available at: http://data.bls.gov/cgi-bin/print.pl/oes/2008/may/oes_nat.htm [accessed July 6, 2009.]
The computer hardware and software needed for an electronic data submission to CDC are readily available to grantees since they collect and distribute cancer incidence data for state purposes; hence no capital or maintenance costs are anticipated.
The average annual cost for the contractor for data collection is $1,506,958 per year for a five-year total of $7,804,790. A data management contract was awarded to ICF Macro in calendar year 2008. Additional annual costs include personnel costs of federal employees involved in oversight and analysis. The annual staff cost is estimated at $120,000 (1.0 epidemiologist FTE, 0.2 public health advisor FTE, and miscellaneous expenses include travel, etc.).
Table A14. Estimated Annualized Federal Government Cost Distribution
|
Annualized Cost |
CDC Personnel Subtotal |
$120,000 |
Data Contractor Subtotal |
$1,506,958 |
Total |
$1,626,958 |
This Revision includes a reduction in burden attributable to a reduction in the number of respondents. Respondents will be 48 NPCR awardees (45 stated-based CCR, the CCR of the District of Columbia, the CCR of Puerto Rico, and the CCR that aggregates information from 10 flag territories and freely associated states in the Pacific Islands). In the previous OMB approval period, the territories, commonwealths, or freely-associated states were counted as individual respondents. In the next OMB approval period, the 10 flag territories, commonwealths, and freely-associated states will be counted as one respondent to more accurately reflect funding, operations and actual response burden. The estimated burden per response has not changed.
CDC requests a 3-year clearance for the proposed, recurring data collection. Data will be received every year in November from grantees. In addition to data from the current diagnostic year, data will be requested back to the reference year for the program, which for most states is 1995. Data submissions will usually be a combination of new data from the most recent diagnostic year and re-submissions from previous years that are improved in quality and completeness. Each year the process of data submission, data editing, data enhancement, and creation of public use datasets will be repeated (Table A16). The schedule each year will be:
Table A16. Time Schedule for Data Reporting, Analysis and Publication
Tasks |
Schedule |
Data submission received from grantees |
November 30 |
Data processed and edited by CDC |
January |
Data analysis file created |
February |
Data Evaluation Reports and data edits returned to grantees |
April |
Public use datasets available for surveillance |
July |
United States Cancer Statistics published |
October |
There is no request for a date display exemption.
There are no exceptions to the certification.
B. COLLECTION OF INFORMATION EMPLOYING STATISTICAL METHODS
Respondents are the 48 states and territories that currently receive CDC funds from the NPCR to create and enhance central cancer registries. The maximum number of respondents possible is 63, which include all 50 states, 12 territories, and the District of Columbia that are eligible to apply for funds from NPCR.
Statistical methods are not employed. Data collection at the state level is population based and these data will be reported annually to NPCR-CSS. Attachment 9 lists the number of cases each state reported to NPCR-CSS in calendar year 2008. Over 17 million incident cases of cancer for diagnosis years 1995-2007 were reported to CDC from 45 states and the District of Columbia.
Required program data will be reported to CDC by grantees once a year in January in lieu of one quarterly report. Since the grantees collect and aggregate data for local public health purposes, they have the primary responsibility for information collection procedures. NPCR-funded grantees collect and aggregate data for local public health purposes, and retain primary responsibility for information collection procedures. As depicted in Attachment 3, the first step occurs when a physician makes a diagnosis of cancer. Once a definitive diagnosis has been established and treatment planned, the data are entered into a computer, usually with a commercial software package that includes quality control measures to assure high-quality data (step 2). NPCR has established a goal of no more than six months between the diagnosis and computerization of cancer data. Step 3 on the flow chart occurs when reporting facilities (hospitals, physicians' offices, radiation facilities, freestanding surgical centers, and pathology laboratories) perform additional quality control measures over and above what is performed at data entry. The data are sent to the central cancer registry (step 4). Quarterly submissions to the central registry are common, but larger facilities may report more often, and smaller facilities, less frequently.
After the central cancer registry receives the data, each incoming case must be checked against the existing database to ascertain if it is a new case or has been reported previously. At the same time additional quality control measures are applied (step 5). Based on this processing, the central cancer registry may return data to the reporting facility for clarification (step 6). Central cancer registries must link state incidence data with state mortality data to obtain cases that are first diagnosed at death (death certificate only cases). In addition to work that is done within state boundaries, central cancer registries are funded for inter-state data exchange to obtain cancer data on residents who travel to other states for diagnosis or treatment. Once quality control standards are met and the data are complete, they are ready for use and dissemination by the state and submission to CDC (steps 7 and 8). This process usually takes 12 to 18 months after the close of the year in which the cancer is diagnosed.
The number of years of data that a state will report data will depend on when funding began (1995 at the earliest) and when the state began a population-based central cancer registry (the reference year). Twenty-seven central cancer registries have a reference year of 1995, 14 have a reference year of 1996, three have a reference year of 1997, three have a reference year between 1998 and 2000, and one is in the planning and implementation stage.
Once a year, CDC requests cumulative data from central cancer registries beginning with their reference year for NPCR (1995 for most programs) to the close of the most current diagnosis year (e.g., diagnosis 1995-2007 data were received in calendar year 2008, but named the 2009 submission since this is the calendar year that the data products will be released). CDC updates its longitudinal database each year with data from the most recent diagnosis year from the central cancer registries. The data items for the annual submission are based upon the NAACCR Standards for Cancer Registries, Volume II, which is a comprehensive reference to ensure uniform data collection, to reduce the need for redundant coding and data recording between agencies, and to facilitate the collection of comparable data among groups. To meet the needs of standard-setting organizations, central cancer registries, software vendors, and reporting facilities, NAACCR developed guidelines for major changes to be implemented on a three-year cycle (calendar years 2009 and 2012) and minor changes to be implemented on an annual cycle (calendar years 2010 and 2011). Attachment 4a contains a list of data items for the annual data submission for calendar year 2008. This table is updated annually based upon any changes outlined in the NAACCR Standards for Cancer Registries, Volume II.
Prior to receiving the data, CDC requests that central cancer registries run their data through a set of computerized edits. These data edits check the content of data fields against an encoded set of acceptable codes and provide feedback on the quality of the data. There are three types of edits: 1) single-field edits (edits that verify one data item at a time), 2) inter-field edits (edits that verify one data item and its relationship to other related data items), and 3) inter-record edits (edits that compare data recorded across more than one record and is used for patients with multiple tumors). In collaboration with other standard-setting organizations, CDC participates in a working group that modifies and reviews existing edits as well as creates new edits. As with NAACCR Standards for Cancer Registries, Volume II, these edits are continually updated. The instructions included in Attachment 4a contain a list of data edits for the annual data submission for calendar year 2008.
Once CDC receives the data from the individual central cancer registries, they are processed and data evaluation reports are generated as indicated in step nine on the flow chart. The data evaluation reports include the results of evaluating state data by the data standards for completeness of case ascertainment and data quality as adopted by NPCR for program goals and a report detailing the states’ submission including details of edit errors. Attachment 10 outlines the components of the data evaluation reports. In calendar year 2008, over 17 million incident cases of cancer for diagnosis years 1995-2007 were reported to CDC from 45 states and the District of Columbia.
When standards of completeness and quality have been met, CDC will aggregate state data and make them available in non-confidential recalculated rates on the Internet in a format that facilitates obtaining data by sex, race, age, and other common factors of interest. Any data published from NPCR CSS in surveillance reports, either in printed copy or on the Internet, will be scrutinized to assure that the confidentiality of the individual is protected. Current users of the NPCR CSS data must sign a data release agreement as outlined in a data release policy that is updated annually. Restricted-access data sets will be made available at a future date once the appropriate processes relating to confidentiality and security are implemented.
In calendar year 2008, 47 of 48 eligible NPCR grantees (45 states, the District of Columbia, and Puerto Rico) reported their data to NPCR CSS. The use of existing data standards and record layouts for electronic submission of data makes it easy for states to comply with the request. Many NPCR states submit data to NAACCR and exchange data with neighboring states using these standards and formats. There should be few technical difficulties for states in using these familiar processes.
In addition, to ease reporting, there are a number of other incentives for states to submit data. The incentives include an independent and detailed assessment of data quality and the recoding of important data items such as primary site and histology to national standards used for analysis. Evaluation of awardees has been based on progress toward meeting NPCR standards and not solely on achievement of program standards, however, CDC is moving toward performance-based funding.
There is no reason to believe that the response rate in subsequent years would be much lower than 100%. If a state has difficulty submitting data, the CDC Project Officer and/or the CDC data contractor would provide assistance. NPCR will also be working with states to assure that they have complete coverage of the population in their catchment area.
Following the awarding of the data management contract in 2000, CDC requested grantees to voluntarily provide the contractor with test data that summer. In total, 31 state and territorial cancer registries send files of cancer incidence data. This enabled the contractor to create and test the programs for receiving and evaluating the completeness and timeliness of the cancer incidence data, including processes used to edit data, create reports, provide feedback, display data on the Internet, and create datasets. The first NPCR request for data was held in 2001. Each year the system is tested and refined based on test data from previous years’ submissions. States are not requested to send additional data to test the system.
A data contractor, ICF Macro, has been retained to assist with data management and analysis of NPCR CSS. The data collection was designed by Dr. Hannah Weir, who was the previous technical monitor of the data collection contract. The current technical monitor is Dr. Christie R. Eheman, Chief of the Cancer Surveillance Branch, Division of Cancer Prevention and Control, National Center for Chronic Disease Prevention and Health Promotion, CDC. The CDC project officer for the contract is Christine Dauer, public health advisor at the same address.
U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2005 Incidence and Mortality Web-based Report. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute; 2009. Available at: www.cdc.gov/uscs.
National Heart, Lung, and Blood Institute. NHLBI Fact Book, Fiscal Year 2007. Bethesda (MD): National Heart, Lung, and Blood Institute; 2008.
Horner MJ, Ries LAG, Krapcho M, Neyman N, Aminou R, Howlader N, Altekruse SF, Feuer EJ, Huang L, Mariotto A, Miller BA, Lewis DR, Eisner MP, Stinchcomb DG, Edwards BK (eds). SEER Cancer Statistics Review, 1975-2006, Bethesda (MD): National Cancer Institute; 2009. Available at: http://seer.cancer.gov/csr/1975_2006/.
Curry SJ, Byers T, Hewitt M. Fulfilling the Potential of Cancer Prevention and Control. Washington (DC): The National Academies Press; 2003.
Haynes MA, Smedley BD. The Unequal Burden of Cancer: An Assessment of NIH Research and Programs for Ethnic Minorities and the Medically Underserved. Washington (DC): The National Academies Press; 1999.
Tucker TC, Howe HL, Weir HK. Certification of population-based cancer registries. Journal of Registry Management 1999; 26(1): 24-27.
North American Association of Central Cancer Registries (NAACCR). Standards for Cancer Registries: vol. III: Standards for Completeness, Quality, Analysis and Management of Data. Springfield (IL): NAACCR; 2000.
Menck HR, West DW. Central cancer registries. In: Hutchison CL, Roffers SD, Fritz AG. Cancer Registry Management: Principles and Practice. Lenexa (KS): National Cancer Registrars Association; 1997. p. 395-422.
Chronic Disease Committee, Council of State and Territorial Epidemiologists. Inclusion of Cancer Incidence and Mortality Indicators in the National Public Health Surveillance System (Position Statement #CD 4). June 1998.
Centers for Disease Control and Prevention. CDC/ATSDR/CSTE Data Release Guidelines for Re-Release of State Data. Atlanta: Centers for Disease Control and Prevention; 2003 (available upon request).
American Statistical Association. Privacy, Confidentiality, and Data Security Web Site. Alexandria, VA: American Statisitcal Association; 2005. Available at http://www.amstat.org/comm/cmtepc/index.cfm.
Doyle P, Lane JI, Theeuwes JM, Zayatz LM (eds). Confidentiality, Disclosure, and Data Access: Theory and Practical Application for Statistical Agencies. Amsterdam: Elsevier Science BV; 2001.
Federal Committee on Statistical Methodology. Checklist on Disclosure Potential of Proposed Data Releases. Available at http://www.fcsm.gov/committees/cdac/cdac.html.
Federal Committee on Statistical Methodology. Report on Statistical Disclosure Limitation Methodology. (Statistical Working Paper 22). Washington (DC): Office of Management and Budget; 1994. Available at http://www.fcsm.gov/working-papers/spwp22.html.
McLaughlin C. Confidentiality protection in publicly released central cancer registry data. Journal of Registry Management 2002; 29(3):84-88.
Stoto M. Statistical Issues in Interactive Web-Based Public Health Data Dissemination Systems. Draft report prepared for the National Association of Public Health Statistics and Information Systems, Rand Corporation; September 2002.
Havener, L and Thornton, M (editors). Standards for Cancer Registries, Vol II: Data Standards and Data Dictionary, 13th ed. Springfield, Ill: North American Association of Central for Cancer Registries; 2008.
North American Association of Central Cancer Registries. Cancer in North America, 2000-2006: volumes 1-3. Springfield (IL): North American Association of Central Cancer Registries; 2009. Available at : http://www.naaccr.org/index.asp?Col_SectionKey=11&Col_ContentID=50
National Center for Health Statistics. NCHS Staff Manual on Confidentiality. Hyattsville, MD: Centers for Disease Control and Prevention, National Center for Health Statistics; 1997.
File Type | application/msword |
Author | hbw4 |
Last Modified By | tfs4 |
File Modified | 2009-10-22 |
File Created | 2009-10-22 |