Jump to: Authorized Access | Attribution | Authorized Requests

Study Description

This sub-study phs001681 Affy Axiom Array contains genotype, sequence data, and selected phenotype of subjects available from the phs001681 study. Summary level phenotypes for the NCI Lung Cancer Transdisciplinary Research Cohort study participants can be viewed at the top-level study page phs000876 Lung Cancer Transdisciplinary Research Cohort. Individual level phenotype data and molecular data for all Lung Cancer Transdisciplinary Research Cohort top-level study and sub-studies are available by requesting Authorized Access to the NCI Lung Cancer Transdisciplinary Research Cohort phs000876 study.

The study was conducted under the auspices of the Transdisciplinary Research In Cancer of the Lung (TRICL) Research Team, which is a part of the Genetic Associations and MEchanisms in ONcology (GAME-ON) consortium, and associated with the International Lung Cancer Consortium (ILCCO)

Ethics
All participants provided written informed consent. All studies were reviewed and approved by institutional ethics review committees at the involved institutions.

CARET. The Carotene and Retinol Efficacy Trial (CARET) was a randomized, double-blind, placebo-controlled trial of the cancer prevention efficacy and safety of a daily combination of 30 mg of beta-carotene and 25,000 IU of retinyl palmitate in 18,314 persons at high risk for lung cancer. CARET began in 1985, and the intervention was halted in January 1996, 21 months ahead of schedule, with the twin conclusions for definitive evidence of no benefit and substantial evidence of a harmful effect of the intervention on both lung cancer incidence and total mortality. CARET continued to follow and collect endpoints on their participants through 2005. Pathology reports and medical records were reviewed to confirm cancer endpoints, and death certificates obtained to capture cause of death. During the active intervention phase of CARET, serum, plasma, whole blood, and lung tissue specimens were collected on participants. These biospecimens make up the CARET Biorepository. For the OncoArray Project, CARET provided DNA extracted from whole blood of lung cancer cases and controls matched on age at baseline (+/- 4 years), sex, race, baseline smoking status, history of occupational asbestos exposure (asbestos vs heavy smoker), and year of enrollment (2-year intervals).

EPIC. The European Prospective Investigation into Cancer and Nutrition (EPIC) study. The European Prospective Investigation into Cancer and Nutrition (EPIC) study is a multi-center cohort study involving 521,000 study participants from 10 European countries.ref The current study involved EPIC participants from 7 countries (Greece, Netherlands, UK, France, Germany, Spain, and Italy), including 1010 incident lung cancer cases and 1025 smoking matched controls.

Nurses Health Study

The NHS was initiated in 1976, when 121,700 United States registered nurses between the ages of 30 and 55 returned an initial questionnaire reporting medical histories and baseline health-related exposures. Every two years, follow-up questionnaires are mailed to the participants (questionnaire response rate >90% for all follow-up cycles). Diet was assessed in 1986, 1990, 1994, and 1996 with a semiquantitative food frequency questionnaire. Between 1989 and 1990, blood samples and relevant sample collection information (blood draw features such as fasting status at blood collection) were collected from 32,826 women.

National Physician Health Study

The National Physicians Health Study The remaining 22,071 physicians were then randomly assigned to receive active aspirin and active beta-carotene (n=5,517), active aspirin and beta-carotene placebo (n=5,520), aspirin placebo and active beta-carotene (n=5,519), or aspirin placebo and beta-carotene placebo (n=5,515). A total of 11,037 physicians were randomized to aspirin and 11,034 to aspirin placebo; a total of 11,034 physicians were randomized to beta-carotene and 11,037 to beta-carotene placebo.

The Multiethnic Cohort (MEC). The MEC Study includes 215,251 men and women aged 45-74 years at recruitment, primarily from five ethnic/racial groups - African Americans and Latinos mostly recruited from CA (mainly from Los Angeles County) and Japanese Americans, Native Hawaiians and whites (mostly recruited from HI). The cohort was assembled in 1993-1996 by mailing a self-administered questionnaire to persons identified primarily through driver's license files. The baseline questionnaire obtained information on demographics, anthropometry, smoking history, medical and reproductive histories, family history of cancer, diet and physical activity. Incident cancer cases are identified by regular linkage with the State of California Cancer Registry and the Hawaii Tumor Registry, both members of the SEER Program of the NCI. In 2001-2006, a prospective biorepository was assembled by collecting a pre-diagnostic blood specimen from 67,594 surviving MEC members. At the time of blood collection a short questionnaire was administered that included information on smoking during the previous 15 days. For this study, cases were all lung cancer cases incident to blood draw and diagnosed before December 2012. For each case, a control was selected among unaffected MEC participants who were alive at time of the case's diagnosis and matched on study site, sex, race/ethnicity, age (age at diagnosis for cases; age at blood collection for controls), and date of blood collection.

The Mount-Sinai Hospital-Princess Margaret Study (MSH-PMH). MSH-PMH was conducted in the greater Toronto area from 2008 to 2013. Lung cancer cases were recruited at the hospitals in the network of the University of Toronto. Controls were selected randomly from individuals registered in the family medicine clinics databases and were frequency matched with cases on age and sex. All subjects were interviewed, and information on lifestyle risk factors, occupational history and medical and family history was collected using a standard questionnaire. Tumors were centrally reviewed by the reference pathologist (a member of the International Association for the Study of Lung Cancer (IASLC) committee) and a second pathologist in the University Health Network. If the reviews conflicted, a consensus was arrived at after discussion. Coding of histology was based on 2001 WHO/IASLC. Genomic DNA was extracted based on standard protocol.

PLCO. The PLCO study, a randomized trial aimed at evaluating the efficacy of screening in reducing cancer mortality, recruited approximately 155,000 men and women age 55 to 74 years from 1992 to 20014. Screening for lung cancer among participants in the intervention arm included a chest x-ray at baseline followed by either three annual x-rays (for current or former smokers at enrollment) or two annual x-rays (for never smokers); participants in the control arm received routine health care. Screening-arm participants provided data on sociodemographic factors, smoking behavior, anthropometric characteristics, medical history, and family history of cancer, as well as blood samples annually for the first 6 years of the study (baseline [T0] and T1 through T5). Lung cancers were ascertained through annual questionnaires mailed to the participants, and positive reports were followed up by abstracting medical records or death certificates. Follow-up in the trial as of July 2009 was 96.7%. Patients were excluded because of missing baseline questionnaire, previous history of any cancer, diagnosis of multiple cancers during follow-up, missing smoking information at baseline, missing consent for utilization of biologic specimens for etiologic studies, or unavailability/insufficient quantity of serum or DNA specimens.

The Harvard Lung Cancer Study (HLCS). HLCS is a case-control study based at Mass General Hospital (MGH) in Boston, Massachusetts from 1992 to 2004. Eligible cases included any person over the age of 18 years with a diagnosis of primary lung cancer that was further confirmed by an MGH lung pathologist. Controls were recruited from the friends or spouses of cancer patients or the friends or spouses of other surgery patients in the same hospital. Potential controls were excluded from participation if they had a diagnosis of any cancer (other than non-melanoma skin cancer). Interviewer-administered questionnaires, a modified version of the standardized American Thoracic Society respiratory questionnaire, collected information on demographics, medical history, family history of cancer, smoking history, and a detailed work history, including job titles and tasks. The Institutional Review Board of MGH and the Human Subjects Committee of the Harvard School of Public Health approved the study.

Liverpool Lung Project. The Roy Castle Lung Study of Liverpool Lung Project (LLP) is a case-control and cohort study which has recruited over 11,500 individuals since 1996 from the Liverpool region in the UK. Detailed epidemiological and clinical data is collected with associated specimens (i.e. tumour tissue, blood, plasma, sputum, bronchial lavage and oral brushings). The participants have completed a detailed lifestyle questionnaire at recruitment, with repeat questionnaires at intervals; updated data on clinical outcome and hospital events are collected through the Health and Social Care Information Centre (including Office of National Statistics mortality data, Cancer Registry and Health Episode Statistics). The project is registered on the UK National Institute for Health Research (NIHR) lung cancer portfolio and has all the required ethical approvals and sponsorship arrangements in place. The lung tumours were reviewed by the reference pathologist.

The IARC Russian Cancer study. Lung cancer cases and controls were recruited through a multicentric case-control study coordinated by the International Agency for Research on Cancer in Russia, from 2005 to 2013. Cases were incident cancer patients collected from general hospitals. Controls were recruited from individuals visiting general hospitals and out-patient clinics for disorders unrelated to lung cancer and/or its associated risk factors, or from the general population. Information on lifestyle risk factors, medical and family history was collected from subjects by interview using a standard questionnaire. All study participants provided written informed consent. The current study included 4882 lung cancer cases and 5313 controls genotyped on the Affymetrix Axiome Array.

Summary: After quality control, the following samples are available for analysis.

Study CARET (CART22) CARET Nested case-control USA 524 (34.8) 978 (65.2) 1502
EPIC-Lung (EPIC45) CARET Nested case-control Europe (Multi-site) 481 (51.3) 456 (48.7) 937
NHS/NPHS (HARV50) CARET Nested case-control USA 475 (50.7) 462 (49.3) 937
LCS (HSPH01) Hospital-based USA 656 (47.5) 726 (52.5) 1382
LLP (LLPC52) Hospital-based UK 393 (50.8) 381 (49.2) 774
MEC (MECO46) Nested case-control USA 651 (48.9) 680 (51.1) 1331
MSH-PMH (MCCT48) Clinic-based Canada 1049 (55.5) 842 (44.5) 1891
PLCO (PLCO53) Nested case-control USA 175 (33.3) 351 (66.7) 526
RUSS (RUSS54) Hospital-based Russia 478 (52.2) 437 (47.8) 915

Quality Control for Markers:

  • 1) 414504 markers from TRICL CHIP (414603, excluding 99 markers with missing chromosome)
  • 2) 392574 markers excluding MNP, INDEL
  • 3) 382020 markers excluding duplicated SNPs (based on Affy's SNPs, one to one matched to dbSNPs, but some dbSNPs are missing)
  • Call rate for 382020 SNP markers, 376571 among 382020 markers (0.9857363) have >= 95% call rate
  • Median Concordance rate among 1801 duplicated markers is 0.998
Despite excellent concordance rate there were a large number of signals for association among markers with low allele frequency. We therefore applied additional filtering steps:

Authorized Access
Publicly Available Data (Public ftp)
Study Inclusion/Exclusion Criteria

Cases. All cases had to have received diagnosis of pathologically confirmed lung cancer. Tumors from patients were classified as adenocarcinomas (AD), squamous carcinomas (SQ), large-cell carcinomas (LCC), mixed adenosquamous carcinomas (MADSQ) and other non-small cell lung cancer (NSCLC) histologies following either the International Classification of Diseases for Oncology (ICD-O) or World Health Organization (WHO) coding. Tumors with overlapping histologies were classified as mixed. The lung cancer patients were classified according to ICD-O-3; Squamous: 8070/3, 8071/3, 8072/3, 8074/3; Adenocarcinoma: 8140/3, 8250/3, 8260/3, 8310/3, 8480/3, 8560/3, 8251/3, 8490/3, 8570/3, 8574/3; with tumors with overlapping histologies classified as mixed. LCC: 8012/3, 8031/3 and other NSCLC: 8010/3, 8020/3, 8021/3, 8032/3, 8230/3. All patients were reported to be of European ancestry.

Controls. Controls are selected to be free of lung and other nonmelanom skin cancers at the time of the recruitment. Controls have not been matched to smoking status for all studies except the M.D. Anderson Cancer Center study, which matches by smoking status. Most studies have frequency matched by age category (within 5 year intervals) and sex.

Molecular Data
TypeSourcePlatformNumber of Oligos/SNPsSNP Batch IdComment
Whole Genome Genotyping Affymetrix Axiom Genome-Wide Human Origins N/A N/A
Study History

This study brings together 9 studies that have all been genotyped on the Affyemtrix Axiome platform. One data set provided by the Melbourne Cohort Study was excluded because a large proportion of samples were provided from Guthrie cards, which had a high failure rate. One manuscript has been published describing these studies.

Selected Publications
Diseases/Traits Related to Study (MeSH terms)
Links to Related Genes
Links to Related Resources
Authorized Data Access Requests
See research articles citing use of the data from this study
Study Attribution
  • Principal Investigator
    • Christopher Amos. National Institutes of Health, Bethesda, MD, USA.
  • Funding Source
    • U19 CA148127. National Institutes of Health, Bethesda, MD, USA.