Jump to: Authorized Access | Attribution | Authorized Requests

Study Description

This sub-study phs000878 CIDR Lung Cancer contains genotype, sequence data, and selected phenotype of subjects available from the phs000878 study. Summary level phenotypes for the NCI Lung Cancer Transdisciplinary Research Cohort study participants can be viewed at the top-level study page phs000876 Lung Cancer Transdisciplinary Research Cohort. Individual level phenotype data and molecular data for all Lung Cancer Transdisciplinary Research Cohort top-level study and sub-studies are available by requesting Authorized Access to the NCI Lung Cancer Transdisciplinary Research Cohort phs000876 study.

The study was conducted under the auspices of the Transdisciplinary Research In Cancer of the Lung (TRICL) Research Team, which is a part of the Genetic Associations and MEchanisms in ONcology (GAME-ON) consortium, and associated with the International Lung Cancer Consortium (ILCCO).

Ethics
All participants provided written informed consent. All studies were reviewed and approved by institutional ethics review committees at the involved institutions.

Sequencing data are derived from four substudies. The substudies that contributed include Harvard, Liverpool, Toronto, and IARC. The IARC and Toronto studies are described above. A description of the Harvard and Liverpool studies is provided below.

Liverpool Lung Project: The Liverpool Lung Project (LLP)1 is a case control and cohort study, which has over 11,500 individuals, with detailed epidemiological, clinical and outcome data with associated specimens (i.e. tumour tissue, blood, plasma, sputum, bronchial lavage, EBUS and oral brushings). The participants have completed a detailed lifestyle questionnaire and updated data on clinical outcome and hospital events are collected through the Office of National Statistics, Cancer Registry and from Health Episode Statistics. The project is registered on the UK National Institute for Health Research (NIHR) lung cancer portfolio and has all the required ethical approvals and sponsorship arrangements in place. The LLP has detailed standard operating procedures (SOP) for all aspects of the recruitment, data, specimen collection as well as the data storage. The LLP Cohort study has 8,224 participants with blood and 7,761 with plasma samples. The LLP case-control samples have been incorporated into in a large number of international GWAS and molecular studies 2,3, methylation 4-7, microRNA 8and next generation studies 9-11, resulting in high ranking publications, as well as forming the basis for the LLP risk prediction model 12-14 which has been utilised in the UK lung cancer screening trial (UKLS) 15-17.
Patient and control DNAs were derived from EDTA-venous blood samples.

Harvard Samples. David Christiani at the Harvard University School of Public Health has been directing research studies to investigate etiological factors influencing lung cancer development since 1983 and has amassed a collection of 2000 controls and 5055 lung cancer cases. He has been actively collecting and storing snap frozen tumor samples since 1992. Around 1500 tumor samples have been collected and the average wet tumor yield is about 30 grams of tumor, of which 631 cases have completely annotated clinical and survival information. Pathology confirmation is provided by two pathologists. At the time of surgery, a minimum of 30 grams of wet lung tumor tissue and 30 grams of non-involved tissue from the same lobe is sectioned, flash frozen and sent to Dr. Christiani's lab for logging and storage. A blood sample for DNA and serum is collected. A structured interview by trained research staff is conducted on each case, and clinical outcomes and treatments is extracted and entered into the molecular epidemiology data base at Harvard. Fresh frozen samples have been collected from 1451 lung cancer and are available for study. Samples from this collaborative study have played key roles in major studies, including the initial finding describing EGFR mutations in lung cancer22. Participants in this study are patients, > 18 years of age, with newly diagnosed histologically confirmed lung cancer. Samples that are included in the analysis have the following histologies: Adenocarcinoma: 8140/3, 8250/3, 8260/3, 8310/3, 8480/3 8560/3; LCC: 8012/3, 8031/3; squamous carcinoma: 8070/3, 8071/3, 8072/3, 8074/3; and other NSCLC: 8010/3, 8020/3, 8021/3, 8032/3, 8230/3.

The Toronto Study: The Toronto study was conducted in the Great Toronto Area between 1997 and 2014. Cases were recruited at the hospitals in the network of University of Toronto and Lunenfeld - Tanenbaum Research Institute. At the time of recruitment in the clinical setting, provisional diagnoses of lung carcinoma were first assigned based on clinical criteria. Diagnoses for all cases included were histologically confirmed by the reference pathologist who is a specialist in pulmonary pathology, based on review of pathology reports from surgery, biopsy or cytology samples in 100% of cases. Diagnostic classification was done initially according to ICD-9, ICD-10, and ICD for oncology-2, and subsequently converted to ICD-O-3. Tumors were grouped into the major categories included in this analysis according to primary cancer type based on the ICD-3 definitions. Controls were randomly selected from individual visiting family medicine clinics and Ministry of Finance Municipal Tax Tapes. All subjects were interviewed using a standard questionnaire and information on lifestyle risk factors, occupational history, medical and family history was collected. Blood samples were collected from more than 85% of the subjects.

IARC: The IARC data are derived from case-control studies conducted in Russia and include samples that have available tissue samples. Patient and control DNAs were derived from EDTA-venous blood samples. The lung cancer patients were classified according to ICD-O-3; SQ: 8070/3, 8071/3, 8072/3, 8074/3; AD: 8140/3, 8250/3, 8260/3, 8310/3, 8480/3, 8560/3, 8251/3, 8490/3, 8570/3, 8574/3; with tumous with overlapping histologies classified as mixed.

Authorized Access
Publicly Available Data (Public ftp)
Study Inclusion/Exclusion Criteria

Cases: All cases had to have received diagnosis of pathologically confirmed lung cancer. Tumors from patients were classified as adenocarcinomas (AD), squamous carcinomas (SQ), large-cell carcinomas (LCC), mixed adenosquamous carcinomas (MADSQ) and other non-small cell lung cancer (NSCLC) histologies following either the International Classification of Diseases for Oncology (ICD-O) or World Health Organization (WHO) coding. Tumors with overlapping histologies were classified as mixed.

Tumors from patients were classified as adenocarcinomas (AD), squamous carcinomas (SQ), large-cell carcinomas (LCC), mixed adenosquamous carcinomas (MADSQ) and other non-small cell lung cancer (NSCLC) histologies following either the International Classification of Diseases for Oncology (ICD-O) or World Health Organization (WHO) coding. Tumors with overlapping histologies classified as mixed. All cases and controls were reported to be European ancestry. During PLINK analysis, cases or controls that clustered more than 6 standard deviations from the centroid of the population were removed.

Controls were collected at each site, according to matching schemes at each site. The M.D. Anderson Cancer Site only collected ever smoking cases matched to ever smoking controls. Data on epidemiological risk factors were not available from the UK/ICR-GWAS as these originated from the 1958 birth cohort and Wellcome Trust Case Control Consortium for which data on epidemiological risk factors were not collected.

Molecular Data
TypeSourcePlatformNumber of Oligos/SNPsSNP Batch IdComment
Whole Exome Sequencing Agilent SureSelect Human All Exon v.1 Kit N/A N/A
Study History

This study brings together 4 studies for which extensive genotyping has already been completed using either the OnoArray or an Affymetrix Array that has been genotyped in Canada and which will be deposited into dbGAP once quality control procedures are complete.

Selected Publications
Diseases/Traits Related to Study (MeSH terms)
Links to Related Resources
Authorized Data Access Requests
See research articles citing use of the data from this study
Study Attribution
  • Principal Investigator
    • Christopher Amos. National Institutes of Health, Bethesda, MD, USA.
  • Funding Source
    • U19 CA148127. National Institutes of Health, Bethesda, MD, USA.
  • Sequencing Center
    • Center for Inherited Disease Research (CIDR). Johns Hopkins University, Baltimore, MD, USA.
  • Funding Source for CIDR Sequencing
    • HHSN268201200008I, NIH contract "High throughput genotyping for studying the genetic contributions to human disease". National Institutes of Health, Bethesda, MD, USA.
  • Sequencing Quality Control
    • Genetics Coordinating Center. Department of Biostatistics, University of Washington, WA, USA.