The table below describes the input data used for the reference-free deconvolution protocol and how it can be obtained.
Name | URL | Description |
---|---|---|
GDC data download tool | https://gdc.cancer.gov/access-data/gdc-data-transfer-tool | On the website, select the version most appropriate for your computational infrastructure and follow the instructions to install the tool. |
Manifest file for TCGA-LUAD | https://portal.gdc.cancer.gov/legacy-archive/search/f | Select "TCGA" in the category "Cancer program" on the left, and the "Disease type" as "lung adenocarcinoma". Then, go to files on the top of the left hand side and select "Methylation array" under "Experimental strategy". Last, select "Raw intensities" under "Data type" and "Illumina Human Methylation 450" under "Platform" and click on "Download Manifest". |
Clinical metadata | https://portal.gdc.cancer.gov/projects/TCGA-LUAD | Click on the "Clinical" button and select "TSV" as the download option. Then, unpack the downloaded .tar.gz file into a new subdirectory annotation within your working directory. |
Mutational data | https://www.cbioportal.org/study/clinicalData?id=luad_tcga_pan_can_atlas_2018 | Click on the download icon, which shows "Download clinical data for the selected cases" by dragging the mouse cursor over it and storing the resulting file luad_tcga_pan_can_atlas_2018_clinical_data.tsv. The file will be required as input in this plotting script. |
Tumor purity scores | https://static- content.springer.com/esm/art%3A10.1038%2Fncomms3612/MediaObjects/41467_2013_BFncomms3 612_MOESM489_ESM.xlsx | Download the Excel file and browse to the sheet "RNASeqV2". From this sheet, select the "lung adenocarcinoma" cases from the column "platform" and store the resulting sheet a tabular file named "LUAD_ESTIMATE_RNAseqV2.tab" in the annotation subdirectory. The file is needed for the trait association script. |
Stemness indices | https://ars.els-cdn.com/content/image/1-s2.0-S0092867418303581-mmc1.xlsx | Download the excel file and browse to the sheet "StemnessScores_DNAmeth". Store this sheet as a new file "stemness_index.csv" in the directory "annotation". This file will be required to correlate LMC proportions with cancer stemness indices using this script. |
The table below describes the supplementary resources that are available for the reference-free deconvolution protocol.
Name | URL | Description |
---|---|---|
Manifest file for TCGA-LUAD | https://github.com/CompEpigen/Decomp_web/blob/master/data/gdc_manifest.2019-01-23.txt | The manifest file for the TCGA LUAD dataset. It can be used to download the IDAT files and associated metadata from TCGA using the GDC data transfer tool. |
RnBeads report | http://epigenomics.dkfz.de/downloads/DecompProtocol/RnBeads_Report_TCGA_LUAD/ | The RnBeads report generated from the lung adenocarcinoma dataset from TCGA (TCGA-LUAD). The protocol only requires the Import and Quality Control modules to be executed, but we provide a complete execution of the RnBeads pipeline including exploratory analysis. |
RnBSet | http://epigenomics.dkfz.de/downloads/DecompProtocol/rnbSet_unnormalized.zip | An processed RnBSet object comprising sample metadata, DNA methylation data, and CpG annotations. The dataset has not been subject to preprocessing and normalization. |
CpGs passing quality filtering | http://epigenomics.dkfz.de/downloads/DecompProtocol/sites_passing_quality_filtering.csv | A list of CpG identifiers from the EPIC array, which pass the stringent quality criteria for this particular dataset. The list comes as a comma-separated values (CSV) file. |
CpGs passing context filtering | http://epigenomics.dkfz.de/downloads/DecompProtocol/sites_passing_context_filtering.csv | A list of CpG identifiers from the EPIC array, which pass the context filtering steps. In this step, sites annotated to single nucleotide polymorphisms, or to the sex chromosomes are removed. |
CpGs passing complete filtering | http://epigenomics.dkfz.de/downloads/DecompProtocol/sites_passing_complete_filtering.csv | A list of CpG identifiers from the EPIC array, which pass all filtering steps employed in DecompPipeline. This is an extension of the list above, which also removes sites that have been reported to be cross-reactive. |
FactorViz output | http://epigenomics.dkfz.de/downloads/DecompProtocol/FactorViz_outputs.tar.gz | This folder comprises the results of the deconvolution experiment of the TCGA LUAD dataset, which can be directly (after extraction) imported through FactorViz. The folder contains the MeDeComSet, genomic annotation of the CpGs, and sample metadata. |
The table below describes scripts that have been used to generate the plots shown within the protocol.
Name | URL | Description |
---|---|---|
Proportions heatmap | https://github.com/CompEpigen/Decomp_web/blob/master/data/proportion_heatmap.R | An R script to generate the proportions heatmap of the TCGA-LUAD deconvolution experiment with 7 LMCs. |
Trait association | https://github.com/CompEpigen/Decomp_web/blob/master/data/trait_association.R | An R script to compare both quantitative and qualitative traits with LMC proportions. |
Comparison with mutational data | https://github.com/CompEpigen/Decomp_web/blob/master/data/compare_with_mutational_data.R | An R script to associate LMC proportions with various kinds of genetic alterations. |
Comparison to cancer stemness | https://github.com/CompEpigen/Decomp_web/blob/master/data/correlation_with_stemness.R | An R script to associate LMC proportions with cancer stemness indices. |
Differential LMC analysis | https://github.com/CompEpigen/Decomp_web/blob/master/data/compare_LMCs_scatterplot.R | An R script to compare LMCs to one another, for instance to create scatterplots describing multiple LMCs. |
Proportions and gene expression | https://github.com/CompEpigen/Decomp_web/blob/master/data/quantify_gene_expression.R | An R script to compare LMC proportions with gene expression values of cell type marker genes in lung tissue. |
Survival analysis | https://github.com/CompEpigen/Decomp_web/blob/master/data/survival_analyis.R | An R script to perform survival analysis of patients with different LMC proportions. |