Welcome to GD2Viz

Welcome to GD2Viz

Welcome to GD2Viz, a powerful and user-friendly visualization tool designed to help researchers and clinicians explore, analyze, and interpret GD2 Scores across various RNA-Seq datasets. Utilizing the advanced methodology outlined by Ustjanzew et al. (2024), GD2Viz offers an in-depth examination of precomputed GD2 Scores from publicly available RNA-Seq datasets such as TCGA, GTEx, and TARGET. Users also have the flexibility to upload and analyze their own datasets within the context of GD2 Scores.

This R Shiny application version of GD2Viz provides a range of interactive visualizations and preloaded datasets. Additionally, the GD2Viz R package includes several functions for computing Reaction Activity Scores of glycosphingolipid metabolism and predicting GD2 Scores directly within the R environment. For comprehensive guidance, refer to the GD2Viz Vignette.


Key Features

  • Interactive Visualizations: Generate dynamic, real-time interactive plots, heatmaps, and network diagrams to thoroughly explore your data.
  • GD2 Score Analysis for Large Datasets: Investigate GD2 Scores across extensive datasets like TCGA, TARGET, and GTEx. Dive deep into individual projects within the TCGA dataset and analyze GD2 Scores alongside various sample metadata.
  • Predict GD2 Scores for Your Datasets: Effortlessly compute Reaction Activity Scores and GD2 Scores for your datasets, with options to visualize and download the results.
  • User-Friendly Interface: Experience smooth navigation with our intuitive and thoughtfully designed user interface.
  • Group Comparison: Compare two groups or conditions within your dataset to observe log-fold changes in Reaction Activity Scores of glycosphingolipid metabolism.

Development Team

GD2Viz was developed at the Institute for Medical Biostatistics, Epidemiology, and Informatics (IMBEI) of the University Medical Center of the Johannes Gutenberg University Mainz. The development team includes:

  • Arsenij Ustjanzew: Developer
  • Federico Marini: Developer
  • Claudia Paret: Methodological and Clinical Support

For more information, visit our website or consult the GD2Viz Vignette. If you have any questions or need assistance, please don’t hesitate to reach out to our support team.

Background Information

Gangliosides are sialylated glycosphingolipids (GSLs) crucial to the nervous system, with GD2 being a significant member. GD2, a simple ganglioside, is predominantly expressed in embryonic tissues and certain neuroblastic tumors, such as neuroblastoma (NB). Its expression is linked to the undifferentiated state of neural stem cells and NB. GD2 is particularly important because it serves as a target for immunotherapies, including monoclonal antibodies like Dinutuximab and Naxitamab, and CAR-T cell treatments, which are pivotal in treating high-risk NB. Consequently, understanding GD2's role and its expression in tumors can enhance therapeutic strategies and improve patient outcomes not only in neuroblastoma but also in other tumor entities.

We created a graph where nodes are metabolites and edges represent reactions. Reactions describe which enzymes are involved in the respective metabolic step. This graph includes four pathways from the KEGG database related to sphingolipid and glycosphingolipid metabolism, including the lacto-, neolacto-, globo-, and ganglio series. The reactions representing the degradation of GM1 to GM2 and GM2 to GM3 were removed from the graph. This is justified because these reactions represent the degradation process and are not competing enzymatic processes of the biosynthesis. Several enzymes in this metabolic network are specific to GSLs, though some are also involved in other lipid-related pathways.

The graph is weighted using Reaction Activity Scores (RAS), calculated from gene expression data. Some reactions of the graph have more than one emzyme involved, then it is useful to compute the RAS values. RAS values depend on whether the enzymes of a reaction are subunits and work only in presence of all enzyme subunits (AND relation) or as independent enzymes (OR relation). In case of an AND relation, the minimal gene expression value of the involved reactions is used as the RAS value. If the enzymes of a reaction are in an OR relation, the RAS value is computed as the sum of all involved gene expression values. These values weight the graph's edges, creating a directed graph with metabolites as nodes.

Early steps in the ganglioside pathway are performed with enzymes of relative high substrate specificity, whereas downstream enzymes are promiscuitive and elongate in the parallel series of this pathway. To adjust for identical RAS values we used the topological information of transition probabilities (TP), three methods were developed: the 'TP adjustment', and 'recursively adjusted RAS'. For the simple 'TP adjustment' we compute the TPs from one node to the next following node(s) proportional to the RAS values of the outgoing edge(s) and multiply the RAS values of the edges with the TP values. 'Recursively adjusted RAS' replaces TP values that are equal to 1 by recursively prolongate the TP value from previous edges in the chain, that are not equal to 1. Lactosylceramide was chosen as the starting node for this calculation. For further details on the used methods, see 'Ustjanzew et al., Unraveling the Glycosphingolipid Metabolism by Leveraging Transcriptome-weighted Network Analysis on Neuroblastic Tumors. Cancer and Metabolism, 2024.'

For the model, we combined and normalized RNA-seq data from two sources: TARGET Neuroblastoma samples and the GTEx dataset. After weighting the graph's edges per sample with the (adjusted) RAS values, the next step was to identify GD2-mitigating and GD2-promoting reactions, as shown in the figure above (step 3). For each sample, we calculated the sum of GD2-mitigating and GD2-promoting reactions and used these scores to train a Support Vector Machine (SVM) with a linear kernel to differentiate between Neuroblastoma and other GTEx tissues.

It is important to note that the goal was not to create a perfect discriminator model. Since GD2 concentration is a continuous variable, we use the sample's distance from the hyperplane as the GD2 score. If a sample's GD2 score is positive or slightly below zero, it suggests a higher accumulation of GD2 in that sample, based solely on the RNA-seq data.

We have to normalize the RNA-seq dataset (test dataset) we use to predict the GD2 score based on the training dataset. Therefore, we use DESeq median ratio normalization method, where the size factor of a test case will be estimated using gene-wise geometric means from training set. The normalized count data is further processed to RAS-values as described previously. The sum of GD2-mitigating and GD2-promoting reactions is computed for each sample and used as input for the Support Vector Machne model to predict the GD2 score.

Getting started with GD2Viz

Explore the GD2 score of six RNA-Seq datasets

The “Public Datasets” tab allows you to analyze the GD2 score across six major RNA-seq datasets: TCGA Tumor samples, TCGA normal samples, GTEx, TARGET, St. Jude Cloud, and the CBTTC dataset from the Pediatric Brain Tumor Atlas. To save time, we’ve precomputed both adjusted and unadjusted RAS values for these datasets. You can fine-tune the SVM model settings by selecting how the RAS values should be used—either raw, ranged, or scaled—and choose the preferred RAS adjustment method. Customize the GD2 score visualization with various plot options, such as scatter, box, or violin plots. Additionally, you can normalize the GD2 score to a 0-1 range, select a grouping variable, or highlight specific sample groups, like Glioblastoma Multiforme within the TCGA dataset.

Detailed exploration of TCGA subtypes

In the “TCGA Cencer Types” tab, you can dive deeper into specific projects within the TCGA Tumor dataset. Select a TCGA project and further analyze the data by grouping scatter plots based on experimental variables such as gender, mRNA subtype, or DNA methylation subtype. Similar to the first tab, you’ll set the model parameters—deciding whether to use raw, ranged, or scaled RAS values and selecting the desired RAS adjustment method.

Beneath the initial GD2 score visualization, you can conduct differential gene expression analysis (DEA) for the entire project or a specific subset, like mRNA subtype LGr4 in the Glioblastoma Multiforme project. Use the GD2 score for stratification, with options to divide samples into GD2-high and GD2-low groups using the median, set a specific GD2 threshold, or create three groups based on percentiles. Customize DEA settings, including False Discovery Rate, filtering, and hypothesis weighting. The results display overall gene regulation, a searchable table of significant genes, and additional details when a gene is selected. The tab also includes diagnostic and summary plots, such as MA and interactive volcano plots. You get an overall result of the DEA describing, how many genes were identified as up or down regulated. The “DEA Genes” table allows you to see and search for genes and their statistics. Select a gene by clicking on the row in the table and see further information on the gene as well as the distribution between the GD2-high and low group in the next two boxes. The two diagnostic plots provide further information on the DEA result, as well as the two summary plots - the MA plot and the interactive volcano plot.

Analalyze your own data

The third tab “Analyze Your Data” allows you to upload your dataset and compute the RAS values and the GD2 score. You can either upload a count matrix .tsv file and a metadata .tsv file or a DESeqDataSet object as .rds file. The counts should be raw, not normalized counts, and genes should be annotated as Gene Symbols. The column names (samples) of the count matrix should be identical with the rownames of the metadata file. The metadata file can contain further columns with experimental variables, e.g. subgroups, treatments, etc.(for further documentation see the package vignette at )

When uploaded your data files, click on the “Compute Reaction Activity Scores” button to compute adjusted and raw RAS values. Set then the model setting, wheather raw, ranged or scaled RAS values should be used for the model training. After clicking the “Compute GD2 Score” button, the models and GD2 scores for all 4 RAS matrices are calculated. You can then download the GD2 score predictions and the RAS matrices. The “Global Plot Settings” allow you to choose the RAS adjustment method that will be visualized in the plots below. You can also select one of your experimental variables to group the dataset in the next plots.

The first plot shows a heatmap of the RAS values. THe heatmap is adjustable, so you can activate or deactivate the dendograms, choose wheather the values should be scaled, wheather the row (reactions) or column (samples) names should be displayed, or select the distance and clustering methods.

The next scatter plot shows the sum of GD2-promoting vs. GD2-diminishing reactions, that were used for the SVM. The third plot shows the predicted GD2 score for yor dataset. THe plot settings allow you to visualize the GD2 score as a box, scatter or violin plot. When “scatter” is selected as plot type, you can also choose a gene to visualize the GD2 score against the selected gene. The fourth plot shows the GD2 score vs. the stemness score, as the stemness sometimes has an impact on the GD2 concentration.

The next section allows you to compare the RAS values of two groups. A log2-fold change graph of the Glycosphingolipid metabolism is generated. as well as a detailed comparison of reactions of the ganglioside metabolism pathway.

The last section is similar to the “TCGA Cancer Types” tab and allows you to perform a differential gene expression analysis based on the predicted GD2-score stratification.

Upload Your Dataset and Compute the GD2 Score

1. Data Input

Loading...

2. Model Settings

Loading...

3. Global Plot Settings

4. Download Data



Reaction Activity Scores

Loading...

GD2 Promoting & Diminishing Reaction Activity

Loading...

GD2 Score

Loading...

GD2 Score vs. Stemness Score (mRNAsi)

Loading...

Compare Two Groups

Settings for Comparison

Ganglioside Metabolism

Loading...

Group Comparison

Loading...

Perform Differential Expression Analysis

Settings for Differential Expression Analysis




DEA Result

Loading...

                      

DEA Genes

Loading...

Selected Gene

Loading...

Gene infobox

Loading...

Diagnostic plots

p-Value Histogram

Loading...

Histogram of the Log2 Fold-Changes

Loading...

Summary plots

MA plot

Loading...

Volcano plot

Loading...

Acknowledgment

The inspiration and parts of the code for the Differential Gene Expression Analysis are based on the R package ideal by Federico Marini.

Explore the GD2 Score of large RNA-Seq datasets

Global Settings

Plot Settings

TCGA Tumor

Loading...

Acknowledgements

The TCGA-GTEx-TARGET RNA-seq data (RSEM expected counts) and phenotype data for computing and visualizing the GD2 Score were downloaded from the Xena data portal.

The data originates from The Cancer Genome Atlas (TCGA) project.

Citation of UCSC Toil RNAseq

Vivian, J., Rao, A., Nothaft, F. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol 35, 314–316 (2017). https://doi.org/10.1038/nbt.3772

Plot Settings

TCGA Normal

Loading...

Acknowledgements

The TCGA-GTEx-TARGET RNA-seq data (RSEM expected counts) and phenotype data for computing and visualizing the GD2 Score were downloaded from the Xena data portal.

The data originates from The Cancer Genome Atlas (TCGA) project.

Citation of UCSC Toil RNAseq

Vivian, J., Rao, A., Nothaft, F. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol 35, 314–316 (2017). https://doi.org/10.1038/nbt.3772

Plot Settings

GTEx

Loading...

Acknowledgements

The TCGA-GTEx-TARGET RNA-seq data (RSEM expected counts) and phenotype data for computing and visualizing the GD2 Score were downloaded from the Xena data portal.

The data originates from the Genotype-Tissue Expression (GTEx) project.

Citation of UCSC Toil RNAseq

Vivian, J., Rao, A., Nothaft, F. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol 35, 314–316 (2017). https://doi.org/10.1038/nbt.3772

Plot Settings

TARGET

Loading...

Acknowledgements

The TCGA-GTEx-TARGET RNA-seq data (RSEM expected counts) and phenotype data for computing and visualizing the GD2 Score were downloaded from the Xena data portal.

The data originates from the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) project.

Citation of TARGET:

For further information on the sinle TARGET subprojects please visit:

National Cancer Institute (NCI) TARGET: Therapeutically Applicable Research to Generate Effective Treatments dbGaP Study Accession: phs000218.v24.p8

Citation of UCSC Toil RNAseq

Vivian, J., Rao, A., Nothaft, F. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol 35, 314–316 (2017). https://doi.org/10.1038/nbt.3772

Plot Settings

St. Jude Cloud

Loading...

Acknowledgements

The GD2 Score was computed and visualized using the St. Jude Cloud RNA-seq data and phenotype data obtained from the St. Jude Cloud Genomics Platform. We acknowledge St. Jude Cloud as a collaborative data sharing partner in support of this research.

Citation of St. Jude Cloud

Clay McLeod, et al. St. Jude Cloud-a Pediatric Cancer Genomic Data Sharing Ecosystem. Cancer Discov January 8 2021 DOI: 10.1158/2159-8290.CD-20-1230

Plot Settings

Pediatric Brain Tumor Atlas: CBTTC

Loading...

Acknowledgements

The Children’s Brain Tumor Tissue Consortium (CBTTC RNA-seq data and phenotype data for computing and visualizing the GD2 Score can be accessed from the Kids First Data Resource Portal. The RNA-seq data (RSEM expected counts) and phenotype data for computing and visualizing the GD2 Score were downloaded from the Xena data portal.

Citation CBTTC

Ijaz, Heba, et al. "Pediatric high-grade glioma resources from the Children’s Brain Tumor Tissue Consortium." Neuro-oncology 22.1 (2020): 163-165.

GD2 Score of TCGA projects

1. Select TCGA Project

2. Global Settings

GD2 Score of TCGA Project:

Loading...

Perform Differential Expression Analysis

Settings for Differential Expression Analysis




DEA Result

Loading...

                      

DEA Genes

Loading...

Selected Gene

Loading...

Gene infobox

Loading...

Diagnostic plots

p-Value Histogram

Loading...

Histogram of the Log2 Fold-Changes

Loading...

Summary plots

MA plot

Loading...

Volcano plot

Loading...

Acknowledgment

The inspiration and parts of the code for the Differential Gene Expression Analysis are based on the R package ideal by Federico Marini.