Frequently Asked Questions (FAQs)
- When should I use OmicsAnalyst?
- What are the main features of OmicsAnalyst?
- What are the data formats accepted by OmicsAnalyst?
- Which browsers are supported by OmicsAnalyst?
- What if WebGL is supported but disabled on my browser?
- Which dimensionality methods are implemented in OmicsAnalyst?
- What are some common use cases of OmicsAnalyst?
- Can I test whether two omics data are associated and how strong their association is?
- Which features are available in 3D scatter plot viewer?
- Which features are available in dual-heatmap viewer?
- Which features are available in network viewer?
- What is Procrustes Analysis?
- How do I interpret the Procrustes plot?
- How do I interpret the MCIA plot?
- How many data points can be visualized?
- How do I interpret "Contour" option used to highlight a group of nodes?
-
A key underlying assumption of OmicsAnalyst is that discrete clusters are present in your omics data.
OmicsAnalyst is designed to detect, visualize and analyze these clusters. A clear outcome of this approach is that
OmicsAnalyst will partition your data into clusters, regardless of whether there are biologically meaningful groups present.
Although this approach may not be suitable for all omics data, such knowledge is rarely known a priori.
Therefore, we strongly recommend users to evaluate their omics data in an unbiased, data-driven manner to
complement mainstream differential analysis and supervised methods.
In addition, users can visualize and analyze the patterns/groups with regard to different metadata they provided.
This function is independent of the clusters detected. For instance, users can directly visualize and compare any
pre-defined groups within our 3D visualization system.
-
OmicsAnalyst was designed to provide an intuitive means for clinicians and bench scientists to work directly with big omics data.
It achieves this by integrating multivaritate statistics, density-based clustering, and 3D visual analytics in a user-friendly web-based platform
to allow users to interact and discover patterns within their large datasets from their personal computer. It offers three main visual analytics systems:
-
Interactive scatter plot displaying simultaneously feature and sample space in 3D space.
-
Dual-heatmap viewer to visually compare expression patterns of two omics datasets.
-
2D/3D network viewer to visualize correlations and associations between features.
All of our visual analytics systems are coupled with extensive clustering analysis and flexible differential analysis.
-
Omics Abundance Tables
OmicsAnalyst accepts one or multiple omics abundance tables generated from high-throughput instruments such as
metabolomics data, transcriptomics, proteomics and miRNA data. Gene and metabolite annotation from human and mouse is supported. Features must be in rows and samples in columns (example below).
Files must either be in .txt, .csv, or .zip format.
Example Abundance File
#NAME sample10 sample105 sample11 sample113
#CLASS:Condition Classical Classical Classical Classical
FSTL1 0.04085 1.09922 -0.45374 0.03402
MMP2 1.76569 -0.50303 0.41764 1.25827
BBOX1 0.9542 1.21379 0.95196 0.60273
GCSH 0.59383 0.66385 -0.00448 0.47665
EDN1 0.8455 -0.26195 -0.34848 0.08309
CXCR4 0.1691 0.04824 0.47361 -0.55183
SALL1 0.45267 0.97534 0.35652 1.32005
MMP7 -1.47079 -1.29109 -1.40518 -1.08673
C9orf45 -0.41859 -0.42783 -0.01162 -0.55135
RTN1 0.08844 -1.07261 0.67485 -0.1453
ZEB1 0.51088 -0.10636 -0.13515 0.04285
SEMA4D -0.46286 0.35512 -0.2117 -0.78839
PIR -0.5913 0.83694 0.23133 -0.93257
KIAA1199 -0.27161 0.13885 0.47737 0.81915
SORL1 -0.27511 0.4553 0.49623 -0.4683
......
Example data from a multi-omics (transcriptomics + miRNA) study of Breast Cancer from TCGA
Notes about formatting your data files:
-
Sample and feature names must be unique and consist of a combination of common English letters, underscores and numbers for naming purpose. Latin/Greek letters are not supported.
-
Sample and feature names must be consistent across all files (i.e. omics abundance tables and metadata file).
-
Data values (read counts or proportions) should contain only numeric and positive values. Empty cells or cells with NA values will be replaced with zero.
-
Metadata is not permitted in the abundance tables.
-
The 3D visualization system was developed based on the Web Graphics Library or WebGL technology.
WebGL is the standard 3D graphics API for the web. It allows developers to harness the full power of the computer’s 3D rendering hardware
from within the browser using JavaScript. Before WebGL, developers had to rely on plugins or native applications and ask their users to
download and install custom software in order to deliver a hardware-accelerated 3D experience.
WebGL is supported by most major modern browsers that support HTML5. We have tested OmicsNet in several major browsers (see below).
Our empirical testings have shown that Google Chrome usually gives the best performance for the same computer:
Name
|
Version
|
Note
|
Google Chrome
|
50+
|
★★★★★
|
Mozilla Firefox
|
47+
|
★★★★☆
|
Apple Safari
|
10.1+
|
★★★☆☆
|
Microsoft Edge
|
12+
|
★★★☆☆
|
-
Chrome
First, enable hardware acceleration:
- Go to
chrome://settings
- Click the + Show advanced settings button
- In the System section, ensure the Use hardware acceleration when available
checkbox is checked (you'll need to relaunch Chrome for any changes to take effect)
Then enable WebGL:
For more information, see:
Chrome Help: WebGL and 3D graphics.
Firefox
First, enable WebGL:
- Type
about:config in the browser address bar and press enter
- Search for
webgl.disabled
- Ensure that its value is
false (any changes take effect immediately without relaunching Firefox)
Then inspect the status of WebGL:
- Go to
about:support
- Inspect the WebGL Renderer row in the Graphics table:
If your graphics card/drivers are blacklisted, you can override the blacklist.
Warning: this is not recommended! (see blacklists note below). To override the blacklist:
- Go to
about:config
- Search for
webgl.force-enabled
- Set it to
true
Safari
- Go to Safari's Preferences
- Select the Security tab
- Make sure to check theAllow WebGL checkbox
Source: https://superuser.com/questions/836832/how-can-i-enable-webgl-in-my-browser
-
Algorithm
|
Full Name
|
Note
|
MCIA
|
Multiple Coinertial Analysis
|
|
mbPCA
|
Multi-block Principal Component Analysis
|
|
PLS
|
Partial Least Squares
|
|
Procrustes
|
Procrustes analysis
|
|
DIABLO
|
Data Integration Analysis for Biomarker discovery using Latent variable approaches for ‘Omics studies
|
|
rCCA
|
regularized Canonical Correlation Analysis
|
|
-
OmicsAnalyst is very flexible and can be used to answer many different questions in omics and multi-omics data analysis.
Below are some common questions that OmicsAnalyst can address.
-
Explore inherent trends and patterns in multi-omics data and whether samples cluster according to biological condition
-Heatmap viewer, mbPCA, Procrustes
-
…and identify correlated features between two datasets.
-Correlation network using rCCA, PLS, DIABLO, univariate and partial correlation.
-
…and identify potential biomarker features
-DIABLO, MCIA, differential analysis
-
Identify clusters from dimensionally reduced sample space and/or expression heatmap
-K-Means, Peakcluster, Hierarchical
-
OmicsAnalyst offers Robust Maximum Association Between Data Sets using a high-performance R package ccaPP.
The package tests the maximum association measures using several different mesures including Pearson, Spearman or Kendall.
The signicance of maximum association estimates can be assessed via permutation tests.
-
Our scatter plot viewer provides more than mere data visualization, it also provides analytics features allowing users to dissect their datasets.
-
Clustering analysis on sample space to identify inherent data structure and patterns of samples: K-means, Peak cluster, Mean shift.
-
Flexible comparative analysis to identify differentially abundant features between groups or clusters of interest.
-
Targeted analysis to dissect meta-data group or cluster of interest.
-
Enrichment analysis to interpret the DE features in the context of current knowledge.
-
Our dual-heatmap viewer facilitates the identification of coordinated expression changes.
-
Clustering features and samples to identify inherent data structure and patterns of feature space using hierarchical clustering.
-
Joint visualization of two heatmaps simultaneously to identify coordinated or contrasting expression patterns within each dataset
-
Enrichment analysis to functionally interpret features of interest in the context of current knowledge.
-
Our interactive network viewer displays feature relationships in the form of 2D or 3D ball-and-stick graphs.
-
Various graph layout options to highlight different features from the network (i.e. visualize shortest paths, network modules, omics type)
-
Topology analysis to reveal network properties such as node betweenness, degrees, community detection and identifying shortest paths.
-
Enrichment analysis to interpret the DE features in the context of current knowledge.
-
Procrustes analysis is the analysis of shapes. It takes as input two ordination matrices with corresponding points,
and transforms one ordination by rotating, reflecting, scaling, and translating it to minimize the distances
between corresponding points in the other ordination (maximizing fit between corresponding observations).
In OmicsAnalyst, raw omics data is transformed into ordinations with
PCA, which are then configured to minimize the sum of square deviations between corresponding points (samples).
-
The Procrustes plot provides a visual indication of match between two ordinations. Spheres represent samples and belong
to either omics 1 or omics 2 depending on the color of the line connected to the sphere.
The lines between two spheres represent the position of a sample in the second ordination
to its position in the target ordination. Longer distances (lines) between the two spheres indicates poor match
while short distances indicate good agreement between datasets.
-
The MCIA plot shows the projection of two omics datasets into the same dimensional space. Shapes represent samples and
identical samples are connected by a line to the center point, which represents the reference structure which
maximizes the covariance derived from the MCIA synthetic analysis.
The shorter the line, the better the correlation between samples obtained by different omics.
-
The visualization is limited by the performance of users' computers and screen resolutions.
Too many data points will result in greater latency in manipulating the plot.
Based on empircal tests and practical utilities, we recommend to keep the total data points
to be less than 5000 - it is rare that the sample size will be larger than this number.
For very large data, please make sure you have a decent computer equipped with a high performing graphics card.
-
OmicsAnalyst uses kernel density estimation (KDE) from "ks" R package to estimate a probability density function of a random variable.
The resulting probability cloud represents the density estimate containing 75% of all data points from the selected group.
|
|