Frequently Asked Questions (FAQs)

  1. When should I use OmicsAnalyst?
  2. What are the main features of OmicsAnalyst?
  3. What are the data formats accepted by OmicsAnalyst?
  4. Which browsers are supported by OmicsAnalyst?
  5. What if WebGL is supported but disabled on my browser?
  6. Which dimensionality methods are implemented in OmicsAnalyst?
  7. What are some common use cases of OmicsAnalyst?
  8. Can I test whether two omics data are associated and how strong their association is?
  9. When should I use PCA or PCoA?
  10. When should I use PCoA or NMDS?
  11. How to interpret the NMDS stress score?
  12. What are the advantages of UMAP over t-SNE?
  13. How do I interpret the UMAP plot?
  14. Why is my UMAP plot different even though I chose the same parameters?
  15. How can I get significant features underlying UMAP clusters?
  16. What is Procrustes?
  17. What is PROTEST?
  18. How do I interpret the Procrustes plot?
  19. How do I interpret the MCIA plot?
  20. How many data points can be visualized?
  1. When should I use OmicsAnalyst?

    A key underlying assumption of OmicsAnalyst is that discrete clusters are present in your omics data. OmicsAnalyst is designed to detect, visualize and analyze these clusters. A clear outcome of this approach is that OmicsAnalyst will partition your data into clusters, regardless of whether there are biologically meaningful groups present. Although this approach may not be suitable for all omics data, such knowledge is rarely known a priori. Therefore, we strongly recommend users to evaluate their omics data in an unbiased, data-driven manner to complement mainstream differential analysis and supervised methods.

    In addition, users can visualize and analyze the patterns/groups with regard to different metadata they provided. This function is independent of the clusters detected. For instance, users can directly visualize and compare any pre-defined groups within our 3D visualization system.

  2. What are the main features of OmicsAnalyst?

    OmicsAnalyst was designed to provide an intuitive means for clinicians and bench scientists to work directly with big omics data. It achieves this by integrating dimensionality reduction, density-based clustering, and 3D visual analytics in a user-friendly web-based platform to allow users to interact and discover patterns within their large datasets from their personal computer.

  3. What are the data formats accepted by OmicsAnalyst?

      Omics Abundance Tables

      OmicsAnalyst accepts one or multiple omics abundance tables generated from high-throughput instruments such as metabolomics data (targeted and untargeted), transcriptomics (bulk and single-cell), and sequencing-based microbiome data (16S rRNA and shotgun metagenomics). Gene and metabolite annotation across 25 different species is supported. Features must be in rows and samples in columns (example below). Files must either be in .txt, .csv, or .zip format.

      Example Abundance File

      Feature	        Patient1 	Patient2	Pateint3
      89.10761	26.0996200732	23.1460903921	25.7902022326
      108.04453	23.9072161208	30.2066140474	15.2227715523
      112.05049	25.2105089519	26.4889060662	22.4397627169
      114.06625	25.7236861033	24.5242553536	18.8258771889
      116.07045	25.4935276049	26.6822051326	19.4841708816
      126.06567	28.3976291914	31.7650213292	27.6698396171
      127.12259	25.7866080331	21.3600835035	21.4387426779
      140.08137	21.2081862174	21.2651236842	19.6536824974
      151.09598	24.756381796	28.6858438887	25.2851696948
      153.06519	26.4146136809	25.7091294726	24.3941898518
      153.04049	21.1820994064	20.828480842	27.6147928377
      ......
                                  
      Metadata Table

      OmicsAnalyst accepts a single metadata file containing metadata information for all samples. In this table, sample names should be in the first column, followed by the different metadata in subsequent columns (example below). Files must either be in .txt or .csv.

      Example Metadata

      Samples 	TissueType	Age
      Sample1	        Liver		71
      Sample2	        Skin		68
      Sample3	        Liver		90
      Sample4	        Skin		61
      Sample5	        Liver		74
      Sample6	        Skin		73
                                  

      Example data from a multi-omics (shotgun metagenomics + untargeted metabolomics) study of Ulcerative Colitis (UC)

      Notes about formatting your data files:

      • Sample and feature names must be unique and consist of a combination of common English letters, underscores and numbers for naming purpose. Latin/Greek letters are not supported.
      • Sample and feature names must be consistent across all files (i.e. omics abundance tables and metadata file).
      • Data values (read counts or proportions) should contain only numeric and positive values. Empty cells or cells with NA values will be replaced with zero.
      • Metadata is not permitted in the abundance tables.
  4. Which browsers are supported by OmicsAnalyst?

    The 3D visualization system was developed based on the Web Graphics Library or WebGL technology. WebGL is the standard 3D graphics API for the web. It allows developers to harness the full power of the computer’s 3D rendering hardware from within the browser using JavaScript. Before WebGL, developers had to rely on plugins or native applications and ask their users to download and install custom software in order to deliver a hardware-accelerated 3D experience.

    WebGL is supported by most major modern browsers that support HTML5. We have tested OmicsNet in several major browsers (see below). Our empirical testings have shown that Google Chrome usually gives the best performance for the same computer:

    Name Version Note
    Google Chrome 50+ ★★★★★
    Mozilla Firefox 47+ ★★★★☆
    Apple Safari 10.1+ ★★★☆☆
    Microsoft Edge 12+ ★★★☆☆

  5. What if WebGL is supported but disabled on my browser?

    Chrome

    First, enable hardware acceleration:

    • Go to chrome://settings
    • Click the + Show advanced settings button
    • In the System section, ensure the Use hardware acceleration when available checkbox is checked (you'll need to relaunch Chrome for any changes to take effect)

    Then enable WebGL:

    • Type chrome://flags in the browser and press Enter
    • Ensure that Disable WebGL is not activated (you will need to relaunch Chrome for any changes to take effect)
    • Here you will have to change Default to Enabled in the drop down.

    • [Try this if above doesn't work] Enable - Override software rendering list

    For more information, see: Chrome Help: WebGL and 3D graphics.

    Firefox

    First, enable WebGL:

    • Type about:config in the browser address bar and press enter
    • Search for webgl.disabled
    • Ensure that its value is false (any changes take effect immediately without relaunching Firefox)

    Then inspect the status of WebGL:

    • Go to about:support
    • Inspect the WebGL Renderer row in the Graphics table:

    If your graphics card/drivers are blacklisted, you can override the blacklist. Warning: this is not recommended! (see blacklists note below). To override the blacklist:

    • Go to about:config
    • Search for webgl.force-enabled
    • Set it to true

    Safari

    • Go to Safari's Preferences
    • Select the Security tab
    • Make sure to check theAllow WebGL checkbox
    Source: https://superuser.com/questions/836832/how-can-i-enable-webgl-in-my-browser
  6. Which dimensionality methods are implemented in OmicsAnalyst?

    Algorithm Full Name Note
    PCA Principal Component Analysis
    PLS Partial Least Squares
    PCoA Principal Coordinate Analysis
    NMDS Non-metric Multidimensional Scaling
    UMAP Uniform Manifold Approximation and Projection
    O2PLS Two-way orthogonal PLS
    sCCA Sparse Canonical Correlation Analysis
    Procrustes Procrustes analysis
  7. What are some common use cases of OmicsAnalyst?

    OmcisAnalyst is very flexible and can be used to answer many different questions in omics and multi-omics data analysis. Below are some common questions that OmicsAnalyst can address.

    1. Whether two omics are correlated and the significance of the correlation
  8. How can I test whether two omics data are associated and how strong their association is?

    OmicsAnalyst offers Robust Maximum Association Between Data Sets using a high-performance R package ccaPP. The package tests the maximum association measures using several different mesures including Pearson, Spearman or Kendall. The signicance of maximum association estimates can be assessed via permutation tests.

  9. When should I use PCA or PCoA?

    PCA searches for patterns in the variables, while PCoA searches for similarities between samples. Unlike PCA/PLS that uses raw data, PCoA takes a (dis)similarity matrix as input and assigns each item a location in low-dimensional space. Distance-based ordinations such as PCoA are recommended over PCA when there are lots of missing data as PCA would result in all samples clustering near the origin.

  10. When should I use PCoA or NMDS?

    Both PCoA and NMDS take a distance matrix as input. PCoA maximizes the linear correlation between samples, whereas NMDS maximizes the rank-order correlation between samples. Users should use PCoA if distances between samples are so close that a linear transformation would suffice. NMDS is suggested if users wish to highlight the gradient structure within their data.

  11. How to interpret the NMDS stress score?

    Stress scores: Less than 5 is excellent (rare), 5-10 good, 10-20 fair, usable, but could be misleading, and scores greater than 20 should be interpreted with caution.

  12. What are the advantages of UMAP over t-SNE?

    UMAP and t-SNE are both popular dimensionality reduction methods widely used in single-cell transcriptomics. However, t-SNE suffers from some limitations, namely a slow computation time and inability to capture global structure. Meanwhile, UMAP preserves more local and global data structure than t-SNE with a shorter computation time. This means that for t-SNE, only inner-cluster distances are meaningful, while inter-cluster relations may be more informative in UMAP versus t-SNE.

  13. How do I interpret the UMAP plot?

    Samples that are clustered together are closely related. However, the sizes of clusters relative to each other is meaningless. As well, distances between clusters is also likely meaningless. While global positions of clusters are preserved, distances are not meaningful.

  14. Why is my UMAP plot different even though I chose the same parameters?

    The UMAP algorithm is stochastic, meaning that different runs with the same parameters may yield different results.

  15. How can I get significant features underlying t-SNE/UMAP clusters?

    Since the the embedding is potentially a highly non-linear transformation, no direct important measures are offered by t-SNE/UMAP. To answer this question, users can directly compare these clusters using several well-established differential analysis methods (such as univariate tests, limma, DEseq2, EdgeR) to identify significant features underlying these clusters.

  16. What is Procrustes Analysis?

    Procrustes analysis is the analysis of shapes. It takes as input two ordination matrices with corresponding points, and transforms one ordination by rotating, reflecting, scaling, and translating it to minimize the distances between corresponding points in the other ordination (maximizing fit between corresponding observations). In OmicsAnalyst, raw omics data is transformed into ordinations with PCA, which are then configured to minimize the sum of square deviations between corresponding points (samples).

  17. What is PROTEST?

    PROcrustean randomization TEST (PROTEST) is a permutation test to determine the measure of goodness of fit (m2) of two datasets. The null hypothesis of PROTEST is that the two datasets do not exhibit greater concordance than expected by chance. Using the permutation approach, variables from one dataset are randomly ordered while keeping the covariance structure, and the fit between the datasets is recalculated. The original fit is then compared to the fit obtained from the randomized data. This is repeated several times to determine the number of times when the original fit was smaller or equal to the fit obtained from the randomized data. For more information, refer to here. By default, the PROTEST implementation in OmicsAnalyst runs 999 permutation and outputs values for the "sum of squares", "correlation", and "p-value". For the sum of squares (m12), values vary from 0 to 1, with low values indicating greater concordance.

  18. How do I interpret the Procrustes plot?

    The Procrustes plot provides a visual indication of match between two ordinations. Spheres represent samples and belong to either omics 1 or omics 2 depending on the color of the line connected to the sphere. The lines between two spheres represent the position of a sample in the second ordination to its position in the target ordination. Longer distances (lines) between the two spheres indicates poor match while short distances indicate good agreement between datasets.

  19. How do I interpret the MCIA plot?

    The MCIA plot shows the projection of two omics datasets into the same dimensional space. Shapes represent samples and identical samples are connected by a line to the center point, which represents the reference structure which maximizes the covariance derived from the MCIA synthetic analysis. The shorter the line, the better the correlation between samples obtained by different omics.

  20. How many data points can be visualized?

    The visualization is limited by the performance of users' computers and screen resolutions. Too many data points will result in greater latency in manipulating the plot. Based on empircal tests and practical utilities, we recommend to keep the total data points to be less than 5000 - it is rare that the sample size will be larger than this number. For very large data, please make sure you have a decent computer equipped with a high performing graphics card.

Processing ....
Your session is about to expire!

You will be logged off in seconds.

Do you want to continue your session?