Release date: 2019-12-14
Now you can quickly discover expected and unexpected commonalities among sets of analyses of interest in Analysis Match using a new capability that detects statistically significant associations in their metadata. For example: Are the analyses that match yours often derived from a particular tissue type, disease state or treatment? Do they tend to derive from a particular mouse strain, or from cells with specific cell surface markers? This approach can help easily identify similarities among matching analyses that may have been previously hidden.
IPA scans across more than 90 metadata fields from the set of repository-based analyses that you select in Analysis Match and performs a calculation to detect potential enrichment among their metadata. Figure 1A shows an Analysis Match result filtered for analyses that strongly match (or anti-match) an analysis of gemfibrozil-treated rats. Gemfibrozil is a classical PPAR agonist. Selecting the matching set (those in the red dotted box in Figure 1A) and then clicking the Evaluate Metadata button generates p-values that are calculated using a right-tailed Fisher’s Exact Test. The results are displayed in a table like the one shown in Figure 1B. The most significant term among the selected analyses is ‘PPAR agonists’ in Figure 1B in the case.subjecttreatment field with p-value = 6.98E-08. Other examples of overrepresented terms are ‘white adipose cell’ and ‘preadipocyte’ in the case.celltype field.
Note that the case.subjecttreatment and case.celltype fields are not shown in the Analysis Match table by default, calling attention to the fact that this new feature sifts through and surfaces metadata which may be initially hidden, due to space constraints in the user interface (UI).
Figure 1: New feature in Analysis Match to discover commonalities among analyses of interest via shared metadata.
Figure 1A shows Analysis Match results for the transcriptomics analysis of the liver of rats who were treated with the PPAR-alpha agonist gemfibrozil (RNA-seq data from PMID 25150839). The table has been filtered to retain only the strongest matching (average matching percentage >43) or anti-matching analyses (average matching percentage < -43). The matching analyses enclosed in the red dotted box were selected and the ‘Evaluate Metadata’ button was chosen. Figure 1B shows the results of the enrichment calculation, where the term ‘PPAR agonists’ was found to be highly enriched (p-value = 6.98E-08) among the matching analyses in the ‘case.subjecttreatment field’. This level of significance arose because of the 18 analyses that were selected, three of them shared the ‘PPAR agonists’ term, while there are only nine analyses in the entire set of over 57,000 analyses in the Analysis Match repository with that term. Other examples of overrepresented terms are ‘white adipose cell’ and ‘preadipocyte’ in the ‘case.celltype’ field.
The analyses that were identified as being treated with “PPAR agonists” were specifically treated with tesaglitazar, fenofibrate, or rosiglitazone, which are well-known PPAR agonists. The metadata results table can be filtered to focus on certain fields or terms of interest. In Figure 2, the metadata evaluation results are narrowed to show only fields involving the ‘case’ samples (rather than the controls).
Figure 2: Filtering the metadata results table.You can filter the results data to focus on certain types of fields or values, such as fields involving the cases rather than the controls. Note that the computation only considers the metadata in the repository-based analyses. It does not evaluate any metadata that you may have entered for any of your own analyses.
The Build > Grow > Diseases & Functions feature is a powerful way to add biological context to a pathway or network. However, its calculation of statistical over-representation is computationally expensive and often takes 30–60 seconds. In the past, after performing the first ‘Grow to Diseases & Functions’ operation on a network, IPA would repeat the calculation immediately any time nodes were added or subtracted from the network, forcing you to wait for updated statistical results with each change. Now you control when to perform the calculation using the new Recalculate button (Figure 3). You can make numerous changes, and when ready, determine which diseases and functions are statistically relevant.
Figure 3: Recalculate over-representation of Diseases & Functions on demand.Now you can make multiple additions or subtractions to the network or pathway before performing the computationally expensive overlap calculation.
IPA now supports the upload of .csv dataset files. Some upstream software such as 10x Genomics Loupe Cell Browser exports comma-separated data files. IPA now supports their direct import.
Three new Canonical Signaling Pathways
Addition of Activity Patterns to six existing Canonical Signaling Pathways
Now you can enjoy nearly 175,000 new findings (with a total of over 7 million findings), as well as ~350 newly mappable chemicals, including: