CLC Single Cell Analysis Module 23.0
Released on January 17, 2023
New features and improvements
- Import Immune Reference Segments – A new importer for TCR and BCR reference data. Intended for importing data that will be used for Single Cell V(D)J-Seq Analysis.
- Single Cell TCR-Seq Analysis has been renamed to Single Cell V(D)J-Seq Analysis.
- Immune Repertoire tools support D and C segments, and scBCR-Seq data. This change affects the following:
- Additional improvements to Filter Cell Clonotypes:
- New filters:
- “Segment types to retain”, for specifying the segment types required for a clonotype to be retained.
- “Productive status to retain, for filtering based on the status of the CDR3 sequence: “Productive”, “Out of frame”, and/or “Premature stop codon”. This replaces the “Only productive” filter.
- “Combined chains to retain”, for specifying the chains required for cells to be retained. E.g. TRA + TRB for TCR clonotypes or IGH + IGH + IGK + IGL for BCR clonotypes. This replaces the “Retain only paired clonotypes” filter.
- “Multiple clonotypes” is now the final filtering step. The new placement can change results relative to previous versions.
- Cell Clonotypes data elements have new and updated views:
- New alignment view, containing alignments between the assembled contigs and the reference V, D, J and C segments.
- New Sankey plot view, summarizing the composition and frequencies of clonotypes.
- Tables views include a “Cell-level clonotype #” column.
- Immune Repertoire Analysis from Reads (10xVDJ) and Immune Repertoire and Expression Analysis to Reads (10xVDJ) workflows have been renamed to Immune Repertoire Analysis from Reads (10xV(D)J) and Immune Repertoire and Expression Analysis from Reads (10xV(D)J), respectively.
- Single Cell hg38 (Ensembl) and Single Cell Mouse (Ensembl) reference data sets now contain D and C reference segments. They no longer contain the trim adapter list for trimming the C segments.
CLC Single Cell Analysis Module 22.1.1
Released on August 16, 2022
CLC Single Cell Analysis Module 22.1
Released on April 04, 2022
- A preview of how the cell barcode and sample will be interpreted has been added to all single cell importers.
- MEX importers have been updated to
- Allow an optional header in the features.tsv file
- Allow an optional header in the barcodes.tsv file
- Allow multiple columns that are tab-separated in the barcodes.tsv file. The barcodes need to be present in the first column.
- A new ‘Sample’ parameter for naming samples has been added to the Expression Matrix and Peak Count Matrix importers.
Train Cell Type Classifier
QIAGEN Cell Ontology
- Dimensionality reduction and phase portrait plots have been improved with new side panel settings allowing:
- Dot and arrow size adjustment
- Addition of labels and legends
- Improved the default colors when overlaying Transcription Factor counts in a dimensionality reduction plot, such that differences can now be seen between groups of cells. Previously, most transcription factors gave an apparently uniform color to all cells, because the upper position of the color slider was determined by the highest value in the plot, and this was often much higher than the other values.
- Fixed an issue where Create Subset in dimensionality reduction plot editors, and Extract to Table in dimensionality reduction and phase portrait plot editor,s would not take into account unsaved changes to clusters.
- Fixed an issue where Import Expression Matrix in Loom format was failing to link cell types from the QIAGEN Cell Ontology found in clusters.
- Fixed an issue where an error would be shown when a selection was made on an Expression Matrix or Peak Count Matrix in a Track List and a table view of the matrix was subsequently opened.
- Fixed an issue with the dimensionality reduction plot editor that would sometimes fail when loading plots with more than 100K cells.
- Fixed an issue where “Extract to Table” in the dimensionality reduction plot editor would:
- For spliced/unspliced matrix selections: put the expression in all three columns (expression, spliced, unspliced)
- For velocity matrix selections: put the spliced read count in all three columns (spliced, unspliced, velocity)
CLC Single Cell Analysis Module 22.0
Released on January 11, 2022
- RNA Velocity analysis is fully supported via RNA-Seq workflows and new tools:
- The following workflows have been updated to calculate velocity, if possible, and produce relevant outputs (phase portraits and a ranking of velocity genes):
- Single Cell Chromatin Accessibility analysis is fully supported via the following new workflows and tools:
- The following tools have been added to support import of 10x ATAC matrices:
- The following tools have been added to support export of 10x ATAC matrices:
Optionally the nearby genes and transcription factors can be imported or exported.
- Single Cell ATAC-Seq Analysis analyzes ATAC-Seq read mappings and produces a filtered and normalized Peak Count Matrix. The peak count matrix contains peaks for each cell, their nearby genes and their transcription factors.
- Split Read Mapping by Cell produces read mappings for groups of cells and Graph Tracks to aid visualization of ATAC-Seq peaks.
- Differential Accessibility for Single Cell calculates peaks / nearby genes / transcription factors for which counts are statistically different between groups of cells.
- It is possible to use the following tools with Peak Count Matrices, either standalone or in combination with an Expression Matrix:
Analysis including both matrix types will only consider cells that are present in both matrices.
- Annotate Reads with Cell and UMI has new options “10x Chromium Single Cell ATAC” and “10x Chromium Single Cell Multiome ATAC” for demultiplexing 10x ATAC data types.
- When ATAC multiome is selected, barcodes are translated to match ATAC GEX.
MEX format Import
- Import Expression Matrix in MEX Format has been extended with options to take separate matrix files for spliced and unspliced reads. When these are used, the expression file may be omitted and the expressions are calculated from these counts.
- Import Expression Matrix in MEX Format (archive) has been extended with the same option. The two matrix files must be called “spliced.mtx” and “unspliced.mtx”. This matches the convention of STARsolo.
MEX Expression Matrix Export
- The MEX Expression Matrix (Archive) exporter is now called MEX Expression Matrix.
- The MEX Expression Matrix exporter has been extended with the ability to export spliced and unspliced counts. They will be in files with fixed names “spliced.mtx” and “unspliced.mtx” matching STARsolo convention.
- The MEX Expression Matrix exporter now outputs the three files (barcode, features, matrix) in .tar format directly instead of wrapping them in an archive.
Improvements to UMAP and tSNE plot
- Single Cell RNA-Seq Analysis has been updated to output a new expression matrix which additionally contains counts for spliced and unspliced reads.
- Single Cell TCR-seq Analysis has been updated to better handle situations where the J segment cannot be uniquely identified for a clonotype. This change is not expected to impact the results significantly but may lead to slight changes in read counts as well as identification of clonotypes with low read counts.
- Text files containing tables with columns separated by tabs (.tsv) can now be imported using Import Cell Annotations and Import Cell Clusters.
- The “Empty droplets filter” in the QC for Single Cell tool has been updated to more easily enable retaining as cells only the barcodes above the automatically detected knee in the log-log rank plot without performing simulation-based tests.
- When importing expression matrix in Loom format appropriate attributes are provided for easy selection.
- Annotate Reads with Cell and UMI has a new option “10x Chromium Single Cell 5′” with the same read structure as “10x Chromium Single Cell 3′ v2”.
- Import Cell Clusters, Import Cell Annotations and Import Cell Clonotypes have been updated to accept any type of data matrix (rather than just an expression matrix) as a parameter for setting the sample name.
- QIAGEN Cell Ontology has been updated with new cell types.
- Various minor improvements.
- Fixed an issue with fitting Negative Binomial Generalized Linear Model. This leads to slightly different results from Normalize Single Cell Data.
- Various minor bug fixes.
- Re-organized the structure of the toolbox to accommodate the new features better.
- The Single Cell Workflows can now be located in the Template Workflows section of the toolbox.
- When using workflows from previous versions of the Single Cell Analysis Module major adjustments may be required. Workflows containing the following tools needs to be manually redrawn by removing and re-adding:
Additional redrawing of connections may be required
Single Cell Analysis Module 21.1
Released on Jun 24, 2021
- It is possible to use Import Cell Annotations and Import Cell Clusters when selecting input to a workflow, so as to perform on-the-fly import.
- A new “Cell format” option in Import Cell Annotations and Import Cell Clusters allows the extraction of sample and barcode information from one column in the imported file. It also allows text, such as a constant prefix, to be trimmed from all barcodes.
- When importing an expression matrix, a feature with a versioned id (e.g. “ENSG00000243485.5”) will be matched to the genomic coordinates of a feature in a track with no version information (e.g. “ENSG00000243485”).
- The Import Expression Matrix tools no longer assign genomic coordinates to imported features when there is ambiguity. For example, if no features have ids, and a feature with the same name is present on two chromosomes, then neither set of genomic coordinates will be assigned to an imported feature with that name.
- Import Expression Matrix in Loom Format imports more of the data present in a Loom file as cell annotations.
Cell type prediction
- Predict Cell Types has been extended with functionality to restrict cell type prediction to selected tissues. This improves accuracy by reducing the range of possible cell types.
- Predict Cell Types matches features with versioned ids in the input (e.g. “ENSG00000243485.5”) to features with no version in the trained classifier (e.g. “ENSG00000243485”) and vice versa. Previously no match would be found for such features.
- The table view of a Cell Type Classifier has been extended to contain information about the features most useful for performing the classification.
- An option to set the sample name has been added to Annotate Reads with Cell and UMI. This can be set either to some literal value or a pattern in much the same way as exporters and output elements.
- Annotate Reads with Cell and UMI has been extended to include two new library preparation types:
- 10x Chromium Single Cell V(D)J
- BD Rhapsody
- An option to “Count intronic reads” has been added to Single Cell RNA-Seq Analysis. This supports analysis of data where many unprocessed transcripts are expected to be present, such as snRNA-seq data.
- The results of Differential Expression for Single Cell additionally contain the feature id and how many cells express each feature.
- New options in Differential Expression for Single Cell allow features to be filtered away when they are insufficiently expressed, prior to testing.
- In the track view of an Expression Matrix, side panel options have been added to show, hide, and color features based on their expression.
- The Expression Analysis from Reads workflow outputs the raw, unfiltered expression matrix. This can be used as input to the Expression Analysis from Matrix workflow, so that changes to cell calling or doublet removal can be made without having to re-map the reads.
- Fixed an issue where Import Cell Annotations would fail when one of the imported columns was completely empty.
- Fixed an issue where the Import Expression Matrix tools could fail to import data in certain circumstances when two features had the same name or identifier. Such features are now supported provided that the combination of name, identifier, and type of feature (e.g. ‘Gene Expression’) is unique.
- Fixed an issue where Import Expression Matrix in CSV/TXT Format gave an error unless the first column header was empty.
- Fixed an issue in QC for Single Cell where cell annotations for “spike-in reads (%)”, “Empty droplet p-value”, and “Empty droplet FDR-corrected p-value” were missing for some cells when the tool was run on a matrix containing multiple samples. This did not affect the output filtered matrix.
- Fixed an issue in QC for Single Cell where an error could sometimes be encountered when generating the histograms for the report.
- Fixed an issue where tooltips shown in the UMAP/tSNE plot did not disappear in some cases after the cursor was moved.
- Fixed an issue where interaction with data via the UMAP/tSNE plot would be slow when the data was generated on a QIAGEN CLC Genomics Server, and then copied and viewed locally while the workbench was still connected to the server.
- Fixed an issue where p-values reported by Differential Expression for Single Cell on data normalized by Normalize Single Cell Data did not use the ‘Pearson residual’ values if a gene had 0 expression. This typically reduced power, leading to fewer differentially expressed genes being detected at a given p-value threshold.
- Fixed an issue when running Create Expression Plot and Differential Expression for Single Cell from the UMAP/tSNE plot, where the pre-selected options would sometimes not reflect the selections in the side panel. This would happen if the tool was run twice, and on the first run different options than the pre-selected ones were used.
- Fixed an issue where Create Subset would output clusters that did not allow easy lookup of cells in the QIAGEN Cell Ontology.
CLC Single Cell Analysis Module 21.0.1
Released on March 3, 2021
- Fold changes reported by the Differential Expression for Single Cell tool on data normalized by the Normalize Single Cell Data tool were previously calculated using the normalized Pearson residual values. This could give misleading results in some cases – typically exaggerating the effect size. Fold changes on such data are now calculated by using the normalization to re-scale expressions to a fixed sequencing depth.
- The cell annotations output of Predict Cell Types has been updated to only include the probabilities of a relevant subset of all possible cell types. Relevant cell types are those that are either the most likely type for at least one cell, or that have a probability above a certain threshold for at least one cell.
- Predict Cell Types no longer creates empty cell types in the “Cell type (high confidence)” category of the output.
- The Perform Single Cell Analysis from Expression Matrix and Perform Single Cell Analysis from Reads workflows have been updated to include the Convert Metadata to Cell Annotations tool. When a metadata table is used for configuring workflow execution, this change means that the produced cell annotations now include information from the metadata table.
- In the it is now possible to select cells that do not express a certain gene.
- QC for Single Cell retains cells with metrics that are equal to the minimum values for count-based filters and maximum values for extra-chromosomal filters. These were previously considered outliers and were removed.
- Various minor improvements.
CLC Single Cell Analysis Module 21.0
Released on January 12th, 2021
First release of this module.