QIAGEN powered by

Latest improvements for QIAGEN CLC Genomics Workbench

  Current line         Previous line          Archive

QIAGEN CLC Genomics Workbench 23.0.5

Release date: 2023-09-20

Improvements

  • Detect and Refine Fusion Genes has a new option allowing fusions of overlapping genes on opposite strands to be reported.
  • Loading reports with very large tables takes less time.
  • Previously, when annotation tracks were exported to BED format files, the Score column in the exported file contained only 0 values. Now, if the annotation track contains a Score column, those values are reported in the Score column of the exported file. (This does not affect expression tracks, where the expression value is exported as the score.)
  • VCF Import can import VCF files with an unexpected number of values in CLCAD2 or AD. This includes VCF files produced by VarScan2.
  • Various minor improvements

Bug fixes

Data related updates

From September 19, 2023, Download Pfam Database downloads Pfam 36.0. This update also affects download using earlier versions of the CLC Genomics Workbench.

Plugin notes

Import Immune Reference Segments, delivered by Biomedical Genomics Analysis and CLC Single Cell Analysis Module, can now import V segments in IMGT format that end in the conserved amino acid. Previously, these segments were silently ignored.

Advanced notice

We plan to remove the Pause and Resume options, available for some processes in the Workbench, in a future release. If you are concerned about this proposed change, please contact our Support team by emailing ts-bioinformatics@qiagen.com.



QIAGEN CLC Genomics Workbench 23.0.4

Release date: 2023-05-22

Improvements

  • Download BLAST Databases is more resilient to interrupted connections and similar issues when downloading large databases.

Bug fixes

  • Fixed an issue where workflows containing a BAM export element could not be launched from CLC Genomics Workbench 23.0.3 to run on a CLC Genomics Server due to an error reported after selecting an export destination in the launch wizard (“The parameter ‘Export destination’ File not found.”)
  • Fixed an issue causing workflows to fail if they contained multiple Filter on Custom Criteria elements connected to a single downstream element, and one or more of the Filter on Custom Criteria outputs was empty.
  • Fixed an issue causing QC for Read Mapping to report the number of unaligned ends instead of the number of reads with unaligned ends. This could cause “read count” and “% of all mapped reads” to be too high.

Data related changes

ClinVar data for hg38 and dbSNP data for hg38 and hg19 made available via Download Genomes have been updated. These changes affect data available via CLC Genomics Workbench 22.x and 23.x and took effect on May 15, 2023.

Details of these updates:

  • “Clinical associated variants” (ClinVar) – now accesses the latest releases from the NCBI for hg38. This data source was already used for hg19. Previously, Ensembl was the source for hg38.
  • “Dbsnp variants” (dbSNP) now accesses version 151 from the NCBI. Previously, Ensembl was the source of this data for hg38. UCSC was the source for hg19.
  • “Dbsnp (common) variants” (dbSNP common) now accesses version 151 from the NCBI. Previously, this option was not available for hg38. UCSC was the source for hg19.

Advance notice

We plan to remove the Pause and Resume options, available for some processes in the Workbench, in a future release. If you are concerned about this proposed change, please contact our Support team by emailing ts-bioinformatics@qiagen.com.



QIAGEN CLC Genomics Workbench 23.0.3

Release date: 2023-04-19

Improvements

  • The SAM and BAM exporters have a new option relevant where there is one or more circular reference sequences. The new option, “Export reads spanning the origin of circular chromosomes as unmapped”, is checked by default, making the default behavior of these exporters match that of CLC Genomics Workbench 22.x and earlier. This update changes the default behavior of these exporters relative to CLC Genomics Workbench 23.0.1 and 23.0.2. In those versions, reads that span the origin are exported as extending beyond the end of the reference. That behaviour corresponds to unchecking the new option.
  • Import of PacBio SAM/BAM files with Platform Model (PM) set to HIFI are imported as HiFi reads without having to check the “Mark as HiFi reads” option.
  • Producing an Amino Acid Track is now optional in Amino Acid Changes.
  • Various minor improvements

Bug fixes

  • Fixed an issue affecting the homopolymer trimming options of Trim Reads. When enabled, homopolymers that started with 9 identical bases followed by a different base were not trimmed. Other homopolymers were trimmed as expected. This update may affect the number of reads trimmed in a given dataset, and thus could lead to differences in results from downstream analyses, relative to earlier software versions.
  • Fixed an issue causing Detect and Refine Fusion Genes to fail on certain data sets.
  • Fixed an issue causing RNA-Seq Analysis to fail when reads mapped to a gene located close to the origin of a circular chromosome.
  • Fixed an issue that caused Join Alignments to fail in CLC Genomics Workbench 23.0.1 and 23.0.2.
  • Fixed an issue causing SAM/BAM export to fail when reference sequence names contained commas, brackets or other characters not in the set of allowed characters according to the SAM format specification. These characters are now replaced by an underscore in the exported file.
  • Fixed an issue causing import of SAM/BAM files to fail when they contained a Platform (PL) but no Platform Model (PM) in the header. This affected the PacBio importer, the Ion Torrent importer and Standard Import of reads from SAM/BAM files.
  • Fixed an issue that caused the value selection range of gradient sliders in the Side Panel to be displayed incorrectly when using Locale Settings that do not use ‘.’ (dot) as a decimal separator.
  • Fixed an issue where lines in pdfs containing history information were not wrapped, resulting in the ends of long lines not being present in the exported document.
  • Fixed an issue that caused VCF Export to fail when exporting fusions that had two or more filter criteria listed in the Filter column.
  • Fixed an issue that caused Low Frequency Variant Detection, Fixed Ploidy Variant Detection, and Basic Variant Detection to fail when the end of a mapped read supported a deletion, and there was support in other reads for a variant at the subsequent position. This issue has only been observed for RNA-Seq data where splicing combined with primer trimming could lead to this situation.
  • Fixed an issue causing Extract Reads to not correctly extract reads overlapping annotated regions that cross the origin of circular chromosomes when the type of overlap was set to “Span region” or “No overlap”.
  • Fixed issues in read mapping track tooltips affecting positions in paired reads that contained different bases:
    • When viewing the reads as pairs, one of the 2 bases would be chosen to represent both reads for the purposes of the counts in the tooltip. Now, such ambiguous bases are represented by their IUPAC codes and counted accordingly. Thus, the tooltip now corresponds to the bases shown in the read mapping track.
    • When viewing the individual reads making up the pairs, just one of the two bases would be counted, thereby undercounting the total number of bases at such positions.

Plugin notes

Fixed an issue affecting Immune Repertoire Analysis, delivered by Biomedical Genomics Analysis, and Single Cell V(D)J-Seq Analysis, delivered by CLC Single Cell Analysis Module. The tools failed if there were reads where the region that aligned to a C segment was contained within the region that aligned to a J segment.



QIAGEN CLC Genomics Workbench 23.0.2

Release date: 2023-02-13

Improvements and bug fixes

  • The runtime of Amino Acid Changes has been significantly improved.
  • Fixed an issue in the Trim Reads report where the number of “Trimmed (broken pairs)” was not reported per sequence list provided as input, but were instead added together incrementally. The number of reported “Trimmed reads” decreased correspondingly. The issue would occur when paired reads from more than one sequence list were trimmed and broken read pairs were produced.
  • Fixed a rare issue that could cause Trim Reads to retain a wrong part of a read if the read was both trimmed based on quality scores and adapter read-through.
  • Fixed an issue causing the Demultiplex Reads tool to always demulitplex based on a sequence structure of “barcode, sequence”. Adjustments to the tag list, such as adding a linker or placing the barcode at the end, were ignored. This issue did not affect the tool when run in a workflow context.
  • Fixed an issue that could cause Detect and Refine Fusion Genes to fail on Windows when either the dataset was large or fusion genes with many possible transcripts were detected.
  • Fixed an issue that could cause VCF Export to fail when exporting filtered annotation tracks that were empty.
  • Fixed an issue that caused fragment lengths to be incorrect in tables from Design Primers. The values are re-calculated when a table is opened, hence previous designs do not need to be repeated.
  • Fixed an issue causing the download of the QIAseq xHYB Viral Panels reference data set to fail on Windows.
  • Fixed a rare issue where Rebuild Index could not repair a corrupt search index.
  • Various minor bug fixes


QIAGEN CLC Genomics Workbench 23.0.1

Release date: 2023-01-17

Improvements and bug fixes

  • Fixed an issue affecting Trim Reads, where the wrong part of a read was retained if the read was both trimmed to a fixed length and also trimmed by another method from the opposite end of the read.
  • Fixed an issue affecting Trim Reads when both adapter trimming using a trim adapter list and fixed length trimming were selected. This issue could cause the resulting trimmed reads to be shorter than expected.
  • Fixed an issue where fusion plots created by Detect and Refine Fusion Genes were omitted in the report and were not accessible via the fusion track table.
  • Fixed an issue where workflows containing a Branch on Coverage element would fail for read mappings with no zero coverage regions when using reports output by QC for Read Mapping.
  • Fixed an issue where dates indicated with forward slashes in CSV format files were not recognized as dates by Import Metadata.
  • Fixed an issue where the history entry in a sequence list after sorting always stated the sorting was based on length, even the sorting was based on name or marked status.
  • Fixed an issue causing Annotate with GFF/GTF/GVF file to fail when the option “Ignore duplicate annotation” was checked.
  • Fixed an issue causing Standard Import of GenBank format to stall if qualifier names spanned more than one line.
  • Various minor improvements

Please see the release notes for CLC Genomics Workbench 23.0, below, for a full list of changes since the last general release of this software.



QIAGEN CLC Genomics Workbench 23.0

Release date: 2023-01-17

New tools

  • Homology Based Cloning – Design cloning experiments for cloning methods relying on homologous ends, such as Gibson Assembly®.
  • Create K-medoids Clustering for RNA-Seq finds clusters of features, e.g., genes/transcripts/miRNAs etc, whose expressions behave similarly, for example first increasing over time and then decreasing. The tool produces a Clustering Collection which contains a Sankey plot showing how these features move between clusters under different conditions, for example different treatments. A line graph representation of features from individual clusters or pairs of clusters is present as well.

New tools coming from plugins

  • Detect and Refine Fusion Genes – Find fusion genes in RNA-Seq data by identifying potential fusions and then refining that list by evaluation of the evidence for each fusion. This is an updated version of the tool formerly distributed in the Biomedical Genomics Analysis plugin. The updates made are listed in an Improvements section below.
  • Target Region Coverage Analysis – Analyze and compare coverage from multiple samples. This tool was formerly distributed in the Biomedical Genomics Analysis plugin.
  • Create Consensus Sequences from Variants – Create consensus sequences from a variant track and a reference sequence. This tool was formerly distributed in the Biomedical Genomics Analysis plugin.
  • Annotate with GFF/GVF/GTF file – Add annotations from a GFF, GVF or GTF format file onto sequences, individual or in sequence lists. This tool was formerly distributed in the Annotate with GFF file plugin.

Other new functionality and improvements

RNA-Seq analysis tools

  • New tutorial: Get hands-on experience with new RNA-Seq analysis functionality, including Create K-medoids Clustering for RNA-Seq (see New Tools above), with the RNA-Seq analysis with four tissues and six timepoints tutorial.
  • Improvements to RNA-Seq Analysis:
    • Substantial speed improvements. Reads that map to multiple transcripts or genes will be distributed differently than earlier due to different choices of random seed in the new implementation. The algorithm is still deterministic.
    • Transcripts are no longer renamed in Transcript Expression (TE) output unless renaming is necessary to avoid duplicate names. Previously, transcripts were renamed to the gene name plus a number e.g. “BRCA_1”. This change means that TE tracks in this version of the software cannot typically be used together with TE tracks generated using older versions to produce Heat Maps, PCA plots, Expression, etc.
    • Reports UMI fragment counts when relevant. UMI counts are included in the Fragment statistics section of the report if the input reads are annotated with UMIs by tools from the Biomedical Genomics Analysis plugin, and if the library type is set to 3′ sequencing for RNA-Seq Analysis.
  • Improvements to Heat Maps:
    • Samples can be ordered by the Tree, Sample, or Active metadata layer options, or any individual metadata entry.
    • Optimize tree layouts – a new option for reordering features to produce a top-left to bottom-right diagonal.
    • The order of the metadata categories can be adjusted. This order is reflected in the legend.
    • Metadata categories are alphabetically sorted.
  • The Expression Browser includes a new plot for visualizing genes across samples and contrasts and metadata categories are sorted alphabetically.
  • Venn diagrams support four and five groups. Previously up to 3 were supported. Tooltips indicate which groups are part of a specific intersection.
  • PCA plots produced by PCA for RNA-Seq:
    • Have two table views. The first table view shows the loadings of the principal components. The second table view shows the coordinates of the points.
    • The order of the metadata categories in 2D PCA plots can be adjusted. This order is reflected in the legend.

miRNA analysis tools

Differential Expression for RNA-Seq and Differential Expression in Two Groups

Detect and Refine Fusion Genes

This is an updated version of Detect and Refine Fusion Genes, formerly distributed in the Biomedical Genomics Analysis plugin. The updates listed here are relative to the version distributed with Biomedical Genomics Analysis 22.2.

  • Fusions will not be called for overlapping genes.
  • Novel exon boundary improvements:
    • Options have been expanded to allow for detecting fusions with a single fusion partner (“Detect with novel exon boundaries”) as well as detecting those with 2 fusion partners (“Allow fusions with novel exon boundaries in both genes”)
    • The “Detect exon skippings” option supports detection of fusions with novel exon boundaries.
  • An option has been added to omit non-significant breakpoints from the report.
  • A minimum Z-score can now be specified for use when evaluating evidence for a fusion.
  • Speed improvements
  • The option “Allow fusions with novel exon boundaries in both genes” now defaults to false to reduce the number of false positive fusions. Setting it to true is useful for exhaustive searches of novel fusions.
  • Changes to the maximum number of equivalent matches to the reference allowed for a single read to be retained:
    • When remapping reads to a fusion chromosome, the maximum number is now 30. Previously it was 10.
    • When searching for unaligned ends, the maximum number remains unchanged, as 10.
    •  The option “Maximum number of hits for a read” has been removed. It’s value was ignored in previous versions.
  • Fusions from mRNA transcripts without an associated gene in the Gene track are not used when detecting fusions. mRNA transcript features must have a gene id in one of the following columns to be matched with the associated gene: “Parent”, “gene_id” or “gene_name”.
  • Fixed an issue where paired end reads were treated as single end reads when the option to “Only use fusion primer reads” was enabled.
  • Fixed an issue where unaligned ends could be too long or too short for reads containing insertions and deletions. This change may lead to small differences in results compared to earlier versions, expected to be due to a decrease in false positive and false negatives reported.

Bisulfite mapping

  • Map Reads to Bisulfite Reference speed improvement. This is data dependent, with about a 50% improvement likely for most data sets. This speed up might change the details of results very slightly.
  • Call Methylation Level speed improvement. This speedup might, in some cases, change results very slightly.
  • Import of read mappings from SAM/BAM now use methylation information from the optional SAM tags XR for read conversion and XG for reference conversion. The recognized values are “CT” and “GA”. Support for these tags is added so that information is not lost if a bisulfite mapping is exported and then re-imported.
  • Export of read mappings to SAM/BAM format now includes details on bisulfite conversion. These are specified using the SAM tags XR for read conversion and XG for reference conversion. The possible values of these tags are “CT” and “GA”. This is provided for increased compatibility with third party tools.

Workflows

  • Branch on Coverage – a new workflow control flow element where the downstream processing of read mappings can be controlled based on coverage values within reports.
  • Import with Metadata – new template workflow that imports sequence data into sequence lists and associates the imported elements to a CLC Metadata Table containing descriptive information for each sample.
  • Workflows containing Demultiplex Reads elements and workflows containing Split Sequence List elements can be run in Batch mode.
  • Barcodes can be preconfigured in Demultiplex Reads elements in workflows.
  • Workflow Export elements can be preconfigured to export to locations on AWS S3.
  • When Annotate with Overlap Information is included more than once in the same workflow, columns with overlap information are now always added in the same order. Previously, concurrency issues could cause column order to be different between different runs.

Search for Reads in SRA

Read mappings

Import and export

  • VCF Import:
    • Supports symbolic alleles for inversions (<INV>), insertions (<INS>), deletions (<DEL>) and tandem duplications (<TANDEM:DUP>). Symbolic alleles that do not contain sequence information or are longer than 100,000 base pairs are imported to annotation tracks instead of variant tracks. Previously symbolic alleles were not imported.
    • Improved handling of variants with multiple loci encoded in the same vcf record.
  • VCF Export supports symbolic allele representation for insertions (<INS>), deletions (<DEL>) and tandem duplications (<TANDEM:DUP>). (Inversions (<INV>) were already supported.) With the exception of deletions, variants in annotation tracks are always exported as symbolic alleles. Deletions in annotation tracks and variants in variant tracks above a specified size are also exported as symbolic alleles. The default size is 1000 bp, which corresponds with the QCI Interpret requirement that InDels > 1000 bp must be represented as symbolic alleles.
  • The PacBio importer supports HiFi reads.
  • The read length when exporting to FASTQ format files has been increased from 524,288 bp to 16,777,216 bp.
  • SAM/BAM Mapping Files importer:
    • Performance improvements
    • The circular flag of references is now retained.
  • Import Tracks from File has been updated to show a warning if the file is not imported.
  • GFF3 Export retains the case of attribute headers. Previously, all headers were adjusted to lower case during export.
  • The history information of elements imported using Standard Import includes the specific importer used (e.g. “CSV table importer”, “Fasta Importer”, etc).
  • Standard Import can be used to import files from AWS S3 locations.
  • When exporting images to bitmap-based formats, the Screen resolution and High resolution options are now bounded so the maximum supported number of pixels will not be exceeded.

Sequence Lists

  • Checkboxes can be enabled to select sequences within the graphical view of sequence lists. Lists can be sorted based on whether they are marked or not, and marked sequences can be deleted.
  • In the Annotation Table view, the following changes have been made to the right-click menu:
    • The underlying sequence of selected annotations can be deleted.
    • Names of sequences selected annotations are on can be copied to the clipboard.
    • The option to export to gff now exports to GFF3 format – Export Selected to GFF3 File. This option has also been updated in the Annotation Table view of individual sequence elements.
  • In the Table view, selected sequences can be deleted, and the names of selected sequences can be copied.
  • Various minor improvement to labels in right-click menus.

CLC Metadata Tables

  • When launching analyses in Batch mode, or when launching workflows with an Iterate element, CLC Metadata Tables with data associated can be used directly as input. Each row in the CLC Metadata Table is a batch unit, with data elements associated to a row, of a type compatible as input to the analysis, being the default contents of a batch unit. When launching workflows, the column to base the batch units on can be specified.
  • New options for editing CLC Metadata Tables, including for adding content from other CLC Metadata Tables or Excel, CSV or TSV files. Rows in a CLC Metadata Table can also be selected and used to make a new CLC Metadata Table.
  • When associating data automatically to CLC Metadata Tables, a preview of the associations that will be made is shown in the wizard.

Other improvements

  • Search for Sequences at UniProt has been substantially updated and improved. Changes include new search fields and more informative information returned, including links to PubMed entries.
  • Quick Search and Local Search have been substantially improved. Please refer to the documentation for details.
  • The overview of batch units when launching tools or workflows in Batch mode and when launching workflows with control flow elements (Iterate, Collect and Distribute), have been aligned. In the latter, the contents of batch units can now be adjusted by including or excluding elements based on a part of their name. Previously this was only possible when launching analyses in Batch mode. Right-click options to remove batch units or to remove particular data elements from a batch unit, have been removed.
  • When Low Frequency Variant Detection, Fixed Ploidy Variant Detection or Basic Variant Detection was used with a mapping realigned using Local Realignment with a guidance variant track, it was possible for partial insertions to be called. Now, the full insertion must be present within at least one, individual read for it to be reported.
  • QC for Targeted Sequencing:
    • Can report coverage statistics per gene.
    • Supports analysis of read mappings generated by RNA-Seq Analysis.
  • The hg38 masking track GenomeReferenceConsortium_masking_hg38_no_alt_analysis_set is provided via the Reference Data Manager as a reference element, and is part of reference sets that use the “hg38_no_alt_analysis_set” genome sequence. It contains regions defined by the Genome Reference Consortium and primarily serves to remove false duplications, including one affecting the gene U2AF1. It is intended for use with Map Reads to Reference.
  • Annotate with Exon Numbers:
    • Can add exon numbers to elements in annotation, expression and statistical comparison tracks. Previously only variant tracks could be annotated with exon numbers.
    • Adds exon numbers when input elements start outside an exon but still overlap the exon.
    • Adds all exons when multiple exons overlaps a single input element.
    • Allows annotation with exons from only one transcript or CDS.
  • Filter on Custom Criteria can be used to filter Statistical Comparison Tracks, Statistical Comparison Tables, IsomiR tables, and miRNA Seed Tables.
  • Demultiplex Reads has been updated to:
    • Report barcodes without any matched reads
    • Show the barcodes names in the history.
  • Reports from Create Sample Reports and Combined Report generated using RNA-Seq reports now include the percentage of reads mapped to exons in the Fragment counting statistics table.
  • In Create Sample Report, the percentage of target region positions with coverage above a set threshold can be used as a QC metric.
  • QC for Sequencing Reads processes only the first 100,000 base pairs in long reads. Before the tool would fail when provided with very long reads.
  • Local Realignment no longer realigns reads into regions with no coverage, such as introns in RNA-Seq read mappings.
  • Remove Duplicate Mapped Reads uses an improved method to identify duplicate reads when handling paired end reads. In general, this improvement results in slightly more reads being considered duplicates.
  • The options for extracting reads according to their location relative to features in an overlap track have been expanded in Extract Reads. Previously reads had to lie fully within an annotated region to be extracted. Now, in addition to that condition, options are provided for extracting any overlapping reads, extracting only reads that fully span annotated regions or extracting all reads except those that overlap with annotations in the overlap track.
  • Assemble Sequences to Reference supports alignment of reads that span the origin of a circular reference.
  • Secondary Peak Calling has a new option “Peak detection stringency”.
  • The report from Copy Number Variant Detection (CNVs):
    • Includes a table showing the number of genes affected by CNV calls.
    • Contains new coverage plots at genome and chromosome levels.
  • The Trim Reads report now includes statistics for the number of reads in intact pairs and in broken pairs.
  • Updated restriction site database to REBASE 2022-06-30.
  • The Identify Known Mutations from Mappings output channel names when used in a workflow have been improved. The elements produced by the tool have not been changed.
  • While viewing data, in most situations, tooltips can be suppressed by holding down the Ctrl key. Similarly those tooltips can be displayed immediately, instead of a moment after the mouse cursor stops moving, by holding down the Shift key.
  • The Welcome Center content has been updated to focus on information helpful when getting started using the Workbench.
  • Third party plugins for CLC Workbenches can be installed when the Workbench is running in Viewing Mode.
  • A button has been added to the top Toolbar for contacting our Support team.
  • Various minor improvements

Bug fixes

  • Low Frequency Variant Detection, Fixed Ploidy Variant Detection and Basic Variant Detection:
    • Fixed an issue that in very rare cases caused insertions to be called twice. Now, the same insertion is always only included once in the variant track.
    • Fixed an issue in the remove pyro-error variants filter. Previously, the frequency threshold for removing pyro-error variants was ignored and more variants than intended were removed. The filter is generally only used for Ion Torrent data. This fix may result in a small improvement to the precision of variant detection.
    • Fixed a rare issue affecting variant calling in very low coverage regions, where a variant could be reported that was not present in any single read in the mapping.
  • Fixed an issue causing Map Reads to Reference to fail if a masking track covering a whole chromosome was provided as input.
  • RNA-Seq Analysis
    • Fixed an issue where reads were not counted as unique for a transcript in the GE track table, if the read could map in multiple ways to the same transcript, but only to that transcript.
    • Fixed an issue that could lead to an IndexOutOfBounds error when the option “Calculate expression for genes without transcripts” was selected, and two or more genes had the same name, and at least one of these has no transcripts, and the Region column of the table view of the gene track contains the text “join”, “>”, or “<” (i.e., the genes have splice structure, or uncertain end positions).
  • Fixed an issue where the gene identifier would be removed from the statistical comparison track and tables produced by the Differential Expression for RNA-Seq tool when it was not recognized to be an Ensembl gene identifier.
  • Fixed an issue in Differential Expression in Two Groups and Differential Expression for RNA-Seq that affected the estimation of dispersion estimates including information from nearby genes. This leads to slightly different p-values produced by by these 2 tools.
  • Fixed an issue affecting Extract Consensus Sequence where annotations transferred from the reference sequence to the consensus sequence could be wrongly positioned if the read mapping had an insertion in a region that was removed due to low coverage.
  • Fixed an issue where, if two genes had the same name and overlapped, their transcripts might become assigned to only one of the genes. The fix only applies when the gene and transcript annotations are imported from GFF3.
  • Fixed an issue affecting the naming of outputs from Local Realignment when the tool was provided with multiple read mappings as input and not run in batch mode. Each resulting realigned read mapping is now named after the corresponding input. Previously all the realigned read mappings were named after the first read mapping in the set of inputs.
  • QC for Sequencing Reads
    • Fixed an issue in the report where the graph for R1 nucleotide contributions would be truncated to only show the same number of nucleotides as the R2 plot.
    • Fixed an issue where the median read length in the supplementary report could be incorrect when the number of reads was very low. The median reported in the graphical report was correct.
  • Amino Acid Changes
    • Fixed an issue causing the output from to be named after the reference data instead of the input data.
    • Fixed an issue that caused the transcripts and proteins listed in the Coding region change and Amino acid change columns in the annotated variant track output to be inconsistently ordered.
  • Fixed an issue in the Trim Reads report, where the number of reads under “No trim” could be incorrect when “Remove fixed number of bases” was enabled.
  • Fixed an issue causing Show Enzymes Cutting Inside/Outside Selection to give wrong results when the selection crossed the junction of a circular sequence and a desired number of cut sites outside the selection was not specified.
  • Fixed an issue in VCF Export, where specified minimum ploidy was not always enforced for complex variants. The issue would only occur when an allele had first been removed from a locus to adhere to the specified maximum ploidy.
  • Fixed an issue where the wrong entry in a trim adapter list would be opened for editing if the list had been sorted or filtered.
  • Fixed a rare issue in K-means/medoids clustering where a gene could be output in multiple clusters. This would occur when genes with identical expressions were chosen to be medoids, and so would only happen when K was comparable to the number of genes with unique expressions across samples.
  • Fixed issues with Quantify miRNA where:
    • It would fail on paired reads if using spike-ins.
    • Opening a sequence list to view it would cause this tool to fail if that same sequence list had been used as input.
  • In the report from Create Sample Report the value column in the summary table is coloured green or yellow according to whether the threshold is met. Previously, the threshold column was coloured.
  • Workflow related
    • Fixed an issue affecting the location of outputs generated from a workflow element that was also linked to a Collect and Distribute element. In cases where the output folder name was defined using the {input} or {2} placeholder, these outputs were sometimes all saved to the first folder created, instead of to different folders as intended.
    • Fixed an issue where default names were applied to outputs from Output elements attached directly to an Iterate element in workflows, even when naming placeholders had been configured.
    • Fixed an issue affecting workflows with nested Iterate elements where results from the outer level of iteration flowed into a Distribute and Collect element. Any output elements generated in the inner iteration, which should have saved, were lost.
    • Fixed an issue where unlocked options for on-the-fly importers in a workflow would be locked if the Input element was re-opened for editing.
    • Fixed an issue affecting the “Highlight used elements” view setting of the workflow editor, where most elements, not just the unconnected ones, were grayed out when this option was selected.
  • Fixed an issue when exporting information from an Expression Browser element to Excel and choosing the “Export table as currently shown” option resulted in information from cells containing very long entries, such as GO Biological processes, being truncated.
  • Fixed issues affecting the right-click option to ”Extract Sequence…” over a sequence track in a track list containing an annotation track:
    • The “Extract annotations” setting was having no effect. Even when unchecked, annotations were included in the output.
    • The “Extract annotations” option was disabled when no sequence region had been selected before running “Extract Sequence”. In this situation, the option is now enabled, and turned on by default.
  • Fixed an issue setting the subtree line color for a particular node of a phylogenetic tree would result in that color also being applied to neighboring nodes.
  • Fixed an issue that could cause the Workbench to freeze when exporting elements with certain view settings to graphics formats, for example read mappings with compactness set to “Not compact” and the Sequence layout set to “No wrap”.
  • Fixed an issue where legends added to heat maps could sometimes be placed on top of an existing legend.
  • Fixed an issue affecting the visualization of certain read mapping tracks, where when zoomed out, some empty space appeared at the top of the mapping when the option “Float variant reads to top” was selected.
  • Fixed an issue in reads tracks tooltips where insertions could be reported as present in more than 100% of the reads.
  • Fixed an issue affecting hyperlinked table entries, where html tags were sometimes included as text in the information exported to Excel or CSV formats.
  • Fixed an issue where upgrading on Windows systems could be blocked due to a locked file.
  • Fixed an issue where text in installer screens was not visible when installing the software in ‘dark mode’ on Linux.
  • Various other minor bug fixes

Changes

  • Tools in the RNA-Seq and Small RNA Analysis folder of the Toolbox have been rearranged into subfolders related to the natural flow of an analysis.
  • The Cloning tool has been renamed to Restriction Based Cloning.
  • The “Disconnect paired reads” viewing option for read mapping tracks and stand-alone read mappings has been replaced by the option “Show strands of paired reads”. The behaviour of the new option is like the old one except that the members of each pair are connected by a blue line.
  • Indexes used for searching are not the same as the ones used in earlier versions. New indexes are automatically established for each available CLC Workbench Location when installing version 23.0 for the first time. So for Workbenches, this change will be seamless in most cases. However, if you later run an old version of a CLC Workbench and save new data elements to a CLC Workbench location, a search from the newer version of the software will not find those unless you manually re-index your CLC Workbench locations.
  • When tools or workflows are run in Batch mode, “Create subfolders per batch unit” is selected by default. Previously this option was not selected by default.
  • PFAM accessions in the results table created by PFAM Domain Search are linked to PFAM entries hosted by InterPro. Links generated in older versions of the software were to the Pfam website, which is being decommissioned.
  • AWS Connections:
    • An AWS region can be specified in the AWS connection settings. When upgrading from an earlier version with AWS connections already defined, the region will be set to us-east-1 by default. This can be changed by editing the connection. The region setting is primarily relevant if you plan to submit analyses from a CLC Workbench with the CLC Cloud Module installed to run on a CLC Genomics Cloud setup.
    • Information about AWS Connections now includes whether the connection is valid for submitting jobs to a CLC Genomics Cloud, in addition to whether the connection is valid for accessing files on AWS S3.
  • The Java version bundled with CLC Genomics Workbench 23.0 Java 17.0.4, where we use the JRE from the Azul OpenJDK builds.

Legacy tools and functionality

The following tools have been moved to the Legacy folder of the Workbench Toolbox and will be retired in a future version of the software:

  • QIAGEN GeneReader importer (Legacy)

Functionality retirement

The following tools have been retired:

  • Batch Rename (legacy)
  • Compare Sample Variant Tracks (Legacy)
  • Empirical Analysis of DGE (Legacy)

Plugin notes

Plugin retirements