Biomedical Genomics Analysis latest improvements
Biomedical Genomics Analysis Plugin 20.0.1
Released on March 10, 2020
- Fixed an issue where reference alleles were being filtered out from variant tracks produced by Trio Analysis and Family of Four Ready-to-Use Workflows, which then led to incorrect coverage values in VCF files exported using these variant tracks. This issue was introduced in Biomedical Genomics Analysis 20.0.
- Fixed an issue in Refine Fusion Genes. where a given read could be counted as supporting more than one fusion if those fusions had breakpoints lying very close together. Each read can now only support one fusion model.
- Fixed an issue affecting the Identify Variants (WGS-HD, WES-HD and TAS-HD) Ready-to-Use Workflows where all reference alleles were being filtered away. This was addressed by removing the Remove Reference Variant (Legacy) tool from these workflows.
- Fixed an issue affecting the Identify Somatic Variants in Tumor Normal pairs (WGS, WES and TAS) Ready-to-Use Workflows where all reference alleles were being filtered away. In these workflows the Remove Homozygous Reference Variants tool has replaced the Remove Reference Variant (Legacy) tool, so only orphan reference alleles are removed.
- Fixed a bug where fusions could be incorrectly annotated as known or incorrectly annotated as unknown by Annotate Fusions with Known Fusion Information and Refine Fusion Genes in cases where multiple sets of breakpoints were detected for a given pair of genes. Where only one pair of breakpoints between two genes was identified, fusion annotation was not affected.
Fusion gene pipeline improvements
Various other minor improvements
Biomedical Genomics Analysis Plugin 20.0
Released on December 11, 2019
QIAseq Panel Analysis
- Perform QIAseq Multimodal Analysis for analysing QIAseq DNA and RNA Multimodal Panel data. This workflow performs somatic variant calling on DNA down to a frequency of 0.5% and uses RNA for detecting fusion genes, exon skipping, and gene expressions. A reference data set called QIAseq Multimodal Panels hg38 is available for analysis with reference sequence, hg38 no alt analysis set. The catalog panels UHS-005Z, UHS-006Z and UHS-009Z can be run from the Multimodal tab in Analyse QIAseq Panel guide. Custom panels can similarly be configured for quick execution as long as DNA primers and target regions have been lifted to the hg38 genome before import. When running the stand-alone workflow it is possible to call CNVs if controls mapping are supplied
- Perform QIAseq RNA Fusion XP Analysis for analyzing QIAseq Fusion XP Panel data. This workflow supports variant calling, fusion detection and expression analysis.
- Detect QIAseq MSI Status with Baseline Creation for a combined analysis of MSI status and baseline creation. This workflow can be used for assessing QIAseq MSI from custom or catalog panels boosted with MSI primers or the booster MSI panel as standalone. It can be configured with the QIAseq TMB panel hg38, or a custom reference dataset for hg19 can be constructed from single reference elements, including newly added MSI loci Track for hg19. The manual includes a description on how to quickly map the samples for the baseline.
An additional VCF exporter, VCF (Biomedical), is now available. This new exporter extends the functionality of the standard VCF exporter by supporting the export of Copy Number Variants (CNV) Tracks (Target, Region and Gene) and Refined Fusion Genes (WT) Tracks, in addition to the export of other variant tracks. VCF files from both exporters are compatible with upload to QCI Interpret.
Fusion gene detection
Detect Fusion Genes
- Detect Fusion Genes now reports fusions with breakpoints that are not close to exon boundaries, such as fusions into exons or introns. This change leads to a wider variety of fusions being detected with a corresponding cost to fusion specificity. The parameter “Maximum distance to exon boundary” now has a slightly different meaning: reads with breakpoints further away than this distance are considered as candidates for fusions into exons or introns.
- The number of fusion partners for a gene now only counts fusions to the same partner once even if there are multiple fusion breakpoints. If the promiscuity threshold is exceeded the top fusions for the gene are selected instead of discarding all fusions that include the gene in question.
Refine Fusion Genes
- A report is now generated by Refine Fusion Genes. The report includes ‘QIMERA’ fusion plots and summary tables for each fusion passing all filters.
- Publication-ready quality plots can be opened for export by double-clicking on the plot in the report or by clicking on links in the fusion gene tracks produced by the tool.
- Reads that map ambiguously and cross a fusion breakpoint are no longer included when counting fusion crossing reads. This reduces the number of false positives from pseudogenes.
- Detect Differentially Methylated Regions Launch this tool using the new option under the Targeted Methyl tab of the Analyze QIAseq Panels guide. This runs the Call Methylation Levels tool of the CLC Genomics Workbench with parameters optimized for QIAseq Targeted Methyl data.
- Create UMI Reads from Reads Available under the QIAseq DNA Panel Expert Tools folder, this tool identifies and merges reads with similar UMIs (up to one mismatch) that likely originate from the same DNA/RNA molecule without the need to first map the reads to a reference genome. It is particularly useful for spliced reads, such as observed when working with RNA-Seq data, and is preconfigured with default values relevant for working with such data. The tool has an advanced settings dialog where hashing, grouping and merging parameters can be defined. It can be used on both single-end and paired-end reads, but will only include R1 in the output reads list in the case of paired reads. The tool is capable of processing tens of millions of reads quickly when using only 8 GB of RAM.
- Annotate RNA Variants Available under the QIAseq DNA Panel Expert Tools folder, this tool adds annotations to variants that appear to represent RNA changes rather than DNA changes. These annotations can be used when analyzing variants called from RNA data, or to remove low-frequency variants from DNA that has been sequenced together with an RNA sample such that index hopping may have occurred.
Reports generated by the following tools provided by Biomedical Genomics Analysis 20.0.0 can be used with the new Combine Reports tool of the CLC Genomics Workbench:
- Remove and Annotate with Unique Molecular Index
- Calculate Unique Molecular Index Groups
- Create UMI Reads from Grouped Reads
- Create UMI Reads from Reads
- Remove Ligation Artifacts
- Detect Fusion Genes
- Calculate TMB score
- Detect MSI status
Improvements and bugfixes
UPX 3′ Analysis
The Demultiplex tool for UPX 3′ (Demultiplex 3′) reads in the panel guide has been improved:
- The wells used in an experimental setup can be selected as an initial step when launching the tool.
- On launch, the tool will determine the likely plate size, platform and which wells were populated. This information can then be adjusted manually if necessary before the analysis is run.
The Quantify QIAseq UPX 3′ workflow has been updated. The workflow now:
- Runs more quickly per sample due to use of the Create UMI Reads from Reads tool
- Produces a Combined Report summarizing key QC values
- Allows for spike-in controls to be used
- Counts ambiguously mapped reads towards total expression
- The new RNA-Seq Analysis option, “Library type setting” is set to “3′ sequencing”, providing better estimates of TPM for this application.
QIAseq Panel Analysis
Detect QIAseq RNAscan Fusions
The Detect QIAseq RNAscan Fusions workflow has been optimized to maintain specificity after improvements to the fusion detection tools. The main changes are:
- An additional homopolymer trimming step has been added.
- The Detect Fusion Genes parameter “Promiscuity” has been reduced from 20 to 8.
- The Refine Fusion Genes parameter “Breakpoint distance” has been increased from 10 to 25.
- The output folder structure has been improved.
Trim Primers of Mapped Reads
- Trim Primers of Mapped Reads is now multi-threaded, supporting much faster execution.
- The handling of broken pair reads has been changed. Broken pairs are kept in the output to help visualize genomic rearrangements, but the primer sequence is unaligned from reads detected to be R1.
- Primers can now be trimmed from the the start of single-end reads, as well as from the ends.
- Spliced primers can now be trimmed from reads.
- The new Combine Reports tool of the CLC Genomics Workbench has been added to QIAseq Panel Analysis ready-to-use workflows, resulting in the generation of a combined QC report containing QC information from supported reports generated by other tools in the workflows.
- Outputs from Biomedical Ready-to-Use workflows now have the name of the first input element added as a suffix and spaces in output names have been replaced with underscores.
- All QIAseq DNA Variants (Illumina) workflows have been updated with a new definition of medium and high coverage variants (medium = 20-200 / high > 200) and these variants are now only tested for Read Direction Test probability. Filtering on Read Position Test Probability is no longer performed for these workflows.
- Filters have been updated in the ready-to-use workflows for identifying Rare Disease Causing Mutations and Causal Inherited Variants in trios as well as Family of Four (WGS, WES and TAS), to minimize the number of false positives.
- QIAseq DNA workflows now use the CLC Genomic Workbench tools Annotate with Overlap Information and Remove Homozygous Reference Variants in place of the retired Add Information from Overlapping Genes tool.
- Various minor improvements
Biomedical Genomics Analysis Plugin 1.2.1
Released on August 15, 2019
- Fixed an issue that caused the ready-to-use workflow Identify and Annotate Variants (WES) to fail with the error “A single input object is required, encountered 2 inputs”.
- Fixed a rare issue in the Trim Primers of Mapped Reads tool that could arise when the “Mispriming events” option was enabled and at least one read in the read mapping used as input was aligned in a relatively unusual way.
- Various minor improvements
Biomedical Genomics Analysis Plugin 1.2
Released on June 27, 2019
QIAseq Panel Analysis
- The QIAseq Panel Analysis guide includes QIAseq 3′ UPX RNA analysis. The 3’UPX RNA solution supports demultiplexing of the reads by cell ID, as well as downstream analysis of the resulting samples with the Quantify QIAseq UPX 3′ workflow. At the same time as this plugin release, new reference data sets are being released, and is available via the CLC Genomics Workbench Reference Data Manager: QIAseq UPX 3′ Panels hg19 and QIAseq UPX 3′ Panels hg38.
- The QIAseq Panel Analysis guide supports four QIAseq Targeted Methylation Panels with the Detect QIAseq Methylation workflow. At the same time as this plugin release, new reference data sets are being released, and is available via the CLC Genomics Workbench Reference Data Manager: QIAseq Methyl Panels hg38. To implement the Targeted Methylation application, the following changes were made:
QIAseq miRNA Analysis
- Quantify miRNA now allows the use of custom small RNA reference databases to map against. The tool will optionally output a sample grouped by the references in the custom database. The sample can be used with downstream analysis tools, e.g. Differential Expression, PCA for RNA-Seq and Create Heat-Map for RNA-Seq.
- A sequence list containing unmapped reads can now be output by the Quantify miRNA tool.
- Fixed a bug where the report generated by Create UMI Reads for miRNA would report maximum and minimum UMI group sizes as -1 if there were no UMI groups.
General changes, bugfixes and improvements
- The workflows Identify QIAseq DNA Somatic Variants with TMB Score (Illumina) and (Ion Torrent) now produce TMB scores calculated on the basis of target regions with coverage greater than 100x.
- The workflows Identify QIAseq DNA Somatic Variants with TMB Score now outputs the read mapping for MSI status detection before the Trim Primers of Mapped Reads step is executed. This improves the quality in the calling performed by Detect MSI Status (beta).
- TMB status is not automatically included in the TMB report any more, but can be added by enabling an option in the Calculate TMB Score tool. Once enabled, thresholds to use when determining TMB status can be adjusted.
- Detect MSI Status (beta) now outputs an annotation track containing information about loci.
- In Detect MSI Status (beta), the minimum read coverage required for a given locus to be considered testable can now be set using the “Minimum read count per locus” option. As a result, default parameters for the noise reduction threshold were adjusted from 10 to 5.
- Fixed a bug in the Interquartile range test of the Detect MSI Status (beta) tool, where previously all loci were identified as unstable. This then would result in all samples being identified as unstable.
QIAseq Targeted DNA
- Fixed an issue in Create UMI Reads that led to the incorrect removal of some UMI reads from the dataset, which could then lead to false negatives in downstream variant calling. This issue affected UMI reads that met all of the three following conditions: they were made from 3 or more pair-end reads, the primer was on the reverse strand, and some reads, but less than 50% of them, contained adapter read-through.
- The “Minimum frequency” option presented when launching Ready-to-Use QIAseq DNA workflows, either directly or via the Analyze QIASeq Panels guide, now shows a minimum frequency of 0.50% instead of 0.25%. This does not affect the behavior of the workflows, as downstream variant calling was already configured to detect variants down to 0.5% frequency.
- The tool Remove and Annotate with Unique Molecular Index has been optimized to take less time to run. The benefits are most apparent on paired-end data when the “Trim read-through common sequence and UMI” option is used.
- Reads in read mappings now retain their names after being processed by tools in the QIAseq DNA Panel Expert Tools folder.
- Trim Primers of Mapped Reads is slightly faster when working on IonTorrent data.
- Fixed an issue where batch units that included more than one sequence list could not be set up when using the Analyze QIAseq Panel Previously, when the Batch option was enabled, and a folder was selected as the batch unit, each sequence list within that folder was analyzed independently.
- Various minor improvements
Biomedical Genomics Analysis Plugin 1.1
Released on April 3, 2019
- An analysis pipeline for analyzing Tumor Mutational Burden (TMB). This includes a tool called Calculate TMB Score and associated workflows for both Illumina and Ion Torrent that take advantage of a new reference data set: QIAseq TMB Panels hg38.
- Beta tools and an associated workflow for analyzing Microsatellite Instability (MSI), with two MSI baseline tracks available from the Workbench’s Reference Data Manager.
- A QIAseq miRNA analysis pipeline that includes four new tools, two workflows that access new reference data: QIAseq Small RNA. The new tools are compatible with the existing RNA-Seq Analysis visualization tools.
At the same time as this plugin release, a new reference data set is being released, and is available via the CLC Genomics Workbench Reference Data Manager: hg38 no alternative reference set. This data set includes scaffolds and a virus decoy sequence for improved performance.
General changes, bugfixes and improvements
QIAseq Panel Analysis
- The Import QIAGEN Primers tool can now also import RNAscan primers.
- The Ion Torrent reads importer accessed via the Analyze QIAseq Panel Analysis tool now annotates imported reads with sequencing platform information.
- The performance of the Create UMI Reads tool has been improved, with this being particularly relevant to systems with slow I/O.
- The performance of the Calculate Unique Molecular Index Groups has been improved, and the organization of the options in the wizard improved.
- In the Identify QIAseq DNA Variants workflows there is now an option to configure the genetic code to use. This is handled automatically when workflows are launched via the the QIAseq Panel Analysis guide.
- Fixed a bug in Quantify QIAseq RNA Expression where the log would report an incorrect percentage of “reads rejected” and “reads accepted”. The error only occurred when the total number of reads exceeded 20M.
Trim Primers of Mapped Reads
- The tool was updated to include options for removing mispriming events and artifacts due to pseudogenes.
- The option “Additional bases to trim” has been renamed to “Additional bases to unalign”, and while it was previously available only for trimming single reads, it is now also applied to paired end reads if enabled.
- Fixed a bug in the Trim Primers of Mapped Reads tool that affected the Identify QIAseq DNA Variants Ready-to-Use workflow, where the tool would fail on panel DHS-105Z data with paired reads that spanned the origin of the MT chromosome. The tool now ignores (does not trim) such reads.
- Fixed a bug where the parameter “Maximum additional nucleotides” would be off by one nucleotide. This bug affected single forward reads with reverse primers.
- Minor improvements have been made to some tooltips.
Prepare Guidance Variant Track
- All Ready-To-Use workflows that have a Local Realignment step have been improved through the inclusion of the Prepare Guidance Variant Track tool to capture information about structural variants in the guidance track and ensure left-alignment of insertions and deletions.
- The guidance track generated now includes “Evidence”, “Repeat”, “Variant ratio”, “Sequence complexity” and “# Reads” annotations for variants that originate from the structural variants input.
- The tool no longer includes SNVs in the guidance track output.
- The tool is now found in the Resequencing Analysis folder of the Toolbox.
The Download Example Data functionality in the Help menu of the Biomedical Genomics Analysis plugin has been fixed.
Biomedical Genomics Analysis Plugin 1.0
First release, November 28, 2018
A CLC Genomics Workbench with the Biomedical Genomics Analysis plugin installed delivers the functionality previously provided by the Biomedical Genomics Workbench, the QIAseq Targeted Panel Analysis plugin and the QIAGEN GeneRead Panel Analysis Plugin.
Similarly, a CLC Genomics Server with the Biomedical Genomics Analysis Server Plugin installed delivers functionality previously provided using a CLC Genomics Server with a Biomedical Genomics Server Extension license, the QIAseq Targeted Panel Analysis Server Plugin and the QIAGEN GeneRead Panel Analysis Server Plugin.
The listing below is of changes to tools and workflows delivered with the Biomedical Genomics Analysis and Biomedical Genomics Analysis Server Plugin relative to the corresponding earlier functionality. Further changes relevant to Workbench- and Server-specific functionality can be found on the Latest Improvements pages for CLC Genomics Workbench 12.0 and CLC Genomics Server 11.0.
The information on those pages reflects the introduction into those products of some tools previously only available in the Biomedical Genomics Workbench and Biomedical-enabled CLC Genomics Servers. Thus some tools listed as new will not be new to users of the earlier Biomedical products.
Searching for with the name of a Biomedical Genomics Workbench 5.x tool in the Launch tool of the CLC Genomics Workbench 12.0 will return the corresponding tool(s) of interest in the new Workbench. Workflow elements within Ready-to-Use Workflows have the same names as used in the equivalent worfklows in Biomedical Genomics Workbench 5.0.1 wherever this was possible.
Changes relative to the functionality of Biomedical Genomics Workbench 5.0.1, QIAGEN GeneRead Panel Analysis 1.12 and QIAseq Targeted Panel Analysis 1.2
General changes and improvements
- An extended Reference Data Manager is available via the CLC Genomics Workbench, and is launched by clicking on the toolbar button labeled “References”. The functionality of the Biomedical Genomics Workbench reference data management tool is presented under the tabs labeled “QIAGEN Sets” and “Custom Sets“.
- When launching a workflow to analyze panel data, the wizard now offers the opportunity to select and, when needed, download the appropriate data set to the relevant reference data location.
- The precision of detection of insertions at positions directly after primers in the Trim Primers of Mapped Reads tool has been improved. This change may somewhat negatively affect the sensitivity of detection of insertions at positions right after primers, but does not affect the sensitivity of detection for other variant types.
- The Differential Expression for RNA-Seq tool, located in the RNA-Seq Analysis folder of the Workbench Toolbox, now includes the functionality formerly delivered by the Differential Expression for Targeted RNA-Seq tool QIAseq Targeted Panel Analysis plugin.
- The extension used when naming outputs of the Remove and Annotate with Unique Molecular Index tool has been changed to “RAUMI”. Previously it was “RAMB”.
Changes to Ready-to-Use Workflows
- Each Map Reads to Reference step present in Ready-to-Use workflows under the Whole Exome Sequencing folder is now followed by a Remove Duplicate Mapped Reads step.
- Workflows in the Whole Transcriptome Sequencing folder under Toolbox | Ready-To-Use Workflows designed for analyzing mouse or rat data are now under a single subfolder called Mouse and Rat. The relevant reference data can be selected when launching each workflow.
- The Differential Expression for Targeted RNA-Seq tool, previously distributed with the QIAseq Targeted Panel Analysis plugin, is no longer available. This functionality is now made available through the Differential Expression for RNA-Seq tool, located within the RNA-Seq folder of the CLC Genomics Workbench 12 Toolbox.
- The QIAGEN GeneRead Panel Analysis workflow has been updated to ensure the coverage report considers the full length of the amplicon, while maintaining appropriate trimming of reads. For this, the “Additional bases to trim” option of the Trim Primers and their Dimers from Mapped Reads tool was decreased from 2 to 0 and the Trim Reads tool was added, which includes automatic read-through adapter trimming.
- In the QIAGEN GeneRead Panel Analysis workflow the dbsnp annotation has been replaced with dbsnp common annotation.
- Fixed an issue with the Reference Data Manager where custom reference sets could not be made with a customized Gene Ontology element.
- The Prepare Overlapping Raw Data workflow is no longer available.
Trim Primers of Mapped Single Reads (legacy) and Trim Primers of Mapped Paired End Reads (legacy) will be retired in a future release.