Biomedical Genomics Analysis 21.1
Released on 24.06.2021
New features and improvements
Upload to QCI Interpret and QCI Interpret Translational
- Upload to QCI Interpret now also supports upload to QCI Interpret Translational.
- Easy authentication is available through a browser log in.
- It is possible to upload a wider range of variants including CNVs, fusions, inversions, TMB and MSI.
- New tools designed for workflows support bundling of files and upload in separate steps. Prepare QCI Interpret Upload bundles CLC elements into a report and Upload Prepared QCI Interpret Report uploads the resulting report.
- All tools are server- and workflow enabled to allow easy automation of upload.
Structural Variant Caller
- The runtime of Structural Variant Caller has been improved through optimisation of calculation methods and underlying code. Runtime improvements primarily affect WGS analysis and/or analysis of samples with many broken pair reads.
- The calculation of the consensus sequence for breakpoints has been refined, this is particularly relevant for samples where multiple breakpoints are located close to each other.
- RNA-seq reads are accepted as input. Reads spanning two exons are considered two individual reads.
- It is possible to ignore broken reads. Choosing this option can improve speed but may reduce sensitivity.
- When running in targeted mode, variants where breakpoints are located in different target regions are detected.
RNA-seq read type improvements (QIAseq RNA Fusion XP)
SARS-CoV-2 workflow and reference data
In the following QIAseq workflows Structural Variant Caller has been configured with target regions to improve speed:
In the following WGS workflows the new setting Ignore broken pairs in Structural Variant Caller has been enabled to improve speed:
- Fixed an issue in Extract Reads Matching Primers so that detection of primers now work when primers are situated at the end of a read.
- Fixed sample names in summary report output from the SARS-CoV-2 workflows.
- Fixed an issue where the fusion plot column was missing from the fusion tracks generated by Detect and Refine Fusion Genes.
- Fixed an issue in Create UMI Reads from Reads where if “Enable advanced settings” was enabled and one or more of the advanced settings was altered, and “Enable advanced settings” was then disabled again, the altered settings would still apply. The tool has been updated, so that if “Enable advanced settings” is disabled, default settings will now be used for all advanced settings.
- Fixed an issue causing Structural Variant Caller to fail when reads near the ends of chromosomes had unaligned ends.
- Fixed an issue affecting Structural Variant Caller that caused an error when unaligned ends spanned the junction of a circular reference sequence.
- Fixed an issue in variant tracks created by the Structural Variant Caller that caused heterozygous variants to be annotated as homozygous in exported VCF files.
- Fixed an issue in CNV and LOH Detection, where setting “Minimal purity” to a value higher than 0.7 and enabling “Normalize coverage using allele frequencies” would cause the tool to fail.
- Fixed an issue in CNV and LOH Detection, where features in the same position could have different predicted ploidy states.
- Fixed an issue affecting Target Region Coverage Analysis where an error occurred if inputs had different chromosome names but compatible genomes.
- Fixed an issue in the Create UMI Reads from Grouped Reads report where one of two identical plots showing “Quality scores of all UMI read nucleotides” has now been removed.
- Fixed an issue that caused Create UMI Reads from Reads to fail if highly dissimilar reads were grouped. The bug was rarely seen but could occur with lenient settings for grouping reads, with the present default settings for Create UMI Reads from Reads the bug has not been observed.
- Replaced Create Sample Report with Combine Reports in the workflow Identify QIAseq Exome Causal Inherited Variants in Trio to fix an issue causing some summary statistics to be missing.
- Various minor bug fixes.
Biomedical Genomics Analysis Plugin 21.0.1
Released on February 3, 2021
- Fixed an issue in all ’Identify Variants’ workflows where the Target regions workflow input was locked to a workflow role not part of the main reference data sets (e.g. hg38 (refseq), hg19 (ensembl), etc.). Affected workflows:
Biomedical Genomics Analysis 21.0
Released on January 12, 2021
Detect and Refine Fusion Genes
Detect and Refine Fusion Genes potential fusions in RNA data and then accumulates and evaluates the evidence for each potential fusion.
This tool replaces the Detect Fusion Genes and the Refine Fusion Genes tools, which have been moved to the Legacy section of the Toolbox. We recommend that workflows containing the legacy tools are updated to include Detect and Refine Fusion Genes instead.
The following ready-to-use workflows have been updated to use Detect and Refine Fusion Genes in place of the the two legacy tools, which they used previously:
- Perform QIAseq RNA fusion XP Analysis
- Detect QIAseq RNAscan Fusions
- Perform QIAseq Multimodel Analysis (Illumina)
- Perform QIAseq Multimodel Analysis with TMB and MSI (Illumina)
Identify Mispriming Events
Identify Mispriming Events produces a track of potential mispriming events, i.e. events where reads were amplified from a region of the genome other than the intended target region. Using this track as input to Trim Primers of Mapped Reads, reads originating from mispriming events can be removed from mappings, decreasing the potential for false positive variants to be called in downstream analysis steps. This track can also be used to evaluate the quality of a set of primers.
Improvements to workflows
- The mismatch cost in the Map Reads to Reference element in ready-to-use workflows provided with this plugin has been increased from 2 to 4 to better reflect current sequencing quality.
- The post-filtering of variants with “high UMI evidence” in the QIAseq ready-to-use workflows has been relaxed: fewer variants are considered as having “high UMI evidence”, and the Average Quality cut-off for these has been reduced to 38.
- The Indels and SV tool has been replaced by the Structural Variant Caller in all ready-to-use workflows. In addition to being used for guiding the local realignment, the indel track produced by the Structural Variant Caller is output as part of the results of the workflows. The QIAseq targeted DNA workflows further produce an inversion and a long indel track among the results; these are also generated by the Structural Variant Caller.
- Indel tracks generated by Structural Variant Caller elements in ready-to-use workflows have been updated to include gene information.
- The Identify QIAseq DNA Somatic Variants ready-to-use workflows include detection of internal tandem duplications, including FLT3.
- All ready-to-use workflows now uses the name “Genome Browser View” for the Track Lists generated.
- QC for RNAscan Panels can make use of primer tracks imported from BED format files.
- Convert Annotation Track Coordinates includes remapped positions from second pass alignments.
- Reports generated by the following tools can be used when defining QC metrics in the new Create Sample Report tool of the QIAGEN CLC Genomics Workbench:
- The plots in the Loss-of-heterozygosity section of the CNV and LOH Detection report use consistent colors, and include lines that indicate the expected allele frequencies and coverage ratios for each state. The colors in the LOH-track match the colors used in the LOH plot.
- Various minor improvements
- Create UMI reads for miRNA accepts only sequence lists as input. Previously a single sequence could be input, but the tool would then fail with an error.
- Fixed an issue affecting Create UMI Reads from Reads, where if the “Read structure” option was set to “Paired end reads (discard read 2)” and the read list generated had an odd number of reads, a downstream Trim Reads job would fail with an error.
- Fixed an issue in the Structural Variant Caller that caused some inversion calls to be incorrectly classified as tandem duplications.
- Fixed an issue in the Refine Fusion Genes (legacy) tool where the number of wild type supporting reads were under-counted if the genes involved had other fusion partners.
- Various minor bug fixes
The following tools have been moved to the QIAGEN CLC Genomics Workbench 21.0:
- Extract IsomiR Counts
- Annotate with Repeat and Homopolymer Information
- Merge Variant Tracks
- Trim Primers of Mapped Single Reads (legacy)
- Trim Primers of Mapped Paired Reads (legacy)
- The VCF (Biomedical) exporter. Functionality of this exporter is available in the VCF exporter provided by the QIAGEN CLC Genomics Workbench 21.0 and higher.
- Identify QIAseq DNA Somatic Variants with FLT3 Identification (Illumina) FLT3 detection is included in the Identify QIAseq DNA Somatic Variants workflows.
- Detect Fusion Genes (legacy) and Refine Fusion Genes (legacy) have been moved to the legacy folder in the Toolbox. We recommend using the new Detect and Refine Fusions Genes tool.
Biomedical Genomics Analysis Plugin 20.2.1
Released on February 3, 2021
Biomedical Genomics Analysis Plugin 20.2
Released on November 26, 2020
New ready-to-use workflows
New workflows are available under the Ready-to-Use Workflows area of the Toolbox:
- SARS-CoV-2 Workflows Under this folder are two workflows:
- TSO500 Panel Analysis Under this folder are two workflows for analysis of the TruSight Oncology 500 Illumina bundle, one for DNA analysis and one for RNA analysis.
- QIASeq Panel Analysis | QIASeq Analysis Workflows. Two new workflows have been added:
- Identify QIAseq DNA Somatic and Germline Variants from Tumor Normal Pair (Illumina) This workflow analyses matched tumor normal samples and reports somatic and germline variants in addition to low frequency somatic variants that are also present in the normal sample. It is useful for analysis of driver mutations and we recommend using it in combination with QIAGEN Clinical Insight (QCI) Interpret.
- Perform QIAseq Multimodal Analysis with TMB and MSI (Illumina) This workflow is for analysis of the multimodal Pan Cancer panel (UHS-5000Z). It reports filtered variants, TMB score, microsatellite instability status, fusion genes and when control samples are provided, CNVs are also reported. The Perform QIAseq Multimodal Analysis with TMB and MSI (Illumina) can also be launched using the Analyze QIAseq Panels guide where it is located under the tab labeled Multimodal TMB/MSI.
- Structural Variant Caller Detects indels, tandem duplications and inversions in somatic and germline WGS and targeted applications. This tool improves upon the detection of tandem duplications and inversions offered by existing tools in the CLC Genomics Workbench, while the InDels track produced can be used as a guidance track in the Local Realignment tool and is expected to provide similar results to outputs of the InDels and Structural Variants tool used for that purpose. If you have been using the Advanced Structural Variant Detection (beta) tool, provided in a separate plugin, we recommend instead using this tool, which has been designed to replace it.
- Compare Immune Repertoires Compares T-cell receptor repertoires identified by Immune Repertoire Analysis and generates a report containing comparisons of read composition, diversity indices, rarefaction, CDR3 length and V/J usage. A heat-map can also be produced based on the Jaccard Similarity between the samples and a similarity table.
- Extract Reads Matching Primers Extracts reads that match a primer and discards reads that do not match a primer, especially useful for cleaning up RNA sequence data to decrease false positives when detecting fusion genes and before generating targeted gene expression matrices.
- Refine Read Mapping Removes potentially problematic mapped reads. Options are provided for removing reads based on the number of SNPs in a certain window and/or remove reads with unaligned ends of a certain length.
- Target Region Coverage Analysis Combines per-region statistics tracks produced by QC for Targeted Sequencing across samples and assesses quality based on different coverage metrics. Based on adjustable thresholds, targets with low coverage are flagged in the output track.
- CNV and LOH Detection Detects copy number variations (CNVs) and loss-of-heterozygosity (LOH) from targeted resequencing experiments. This tool expands upon the existing Copy Number Variant (CNV) Detection tool.
- Two new tabs have been added to the Analyze QIAseq Panels guide:
- RNA Fusion XP Analysis of seven catalog RNA FUSION XP panels can be launched from this tab. Primer information for the catalog panels are available in the QIAseq RNA Fusion XP Panels hg38 QIAGEN set in the Reference Data Manager.
- Immune The Perform QIAseq Immune Repertoire Analysis workflow for Human and for Mouse, can be launched from this tab.
- Perform QIAseq RNA Fusion XP Analysis filters have been adjusted and are now able to detect variants at a lower coverage which improves the sensitivity.
- Perform QIAseq RNA Fusion XP Analysis and Perform QIAseq Multimodal Analysis workflows now include Extract Reads Matching Primers, which reduces false positive fusion calls as well as improving the targeted expression matrix produced.
- Improved handling of inversions by the VCF (biomedical) exporter.
- Trim Primers and their Dimers of Mapped Reads now includes an option to trim the primers of amplicon fragments.
- TMB score thresholds are no longer set by default in the Identify TMB Status workflow. It is now possible to set thresholds manually.
- Short read removal in Trim Primers of Mapped Reads now removes short reads based on alignment length and not just read length.
- The QC for RNAscan Panels report now accounts for missing validation genes. The symbol “-” is used to denote undefined fraction values (those where the denominator is 0). Previously, these values were reported as 0.
- The QC for RNAscan Panels report can now be used as input to the Combine Reports tool.
- The Detect QIAseq RNAscan Fusions workflow and the Perform QIAseq RNA Fusion XP Analysis workflows now include the QC for RNAscan Panels report in the combined report they generate. The Perform QIAseq Multimodal Analysis (Illumina) workflow no longer outputs the QC for RNAscan Panels report and is thus not included in the combined report generated.
- The length of a microsatellite locus read in Generate MSI Baseline and Detect MSI Status now depends not only on identification of flanking region markers, but is also assessed by comparing nucleotide composition in the read against the true locus nucleotide composition. Only reads that match the same composition as the microsatellite are kept. The two reference data baseline elements that are part of the QIAseq TMB Panels hg38 reference data set have been updated with this change.
- The manual information for the Identify QIAseq DNA Somatic Variants with FLT3 Identification workflow has been updated to clarify that that it has been optimized to detect FLT3 internal tandem duplications, and this optimization may impact the ability to detect other variants.
- Various minor improvements
Reference data bundles
New datasets supporting the new workflows are available under the QIAGEN Sets tab of the Reference Data Manager:
- AmpliSeq SARS-CoV-2 Uses MN908974.3 as reference genome
- QIAseq SARS-CoV-2 Uses MN908974.3 as reference genome
- TSO500 hg38 Uses hg38_no_alt_analysis_set and RefSeq annotations. Bundle includes all reference data types needed for the analysis of both TSO500 RNA and DNA part
- QIAseq Multimodal Pan Cancer hg38 Uses hg38_no_alt_analysis_set and RefSeq annotations. Dataset have been custom made to fit analysis of the Pan Cancer panel UHS-5000Z
- QIAseq DNA Panel hg19 Updated with mispriming events tracks
- QIAseq DNA Panel hg19 (RefSeq) Supports analysis of the QIAseq DNA Panels using RefSeq annotations
- Fixed an issue in Remove and Annotate with Unique Molecular Index where the number of reads reported in the report header was incorrect.
- Fixed an issue that caused an error to be thrown by the Trim Primer of Mapped Reads tool when unaligning primers in RNA reads spanning an intron.
- The VCF (biomedical) exporter now supports exports of tracks based of compatible genomes even if the chromosome names differ. Previously, an empty VCF file was generated if the chromosome names differed, even if the tracks used as input were compatible.
- Fixed an issue introduced in Biomedical Genomics Analysis 20.0 where under particular circumstances, when the Remove and Annotate with Unique Molecular Index tool was run on a CLC Genomics Server and other tools were being run on the same execution node or single server at the same time, an empty annotated sequence list could be generated.
- Various minor bugfixes
Biomedical Genomics Analysis Plugin 20.1.1
Released on August , 2020
- Empty fusion tracks can now be handled by the VCF (Biomedical) exporter.
- The average quality value in the “Low average quality variants” filter has been lowered from 44 to 41.5 in the workflows listed below. This change only affects the filtering of variants with high UMI evidence, and is intended to decrease the chance of filtering out true positive variants from the results.
- Fixed a bug in Detect Fusion Genes that caused no fusion genes to be reported if reference data was used where the link between annotations on mRNA and gene tracks could not be established. This affects reference data imported from GTF format, but not reference data imported from GFF3 format files. When using the CLC Genomics Workbench Reference Data Manager to download reference, this affected results using reference data from Ensembl, but not reference data downloaded from RefSeq.
- Fixed a bug affecting the Perform QIAseq Multimodal Analysis (Illumina) and Perform QIAseq RNA Fusion XP Analysis ready-to-use workflows, where all non-reference alleles of frequencies below 3% were wrongly filtered away.
Biomedical Genomics Analysis Plugin 20.1
Released on June 29, 2020
New QIAseq ready-to-use workflows
After plugin installation, these new workflows can be found in the CLC Genomics Workbench Toolbox under Ready-to-Use Workflows | QIASeq Panel Analysis | QIASeq Analysis Workflows. They can also be launched using the Analyze QIAseq Panels guide.
These workflows offer some benefits over the workflows provided in the Whole Exome Sequencing folder for trio and germline variant detection, including incorporating read trimming as an initial step in the workflow, and retaining QUAL scores in the variant track produced. In addition, the Identify QIAseq Exome Germline Variants workflow provides the option to detect CNVs. The Create QIAseq Exome CNV Control Mapping can be used to generate control mappings suitable for use when running CNV detection on exome panel data.
These workflows can be launched from under the new Exome tab of the Analyze QIAseq Panels guide.
- Perform QIAseq Immune Repertoire Analysis This workflow is aimed at analysis of data generated using the QIAseq Immune Repertoire Panel. The reference data for the analysis of human and mouse TCR samples from the IMHS-001Z and IMMM-001Z panels is not yet available, but work is ongoing to release it. If you would like to register your interest in using this analysis area, please contact our support team by emailing email@example.com.
- Identify QIAseq DNA Somatic Variants with FLT3 Identification (Illumina) Identify somatic variants and structural variants in FLT3 in QIAseq targeted DNA panel data. This workflow can be launched from under the Targeted DNA tab of the Analyze QIAseq Panel guide, when choosing to analyze data from the Myeloid Neoplasms Panel (DHS-003Z) or Comprehensive Cancer Panel (DHS-3501Z).
- Create QIASeq DNA CNV Control Mapping (Illumin/Ion Torrent) These workflows can be used to generate control mappings suitable for use when running CNV detection when analyzing QIAseq targeted DNA panel data. They include the QC for Targeted Sequencing tool, for easy evaluation of sample coverage and quality. The workflows can be launched from under the Targeted DNA tab of the Analyze QIAseq Panel guide, using options available in the drop-down menu for each supported panel type.
Other ready-to-use workflows
- Annotate Variants with Effect Scores (WES) This workflow adds annotations with effect scores to SNVs in variant tracks. Effect scores indicate the impact of a mutation on the gene or transcript. This workflow can be found in the Toolbox under Ready-to-Use Workflows | Whole Exome Sequencing (WES).
- Create Methylation Level Heat Map Hierarchically clusters samples and features, generating a two dimensional heat map of methylation levels, using methylation levels tracks as input.
- Merge Variant Tracks Merges multiple variant tracks into a single variant track.
- Annotate with Repeat and Homopolymer Information Annotates variants with repeat and homopolymer information, based on the variant itself and the genome sequence flanking it.
- Annotate with Effect Scores Annotates SNV variants with precomputed effect scores, which indicate the level of impact a mutation has on the gene or transcript.
- Extract IsomiR Counts Extracts IsomiR composition and count information from each underlying miRNA alignment of the “grouped on mature” expression table output by the Quantify miRNA tool, and produces a table containing that information.
- Convert Annotation Track Coordinates Converts annotation coordinates, either from hg19 coordinates to hg38 coordinates, or vice versa, making use of the NCBI Remapping Service.
- Annotate Structural Variants Estimates count, coverage and frequency information for indels detected by the InDels and Structural Variants tool and generates a variant track containing the original variants with these annotations added.
- Immune Repertoire Analysis Analyzes RNA data to characterize the T cell receptor repertoire.
Extended support for paired reads, including reads generated by duplex TruSight Oncology, Illumina (TSO) UMI protocol
- The following tools are affected by this improvement:
- Remove and Annotate Unique Molecular Index, where the read structure can be specified as single end, paired end or duplex, and where for paired end, the read the index is on can be specified.
- Create UMI Reads from Reads, where the read structure can be specified as single end, paired end or duplex, and for paired end reads, there is now an option for not discarding read 2.
- All QIAseq ready-to-use workflows have been updated to support duplex reads.
- Single reads must now be designated as such when launching these tools and updated workflows.
Other QIAseq Panel Analysis workflow improvements
- Two new options are available under the UPX 3′ RNA tab of the Analyze QIAseq Panels guide to support simple case vs. control analysis of differentially expressed genes: “Human Detect Differentially Expressed Genes Between Two Groups” and “Mouse Detect Differentially Expressed Genes Between Two Groups”. These options launch the “Human Identify and Annotate Differentially Expressed Genes and Pathways” and “Mouse Identify and Annotate Differentially Expressed Genes and Pathways” workflows, respectively.
- When configuring Demultiplex Reads, launched from the UPX 3′ RNA tab of the Analyze QIAseq Panels guide, the selection of the most probable wells in the well selection preview is now based on information taken from across all the input sequence lists, instead of just the first. Demultiplex Reads was formerly named “Demultiplex”.
- Options specifying the input data as Illumina (Index on Read 2) or Ion Torrent are now present for each panel listed under the Targeted RNAScan tab of the Analyse QIAseq Panels guide.
- The Quantify QIAseq RNA Expression workflow produces a combined summary report per sample. These reports can be used by Combine Reports to make a summary report for
- The Detect Fusion Genes tool option “Minimum fusion read count” has been changed from 2 to 4 in the following workflows:
- The Perform QIAseq Multimodal Analysis (Illumina) and Perform QIAseq RNA Fusion XP Analysis ready-to-use workflows now include the new Annotate with Repeat and Homopolymer Information tool. Low frequency (<3%) repeat and homopolymer variants are now actively removed from the variants passing filters track.
Updated data sets are available under the QIAGEN Sets tab of the Reference Data Manager:
- Updated hg19 and hg38 (no alt analysis set) reference bundles, both RefSeq and Ensembl.
- Updated versions of dbSNP for both hg19 and hg38, under the Reference Data Elements area.
- A new reference data set, QIAseq Exome Panels hg38, intended for use with the new QIAseq Exome workflows.
- SIFT effect annotations have been added, and are intended for use with the new Add Effect Score to Variants (WES) workflow.
- The hg19 reference set used by default in Biomedical Ready-to-Use workflows is now based on Ensembl version 99.
Export and upload
Quantify QIAseq RNA
- Running time has been reduced. This tool is used in the Quantify QIAseq RNA Expression workflow, thus also affecting its running time.
- A report can now be generated. These reports can be compared across samples using Combine Reports.
Fusion gene detection
- The runtime of Detect Fusion Genes has been substantially improved for paired-end, whole transcriptome RNA-seq data.
- The Fusion Report generated by Refine Fusion Genes has been updated to include the sample name, and the chromosome and exon number of fusion genes detected.
- Evidence for backsplicing in exon skipping fusions, which was sometimes reported, has been removed, as it was not consistently reported.
Calculate TMB Score has been updated to only use exon regions for calculating the TMB score. For this, it requires a track containing the exon regions (typically, an mRNA track).
- Fixed an issue that caused the Quantify QIASeq UPX 3′ workflow to fail if it was provided with multiple inputs.
- Fixed an issue that arose if workflows were launched in batch mode from the Analyze QIAseq Panels guide, where one Workflow Result Metadata table was created per batch unit, when a single Workflow Result Metadata table, listing results from all batch units, was expected.
- Fixed an issue where Detect Fusion Genes could fail with an error of the form “Empty region for geneA chrA positionA – geneB chrB positionB”. The error was most likely to be encountered when the “Detect fusions with novel exon boundaries” option was enabled, and “Maximum distance to known exon boundary” was set to a high value.
Tutorials for the Biomedical Genomics Analysis plugin are provided online. They will no longer be available from the Help | Plugin Tutorials menu of the CLC Genomics Workbench. The online tutorials can be reached using the Help | Online Tutorials menu option.
Biomedical Genomics Analysis Plugin 20.0.1
Released on March 10, 2020
- Fixed an issue where reference alleles were being filtered out from variant tracks produced by Trio Analysis and Family of Four Ready-to-Use Workflows, which then led to incorrect coverage values in VCF files exported using these variant tracks. This issue was introduced in Biomedical Genomics Analysis 20.0.
- Fixed an issue in Refine Fusion Genes. where a given read could be counted as supporting more than one fusion if those fusions had breakpoints lying very close together. Each read can now only support one fusion model.
- Fixed an issue affecting the Identify Variants (WGS-HD, WES-HD and TAS-HD) Ready-to-Use Workflows where all reference alleles were being filtered away. This was addressed by removing the Remove Reference Variant (Legacy) tool from these workflows.
- Fixed an issue affecting the Identify Somatic Variants in Tumor Normal pairs (WGS, WES and TAS) Ready-to-Use Workflows where all reference alleles were being filtered away. In these workflows the Remove Homozygous Reference Variants tool has replaced the Remove Reference Variant (Legacy) tool, so only orphan reference alleles are removed.
- Fixed a bug where fusions could be incorrectly annotated as known or incorrectly annotated as unknown by Annotate Fusions with Known Fusion Information and Refine Fusion Genes in cases where multiple sets of breakpoints were detected for a given pair of genes. Where only one pair of breakpoints between two genes was identified, fusion annotation was not affected.
Fusion gene pipeline improvements
Various other minor improvements
Biomedical Genomics Analysis Plugin 20.0
Released on December 11, 2019
QIAseq Panel Analysis
- Perform QIAseq Multimodal Analysis for analysing QIAseq DNA and RNA Multimodal Panel data. This workflow performs somatic variant calling on DNA down to a frequency of 0.5% and uses RNA for detecting fusion genes, exon skipping, and gene expressions. A reference data set called QIAseq Multimodal Panels hg38 is available for analysis with reference sequence, hg38 no alt analysis set. The catalog panels UHS-005Z, UHS-006Z and UHS-009Z can be run from the Multimodal tab in Analyse QIAseq Panel guide. Custom panels can similarly be configured for quick execution as long as DNA primers and target regions have been lifted to the hg38 genome before import. When running the stand-alone workflow it is possible to call CNVs if controls mapping are supplied
- Perform QIAseq RNA Fusion XP Analysis for analyzing QIAseq Fusion XP Panel data. This workflow supports variant calling, fusion detection and expression analysis.
- Detect QIAseq MSI Status with Baseline Creation for a combined analysis of MSI status and baseline creation. This workflow can be used for assessing QIAseq MSI from custom or catalog panels boosted with MSI primers or the booster MSI panel as standalone. It can be configured with the QIAseq TMB panel hg38, or a custom reference dataset for hg19 can be constructed from single reference elements, including newly added MSI loci Track for hg19. The manual includes a description on how to quickly map the samples for the baseline.
An additional VCF exporter, VCF (Biomedical), is now available. This new exporter extends the functionality of the standard VCF exporter by supporting the export of Copy Number Variants (CNV) Tracks (Target, Region and Gene) and Refined Fusion Genes (WT) Tracks, in addition to the export of other variant tracks. VCF files from both exporters are compatible with upload to QCI Interpret.
Fusion gene detection
Detect Fusion Genes
- Detect Fusion Genes now reports fusions with breakpoints that are not close to exon boundaries, such as fusions into exons or introns. This change leads to a wider variety of fusions being detected with a corresponding cost to fusion specificity. The parameter “Maximum distance to exon boundary” now has a slightly different meaning: reads with breakpoints further away than this distance are considered as candidates for fusions into exons or introns.
- The number of fusion partners for a gene now only counts fusions to the same partner once even if there are multiple fusion breakpoints. If the promiscuity threshold is exceeded the top fusions for the gene are selected instead of discarding all fusions that include the gene in question.
Refine Fusion Genes
- A report is now generated by Refine Fusion Genes. The report includes ‘QIMERA’ fusion plots and summary tables for each fusion passing all filters.
- Publication-ready quality plots can be opened for export by double-clicking on the plot in the report or by clicking on links in the fusion gene tracks produced by the tool.
- Reads that map ambiguously and cross a fusion breakpoint are no longer included when counting fusion crossing reads. This reduces the number of false positives from pseudogenes.
- Detect Differentially Methylated Regions Launch this tool using the new option under the Targeted Methyl tab of the Analyze QIAseq Panels guide. This runs the Call Methylation Levels tool of the CLC Genomics Workbench with parameters optimized for QIAseq Targeted Methyl data.
- Create UMI Reads from Reads Available under the QIAseq DNA Panel Expert Tools folder, this tool identifies and merges reads with similar UMIs (up to one mismatch) that likely originate from the same DNA/RNA molecule without the need to first map the reads to a reference genome. It is particularly useful for spliced reads, such as observed when working with RNA-Seq data, and is preconfigured with default values relevant for working with such data. The tool has an advanced settings dialog where hashing, grouping and merging parameters can be defined. It can be used on both single-end and paired-end reads, but will only include R1 in the output reads list in the case of paired reads. The tool is capable of processing tens of millions of reads quickly when using only 8 GB of RAM.
- Annotate RNA Variants Available under the QIAseq DNA Panel Expert Tools folder, this tool adds annotations to variants that appear to represent RNA changes rather than DNA changes. These annotations can be used when analyzing variants called from RNA data, or to remove low-frequency variants from DNA that has been sequenced together with an RNA sample such that index hopping may have occurred.
Reports generated by the following tools provided by Biomedical Genomics Analysis 20.0.0 can be used with the new Combine Reports tool of the CLC Genomics Workbench:
- Remove and Annotate with Unique Molecular Index
- Calculate Unique Molecular Index Groups
- Create UMI Reads from Grouped Reads
- Create UMI Reads from Reads
- Remove Ligation Artifacts
- Detect Fusion Genes
- Calculate TMB score
- Detect MSI status
Improvements and bugfixes
UPX 3′ Analysis
The Demultiplex tool for UPX 3′ (Demultiplex 3′) reads in the panel guide has been improved:
- The wells used in an experimental setup can be selected as an initial step when launching the tool.
- On launch, the tool will determine the likely plate size, platform and which wells were populated. This information can then be adjusted manually if necessary before the analysis is run.
The Quantify QIAseq UPX 3′ workflow has been updated. The workflow now:
- Runs more quickly per sample due to use of the Create UMI Reads from Reads tool
- Produces a Combined Report summarizing key QC values
- Allows for spike-in controls to be used
- Counts ambiguously mapped reads towards total expression
- The new RNA-Seq Analysis option, “Library type setting” is set to “3′ sequencing”, providing better estimates of TPM for this application.
QIAseq Panel Analysis
Detect QIAseq RNAscan Fusions
The Detect QIAseq RNAscan Fusions workflow has been optimized to maintain specificity after improvements to the fusion detection tools. The main changes are:
- An additional homopolymer trimming step has been added.
- The Detect Fusion Genes parameter “Promiscuity” has been reduced from 20 to 8.
- The Refine Fusion Genes parameter “Breakpoint distance” has been increased from 10 to 25.
- The output folder structure has been improved.
Trim Primers of Mapped Reads
- Trim Primers of Mapped Reads is now multi-threaded, supporting much faster execution.
- The handling of broken pair reads has been changed. Broken pairs are kept in the output to help visualize genomic rearrangements, but the primer sequence is unaligned from reads detected to be R1.
- Primers can now be trimmed from the the start of single-end reads, as well as from the ends.
- Spliced primers can now be trimmed from reads.
- The new Combine Reports tool of the CLC Genomics Workbench has been added to QIAseq Panel Analysis ready-to-use workflows, resulting in the generation of a combined QC report containing QC information from supported reports generated by other tools in the workflows.
- Outputs from Biomedical Ready-to-Use workflows now have the name of the first input element added as a suffix and spaces in output names have been replaced with underscores.
- All QIAseq DNA Variants (Illumina) workflows have been updated with a new definition of medium and high coverage variants (medium = 20-200 / high > 200) and these variants are now only tested for Read Direction Test probability. Filtering on Read Position Test Probability is no longer performed for these workflows.
- Filters have been updated in the ready-to-use workflows for identifying Rare Disease Causing Mutations and Causal Inherited Variants in trios as well as Family of Four (WGS, WES and TAS), to minimize the number of false positives.
- QIAseq DNA workflows now use the CLC Genomic Workbench tools Annotate with Overlap Information and Remove Homozygous Reference Variants in place of the retired Add Information from Overlapping Genes tool.
- Various minor improvements
Biomedical Genomics Analysis Plugin 1.2.1
Released on August 15, 2019
- Fixed an issue that caused the ready-to-use workflow Identify and Annotate Variants (WES) to fail with the error “A single input object is required, encountered 2 inputs”.
- Fixed a rare issue in the Trim Primers of Mapped Reads tool that could arise when the “Mispriming events” option was enabled and at least one read in the read mapping used as input was aligned in a relatively unusual way.
- Various minor improvements
Biomedical Genomics Analysis Plugin 1.2
Released on June 27, 2019
QIAseq Panel Analysis
- The QIAseq Panel Analysis guide includes QIAseq 3′ UPX RNA analysis. The 3’UPX RNA solution supports demultiplexing of the reads by cell ID, as well as downstream analysis of the resulting samples with the Quantify QIAseq UPX 3′ workflow. At the same time as this plugin release, new reference data sets are being released, and is available via the CLC Genomics Workbench Reference Data Manager: QIAseq UPX 3′ Panels hg19 and QIAseq UPX 3′ Panels hg38.
- The QIAseq Panel Analysis guide supports four QIAseq Targeted Methylation Panels with the Detect QIAseq Methylation workflow. At the same time as this plugin release, new reference data sets are being released, and is available via the CLC Genomics Workbench Reference Data Manager: QIAseq Methyl Panels hg38. To implement the Targeted Methylation application, the following changes were made:
QIAseq miRNA Analysis
- Quantify miRNA now allows the use of custom small RNA reference databases to map against. The tool will optionally output a sample grouped by the references in the custom database. The sample can be used with downstream analysis tools, e.g. Differential Expression, PCA for RNA-Seq and Create Heat-Map for RNA-Seq.
- A sequence list containing unmapped reads can now be output by the Quantify miRNA tool.
- Fixed a bug where the report generated by Create UMI Reads for miRNA would report maximum and minimum UMI group sizes as -1 if there were no UMI groups.
General changes, bugfixes and improvements
- The workflows Identify QIAseq DNA Somatic Variants with TMB Score (Illumina) and (Ion Torrent) now produce TMB scores calculated on the basis of target regions with coverage greater than 100x.
- The workflows Identify QIAseq DNA Somatic Variants with TMB Score now outputs the read mapping for MSI status detection before the Trim Primers of Mapped Reads step is executed. This improves the quality in the calling performed by Detect MSI Status (beta).
- TMB status is not automatically included in the TMB report any more, but can be added by enabling an option in the Calculate TMB Score tool. Once enabled, thresholds to use when determining TMB status can be adjusted.
- Detect MSI Status (beta) now outputs an annotation track containing information about loci.
- In Detect MSI Status (beta), the minimum read coverage required for a given locus to be considered testable can now be set using the “Minimum read count per locus” option. As a result, default parameters for the noise reduction threshold were adjusted from 10 to 5.
- Fixed a bug in the Interquartile range test of the Detect MSI Status (beta) tool, where previously all loci were identified as unstable. This then would result in all samples being identified as unstable.
QIAseq Targeted DNA
- Fixed an issue in Create UMI Reads that led to the incorrect removal of some UMI reads from the dataset, which could then lead to false negatives in downstream variant calling. This issue affected UMI reads that met all of the three following conditions: they were made from 3 or more pair-end reads, the primer was on the reverse strand, and some reads, but less than 50% of them, contained adapter read-through.
- The “Minimum frequency” option presented when launching Ready-to-Use QIAseq DNA workflows, either directly or via the Analyze QIASeq Panels guide, now shows a minimum frequency of 0.50% instead of 0.25%. This does not affect the behavior of the workflows, as downstream variant calling was already configured to detect variants down to 0.5% frequency.
- The tool Remove and Annotate with Unique Molecular Index has been optimized to take less time to run. The benefits are most apparent on paired-end data when the “Trim read-through common sequence and UMI” option is used.
- Reads in read mappings now retain their names after being processed by tools in the QIAseq DNA Panel Expert Tools folder.
- Trim Primers of Mapped Reads is slightly faster when working on IonTorrent data.
- Fixed an issue where batch units that included more than one sequence list could not be set up when using the Analyze QIAseq Panel Previously, when the Batch option was enabled, and a folder was selected as the batch unit, each sequence list within that folder was analyzed independently.
- Various minor improvements
Biomedical Genomics Analysis Plugin 1.1
Released on April 3, 2019
- An analysis pipeline for analyzing Tumor Mutational Burden (TMB). This includes a tool called Calculate TMB Score and associated workflows for both Illumina and Ion Torrent that take advantage of a new reference data set: QIAseq TMB Panels hg38.
- Beta tools and an associated workflow for analyzing Microsatellite Instability (MSI), with two MSI baseline tracks available from the Workbench’s Reference Data Manager.
- A QIAseq miRNA analysis pipeline that includes four new tools, two workflows that access new reference data: QIAseq Small RNA. The new tools are compatible with the existing RNA-Seq Analysis visualization tools.
At the same time as this plugin release, a new reference data set is being released, and is available via the CLC Genomics Workbench Reference Data Manager: hg38 no alternative reference set. This data set includes scaffolds and a virus decoy sequence for improved performance.
General changes, bugfixes and improvements
QIAseq Panel Analysis
- The Import QIAGEN Primers tool can now also import RNAscan primers.
- The Ion Torrent reads importer accessed via the Analyze QIAseq Panel Analysis tool now annotates imported reads with sequencing platform information.
- The performance of the Create UMI Reads tool has been improved, with this being particularly relevant to systems with slow I/O.
- The performance of the Calculate Unique Molecular Index Groups has been improved, and the organization of the options in the wizard improved.
- In the Identify QIAseq DNA Variants workflows there is now an option to configure the genetic code to use. This is handled automatically when workflows are launched via the the QIAseq Panel Analysis guide.
- Fixed a bug in Quantify QIAseq RNA Expression where the log would report an incorrect percentage of “reads rejected” and “reads accepted”. The error only occurred when the total number of reads exceeded 20M.
Trim Primers of Mapped Reads
- The tool was updated to include options for removing mispriming events and artifacts due to pseudogenes.
- The option “Additional bases to trim” has been renamed to “Additional bases to unalign”, and while it was previously available only for trimming single reads, it is now also applied to paired end reads if enabled.
- Fixed a bug in the Trim Primers of Mapped Reads tool that affected the Identify QIAseq DNA Variants Ready-to-Use workflow, where the tool would fail on panel DHS-105Z data with paired reads that spanned the origin of the MT chromosome. The tool now ignores (does not trim) such reads.
- Fixed a bug where the parameter “Maximum additional nucleotides” would be off by one nucleotide. This bug affected single forward reads with reverse primers.
- Minor improvements have been made to some tooltips.
Prepare Guidance Variant Track
- All Ready-To-Use workflows that have a Local Realignment step have been improved through the inclusion of the Prepare Guidance Variant Track tool to capture information about structural variants in the guidance track and ensure left-alignment of insertions and deletions.
- The guidance track generated now includes “Evidence”, “Repeat”, “Variant ratio”, “Sequence complexity” and “# Reads” annotations for variants that originate from the structural variants input.
- The tool no longer includes SNVs in the guidance track output.
- The tool is now found in the Resequencing Analysis folder of the Toolbox.
The Download Example Data functionality in the Help menu of the Biomedical Genomics Analysis plugin has been fixed.
Biomedical Genomics Analysis Plugin 1.0
First release, November 28, 2018
A CLC Genomics Workbench with the Biomedical Genomics Analysis plugin installed delivers the functionality previously provided by the Biomedical Genomics Workbench, the QIAseq Targeted Panel Analysis plugin and the QIAGEN GeneRead Panel Analysis Plugin.
Similarly, a CLC Genomics Server with the Biomedical Genomics Analysis Server Plugin installed delivers functionality previously provided using a CLC Genomics Server with a Biomedical Genomics Server Extension license, the QIAseq Targeted Panel Analysis Server Plugin and the QIAGEN GeneRead Panel Analysis Server Plugin.
The listing below is of changes to tools and workflows delivered with the Biomedical Genomics Analysis and Biomedical Genomics Analysis Server Plugin relative to the corresponding earlier functionality. Further changes relevant to Workbench- and Server-specific functionality can be found on the Latest Improvements pages for CLC Genomics Workbench 12.0 and CLC Genomics Server 11.0.
The information on those pages reflects the introduction into those products of some tools previously only available in the Biomedical Genomics Workbench and Biomedical-enabled CLC Genomics Servers. Thus some tools listed as new will not be new to users of the earlier Biomedical products.
Searching for with the name of a Biomedical Genomics Workbench 5.x tool in the Launch tool of the CLC Genomics Workbench 12.0 will return the corresponding tool(s) of interest in the new Workbench. Workflow elements within Ready-to-Use Workflows have the same names as used in the equivalent worfklows in Biomedical Genomics Workbench 5.0.1 wherever this was possible.
Changes relative to the functionality of Biomedical Genomics Workbench 5.0.1, QIAGEN GeneRead Panel Analysis 1.12 and QIAseq Targeted Panel Analysis 1.2
General changes and improvements
- An extended Reference Data Manager is available via the CLC Genomics Workbench, and is launched by clicking on the toolbar button labeled “References”. The functionality of the Biomedical Genomics Workbench reference data management tool is presented under the tabs labeled “QIAGEN Sets” and “Custom Sets“.
- When launching a workflow to analyze panel data, the wizard now offers the opportunity to select and, when needed, download the appropriate data set to the relevant reference data location.
- The precision of detection of insertions at positions directly after primers in the Trim Primers of Mapped Reads tool has been improved. This change may somewhat negatively affect the sensitivity of detection of insertions at positions right after primers, but does not affect the sensitivity of detection for other variant types.
- The Differential Expression for RNA-Seq tool, located in the RNA-Seq Analysis folder of the Workbench Toolbox, now includes the functionality formerly delivered by the Differential Expression for Targeted RNA-Seq tool QIAseq Targeted Panel Analysis plugin.
- The extension used when naming outputs of the Remove and Annotate with Unique Molecular Index tool has been changed to “RAUMI”. Previously it was “RAMB”.
Changes to Ready-to-Use Workflows
- Each Map Reads to Reference step present in Ready-to-Use workflows under the Whole Exome Sequencing folder is now followed by a Remove Duplicate Mapped Reads step.
- Workflows in the Whole Transcriptome Sequencing folder under Toolbox | Ready-To-Use Workflows designed for analyzing mouse or rat data are now under a single subfolder called Mouse and Rat. The relevant reference data can be selected when launching each workflow.
- The Differential Expression for Targeted RNA-Seq tool, previously distributed with the QIAseq Targeted Panel Analysis plugin, is no longer available. This functionality is now made available through the Differential Expression for RNA-Seq tool, located within the RNA-Seq folder of the CLC Genomics Workbench 12 Toolbox.
- The QIAGEN GeneRead Panel Analysis workflow has been updated to ensure the coverage report considers the full length of the amplicon, while maintaining appropriate trimming of reads. For this, the “Additional bases to trim” option of the Trim Primers and their Dimers from Mapped Reads tool was decreased from 2 to 0 and the Trim Reads tool was added, which includes automatic read-through adapter trimming.
- In the QIAGEN GeneRead Panel Analysis workflow the dbsnp annotation has been replaced with dbsnp common annotation.
- Fixed an issue with the Reference Data Manager where custom reference sets could not be made with a customized Gene Ontology element.
- The Prepare Overlapping Raw Data workflow is no longer available.
Trim Primers of Mapped Single Reads (legacy) and Trim Primers of Mapped Paired End Reads (legacy) will be retired in a future release.