QIAGEN powered by

Example Data

Next Generation Sequencing example data

Import the example data into QIAGEN CLC Genomics Workbench:

  • Download and save the relevant data set below
  • Unless otherwise stated, unzip the file
  • Open QIAGEN CLC Genomics Workbench
  • Click File->Import
  • Select the appropriate NGS format to start up the appopriate Import wizard

Raw Data
Name Description Download
Illumina genomic data from Pseudomonas aeruginosa (616 MB) The data set contains four files:

  • SRR396637.sra_1.fastq and SRR396637.sra_2.fastq – paird end (FR) sequence reads. A distance range 150 to 350 is reasonable
  • SRR396636.sra_1.fastq and SRR396636.sra_2.fastq – mate pair (RF) sequence reads. A distance range 2000 to 3800 is reasonable

This data set is used in the De novo analysis of paired data tutorial


CLC Formatted Data
QIAseq Panel data
Name Description Download
QIAseq TMB and MSI Panel DHS-8800Z reads Genomic sequencing reads with high and low TMB scores as well as different MSI status, provided as example data for the Identify TMB Status ready-to-use workflow, distributed with the Biomedical Genomics Analysis plugin. Use Import | Standard Import to import the ZIP file. A subset of this data set was used to create the sample data for the Compare TMB Scores and MSI Statuses from QIAseq Tumor Mutational Burden Panels tutorial.
Mapping data
Name Description Download
QIAGEN CLC formatted mapping data (19.2 MB) This data set contains genomic sequencing reads from a cancer sample and a normal sample for the human mitochondrial genome. Also included is the chromosome M sequence from the hg18 build of the human genome as well as annotation tracks generated from data from UCSC Genome Browser site.
There is no need to unzip before import, simply use the Standard Import option to import the whole file. Instructions on importing this data and using it are included in the Resequencing and Tracks tutorial.
RNA-Seq data
Name Description Download
Subset of the full data set (12.9 MB) This data set is based on the data set published with [Mortazavi et al., 2008] and includes a subset of the full data set including a region of chromosome 16 for use as a reference.
Experiments with the full data set (12.9 MB) This data set is based on the data set published with [Mortazavi et al., 2008] and includes experiments containing the expression values for the full data set.
Variant and protein structure data
Name Description Download
QIAGEN CLC formatted variant table and molecule project (0.8 MB) This data set contains 6 variants commonly found in the Bcr-Abl fusion gene, and used in the Visualize Variants on Protein Structure tutorial.