Imagine this scenario: You are working on a high profile scientific finding (for example the Nevada SARS-CoV-2 reinfection case), and you need to 1) perform a bioinformatics analysis; 2) provide detailed methods of the exact analysis for publication; 3) share those methods so that others could recreate your analysis; 4) provide publication-quality images. What’s more, you needed this all done yesterday. What do you do? Dr. Joel R. Sevinsky, Ph.D., recently found himself in this situation while working with the Nevada State Public Health Laboratory on a potential SARS-CoV-2 reinfection case (published The Lancet Infectious Diseases; access the article here). He had numerous analysis software options to choose from and decided upon QIAGEN CLC Genomics Workbench. The main reason for doing so is it satisfied all the requirements mentioned above and enabled high productivity during a time of limited bandwidth.
Dr. Sevinsky developed an analysis pipeline for SARS-CoV-2 using ARTIC amplicons and the Illumina DNA Prep library preparation kit. He had designed a workflow in QIAGEN CLC Genomics Workbench and was preparing a tutorial (access tutorial here). Given the modularity of the designed workflow, he was able to modify the pipeline in just a few minutes to accept metagenomics data as input, rather than amplicons, and perform the analysis. No coding, no command line. All he had to do was point and click in the workflow diagram, remove one step, redirect a couple of outputs, and the new pipeline was ready. When the analysis was done, he had a complete visualization of the variant differences between the two SARS-CoV-2 strains, confirming their hypothesis that this was a clear case of SARS-CoV-2 reinfection supported by genomic data. This visualization could be shared and viewed with anyone that has QIAGEN CLC Genomics Workbench installed, even without a license. Furthermore, Dr. Sevinsky and his team were able to compare the results with other bioinformatics platforms because the software allowed the export of results files in many standard open formats (.bam, .vcf and others).
Detailed methods for publication
According to many researchers, the most important aspect of a scientific publication is the methods section. This section should document in exact detail how an experiment and analysis were accomplished so that other scientists can replicate the findings. The bioinformatics results from Dr. Sevinsky’s analysis were accompanied by a full history of algorithms, workflows, reference files and input files, all with version documentation, used to generate the results. This detailed history was included as supplemental data in their publication.
Share methods with the scientific community
Sometimes, especially in bioinformatics, the most detailed methods can still provide obstacles to recreating an analysis. Unless a documented workflow uses containers and workflow managers, which require significant bioinformatics expertise to maintain, getting the environment correct to recreate the analysis can be difficult. It can also be very time consuming to set up. Fortunately, the entire workflow with input data, references and parameters can be packaged in a single file, exported and shared with the scientific community. Dr. Sevinsky’s journal article will include this file as supplemental data, and for the SARS-CoV-2 tutorial mentioned above you can find a package that includes input fastq files, reference files, primer files and workflows at this resource center here.
Lastly, to get his findings published in a top-tier journal, Dr. Sevinsky and his team needed high-quality images that clearly communicated their findings. QIAGEN CLC Genomics Workbench provided advanced visualization tools that could easily be exported into editable formats for publication. Moreover, the visualization settings can be saved, so you can further refine your analysis without having to recreate the format of the final figure, which is an enormous time saver.
Overall, QIAGEN CLC Genomics Workbench allowed Dr. Sevinsky and his team to communicate their results as quickly as possible. No github sites to create, no Docker containers to manage. Just efficient analysis and publication of results.
If your scientific position utilizes NGS data and requires a lot of “getting things done”, QIAGEN CLC Genomics Workbench is an invaluable tool for your laboratory.