QIAGEN powered by

Latest improvements for

  Current line          Archive

2021R3 Land Release Notes

In this content release, OncoLand and DiseaseLand added hundreds of new projects, and Human Disease data are now available on Human Genome B38!

If you don't see a Land of interest listed under "Select Land", please ask your OmicSoft Server administrator to check the Cloud Land Publishing function for available data.

Invitation to request new data curation

The OmicSoft team is inviting requests for new OncoLand or DiseaseLand expression projects to curate for upcoming releases, which will be included as part of your subscription.

Let us know if there are important datasets that you would like to see curated and represented in the Lands. Public (GEO, SRA or Array Express) expression studies for human, mouse, and rat will be evaluated; single-cell transcriptomic projects, projects for bulk RNA-seq and commercial expression arrays from Affymetrix, Illumina and Agilent are compatible platforms. Please email omicsoft.support@qiagen.com for more information.


Figure 1. Distribution of disease samples in OncoGEO and Hematology update by Disease Category.


This release adds 2058 new samples and 399 new comparisons from 63 projects, focusing on breast, melanoma, lung, central nervous system and stomach cancer.
Some highlights in this release are studies that explore the following:


  • CAR-T cells: GSE164902 (SynNotch CAR-T cells), GSE163400, GSE136432, GSE135379, GSE161942, GSE158144
  • Paired pre- and post-treatment, and tumor and non-tumor samples: GSE155164, GSE106128, GSE144020, GSE94104, GSE110114, GSE133039
  • Gene signatures for prognosis accuracy improvement: GSE33331, GSE126870, GSE168009, GSE133713, GSE131769, GSE139050, GSE122220, GSE126044, GSE135565
  • Cancer progression: GSE80609, GSE144020
  • Xenograft models: GSE66346, GSE148310, GSE100066, GSE104020, GSE100669
  • And more!


This release adds 1830 new samples from 44 unique project IDs  with 222 comparisons, focusing on subtypes of leukemia, lymphoma and myeloma.
Some highlights in this release are studies that explore the following:

  • CAR-T cells: GSE134937, GSE153437 (axicabtagene ciloleucel), GSE156190, GSE166976 (NK cell), GSE160311 (synthetic T cell antigen receptor (STAR)), GSE147046, GSE156207
  • Paired pre- and post-treatment samples: GSE122934, GSE75086, GSE117090
  • Xenograft models: GSE123485, GSE75086, GSE156207, GSE121007



TCGA major update coming soon

Figure 2. The comprehensive update of TCGA metadata included the review of over 1200 files, the definition of over 1000 fields, the unification and grouping of hundreds of columns, the update of fields representing TCGA publication results and the curation of hundreds of treatment labels.

We are in the final stages of a comprehensive update of TCGA Land (TCGA_B38_GC33).

Look for comprehensive metadata field definitions and tooltips, improved metadata field names, curated treatment information, additional marker paper and PanCanAtlas cluster information, and more!




Figure 3. Distribution of disease samples in the latest release of DiseaseLand by Disease Category.


With this release the Human Disease collection is now available on Human Genome B38/GenCode version 33. All new content requests will be added to HumanDisease_B38_GC33.

If you don't see HumanDisease_B38_GC33, be sure to ask your OmicSoft Server administrator to use "Publish Cloud Land" to select the new Land.

This release adds 7749 new samples and 2297 comparisons from 102 unique project IDs.
This release includes the following:

  • NK cells (CellType: NK Cell) from peripheral blood, umbilical cord blood, lung, liver, spleen and more
  • Infectious diseases and vaccines: over 30 projects in the “Infectious Diseases” Therapeutic Area
  • Arthritis: E-MTAB-6266, E-MTAB-7466, GSE104113, GSE148395, GSE41038, GSE49604, GSE57218, GSE75181, GSE89484, GSE98918
  • Female reproductive diseases: GSE35287, GSE40400, GSE5850
  • Nervous system diseases: GSE121569, GSE122647, GSE135511, GSE136666, GSE137619, GSE138064, GSE141381, GSE150174, GSE151936, GSE22779
  • Tissue-profiling projects: GSE2004; GSE2361; GSE803


This release adds 1200 new samples and 357 comparisons from 59 projects, including studies on the following:

  • Kidney disease: 13 new projects in Therapeutic Area “Renal Disease”, including chronic kidney disease, acute kidney injury and renal fibrosis
  • Liver disease: NAFLD, acute liver failure and acute liver injury models in projects such as GSE102489, GSE104302, GSE111828, GSE120484, GSE124694, GSE128284, GSE130528 and GSE132298
  • Arthritis: E-MTAB-5326, GSE101573, GSE104793, GSE104794, GSE33754, GSE43663, and GES53857
  • Nervous system diseases (migraine, anxiety, addiction, obsessive compulsive disorder, amyotrophic lateral sclerosis): 14 new projects in Therapeutic Area “Neurology” and “Psychiatry”
  • Systemic lupus erythematosus: GSE128692, GSE145422, and GSE147359
  • Drug screen: GSE110256


This release adds 383 new samples and 125 comparisons from 13 projects on nervous system disease, cardiovascular and metabolic diseases and aging.

Single Cell Lands

With the latest Single Cell Lands content update, new datasets on ophthalmology, oncology, neurology, gastroenterology, endocrinology, dermatology and more are now available.

  • 29 human projects (23 UMI + 6 non-UMI), 79 datasets and 801 comparisons
  • 9 mouse projects (8 UMI + 1 non-UMI), 15 datasets and 166 comparisons

Figure 4. Human UMI datasets in the latest release, plotting the number of cell clusters with different cell types (colored) by tissue.


Looking ahead to the next release, expect 55 additional projects with 96 "cell map" dimension reduction datasets, profiling 3.2 million cells from 847 samples.

Our new "Single Cell Lite" protocol for integrating pre-quantified datasets with full manual curation enables us to bring in datasets without raw data. Key datasets to be integrated include Tabula Sapiens (UMI and nonUMI) profiling normal tissue expression in humans, and Allen Mouse Brain Atlas (GSE116470).



Did you know?

In OmicSoft curation, we annotate in vivo and in vitro treatment studies in different columns.

  • Treatment: For in vitro studies, describes the treatment performed on a sample, using OmicSoft controlled vocabularies
  • Subject Treatment: For in vivo studies, describes the treatment using OmicSoft controlled vocabularies
    • If the same subject was sampled before and after treatment, Subject Treatment will be the same, but Treatment Status will indicate which sample is post-treatment
  • Treatment Status: Indicates the treatment applied to an individual sample, when the sample came from a Subject (i.e., patient) that was sampled pre- and post-treatment
  • Pre Treatment: treatment given in vivo or in vitro before the main treatment. Many times the pre-treatment is the disease-induction model for mouse studies.
  • Maternal Treatment: in vivo treatment given to the mother prior to or during gestation. The sample collected from is from the offspring.

OmicSoft curates controlled vocabulary terms from PubChem, NCIT, DrugBank, ChemSpider and the company web site of the treatment source.


Using the SubjectTreatment and TreatmentStatus columns, you can group and subset in vivo treatment studies to reveal interesting patterns in the data, for example, by showing pre-treatment gene expression between patients with differential response to a treatment.

Figure 5. KCNB2 is up-regulated in pre- and post-treatment samples of pancreatic ductal adenocarcinoma from GSE131050. All samples are curated as SubjectTreatment=5-fluorouracil;irinotecan;leucovorin;oxaliplatin;PF-4136309. Pre-treatment samples are identified by TreatmentStatus=none, post-treatment samples are identified by TreatmentStatus=5-fluorouracil;irinotecan;leucovorin;oxaliplatin;PF-4136309. Treatment response is indicated by Response (partial response or stable disease).