High-throughput biological technologies (e.g. ChIPseq, RNA-seq and single-cell RNA-seq) rapidly accelerate the accumulation of genome-wide omics data in
diverse interrelated biological scenarios (e.g. cells,
tissues and conditions). Integration and differential
analysis are two common paradigms for exploring
and analyzing such data. However, current integrative methods usually ignore the differential part, and
typical differential analysis methods either fail to
identify combinatorial patterns of difference or require matched dimensions of the data. Here, we propose a flexible framework CSMF to combine them
into one paradigm to simultaneously reveal Common
and Specific patterns via Matrix Factorization from
data generated under interrelated biological scenarios. We demonstrate the effectiveness of CSMF with
four representative applications including pairwise
ChIP-seq data describing the chromatin modification
map between K562 and Huvec cell lines; pairwise
RNA-seq data representing the expression profiles of
two different cancers; RNA-seq data of three breast
cancer subtypes; and single-cell RNA-seq data of human embryonic stem cell differentiation at six time
points. Extensive analysis yields novel insights into
hidden combinatorial patterns in these multi-modal
data. Results demonstrate that CSMF is a powerful
tool to uncover common and specific patterns with
significant biological implications from data of interrelated biological scenarios.