Abstract
Chromosome conformation capturing (3C) is a pivotal driver of 3D genome research. Protocols like Hi-C produce genome-wide sequencing reads of pairs of proximal genomic loci in 3D space, used to approximate their interaction frequencies. These interaction frequencies are typically represented as a 2D contact matrix in which each dimension corresponds to the linear genome segmented into equal-sized bins. Contact matrices are notoriously sensitive to the experimental protocol and thus, the resulting artifacts and low signal-to-noise ratio pose several bioinformatics challenges. Quantifying the similarity of two contact matrices is important not only for assessing the quality and reproducibility of a Hi-C experiment but also for drawing feasible conclusions about 3D genome organization in diverse cell types or conditions.
Current methods quantifying contact matrix similarity are rooted in linear algebra, image processing or network theory, each with a different interpretation of the contact matrix. Most methods consider the matrix as a whole and apply various noise reducing transformations. We propose a fundamentally different entropy-driven approach, ENT3C, inspired by recent work on estimating entropy in correlation matrices to study functional connectivity in the brain.
ENT3C was motivated by the association of contact matrices with images, networks and fractals, all of which have been analyzed in other contexts by various notions of entropy. Contact matrices are characterized by an abundance of intertwined geometries contributing to local pattern complexities. Higher complexity is reflected in higher information entropy. By recording the change in complexity along the matrix diagonal, estimated by the von Neumann information entropy, ENT3C produces a characteristic signal enabling the comparison of contact matrices. Besides robust and accurate quantification of contact matrix similarity, we demonstrate the potential use of ENT3C-derived entropy signals for detailed exploration of biologically relevant intra- and intercellular differences in the 3D genome. Genes within loci identified by ENT3C as most similar between two cell lines were enriched in biological processes vital for a majority of cells. Such processes depended on contact matrix resolution, aligning with the assumption that distinct genomic regulatory mechanisms are associated with different scales.
The profound significance of entropy is underpinned by its applications across a myriad of disciplines. Inspired by this well-established concept, ENT3C provides a robust, user-friendly contact matrix similarity metric and a characteristic signal that can be used to gain detailed biological insights into 3D genome organization.
Current methods quantifying contact matrix similarity are rooted in linear algebra, image processing or network theory, each with a different interpretation of the contact matrix. Most methods consider the matrix as a whole and apply various noise reducing transformations. We propose a fundamentally different entropy-driven approach, ENT3C, inspired by recent work on estimating entropy in correlation matrices to study functional connectivity in the brain.
ENT3C was motivated by the association of contact matrices with images, networks and fractals, all of which have been analyzed in other contexts by various notions of entropy. Contact matrices are characterized by an abundance of intertwined geometries contributing to local pattern complexities. Higher complexity is reflected in higher information entropy. By recording the change in complexity along the matrix diagonal, estimated by the von Neumann information entropy, ENT3C produces a characteristic signal enabling the comparison of contact matrices. Besides robust and accurate quantification of contact matrix similarity, we demonstrate the potential use of ENT3C-derived entropy signals for detailed exploration of biologically relevant intra- and intercellular differences in the 3D genome. Genes within loci identified by ENT3C as most similar between two cell lines were enriched in biological processes vital for a majority of cells. Such processes depended on contact matrix resolution, aligning with the assumption that distinct genomic regulatory mechanisms are associated with different scales.
The profound significance of entropy is underpinned by its applications across a myriad of disciplines. Inspired by this well-established concept, ENT3C provides a robust, user-friendly contact matrix similarity metric and a characteristic signal that can be used to gain detailed biological insights into 3D genome organization.
Original language | English |
---|---|
Pages | 129 |
Publication status | Published - 30 Apr 2023 |
Event | Genome Organization & Nuclear Function - Cold Spring Harbor Laboratory , Cold Spring Harbor, United States Duration: 30 Apr 2024 → 4 May 2024 https://meetings.cshl.edu/meetings.aspx?meet=NUCLEUS&year=24 |
Conference
Conference | Genome Organization & Nuclear Function |
---|---|
Country/Territory | United States |
City | Cold Spring Harbor |
Period | 30/04/24 → 4/05/24 |
Internet address |