Predicting 3D genome organization from the nucelotide sequence with DNA-DDA

Xenia Lainscsek*, Leila Taher

*Korrespondierende/r Autor/-in für diese Arbeit

Publikation: KonferenzbeitragPaperBegutachtung

Abstract

The intricate regulation of multiple levels of gene expression enables the remarkable cellular diversity seen in eukaryotic organisms. Around $2m$ of DNA must be packaged into every cell nucleus with a diameter of only $\approx 2 \mu m$ in such a manner, that allows for the precise and efficient expression of genes into proteins. Genome folding is therefore characterized by highly organized multi-scale structures which can be probed by novel experimental sequencing techniques such as ``Hi-C'' \cite{Lieb09,Rao14}. So called 2D ``contact maps'' or ``contact matrices'', which visualize the 3D proximity of genomic loci and portray the fractal like nature of 3D genome architecture, are derived from Hi-C experiments. One important feature of genome folding discovered by this assay, is its partitioning into A/B compartments which are associated with transcriptionally active euchromatin and inactive heterochromatin respectively \cite{Dekker13}. Although Hi-C and similar technologies have led to major breakthroughs in understanding the principles of genome folding, such as chromosomal compartmentalization, they are costly, tedious and limited by technical constraints. This has fueled the development of computational models that can simulate the complex patterns of chromatin interactions, unravel their molecular determinants and asses the impact of genomic variants \cite{Yang22}.

We are developing a nonlinear dynamics-based approach ``\textit{DNA-DDA}'' for the prediction of contact maps by adapting the time series classification framework, \textit{delay differential analysis} (DDA), to capture dynamical signatures inherent in genomic sequence data.

In a recent proof-of-concept publication \cite{XLain23}, we demonstrated DNA-DDA could accurately predict chromosomal compartments, from an intermediate step that inferred individual interactions at 100 kb. The DNA sequence was represented as a ``1D DNA walk'' in which the walker starts at zero and continues along the nucleotide chain taking a step up for strongly bonded pairs (C or G) and down for weakly bonded pairs (A or T). DNA-DDA exhibited exceptional performance and competed well with state-of-the-art methods \cite{Zhou22,Kirchhof21,Schweiss20,Fuden20}, indicating its potential as a robust alternative tool for the analysis of genomic sequence data. Importantly, while other methods require nearly all chromosomes for training (around $95\%$ of human autosomes), we obtain sparse models from a 20Mb long region on one chromosome ($0.7\%$ of human autosomes) and the corresponding true positives defined by experimental 3D interaction data. We are currently extending DNA-DDA to a multi-scale resolution model that 1) uses a 2D DNA walk sequence representation to include information of all nucleotides and 2) relies on individual contacts instead of A/B compartment labels for quantifying model performance. Such methods will be key in understanding the interplay between 3D genome architecture and proper cellular function.

Originalspracheenglisch
Seiten163-168
Seitenumfang5
PublikationsstatusVeröffentlicht - Okt. 2023
VeranstaltungFrom the Nonlinear Dynamical Systems Theory to Observational Chaos - 4 Rue San Subra, Toulouse, Frankreich
Dauer: 9 Okt. 202311 Okt. 2023
https://www.cesbio.cnrs.fr/ottochaos-from-the-nonlinear-dynamical-systems-theory-to-observational-chaos/

Konferenz

KonferenzFrom the Nonlinear Dynamical Systems Theory to Observational Chaos
Land/GebietFrankreich
OrtToulouse
Zeitraum9/10/2311/10/23
Internetadresse

Fingerprint

Untersuchen Sie die Forschungsthemen von „Predicting 3D genome organization from the nucelotide sequence with DNA-DDA“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren