Abstract
The investigation of clusters of conserved non-coding elements (CNEs) is expected to provide
insights into the mechanisms of gene regulation and our understanding of human disease.
With the aim of characterizing the evolutionary constrains acting on the distances that separate
the CNEs found in clusters, we downloaded PhastCons elements based on 100-way alignments
from the UCSC Genome Browser. These elements were then filtered and merged to obtain 1.2
million CNEs that were at least 24 bp long and did not overlap any protein-coding exons. Among
these CNEs, 54% were intronic, 44% intergenic, and 2% overlapped with untranslated regions
(UTR). Furthermore, we defined ~30,000 “clusters” containing CNEs with no conserved protein-
coding elements in between them and bordered by the nearest conserved protein-coding
elements upstream and downstream of each CNE in the cluster. On average, the clusters
comprised 40 CNEs, and only ~8,000 CNEs were not part of any cluster, confirming previous
reports indicating that CNEs are often found in clusters. Next, we used squared-change
maximum parsimony to infer inter-CNE distances in the primate and mammalian ancestor. For
comparison we also estimated the distance between the two protein-coding elements
delimiting each cluster. Generally, we found that inter-CNE distances have contracted with
respect to their loci. We hypothesize that epistatic interactions drive this pattern, and that
clusters of CNEs have relevant functional or structural roles.
insights into the mechanisms of gene regulation and our understanding of human disease.
With the aim of characterizing the evolutionary constrains acting on the distances that separate
the CNEs found in clusters, we downloaded PhastCons elements based on 100-way alignments
from the UCSC Genome Browser. These elements were then filtered and merged to obtain 1.2
million CNEs that were at least 24 bp long and did not overlap any protein-coding exons. Among
these CNEs, 54% were intronic, 44% intergenic, and 2% overlapped with untranslated regions
(UTR). Furthermore, we defined ~30,000 “clusters” containing CNEs with no conserved protein-
coding elements in between them and bordered by the nearest conserved protein-coding
elements upstream and downstream of each CNE in the cluster. On average, the clusters
comprised 40 CNEs, and only ~8,000 CNEs were not part of any cluster, confirming previous
reports indicating that CNEs are often found in clusters. Next, we used squared-change
maximum parsimony to infer inter-CNE distances in the primate and mammalian ancestor. For
comparison we also estimated the distance between the two protein-coding elements
delimiting each cluster. Generally, we found that inter-CNE distances have contracted with
respect to their loci. We hypothesize that epistatic interactions drive this pattern, and that
clusters of CNEs have relevant functional or structural roles.
Original language | English |
---|---|
Publication status | Published - 20 Sept 2022 |
Event | 21st European Conference on Computational Biology: ECCB 2022 - Sitges, Barcelona, Spain Duration: 12 Sept 2022 → 21 Sept 2022 https://eccb2022.org/poster-presentations/ https://eccb2022.org/ |
Conference
Conference | 21st European Conference on Computational Biology |
---|---|
Abbreviated title | ECCB 2022 |
Country/Territory | Spain |
City | Barcelona |
Period | 12/09/22 → 21/09/22 |
Internet address |
Keywords
- conserved non-coding elements, evolution, bioinformatics