Novel G-quadruplex prone sequences emerge in the complete assembly of the human X chromosome

Časopis: Biochimie
Autoři: Bohálová N., Mergny J.-L., Brázda V.
Rok: 2021
ISBN: 0300-9084

Abstrakt

G-quadruplexes are non-B secondary structures with regulatory functions and therapeutic potential. Improvements in sequencing methods recently allowed the completion of the first human chromosome which is now available as a gapless, end-to-end assembly, with the previously remaining spaces filled and newly identified regions added. We compared the presence of G-quadruplex forming sequences in the current human reference genome (GRCh38) and in the new end-to-end assembly of the X chromosome constructed by high-coverage ultra-long-read nanopore sequencing. This comparison revealed that, even though the corrected length of the chromosome X assembly is surprisingly 1.14% shorter than expected, the number of G-quadruplex forming sequences found in this gapless chromosome is significantly higher, with 493 new motifs having G4Hunter scores above 1.4 and 23 new sequences with G4Hunter scores above 3.5. This observation reflects an improved precision of the new sequencing approaches and points to an underestimation of G-quadruplex propensity in the previous, widely used version of the human genome assembly, especially for motifs with a high G4Hunter score, expected to be very stable. These G-quadruplex forming sequences probably remained undiscovered in earlier genome datasets due to previously unsolved G-rich and repetitive genomic regions. These observations allow a precise targeting of these important regulatory regions.

Full paper