Understanding the regulation of non-coding RNA transcription in repeat regions
Only 1% of human genome codes for proteins. However, about 75% of the human genome is transcribed. Further, large parts of this transcribed genome is repetitive in nature. Examples are centromeres and telomeres. Although MLL and family members are best studied for their role as HMT in transcription, many aspects of even this role are not well understood. For instance non-coding repeats regions like centromere have been shown to bear H3K4 methylation marks and support RNA polymerase II mediated-transcription. However, the identity of the methyltransferase, depositing these marks to facilitate transcription on these regions, is not known.
We are investigating the role of MLL family in shaping the epigenetic landscape of centromeres. Centromeres are conserved regions among the eukaryotes that pose an evolutionary conundrum as they are epigenetically defined by — centromeric protein A (CENP-A) — a Histone 3 variant, and not by the presence of α-satellite DNA. Although, initially thought to be transcriptionally silent, centromere chromatin is now known to be transcribed by RNA polymerase II and produces centromere RNA (cenRNA) transcripts. Moreover, histone modification, centromere transcription, and the cenRNA are important for the centromere and kinetochore assembly and function.
In an ongoing project, we show that MLL and SETD1A regulate cenRNA transcription. MLL regulates the chromatin state and stability of the centromere by depositing the H3K4me2 mark on the centromere. our data reveals a novel molecular framework where both the H3K4 methylation mark and the methyltransferases regulate stability and identity of the centromere.