Structural biology of gene transcription

In eukaryotes, gene transcription is carried out by three related RNA polymerases. RNA Pol I and III transcribe a few dozen genes to produce mainly ribosomal and transfer RNA, respectively. Pol II transcribes all genes coding for proteins, and produces mRNA, which serves as the template for protein synthesis.  RNA polymerases are the endpoint of signal transduction pathways, and their regulation underlies cell growth and differentiation. RNA polymerases are very large enzymes, consisting of 12-17 subunits with a total molecular weight of 0.5-0.7 Mega Dalton. They transiently assemble with many additional factors into large transcription machineries of changing composition. Polymerase-associated factors enable the polymerases to recognize different promoters and to transcribe different classes of genes, to receive different regulatory signals, to direct the co-transcriptional processing of RNA transcripts, and to couple transcription to changes in chromatin structure and modification. Therefore, RNA polymerases are not only the key enzymes of gene expression but also the central coordinators of nuclear events and chromatin transitions.

To elucidate the transcription mechanisms, we determine three-dimensional structures of RNA polymerases in complex with nucleic acid substrates and protein factors, and structures of polymerase-associated factors. We use X-ray crystallography since it allows atomic structure determination of very large and asymmetric macromolecular complexes. We additionally use single-particle cryo-electron microscopy for the structural analysis of large transcription assemblies. We often combine the different structural biology methods and also use modeling. This integrated structural biology approach can elucidate the architecture of large and transient assemblies.

During the mRNA transcription cycle, Pol II first assembles with general factors at the promoter and initiates mRNA synthesis. Pol II then elongates the RNA chain and finally terminates transcription by dissociating from the DNA template and the RNA product. Pol II is then recycled and can reinitiate at the promoter. Our aim is to obtain enough structural and functional information to visualize the transcription cycle in a three-dimensional movie. We have determined structures of Pol II in functional complexes with DNA and RNA, the initiation factor TFIIB, the elongation factor TFIIS, and inhibitors of transcription. We obtained a first movie of the nucleotide addition cycle. Additional structural data on Pol II-associated proteins have been obtained. We investigated the basis for coupling transcription to RNA 3'-processing and chromatin methylation, and the mechanism of Pol II dephosphorylation during polymerase recycling. We studied how Pol II recognizes DNA lesions for their repair, and clarified the mechanisms underlying mutagenesis, fidelity, and proofreading during transcription.

To understand how different classes of genes are transcribed, we need to structurally and functionally study the two larger siblings of Pol II, Pol I and Pol III. These polymerases are central for cell growth, and their deregulation can lead to cancer. Pol I and Pol III have 14 and 17 subunits, respectively, and exhibit a molecular weight of around 600 and 700 kilo-Daltons, respectively. We obtained the first structural data on Pol I and Pol III, including electron microscopic structures of the complete enzymes and crystallographic structures of their specific subunits. We showed that the additional subunits in Pol I and Pol III are related to the Pol II initiation factors TFIIE and TFIIF. In the medium term, we hope to understand the evolution of specific structures, properties, and regulation of the three polymerases.