Chapter 10 PDF With Notes
Chapter 10 PDF With Notes
RNA Processing
10.1 Types of RNA
In Chapter 10, we will learn more about the different types of RNA and how RNA is
transcribed from the DNA sequence. We will start with section 10.1 covering the types of
RNA.
1
DNA vs RNA
Recall that RNA has two key chemical differences from the DNA molecule. RNA uses the
ribose sugar that contains the 2’-OH and it also uses the uracil base in place of thymine in
DNA.
2
DNA vs RNA
These minor chemical differences result in major structural differences between the
molecules. While DNA is held in the rigid structure of the alpha helix, RNA is usually single
stranded and therefore has much more variation in the shapes that it can adopt. RNA can
largely be divided into two types, one that carries the code for making proteins
or coding RNA, which is also called messenger RNA (mRNA), and non-coding
RNA (ncRNA). The ncRNA can be subdivided into several different types,
depending either on the length of the RNA or on the function. Size
classification begins with the short ncRNAs (~20–30 nt), which include
microRNAs (miRs), and small interfering (siRNAs); the small ncRNAs up to 200
nt, which include transfer RNA (tRNA), small nuclear RNA (snRNA), and small
nucleolar RNA (snoRNA); and long ncRNAs ( > 200 nt), which include ribosomal
RNA (rRNA), enhancer RNA (eRNA) and long intergeneic ncRNAs (lincRNAs),
among others.
3
Translation
Cells access the information stored in DNA by creating RNA, through the
process of transcription, which then directs the synthesis of proteins through
the process of translation. The three main types of RNA directly involved in
protein synthesis are messenger RNA (mRNA), ribosomal RNA (rRNA), and
transfer RNA (tRNA). The mRNA carries the message from the DNA, which
controls all of the cellular activities in a cell. If a cell requires a certain protein to
be synthesized, the gene for this product is “turned on” and the mRNA is
synthesized through the process of transcription. The mRNA then interacts
with ribosomes and other cellular machinery to direct the synthesis of the
protein it encodes during the process of translation. mRNA is relatively
unstable and short-lived in the cell, especially in prokaryotic cells, ensuring that
proteins are only made when needed.
4
Micro RNA (miRNA)
5
Types of RNA Molecules
At steady state, the vast majority of human cellular RNA consists of rRNA (∼90%
of total RNA for most cells Figure 10.5). Although there is less tRNA by mass,
their small size results in their molar level being higher than rRNA (Figure 10.5).
Other abundant RNAs, such as mRNA, snRNA, and snoRNAs are present in
aggregate at levels that are about 1–2 orders of magnitude lower than rRNA
and tRNA (Figure 10.5). Certain small RNAs, such as miRNA and piRNAs can be
present at very high levels; however, this appears to be cell type dependent.
lncRNAs are present at levels that are two orders of magnitude less than total
mRNA. Although the estimated number of different types of human
lncRNAs may have a very restricted expression pattern and thus, accumulate to
higher levels within specific cell types. For example, sequencing of mammalian
transcriptomes has revealed more than 100,000 different lncRNA molecules
can be produced, compared with the approximate 20,000 protein-coding
genes. The diversity and functions of the transcriptome within biological
processes are currently a highly active area of research.
6
Chapter 10: Transcription and
RNA Processing
10.2 RNA Polymerase Enzymes
RNA Polymerase Enzymes (RNAPs) are required to carry out the process of
transcription and are found in all cells ranging from bacteria to humans. All
RNAPs are multi-subunit assemblies, with bacteria having five core subunits
that have homologs in archaeal and eukaryotic RNAPs. In this section, we will
learn more details about this enzyme class.
7
Transcription Initiation
Transcription takes place in several stages. To start with, the RNA polymerase
holoenzyme locates and binds to promoter DNA. At this stage the RNAP
holoenzyme is it the closed conformation (RPc). Initial specific binding to the
promoter by sigma factors of the holoenzyme, sets in motion conformational
changes in which the RNAP molecular machine bends and wraps the DNA with
mobile regions of RNAP playing key roles. Next, RNAP separates the two
strands of DNA and exposes a portion of the template strand. At this point, the
DNA and the holoenzyme are said to be in an ‘open promoter complex’ (RPo),
and the section of promoter DNA that is within it is known as a ‘transcription
bubble’.
8
RNAP
The RNAP catalytic core within bacteria contains five major subunits
(α2ββ’ω). To position this catalytic core onto the correct promoter requires the
association of a sixth subunit called the sigma factor (σ). Within bacteria there
are multiple different sigma factors that can associate with the catalytic core of
RNAP that help to direct the catalytic core to the correct DNA locations where
RNAP can then initiate transcription. For example, within E. coli σ70 is the
housekeeping sigma factor that is responsible for transcribing most genes in
growing cells. It keeps essential genes and pathways operating. Other sigma
factors are activated during certain environmental situations, such as σ38 which
is activated during starvation or when cells reach the stationary phase. When
the sigma subunit associates with the RNAP catalytic core, the RNAP has then
formed the holoenzyme. When bound to DNA, the holoenzyme conformation
of RNAP can initiate transcription. Once the transcription bubble has formed
and transcription initiates, the sigma subunits dissociate from the complex and
the RNAP catalytic subunit continues elongation on its own.
9
Eukaryotic RNA Polymerase Enzyme
In eukaryotic cells, three RNAPs share the task of transcription, the first step in
gene expression. RNA Polymerase I (Pol I) is responsible for the synthesis of the
majority of rRNA transcripts, whereas RNA Polymerase III (Pol III) produces
short, structured RNAs such as tRNAs and 5S rRNA. RNA Polymerase II (Pol II)
produces all mRNAs and most regulatory and untranslated RNAs.
The three eukaryotic RNA polymerases contain homologs to the the five core
subunits found in prokaryotic RNAPs. In addition, the eukaryotic Pol I, Pol II and
Pol III have five additional subunits forming a catalytic core that contains 10-
subunits. The core has a characteristic crab-claw shape which encloses a
central cleft that harbors the DNA, and has two channels, one for the substrate
NTPs and the other for the RNA product. Two ‘pinchers’, called the ‘clamp’ and
‘jaw’ stabilize the DNA at the downstream end and allow opening and closing of
the cleft. For transcription to occur, the enzyme has to maintain a transcription
bubble with separated DNA strands, facilitate the addition of nucleotides,
translocate along the template, stabilize the DNA:RNA hybrid and finally allow
the DNA strands to reanneal. This is achieved by a number of conserved
elements in the active site, which include the fork loop(s), rudder, wall, trigger
loop and bridge helix.
10
Here you can see a comparison of all three RNA polymerases from eukaryotes. They all are
homologs and share a similar core complex shown in the gray and tan regions. The stalk
region is also a related structure in all the polymerases. The Pol I and III enzymes have
more subunits than Pol II. They have more limited arrays of transcripts that they create and
tend to permanently incorporate transcription factors into their core structures. Pol II,
however, has a much more diverse array of targets, and therefore, has a wider array of
transcription factors that it needs to bind with, therefore they are not permanently
incorporated giving a smaller polymerase structure. In the next section, we will focus on
the activity of RNA polymerase II
11
Chapter 10: Transcription and
RNA Processing
10.3 Transcription Factors and the Preinitiation
Complex
12
General Transcription Factors
Transcription factors that are critical for the recruitment of RNA Polymerase II are typically
referred to as the TFII class of transcription factors. They are further identified by a letter A,
B, C, and so on. Class II gene transcription is regulated at various levels: while
assembling on chromatin, before and during transcription initiation,
throughout elongation and mRNA processing, and termination. A host of
activators and repressors has been reported to regulate transcription, including
a central multisubunit complex called the Mediator that helps in the
recruitment of general transcription factors (GTFs) and the activation of RNA
Pol II. Here we will focus on the formation of the GTFs that make up the
core preinitiation complex (PIC) during transcriptional activation.
13
Preinitiation Complex (PIC)
14
TBP – TATA Box Binding Protein
The TBP component of TFIID binds with a specific DNA sequenced called the
TATA box. This DNA sequence is found about 30 base pairs upstream of the
transcription start site in many eukaryotic gene promoters.
15
TBP – TATA Box Binding Protein
If inhibitors bind with TBP, it will be unable to interact with the DNA and start the formation
of the preinitiation complex.
16
TBP – TATA Box Binding Protein
In the absence of inhibitors, TFIID will bind with the DNA and the TATA Box Binding protein
will scan the DNA for the TATA box sequence.
17
TBP – TATA Box Binding Protein
In the absence of inhibitors, TFIID will bind with the DNA and the TATA Box Binding protein
will scan the DNA for the TATA box sequence. When TBP binds to a TATA box within
the DNA, it distorts the DNA by inserting amino acid side-chains between base
pairs, partially unwinding the helix, and doubly kinking it. Note that TBP is
released from TFIID during this process.
18
TBP – TATA Box Binding Protein
The diagram on the right shows the TFIID complex in scanning mode. Notice
that several transcription factor binding sites are blocked during this time, so
that the preinitiation complex won’t form inappropriately. When TBP binds with
a TATA box and dissociates from TFIID, this opens up a binding site for TFIIB
which will in turn bind with RNA Polymerase II, and initiate the formation of the
preinitiation complex.
19
Preinitiation Complex
This slide just shows that assembly a little more clearly. TFIID binds to the DNA, when the
TBP finds a TATA box, it releases from TFIID and can recruit TFIIB and the RNA polymerase II
enzyme, in addition to other TFII factors. TFIID may dissociate completely or stay loosely
associated with TFIIA. TFIIB is critical for the transition of RNA Pol II from the closed to the
open conformation when transcription can begin. In most eukaryotes, after
synthesizing about 20–100 bases, RNA pol II can pause (Promoter proximal
pause) and then disconnect from promoter elements and other components of
the transcription machinery, giving rise to a fully functional elongation complex
in a process called promoter escape. The promoter-bound components of the
PIC, in contrast, remain in place, and thus only TFIIB, TFIIF, and RNA pol II need
to be recruited for re-initiation, significantly increasing the transcription rate in
subsequent rounds of transcription.
20
Enhancer Elements
21
Chapter 10: Transcription and
RNA Processing
10.4 Transcription Elongation and Termination
22
Prokaryotic
Transcription
Elongation
Once transcription has been initiated, the elongation phase is where the RNA polymerase
enzyme will use the template strand of the DNA to create the nascent RNA. This diagram
shows the catalytically active RNA Polymerase. You can see that within the enzyme catalytic
center, a small transcription bubble forms where the DNA helix is unwound. The template
DNA strand is shown in blue. The Mg2+ cofactor is shown in red positioned at the catalytic
active site.
23
Prokaryotic Transcription Elongation
This diagram shows that catalytic activity of the polymerase in a little more detail. The DNA
template is shown in grey and the nascent RNA strand shown in red. The polymerase has
two major conformations during the process of elongation. A closed catalytic
conformation, that is used when the polymerase is adding nucleotides to the nascent
chain. Notice here that the diphosphate is cleaved from the incoming nucleotide. Further
hydrolysis of the diphosphate will release energy that helps to drive this reaction forward.
Once the incoming nucleotide has been added, the polymerase has to then translocate
down the DNA template to open up the position for the next nucleotide to enter. It does
this by switching to an open conformation that is more flexible. This allows the polymerase
to essentially pull itself along the DNA template, one nucleotide at a time.
24
Prokaryotic Transcription Elongation
Within the catalytic center, a bridge region, shown in yellow, and a hinge loop shown in
pink, are required for catalysis and translocation. In the closed conformation, the
polymerase is modestly flexible, which enables the positioning of the Mg2+ cofactor to
facilitate nucleotide addition. Following nucleotide addition, the bridge assumes an open
conformation that is much more flexible. Movement of the hinge-loop causes bending of
the bridge and the polymerase can shift down on the DNA template.
25
Transcriptional Backtracking
Figure from: Sharma, A.K. and Chowdhury, D. (2012) Biophys Rev & Let 07:03n04
Sometimes the elongation phase does not move smoothly in the forward direction. The
elongation complex can sometimes regress in a process called backtracking. This can
sometimes be caused by a misincorporation event or by brief pauses of the RNA
polymerase complex. The backtracking process, however, stalls the elongation process, and
it must now be rescued by additional protein helpers.
26
Gre Factors to the Rescue!
Gre factors are involved in rescuing a stalled transcriptional elongation complex. During
normal elongation, the Gre protein is bound to the active elongation complex, but does not
exert any activity on the complex. It essentially stays out of the way, like a tool in a tool
belt. However, upon backtracking or nucleotide misincorporation, The Gre factor produces
its own trigger loop domain, which supplants the normal catalytic trigger loop of the RNA
polymerase enzyme that we saw in more detail in a previous slide. The Gre factor will then
cleave off any overhanging mRNA from the backtracked complex. The elongation complex
is then rescued and the normal trigger loop domain from the RNA polymerase will again be
functional and can resume the elongation process.
27
Prokaryotic Transcription Termination
Termination of transcription in prokaryotes can occur using an intrinsic model. The diagram
on the left shows the polymerase in yellow, in the open conformation with the DNA in blue
and the nascent RNA in red. Intrinsic termination occurs at specific template
sequences – an inverted repeat followed by a run of A residues. This sequence
causes the formation of a short stem-loop structure in the nascent RNA
chain which in essence will derail the polymerase from continuing.
28
Prokaryotic Transcription Termination
Extrinsic factors can also be involved in the termination process. This can be most aptly
seen by the functioning of the Rho protein. As the nascent mRNA extends outward from
the RNA polymerase machinery, the Rho protein is a small, horseshoe-shaped protein that
can clamp onto the extending nucleic acid. Biochemical and structural data suggest
that Rho initially binds to RNA in an open, ‘lock-washer’ conformation that
closes into a planar ring as RNA transfers to the central cavity. There, the ssRNA
contacts an asymmetric secondary binding site (SBS). Upon hydrolysis of ATP,
the ssRNA is pulled upon conformational changes of the conserved Q and R
loops of the SBS, leading to Rho translocation, and ultimately promoting RNA
polymerase (RNAP) dissociation.
29
Eukaryotic Termination
30
Chapter 10: Transcription and
RNA Processing
10.5 Processing of RNA
31
Prokaryotic mRNA Processing
32
Eukaryotic RNA Processing
In multicellular organisms almost every cell contains the same genome, yet
complex spatial and temporal diversity is observed in gene transcripts. This is
achieved through multiple levels of processing leading from gene to protein, of
which RNA processing is an essential stage. Following transcription of a gene by
RNA polymerases to produce a primary mRNA transcript, further processing is
required to produce a stable and functional mature RNA product. This involves
various processing steps including RNA cleavage at specific sites, intron
removal, called splicing, which substantially increase the transcript repertoire,
and the addition of a 5’CAP. Another crucial feature of the RNA processing of
most genes is the generation of 3′ ends through an initial endonucleolytic
cleavage, followed in most cases by the addition of a poly(A) tail, a process
termed 3′ end cleavage and polyadenylation
33
3’- Polyadenylation
34
The Location of Polyadenylation
More than 70% of all genes harbour more than one polyadenylation signal
(PAS). This gives rise to transcript isoforms differing at the mRNA 3′ end. While
alternative polyadenylation (APA) in 3′UTR changes the properties of the mRNA
(stability, localisation, translation), internal PAS usage (in introns or the coding
sequence (CDS)) changes the C-termini of the encoded protein, resulting in
different functional or regulatory properties.
35
5’- CAP Formation
36
In addition to the polyA tail, nuclear export of RNA is regulated by the cap
binding complex (CBC), which binds to 7-methylguanylate-capped RNA.
37
The CBC is then recognized by the nuclear pore complex and the mRNA
exported. Once in the cytoplasm after the pioneer round of translation, the CBC
is replaced by the translation factors eIF4E and eIF4G of the eIF4F complex. This
complex is then recognized by other translation initiation machinery including
the ribosome, aiding in translation efficiency. In addition, the 5’ CAP prevents
degradation by functionally looking like the 3’ end of the RNA and evading the 5’
degradation exonucleases. Secondly, when the 5’CAP is bound by translation
machinery it hides the CAP from the decapping enzyme and increases the
lifespan of the message.
38
Decapping Enzymes and Degradation
39
Intron Splicing
Eukaryotic organisms also have large tracks of non-coding regions interspersed throughout
gene sequences, known as intron sequences. These introns must be removed from the pre-
messenger RNA before it translocates into the cytoplasm to undergo the translation
process. Intron removal is mediated by the spliceosome, which is a macromolecular
complex formed by five small nuclear ribonucleoproteins (snRNPs), termed U1,
U2, U4, U5, and U6, and approximately 200 proteins. Only the snRNP molecules
are shown in the diagram above
40
Intron Splicing
If we take a closer look at the pre-mRNP that needs to be spliced, we can see that there are
some key sequence elements within the intron sequence that enable the snRNA molecules
to recognize the intron and choose the correct excision sites for the splicing. Exon 1 is
shown here in blue and Exon 2 in grey. There is a key GU sequence on the 5’ side of the
splice site and a key AG sequence on the 3’ site that identify the borders of the intron.
There is also a polypyrimidine sequence (Py) that is a key recognition sequence for the
snRNP binding. There is also a branch point sequence that contains the critical adenine
base that will be involved in the splicing reaction.
41
Two Transesterification Reactions
42
Two Transesterification Reactions
So again, for the chemistry that will be occurring during the reaction, the key elements are
the identification of the correct cleavage sites at the 5’ and 3’ ends of the intron, and the
key Adenine base that is found within the branch point sequence.
43
Two Transesterification Reactions
The first transesterification occurs by the positioning of the 2’-OH residue of the
branchpoint Adenine near the 5’-phosphate of the Guanine residue at the 5’-edge of the
intron. The 2’-OH will mediate nucleophilic attack on the 5’-phosphate group of the
guanine residue in the intron. You guessed it! This forms an oxyanion intermediate, and
when the electrons rebound in to reform the P-O double bond, the 3’-OH of the last
guanine residue in Exon 1, will serve as the leaving group.
44
Two Transesterification Reactions
This forms the first intermediate product. The branchpoint Adenine residue is now
covalently linked to the 5’ guanine residue of the intron at the 2’-OH position of the
adenine. This has released the 3’-OH of Exon 1. This 3’-OH is positioned at the 5’ side of
Exon 2, where it mediates nucleophilic attack at the 5’-phosphate group of the first residue
of Exon 2. This begins the second transesterification with the formation of the oxyanion
intermediate. In this esterification reaction, the intron will serve as the leaving group, with
the reforming of the P – O double bond, and the two exons will now be perfectly joined
together.
45
Two Transesterification Reactions
The final products are shown here. Exon 1 and Exon 2 are joined, and the intron has been
successfully spliced out. The branchpoint adenine residue remains covalently linked to the
5’-phosphate of the intron. This resulting looped structure is known as a lariat structure. A
lariat is a rope used by cowboys to lasso or tether animals. The intron loop resembles this
type of rope structure and why it has this name.
46
Intron Splicing
So now let’s go back to our original diagram and see how the small nuclear RNP molecules
are involved in this process.
47
Intron Splicing
In the first step, you can see that there are a number of small proteins involved in the
recognition of the 3’ intron border sequence, and that the U1 small nuclear RNP molecule
recognizes and binds with the 5’ edge of the intron sequences. This will happen
simultaneously for all of the 5’ edges of all the introns within the pre-mRNP molecule. This
is known as the commitment complex. Next the U2 snRNP will bind to the branch point
sequence. This is known as complex A.
48
Intron Splicing
Complex A is converted to Complex B with the binding of three additional snRNA molecules
and the loss of the SF1 protein. U4, U5, and U6 join the complex and cause the folding of
the mRNP so that Exon 1 will be in close proximity of Exon 2. Sorry, there is no U3 snRNP
involved within this process. Researchers likely identified the snRNP molecules before they
new their complete functions.
49
Intron Splicing
U1 and U4 dissociate from the complex and U2, U5 and U6 rearrange forming Complex B*
50
Intron Splicing
Complex B* quickly shifts conformation to form Complex C, which is catalytically active and
mediates the two transesterification reactions.
51
Intron Splicing
Once the transesterification reactions are complete, the U2, U5, U6 complex is releases the
Lariate product and the mRNA with the exons correctly joined together. U2, U5, and U6
will then be recycled for another round of splicing.
52
Alternative Splicing
53
Alternative Splicing
There are several different types of AS events, which can be classified into four
main subgroups. The first type is exon skipping, which is the major AS event in
higher eukaryotes. In this type of event, a cassette exon is removed from the
pre-mRNA. The second and third types are alternative 3′ and 5′ SS selection.
These types of AS events occur when the spliceosome recognizes two or more
splice sites at one end of an exon. The fourth type is intron retention, in which
an intron remains in the mature mRNA transcript. This AS event is much more
common in plants, fungi and protozoa than in vertebrates.
54
Alternative Splicing
Other events that affect the transcript isoform outcome include mutually
exclusive exons, alternative promoter usage, and alternative polyadenylation.
Overall, there are many layers of RNA modifications that help to regulate the
lifespan and translation efficiency of messenger RNA and help to control
protein levels within a cell. In the next chapter, we will look in more depth at
the translation process.
55