Introduction

Gene expression in eukaryotes is a compartmentalized process consisting of several different, yet connected steps. The importance of gene expression for living cells and organisms is exemplified by the existence of diverse molecular mechanisms that detect errors and thereby ensure the accuracy of gene expression. A well-known quality control process, referred to as nonsense-mediated mRNA decay (NMD) or alternatively mRNA surveillance, limits the expression of mRNAs with premature termination codons (PTCs) and other aberrant termination events. NMD prevents the biosynthesis of C-terminally truncated proteins from PTC-containing mRNAs, which may exert dominant-negative activities and interfere with the normal function of full-length proteins present in the cell (Fig. 1). NMD recognizes not only PTC-containing transcripts, which could result for example from frameshift or nonsense mutations, but also regulates the expression of many physiological mRNAs, the so-called endogenous NMD substrates. Hence, NMD represents a molecular quality control mechanism, but in addition also plays an important role during post-transcriptional gene expression.

Fig. 1
figure 1

Function of nonsense-mediated mRNA decay. Top A normal transcript consists of a 5′ cap, followed by a 5′ UTR, the ORF, a 3′ UTR, and a poly(A) tail. The ORF of a transcript with a normal termination codon is translated into a full length protein. Bottom A transcript carrying a premature termination codon (PTC) will host a shortened ORF as well as an elongated 3′ UTR. Instead of continuously translating the transcript into a truncated protein with possible deleterious function in the cell, the PTC-containing transcript will be degraded via NMD

In this review, we aim to provide a comprehensive overview about the factors involved in NMD and their characteristics and molecular functions. We also describe NMD-activating features of different types of NMD substrates. Furthermore, we present a model of the NMD mechanism, which explains how the interplay of different factors helps to recognize different classes of mRNAs. Finally, we discuss the function of NMD factors during normal embryonic development and how NMD influences the phenotypes of human disorders. This review focuses on mammalian NMD but we provide information about essential aspects of NMD in different organisms.

Terminating the message: steps in normal translation termination

The expression of protein coding genes is essential for cellular functionality, with the final step being cytoplasmic translation of mature mRNA transcripts into proteins by the ribosome. Translation termination plays a pivotal role in the lifespan of an mRNA, because the position of the termination codon and efficiency of the termination process determines mRNA stability [15]. The termination process begins when a stop codon (UAA, UAG, or UGA) enters the A-site of a translating ribosome, causing it to stall. Two important factors involved in the ensuing steps of translation termination are the class I release factor 1 (in humans eRF1) and the class II release factor 3 (in humans eRF3). Using RNA interference, it was shown that human eRF3 is directly involved in translation termination, albeit in a cell-type specific manner and is required for eRF1 stability in the cell [6, 7]. Of the two human eRF3 isoforms (eRF3a and eRF3b), eRF3a has been shown in HEK 293 cells to act as the main factor in translation termination [79]. A termination codon within the ribosomal A-site is decoded by eRF1, which is known to bind the ribosomal A-site and mimic a tRNA [10, 11]. Instead of promoting peptidyl transfer like a regular tRNA, eRF1 mediates via its GGQ motif the hydrolysis of the synthesized polypeptide chain, which is still attached to the ribosomal P-site [12, 13]. The role of the GTPase eRF3 is to enhance the activity of its interaction partner eRF1 [14]. Indeed, it has been shown in yeast that GTP hydrolysis by eRF3 stimulates the release of the polypeptide chain via eRF1 [15]. Although eRF3 is able to bind GTP, its weak inherent GTPase activity is stimulated by a quaternary complex consisting of eRF1, eRF3, GTP, and the ribosome [16, 17].

Upon peptide release mediated by eRF1, the mRNA-bound ribosome needs to be removed to free up space for any approaching, upstream ribosomes and any subsequent termination events. This dissociation of the 80S complex is executed by the ATP-binding cassette subunit family E member 1 (ABCE1) ATPase and involves ATP hydrolysis. The exact mechanism by which ABCE1 mediates ribosome recycling is not yet fully understood; however, several steps of this process have already been illuminated. It is known, for example, that eRF1 and ABCE1 interact with each other and that this interaction is required for efficient ribosome splitting [18]. Furthermore, ABCE1 is known to interact with the ribosome and this interaction is mediated by a FeS cluster domain of ABCE1 [19, 20]. This multistep process of translation termination ensures efficient ribosome recycling with continuous rounds of translation and increases mRNA survivability in general.

Several other factors besides the aforementioned ones are involved in translation termination. One such prominent factor in human cells is cytoplasmic poly(A)-binding protein 1 (PABPC1), which binds to the 3′ poly(A) tail of an mRNA. PABPC1 consists of four RNA recognition motifs (RRMs) in the N-terminal region of the protein [21, 22]. The first two RRMs are responsible for PABPC1-binding to the poly(A) tail [23]. The C-terminal region of PABPC1 contains the so-called MLLE (pronounced Mademoiselle) domain, which mediates the interaction with eRF3a [24, 25]. Two PAM2 motifs in the N terminus of eRF3a interact directly with the MLLE domain of PABPC1 and this interaction is known to stimulate polypeptide release as well as the consequent ribosome recycling [2628]. PABPC1 interacts via its second RRM with eukaryotic initiation factor 4G (eIF4G), which is another important binding partner [29, 30]. eIF4G is part of a larger, multisubunit complex termed the eIF4F complex consisting of eIF4E, eIF4G, and eIF4A. This eIF4F complex is bound to the cap of mRNAs in the cytoplasm via its eIF4E subunit [31]. The interaction between PABPC1 and eIF4G is believed to give rise to a looped mRNA termed the closed loop. The closed loop model predicts a close proximity between a 3′ terminating ribosome and the 5′ end of the mRNA, which is believed to play a role in facilitating ribosome recycling as well as translation initiation [3234].

PTC carrying transcripts activate NMD

NMD as a quality control mechanism targets aberrant mRNAs harboring a PTC at approximately 50 to 55 nucleotides upstream of the final exon–exon junction of a transcript. PTCs downstream of this border fail to initiate efficient degradation of a transcript via NMD [3538]. It is of interest to note that not all mRNAs fully adhere to this 50 to 55 nucleotide rule. For example the T-cell receptor β (TCR-β) transcript represents one well known exception [39]. These findings suggest that intron position and nuclear splicing are important determinants for whether NMD is activated or not. This is emphasized by the fact that PTC-containing, but intronless transcripts are immune to degradation via NMD [4042]. It was later found that a multisubunit protein complex termed exon junction complex (EJC) is deposited 20 to 24 nucleotides upstream of an exon–exon junction during splicing and remains bound to the mRNA until it is displaced during translation in the cytoplasm [4346]. Although ribosomal transit is likely sufficient to remove EJCs located within the ORF, the protein PYM acts as an additional specific EJC disassembly factor [47]. It has been shown that PYM interacts with the heterodimer MAGOH/Y14, part of the EJC core, via its N-terminal region, thereby causing the disassembly of the EJC from the mRNA [47]. The EJC core complex is composed of four proteins: the heterodimer MAGOH/Y14, Barentzs (BTZ, MLN51 or CASC3), and the DEAD box protein eukaryotic initiation factor 4A-3 (eIF4A3). This core is deposited onto the RNA in the nucleus and stays attached during export to the cytoplasm, as well as in the cytoplasm by inhibition of the ATPase activity of eIF4A3 [48, 49]. The crystal structure of the EJC core complex revealed that the heterodimer Y14/MAGOH stabilizes the closed conformation of eIF4A3, effectively locking the EJC to the mRNA [50]. BTZ has been shown to interact with eIF4A3 by wrapping around the protein as well as interacting with MAGOH [49]. The EJC serves as a binding platform for NMD factors, thereby providing a direct molecular link between splicing and NMD [51, 52]. The role of the EJC in NMD has further been elucidated by tethering and RNA interference assays. The mRNA levels of a β-globin reporter construct were reduced when artificially tethering Y14 to the reporter [53]. At the same time, the significance of the EJC for NMD was shown by the knockdown of Y14, which impaired the degradation of a PTC-containing β-globin reporter construct [53].

NMD substrates lacking an EJC

Despite the fact that EJCs have been shown to activate NMD, further evidence also suggests that NMD can be activated without the need for an EJC. A set of experiments examined the effect of EJC-dependent and EJC-independent NMD. A reporter with an intron downstream of a PTC was susceptible to NMD even after knocking down the EJC core factor eIF4A3, albeit to a lesser extent compared to non-knockdown conditions [54]. Degradation of a second reporter with no intron downstream of a PTC was unaffected by the eIF4A3 knockdown [54]. This suggests that an EJC downstream of a PTC certainly has an NMD-enhancing effect, but an EJC is not mandatory for NMD activation.

Several long 3′ UTR containing mRNAs are known to be regulated by NMD even though they lack an EJC downstream of the stop codon [5]. Furthermore, it has been shown that triosephosphate isomerase (TPI) reporter constructs containing different 3′ UTRs (e.g. SMG5 3′ UTR, UPF3b 3′ UTR, and heterologous GFP coding sequence as a 3′ UTR) are able to elicit NMD [55]. Interestingly, the NMD machinery does not distinguish between the 3′ UTRs of natural NMD targets (SMG5 and UPF3b) but also degrades a TPI reporter with the heterologous GFP coding sequence as a 3′ UTR [55]. This suggests that any long 3′ UTR might be sufficient for NMD activation. Conversely, a different analysis of multiple long 3′ UTR containing reporter constructs revealed that some long 3′ UTRs are targeted by NMD, whereas others are not [56]. This was evident by decreased reporter levels being stabilized by a UPF1 (a central NMD factor, explained in detail below) knockdown [56]. These results imply that the mere length of a 3′ UTR is not enough to determine the fate of a transcript. Instead, they suggest the need for a cis factor binding in close proximity to the termination codon, possibly mediated by the AU-content of the 3′ UTR [56]. An example of a long 3′ UTR containing mRNA is depicted in Fig. 3d (top).

A different approach on long 3′ UTR mediated NMD holds UPF1 occupancy on a transcript responsible as the culprit for NMD activation. It has recently been shown that UPF1 is populating the 3′ UTR of mRNAs in human cells [5759]. It has been proposed that UPF1 being displaced from mRNA by a translating ribosome is able to increasingly bind to the 3′ UTR where no ribosomal destabilization can occur [58, 59]. This occupancy might differ from transcript to transcript, explaining why some long 3′ UTR containing mRNAs are more susceptible to NMD than others.

The faux 3′ UTR model

Different models of NMD activation have been proposed based on the available data. An early model suggested for NMD in yeast is called the “faux 3′ UTR model”. This model advocates that NMD is activated by an aberrant translation termination event. It has been shown in yeast that a PTC lengthens the distance between a terminating ribosome and the yeast poly(A)-binding protein (Pab1), leading to inefficient translation termination [60]. Furthermore, tethering Pab1 downstream of a PTC suppresses NMD, possibly by restoring normal translation termination [60]. Taking into account that yeast lack EJCs, these data suggest that NMD substrates are defined by 3′ UTR length [61]. This indicates that the distance between a termination codon and Pab1 determines whether NMD in yeast is activated or not. This model was proposed prior to findings linking proper and aberrant translation termination in human cells to NMD activation or suppression.

NMD is linked to active translation

First results indicated that cap-binding protein 80 (CBP80)-bound mRNA was targeted by NMD, suggesting that NMD occurs during the pioneer round of translation [62]. However, it was later reported that NMD is able to occur on eIF4E-bound mRNA, suggesting that NMD is not limited to the first round of translation [63, 64].

It has been well established that NMD requires active translation. Secondary structures in the 5′ UTR inhibiting cytoplasmic translation have been linked to NMD inhibition [36, 65]. Additionally, protein synthesis inhibitors cycloheximide, anisomycin, emetine, pactamycin, and puromycin were able to suppress NMD of T-cell receptor mRNA harboring a PTC [66]. Furthermore, Polio virus infection, which is known to shut down protein translation by inactivating eIF4GI, eIF4GII, and PABP, increased an out-of-frame TCR-β mRNA in HeLa cells indicating NMD suppression [6668]. Additionally, in-frame start (Met) codons downstream of a PTC in the TPI mRNA and the β-globin mRNA serve as sites of translation re-initiation and have also been shown to suppress mRNA degradation via NMD [69, 70]. This suggests that additional aspects besides active translation and factors recruited to and interacting with a stalling ribosome might be responsible for NMD activation.

Aberrant translation termination

An aberrant translation termination event starts out similar to a normal translation termination, namely by a premature translation termination codon (PTC) entering the A-site of a translating ribosome. However, the subsequent termination cascade has to be different from normal termination, resulting in NMD initiation. Indicative of a substantial difference between normal and premature translation termination are the ribosome profiles at both a normal termination codon (NTC) and a PTC. Toeprinting assays in yeast cell extracts fail to show any signals for ribosomes at NTCs, unless eRF1 was defective [60]. In contrast, ribosomes stalling at PTCs show typical toeprinting signals for a ribosome with an occupied A-site independent of eRF1 inactivation [60]. Human β-globin mRNA with a PTC at position 39 (β-globin NS39) is a known NMD target, which has been identified in patients suffering from beta thalassemia [36, 71, 72]. Toeprinting assays of β-globin NS39 show a signal indicating a stalled ribosome at the PTC, whereas β-globin wild-type mRNA does not [73]. These results are indicative of the aberrant nature of translation termination of PTC-containing mRNA and provide evidence for a mechanistic difference between normal and premature translation termination.

The exact nature of how prematurely terminated translation leads to the degradation of an mRNA is still not entirely clear. Several different sets of data pointing to different models exist. One such model is based on the competition between UPF1 and PABPC1 for eRF3 binding. As described above, PABPC1 is known to interact with eRF3a, stimulating peptide release and ribosome recycling. UPF1 has been shown to interact with eRF3 as well; however, the interaction between PABPC1 and eRF3 seems favored over the UPF1–eRF3 interaction [5, 74]. The fact that both PABPC1 and UPF1 can bind eRF3 but PABPC1 binding to eRF3 is preferred leads to the idea that PABPC1 might be able to antagonize the interaction between eRF3 and UPF1. Indeed, it has been shown that increasing amounts of PABPC1 can prevent the interaction between UPF1 and eRF3 in vitro [5]. This finding suggests that when a translating ribosome stalls at a PTC the interaction between PABPC1 and eRF3 is outcompeted by UPF1, possibly due to the large physical distance between PABPC1 and the termination complex. This model is further supported by the fact that a MS2-tagged PABPC1 tethered downstream but in close proximity to a PTC of a β-globin NS39 reporter mRNA prevented NMD [3, 5]. Tethering PABPC1 close to the PTC shortens the distance between PABPC1 and eRF3, which suggests that PABPC1 efficiently outcompetes UPF1 for eRF3 binding. Supporting evidence for this competition model has been shown by detailing two distinct roles for PABPC1 and UPF1 in efficient or aberrant translation termination, respectively. A readthrough assay with a reporter containing the ORF of Renilla luciferase upstream of the ORF of firefly luciferase, with a termination codon in between, showed decreased readthrough upon UPF1 knockdown for all three termination codons [3]. This suggests that UPF1 is able to decrease the efficiency of translation termination.

Another possible mechanism, besides the competition between PABPC1 and UPF1, for NMD suppression involves both ribosome recycling as well as translation initiation. It is known that NMD can target mRNAs with a long 3′ UTR and tethering MS2-tagged PABPC1 at the beginning of a long 3′ UTR can suppress NMD [2, 5, 75, 76]. Additionally, it has been shown that MS2-eIF4G tethered downstream of a termination codon and upstream of a long 3′ UTR or downstream of a PTC is able to suppress NMD in a similar manner as PABPC1 [2, 4]. Point mutations of PABPC1 that abolish the interaction between PABPC1 and eIF4G lose their ability to suppress NMD [2]. In contrast, PABPC1 lacking the ability to interact with eRF3 can still suppress NMD when tethered downstream of a termination codon and upstream of a long 3′ UTR as well as downstream of a PTC [2, 4]. The fact that PABPC1 does not require the interaction with eRF3 to suppress NMD contradicts the findings that PABPC1 and UPF1 compete for eRF3 binding during translation termination at a PTC and, thereby, either suppress or activate NMD, respectively. Additionally, these findings also implicate eIF4G and, therefore, the eIF4F complex in NMD inhibition. This suggests that the interaction cascade involving eRF3, PABPC1, and eIF4G is required for NMD suppression. These interactions promote a normal translation termination event, followed by efficient ribosome release and recycling, essentially antagonizing NMD [1, 2].

Further support for a model that links ribosome recycling and translation initiation to NMD suppression was given by the fact that the eukaryotic initiation complex 3 (eIF3) was shown to be involved in NMD suppression [4]. Although tethering subunits of eIF3 to an NMD-susceptible reporter did not antagonize NMD, knockdown of eIF3 subunits eIF3f and eIF3 h reduced the reporter mRNA levels of an NMD reporter stabilized by tethered eIF4G [4]. Overall, this leads to the paradoxical situation that efficient translation re-initiation after ribosome recycling suppresses NMD, while at the same time efficient translation of a substrate mRNA is required to activate NMD.

Additional evidence implicates the proximity of PABPC1 to a termination codon in NMD suppression. A reporter carrying several PTCs shows gradually less efficient NMD for 5′ as well as 3′ PTCs [76]. The more 5′ PTCs resistant to NMD implicate the closed loop structure of a transcript in translation termination. Since 5′ PTCs are not efficiently degraded it can be suggested, that PABPC1 is in close proximity to these PTCs by looping of the mRNA and, thereby, signaling a normal translation termination event. A similar experiment uses a reporter construct that harbors two complementary sequences to fold the poly(A) tail back into close proximity of a PTC. This foldback brings PABPC1 in close proximity of the PTC and is able to antagonize NMD even without the depletion of UPF1 [76].

Figure 2 shows a current model of NMD of a PTC-carrying transcript. Still, many links between ribosome recycling, translation initiation, and NMD are still missing and need to be unveiled in future research.

Fig. 2
figure 2

Model for degradation of PTC carrying transcripts. Step 1 Translation is initiated at the AUG start codon. The ribosome starts translation in the 5′ to 3′ direction. Step 2 When the translating ribosome stalls at a PTC, eRF1 and eRF3 interact with and bind to the ribosome. The physical distance between poly(A)-bound PABPC1 and the stalled ribosome is too large for efficient interaction and subsequent translation termination. Step 3 Since the interaction between PABPC1 and eRF3 is inefficient in this scenario, UPF1 is able to interact with eRF3 instead. Furthermore, the EJC serves as a binding platform for UPF3b and UPF2. Additionally, SMG1 is recruited to UPF1. SMG1 not only interacts with UPF1 and UPF2 but also phosphorylates UPF1. Step 4 Phosphorylated UPF1 recruits the SMG5/7 heterodimer as well as the endonuclease SMG6. The stalling ribosome has been removed from the transcript by this point. SMG6 cleaves the transcript in close proximity of the PTC, whereas SMG5/7 recruit the catalytic subunit of the CCR4-NOT deadenylase complex POP2. Step 5 The NMD factors are removed from the cleaved transcript and decapping and deadenylation commences. As a last step, the exonuclease XRN1 is recruited and degrades the transcript in 5′ to 3′ direction, and the exosome supposedly mediates 3′ to 5′ degradation

Various cellular transcripts are NMD substrates

Selenoproteins are a class of polypeptides containing the trace element Selenium (Se) and are characterized by the incorporation of the unusual amino acid selenocysteine (Sec) [77]. Selenoprotein-encoding transcripts are believed to be degraded via NMD since the codon UGA, more prominently known for being a stop codon, also codes for Sec when it is accompanied by specific sequence elements (SECIS) [78]. It has been shown that the stability of selenoprotein mRNAs is dependent on Se concentrations [79, 80]. Interestingly, about half of all selenoprotein-encoding mRNAs are regulated by NMD, whereas the other half is not [81]. This occurrence is believed to be related to the location of the termination codon in the mRNA of selenoproteins. Sec codons in the last exon are NMD resistant, while Sec codons in any other exon are subjected to NMD [81].

As described previously, many NMD targets have an EJC downstream of a termination codon. This also fits for many transcripts carrying upstream open reading frames (uORFs), which are usually followed by a long main ORF with several splice sites. Microarray analysis in HeLa cells showed that several uORF transcripts are upregulated after UPF1 knockdown [82]. This indicates that transcripts with a uORF can be endogenous targets for the NMD pathway. It is of interest to note that not all uORF containing transcripts are degraded via NMD. Thrombopoietin (TPO) is such an example. TPO contains seven uORFs and has been shown to be unaffected by knockdown of UPF1 [83]. This leads to the current understanding that many uORF transcripts are in fact targeted by NMD, but a uORF can by no means be generally considered a NMD target.

Alternative splicing is responsible for modulation of gene expression by generating different mRNA transcript isoforms [84]. It has been shown that alternatively spliced pre-mRNAs can be targets of NMD, since many of the transcript variants will carry a PTC [85]. An interesting example is the splicing factor SRSF2 (SC35). SRSF2 is known to regulate its own mRNA expression by altering its splicing pattern [86].

Another subset of NMD targets are so-called long non-coding RNAs (lncRNAs). A study employed growth arrest factor 5 (GAS5) as an exemplary lncRNA to examine if lncRNAs are indeed degraded via NMD [87]. The study showed that GAS5 transcript levels were increased upon UPF1 depletion, which indicates that NMD is responsible for the degradation of GAS5 mRNA [87]. It is surprising that lncRNAs are NMD targets since NMD is known to require actively translated mRNA. However, a recent study showed that many lncRNAs are indeed protein coding and therefore actively translated [88]. Figure 3 depicts a graphical overview of possible NMD targets and Table 1 lists examples for each class of NMD substrate.

Fig. 3
figure 3

Examples of NMD substrates. a Alternative splicing of a pre-mRNA can lead to PTC formation by, for example, exon-skipping. The PTC-containing transcript isoform is degraded in an EJC-dependent manner, whereas regularly spliced transcripts without a PTC will not be targeted by NMD. b An error during gene expression can lead to a nonsense mutation in the ORF of a transcript. If the PTC that arises due to this nonsense mutation is upstream of the last exon–exon junction, an EJC will be present downstream of the PTC and the transcript will be degraded via NMD. c Some long ncRNAs are known to harbor snoRNAs within their introns. The snoRNAs are released when the introns are spliced. snoRNA host genes do not encode a protein, but instead have only a short ORF with a PTC, which will lead to activation of NMD followed by subsequent degradation of the transcript. d Top Many transcripts with long 3′ UTRs are known targets of NMD. Middle Some mRNAs can host one or multiple uORFs, which will lead to EJC-mediated NMD, since the EJCs will not be displaced by a translating ribosome. Bottom Selenoprotein-encoding transcripts are a class of NMD targets that incorporate the unusual amino acid selenocysteine (Sec) at UGA codons. With low concentrations of Sec present, the UGA codon is decoded as a classical stop codon and, depending on its position, is treated as a PTC

Table 1 Example targets for different NMD-inducing features

NMD factors and their role in substrate recognition and degradation

The number of identified proteins involved in NMD, especially in metazoan cells, has tremendously increased over the past two decades. The first NMD factors were described more than 30 years ago in S. cerevisiae. At that time, genetic screening was done in a yeast strain expressing a HIS4 transcript with a +1 frameshift, which leads to premature translation termination and decreased stability of the mRNA. The screen aimed to identify mutations, which suppress this frameshift and result in increased mRNA stability [89]. Several upf (up–frameshift) mutants were characterized later and the mutated genes were termed UPF1, UPF2 and UPF3 [9092]. They represent the central set of NMD factors. The UPF proteins have been found in all late-branching eukaryotes, ranging from yeast over nematodes and fruit flies to mammals and plants, whereas the set is incomplete in certain protists [9396]. Additional factors in the NMD pathway of higher metazoans were found in nonsense suppression screens in C. elegans and were termed smg1-7 (suppressor with morphogenetic effect on genitalia). Mutation of these genes did not only abolish NMD, but also caused phenotypical abnormalities in the male bursa and hermaphrodite vulva [9799]. Smg2-4 represent homologs of the conserved UPF1-3; therefore, the extended NMD core machinery comprises UPF1-3, SMG1, and SMG5-7 [100106]. An overview of the central characteristics of these core NMD factors can be found in Table 2. Figure 4 depicts the molecular architecture of the core NMD factors and summarizes their functions and interaction sites, both of which are discussed in detail below. A number of additional NMD factors have been identified in different organisms, including SMG8, SMG9, PNRC2, DHX34, NBAS, RUVBL1, RUVBL2, MOV10, GNL2 and SEC13 for the human system [107113].

Table 2 Key characteristics of core human NMD factors
Fig. 4
figure 4

Overview of the core NMD factors UPF1, UPF2, UPF3b/a, the kinase SMG1, and the decay inducing factors SMG5, SMG6, and SMG7. The domain architecture for all factors, as well as the NMD-specific functions of important regions are depicted here (for details, see text). The most studied isoforms of NMD factors were chosen for representation and match, except for SMG1, those indicated in Table 2 (see footnote b for SMG1 discrepancy). UPF1 phosphorylation sites (SQ and TQ motifs) verified by various experimental approaches (ultradeep HeLa cell phosphoproteome [222], in vitro phophorylation assay with UPF1 peptides [106] or full length UPF1 [154] ) are indicated. The phosphosites connected to specific recruiting functions are highlighted specifically

UPF1 is the central nexus of the NMD machinery

UPF1 is considered the central NMD factor, since it acts as an interaction hub for other NMD factors, is involved in all decisive stages of the NMD process, and is essential for NMD in all investigated organisms [91, 114117]. The central part of UPF1 is highly conserved between species with more than 40 % sequence identity between the yeast and human homolog. It consists of two functional domains: an N-terminal zinc knuckle cystidine-histidine-rich CH domain followed by a central helicase domain [94]. UPF1 is classified based on several functional features. The helicase domain of UPF1 is formed by two RecA-like domains belonging to the superfamily 1Bα (SF1Bα), which uses ATP hydrolysis to unwind double stranded nucleic acids in 5′-3′ direction [118121]. The helicase domain also mediates the direct binding of UPF1 to the RNA and, according to the current model, RNA-bound UPF1 is required for NMD factor assembly [118, 122]. Whereas the presence of ATP, or nucleotide analogs that mimic certain transition states in the hydrolysis reaction, reduce the RNA binding affinity of the helicase domain in vitro, these nucleotide-induced conformational changes still allow concomitant ATP and RNA binding [119, 122124]. Initially it was proposed that UPF1 is recruited to the NMD target by the terminating ribosome via the interaction with the release factors eRF1 and eRF3. This implies that UPF1 recruitment on NMD targets is directly linked to translation termination and hence occurs in the proximity of the termination codon [74]. However, individual nucleotide resolution UV cross-linking and immunoprecipitation (iCLIP) experiments showed that UPF1 directly binds mostly spliced transcripts, regardless of whether they are NMD targets or not [59]. Thus, UPF1 abundance on an mRNA does not correlate directly with its NMD susceptibility. As mentioned above, UPF1 preferentially occupies the 3′ UTR region of mRNAs due to displacement from the 5′ UTR and coding region by scanning and translating ribosomes, respectively [5759]. The overall importance of a functional UPF1 helicase domain is represented by the fact that the ATPase activity and direct RNA binding ability are both required for NMD [124126]. Once associated with the mRNA, the helicase domain of UPF1 is believed to utilize the hydrolysis of ATP to remodel the downstream mRNP, which is essential to facilitate recycling of NMD factors [119, 121]. This would allow efficient progression of exonucleolytic degradation of the mRNA once initial decay steps have taken place [127]. Alternatively, it was proposed that UPF1 is able to translocate on, or thread in, the mRNA to bridge the physical distance to downstream mRNP components, such as the EJC [128]. This early “licensing” step would explain why a distance-independent activation of NMD is observed, no matter how far away the EJC is located from the upstream terminating ribosome [55]. However, it was shown that both the CH domain as well as a C-terminal region of UPF1 (SQ region) can regulate the helicase activity, which ensures that UPF1 clamps to the RNA and does not translocate in the earlier stages of NMD [123, 129]. More specifically, the CH domain directly interacts with the RecA2 domain, which results in conformational changes that promote more extensive RNA binding and thereby repress the helicase activity [123]. In order to initiate the unwinding activity of UPF1, the CH domain has to be removed from the helicase core, which is achieved by the interaction with the C-terminal UPF1-binding domain (U1BD) of UPF2 [122, 123]. This means in turn that if UPF1 needs to use its helicase activity in the early steps of NMD, for example for finding a downstream EJC, UPF2 already needs to be bound to UPF1 or the CH domain has to be pulled away by a as of yet unknown mechanism or factor.

Multiple functions define the role of UPF2

The domain architecture of UPF2 consists of three tandem MIF4G domains (middle portion of eIF4G), followed by the U1BD [130132]. Proteins containing MIF4G domains are commonly involved in general mRNA metabolism, such as the components of the nuclear or cytoplasmic cap-binding complex (CBC) CBP80 and eIF4G, or the spliceosomal protein CWC22 [130, 133136]. Consistent with the role of MIF4G domains to provide the surface for critical interactions, the MIF4G-3 domain of UPF2 is required for the interaction with UPF3 [137, 138]. This interaction establishes a linear cascade ranging from UPF1 over UPF2 to UPF3 [122]. Whereas the exact function of the first two MIF4G domains is not known, a structural role was proposed in which these domains are required to form a ring-like scaffolding structure required for NMD factor assembly [131, 139]. Moreover, conserved residues on the surface of the N-terminal helices of MIF4G-1 of the S. cerevisiae Upf2 were shown to be essential for NMD [140]. Although potential interaction partners were identified, the function of these interactions in the molecular pathway of NMD remains unclear [140]. However, this finding strengthens the possibility that the MIF4G-1 and -2 domains of UPF2 play a role that goes beyond providing the correct structural architecture for NMD.

UPF3 bridges decay inducing elements and NMD factors

Higher eukaryotes contain two UPF3 paralogs with high sequence similarity, UPF3a and UPF3b, the latter being expressed from the X chromosome in mammals [103, 138]. In contrast, only one UPF3 protein exists in yeast and other invertebrates. UPF3b was found to be the dominant NMD factor of both paralogs. However, a cross-regulatory circuit was described, in which the stability of UPF3a is regulated as a consequence of the competition of both UPF3 proteins for binding to UPF2 [53, 141, 142]. Besides providing robustness to NMD by functional redundancy, the advantage of having two UPF3 paralogs becomes evident when the expression of one paralog is cell-specifically down regulated. Meiotic male germ cells are one such example, since they inactivate the X chromosome transcriptionally and, therefore, shut down UPF3b expression [141, 143]. UPF3b is a nucleocytoplasmic shuttling protein and contains a conserved N-terminal RRM [103, 138]. The N-terminal RRM domain is the binding site for the MIF4G-3 domain of UPF2, but does not mediate RNA binding as the name suggests [137]. A short linear motif termed EJC-binding motif (EBM) at the C-terminus of UPF3b is responsible for the interaction with a composite binding site of the EJC formed by the core components eIF4A3, MAGOH and Y14 [53, 122, 144, 145]. The exact molecular function of UPF3 in NMD remains elusive. The proposed role of UPF3b in mammals was the bridging of the EJC and UPF1-UPF2; however, this does not explain the function of UPF3 in EJC-independent NMD [122, 139, 146]. This is especially interesting in organisms that do not employ EJC-enhanced NMD as the standard pathway, but still rely on UPF3 for NMD. Examples are yeast, flies, and worms, which either contain a very small number of spliced transcripts, lack EJC proteins and the EBM in the C-terminus of UPF3, or do not require EJC core components for NMD [53, 94, 111, 147149].

Regulation and mechanism of UPF1 phosphorylation

It was first observed in C. elegans that the phosphorylation status of the phosphoprotein UPF1/SMG2 is regulated by other core NMD factors. Of these, SMG1, UPF2/SMG3 and UPF3/SMG4 are required for phosphorylation, whereas SMG5, SMG6, and SMG7 are involved in the dephosphorylation of UPF1/SMG2, respectively [105]. SMG1 is a member of the phosphatidylinositol (PI) 3-kinase-related kinase (PIKK) family and was confirmed to be the kinase responsible for UPF1 phosphorylation [102, 105, 106, 150]. The 410 kDa human SMG1 has quite a complex domain structure with N-terminal helical HEAT repeats, followed by the FRAP, ATM, and TRRAP (FAT) domain, a FKBP12-rapamycin-binding (FRB) domain, the catalytic PIKK domain, and the very C-terminal FATC domain [151]. Cryo-EM structures of SMG1 showed a thinner arm formed by the HEAT repeat region and a globular head consisting of the remaining C-terminal domains [151, 152]. SMG8 and SMG9, additional NMD factors and components of the SMG1 complex (SMG1C), interact with the HEAT repeat region, and regulate the kinase activity of SMG1 by inducing conformational changes [113, 152, 153]. More specifically, SMG9 is required for SMG8 interaction with SMG1, which leads to a motion of the SMG1 arm and a concomitant conformational change of the head region. This conformational change has been shown to repress kinase activity [113, 152]. Although it was previously observed that the C-terminal domain of SMG1 can interact with UPF1 and UPF2, recent structural and biochemical studies refined this observation. It has been shown that UPF1 interacts with the PIKK domain of SMG1 via its helicase domain, whereas UPF2 binds the FRB domain via its MIF4G-3 domain [131, 151]. The concurrent interaction of SMG1 with a UPF2-UPF3b dimer is possible, pointing to different interaction sites within the MIF4G-3 domain of UPF2 for both interaction partners [131]. UPF2 binding to the FRB domain of SMG1 is believed to modulate and positively stimulate the kinase activity [3, 74]. Although UPF2 is also phosphorylated preferentially at S1046 by SMG1 in vitro, the functional relevance of this phosphorylation site is unclear as it is not essential for NMD [131].

Phosphorylated UPF1 recruits decay inducing factors

SQ and TQ motifs in the extended and unstructured N- and C-terminus of UPF1 are the preferred motifs for phosphorylation by SMG1 [105, 106, 154]. Even though phosphorylation was also reported for yeast Upf1, the mechanism and the responsible kinase are different. Yeast Upf1 lacks most of the clustered SQ and TQ motifs in the C-terminus and no ortholog of SMG1 has been found [155, 156]. The phosphorylation sites in mammalian UPF1 act as recruitment platforms for the remaining core NMD factors, SMG5, SMG6, and SMG7. The three proteins share one common domain feature, a 14-3-3-like domain which folds similar to 14-3-3 proteins and is able to interact with phosphorylated peptides [157]. SMG5 and SMG7 interact with their N-terminal 14-3-3-like domains in a perpendicular back-to-back orientation in order to form a heterodimer. This heterodimer exhibits an uncommon arrangement compared to the normal head-to-head interaction found in most 14-3-3 dimers [158160]. The 14-3-3-like domain of SMG7 is mostly responsible for the phosphorylation-dependent interaction between phosphorylated amino acids (e.g. S1096) in the C-terminus of UPF1 and the heterodimer SMG5-SMG7 [154, 157, 159, 161]. The 14-3-3-like domain of SMG5, which by itself is not able to interact with UPF1, is postulated to provide additional binding strength and specificity [159, 161].

Initiation of exonucleolytic degradation

Early work showed that tethering of full length SMG7 or the C-terminal proline-rich (PC) region to a reporter mRNA induces mRNA degradation in a position-independent and XRN1-/DCP2-dependent manner [162]. Recently, the direct interaction of the PC region of SMG7 with POP2, the catalytic subunit of the CCR4-NOT deadenylase complex has been shown [163]. Therefore, SMG7 recruitment to phospho-UPF1 induces deadenylation followed by DCP2-mediated decapping and XRN1-catalyzed degradation of the mRNA in the 5′-3′ direction [163]. Early reports showed that UPF1 can associate with decapping proteins like DCP2, the catalytic subunit of the decapping complex. However, it was unclear if this interaction is direct or mediated by another factor [164167]. Recent studies have identified the proline-rich nuclear receptor coregulatory protein 2 (PNRC2) as an additional NMD factor. PNRC2 interacts with UPF1 and the decapping complex component DCP1, thereby providing a link for deadenylation-independent decapping during NMD [168, 169]. Additionally, PNRC2 was reported to form a functional complex with SMG5, which is devoid of SMG6 or SMG7 and initiates NMD in a UPF1-dependent manner when tethered to a reporter mRNA [170]. Recent data, however, contradicts these observations, as no interaction between PNRC2 and SMG5 could be detected and SMG5-mediated degradation was reported to be strictly SMG7-dependent. Therefore, the existence and contribution of a PNRC2-SMG5 complex to NMD remains unclear [159].

Dephosphorylation of UPF1 is initiated by decay factors

NMD is impaired under conditions where UPF1 accumulates in the hyper- or hypo-phosphorylated form, suggesting that a cycle of UPF1 phosphorylation and dephosphorylation is essential for NMD activity [104106, 150, 161]. Protein phosphatase 2A (PP2A) was identified as the specific phosphatase essential for the dephosphorylation of UPF1. PP2A associates with the SMG5-SMG7 heterodimer via a direct interaction with SMG5 [104, 171]. SMG5 contains a C-terminal PilT N-terminus (PIN) domain, which is potentially involved in the interaction with PP2A. Deletion of the very C-terminal amino acids or the replacement of a conserved aspartate at position 860 in this domain increased phosphorylation of UPF1 [104]. PIN domains are commonly found in proteins exhibiting endonuclease activity. However, the catalytic triad normally consisting of three aspartate residues is mutated in the SMG5 PIN domain and no endocleavage activity was observed in vivo or in vitro [172174]. Interestingly, D860 is the one remaining aspartate residue in the active site of SMG5, which was implicated in the regulation of UPF1 phosphorylation status [104]. It is worth mentioning that SMG6 associates with the PP2A complex as well, suggesting that, in line with initial observations in C. elegans, all three SMG5-7 proteins mediate UPF1 dephosphorylation by recruiting phosphatases [175].

Endonucleolytic cleavage is executed by SMG6

Studies to elucidate the degradation pathway of PTC-containing mRNA in D. melanogaster S2 cells showed that the knockdown of exonucleolytic machineries catalyzing deadenylation (CCR4, CAF1, PAN2, PAN3), decapping (DCP1, DCP2, LSM1), 3′-5′ (CSL4, RRP4, RRP6 and SKI2), and 5′-3′ degradation (XRN1, RAT1) could not stabilize reporter mRNA levels [176]. However, evidence for PTC-dependent endonucleolytic cleavage was found due to the accumulation of 3′ and 5′ fragments upon depletion of the major 5′-3′ exonuclease XRN1 and components of the 3′-5′ degrading exosome complex, respectively [176]. In metazoans, SMG6 was identified as the endonuclease responsible for cleavage of the NMD targets in the vicinity of the PTC [176178]. SMG6 contains a C-terminal PIN domain similar to SMG5. In contrast to SMG5, however, all catalytically important residues are present in the active site and the SMG6 PIN domain exhibits endonucleolytic activity in vitro [174]. Mutations of any of the catalytic aspartate residues, which are required to coordinate divalent metal ions for the nucleophilic attack of H2O on the phosphodiester bond of the RNA, renders the protein inactive and abolishes endonucleolytic degradation of NMD targets [55, 145, 174, 177179]. Like SMG5 and SMG7, SMG6 contains a 14-3-3-like domain, which is located centrally in the protein [157]. Interestingly, the SMG6 14-3-3-like domain was found to be monomeric as it neither forms homodimers nor heterodimers with either SMG5 or SMG7 14-3-3-like domains [154]. This domain was also suggested to bind phosphorylated UPF1 and biochemical analysis showed that mutation of the residues in the phosphopeptide binding pocket abolished the interaction with UPF1 [161]. Similarly, alanine exchange of T28 in the N-terminus of UPF1 greatly reduced the interaction with SMG6, suggesting that the 14-3-3-like domain of SMG6 interacts with the phosphorylated N-terminus of UPF1 [161]. The phospho-dependent interaction of phosphorylated UPF1 with SMG5-SMG7 was confirmed by recently reported in vitro experiments [154]. However, no interaction of the isolated 14-3-3-like domain of SMG6 with hyperphosphorylated UPF1 was observed [154]. This is in line with recent data showing that phosphorylated UPF1 preferentially occupies the 3′ UTR of NMD targets in a complex with SMG5 and SMG7, but not SMG6 [180]. However, the unstructured region preceding the 14-3-3-like domain of SMG6 was observed to bind UPF1 in a phospho-independent manner in vitro [154]. This observation was supported by functional studies of SMG6 tethering and UPF1 complementation assays performed in another recent publication [179]. Additionally, two EBMs were characterized in the very N-terminus of SMG6, which, similarly to UPF3b, mediate the interaction with the EJC [145]. Despite initial observation that these motifs are crucial for NMD, recent data suggests that they are dispensable for endocleavage [55, 145]. Taken together, the exact mechanisms by which SMG6 is recruited to a target mRNA remain elusive, although at least three potentially redundant or inter-dependent mechanisms exist: (1) recruitment to the EJC via the EBMs, (2) interaction with various regions of UPF1 in a phospho-independent manner, (3) phospho-dependent binding of the 14-3-3-like domain to the N-terminus of UPF1.

Interplay between exo- and endonucleolytic decay during NMD

Recent high-throughput sequencing experiments showed that SMG6-mediated endocleavage is the preferred NMD degradation pathway compared to SMG7-mediated degradation [181, 182]. However, other studies showed that the knockdown of either SMG6 or SMG7 alone is not sufficient to achieve NMD inhibition. Consequently, both proteins need to be depleted, thereby shutting down both degradation pathways, to substantially increase reporter mRNA level [159]. It is still unknown, whether both pathways (initiated by SMG6 or SMG5/7) operate independently, or are somehow connected and regulate each other. As the factors required for the initiation of degradation share the majority of their binding sites (N- and C-terminus of UPF1), it is conceivable that there is a cross-talk between SMG5-7, PNRC2 and/or DCP1/2. Interestingly, in vitro binding studies showed that due to the phosphorylation-independent interaction of SMG6 with UPF1, both SMG6 and SMG5/7 can in principle be accommodated simultaneously on phosphorylated UPF1 [154].

Physiological functions of NMD factors

In lower eukaryotes, such as S. cerevisiae or C. elegans, an NMD factor deficiency has only very mild effects at the organismal level. In contrast, several mammalian NMD factors are essential for normal embryonic development and knockout mice display striking phenotypes [183]. For example, murine Upf1 is essential for embryonic development. Mouse embryos lacking Upf1 are only viable in the pre-implantation period and die due to massive apoptosis soon after uterine implantation [115]. Furthermore, Upf1-deficient blastocysts could be maintained in cell culture for 5 days, but ultimately regressed showing massive apoptosis. Likewise, homozygous targeting of Upf1 in ES cell lines was unsuccessful and Upf1 null cells were never observed [115]. The importance of NMD for normal embryonic development is underscored by data demonstrating that mice deficient of Upf2 die in utero around E3.5-E7.5 [184], whereas Smg1-deficient mice die before embryonic day 12.5 [185]. The use of conditional Upf2 alleles enabled testing of the function of Upf2 in adult animals [184, 186]. Induced loss of Upf2 is highly detrimental to a steady state adult liver, but only a relatively minor phenotype is observed in Upf2 null fetal livers. However, Upf2 null fetal livers do not undergo terminal differentiation, which is incompatible with postnatal life [186]. Furthermore, ablation of Upf2 in the hematopoietic system results in a complete extinction of hematopoietic stem cells and subsequent death of the affected organisms [184]. It was suggested that NMD plays a particularly important role in proliferating cells, because differentiated cells were only mildly affected by the Upf2 knockout [184].

Very recently, knockout mice lacking Smg6 have been reported to show embryonic lethality at the blastocyst stage [187]. However, a floxed Smg6 allele could be used to delete the Smg6 locus in cultured embryonic stem cells (ESC) and to establish Smg6∆/∆ ESCs, which were morphologically indistinguishable from control ESCs and proliferated normally. While Smg6 appears to be dispensable for normal viability and self‐renewal of ESCs, its deletion blocks ESC differentiation in vitro and in vivo in a c-Myc-dependent manner. Although it has been previously suggested that the knockdown of NMD factors compromises the survivability of mouse ESCs, the NMD factors Smg1, Smg5, Upf1, and Upf2 could be knocked down successfully using shRNA vectors [187]. However, these ESCs showed a differentiation defect similar to that observed for Smg6, indicating that NMD plays a general role during ESC differentiation.

UPF1, SMG1, and SMG6 have reported functions beyond NMD and it is difficult to disentangle to what extent the dramatic effects of the genetic knockout can be attributed to the inhibition of NMD [188, 189]. However, non-overlapping moonlighting functions of these factors argue that the majority of the observed phenotypes indeed reflect their role in NMD itself. Furthermore, it was shown that SMG1- and UPF2-deficient cells displayed NMD-specific changes in their transcriptome and many potential NMD targets appear to be upregulated [184, 185]. In summary, according to the prevailing view in the field, the regulation of gene expression by NMD has an important role during normal embryonic development and cellular viability.

Human disorders associated with NMD factor mutations

Upf3b

Mutations in the UPF3B gene were described in males with mild to severe X-linked mental retardation [190]. Until now, ten families carrying UPF3B mutations have been analyzed and seven truncation mutations and three missense mutations in the UPF3B gene have been identified [190194]. A broad range of clinical symptoms, including autism, schizophrenia, and facial dysmorphism have been observed in patients with UPF3B mutations. The different degree of mental retardation and of the other phenotypes of patients with UPF3B mutations depend on the amount of UPF3a, which is upregulated in response to UPF3b deficiency [190]. However, the clinical manifestations of UPF3B mutations demonstrate that the degree of UPF3a up regulation is insufficient to completely compensate for the lack of UPF3b. Hence, UPF3a only partially rescues NMD in the absence of functional UPF3b and is a potential modifier of the clinical phenotype of UPF3B patients [195, 196].

Upf2

The idea that misregulation of NMD predisposes for neuro-developmental disorders is supported by an association with heterozygous deletions of a genomic region that include UPF2 [197]. In addition, a de novo missense mutation in UPF2 has been identified in a patient with schizophrenia [198].

Upf1

Recently, somatic mutations in the UPF1 gene have been described in pancreatic adenosquamous carcinoma (ASC) tumors [199]. All mutations were somatic in origin and not detected in normal pancreatic tissues from the same patients. ASC-specific point mutations clustered in two regions of the UPF1 gene and many seem to trigger alternative splicing of the UPF1 pre- mRNA, leading to the expression of truncated UPF1 proteins. UPF1 mutations appear to represent a signature of many pancreatic ASC tumors, since no UPF1 mutations were found in non-ASC pancreatic tumors and lung squamous cell carcinomas. Furthermore, no mutations in other NMD genes (UPF2, UPF3A and UPF3B) were detected in ASC tumors. Although the molecular effects of UPF1 mutations remain to be elucidated, it is very likely that they alter the efficiency of NMD. This is supported by the observation that a PTC-containing splice variant of p53, representing an endogenous NMD substrate, was only detectably expressed in tumor tissue.

Y14

TAR syndrome (thrombocytopenia with absent radius) is a rare genetic disorder with 55 cases being described until today [200]. Patients with TAR syndrome have low numbers of megakaryocytes, leading to a dramatically reduced platelet count (hypomegakaryocytic thrombocytopenia). In all cases the radius bone in the forearm is absent, but the skeletal abnormalities show a high degree of variation, from absence of radii to virtual absence of upper limbs. The lower limbs, the gastrointestinal as well as the cardiovascular systems may also show defects [201]. In most cases, TAR syndrome manifests in patients having one allele with a reduced expression of the RBM8A gene (encoding the Y14 component of the EJC) in combination with a heterozygous small deletion on chromosome 1q21.1 containing the RBM8A gene [200]. The reduced expression of the RBM8A gene is caused by low-frequency SNPs either in its 5′ UTR, or in the first intron. Two patients with TAR syndrome did not have the chromosome 1q21.1 deletion. In these cases the 5′UTR SNP was found in combination with a 4-bp frameshift insertion at the start of the fourth exon of the RBM8A gene and a nonsense mutation in the last exon of RBM8A. Hence, the compound inheritance of a null allele (deletion, frameshift-, or nonsense mutation) and low-frequency noncoding SNPs in RBM8A is the genetic cause of TAR syndrome [200]. Why certain tissues require higher levels of Y14 and are therefore specifically affected by its reduced expression is currently unclear. While these patients likely have reduced amounts of EJCs due to the insufficiency of the Y14 protein, the effects on NMD will require further analysis.

eIF4A3

In 1992 Richieri-Costa and Pereira described a new autosomal-recessive syndrome characterized by mandibular median cleft associated with other craniofacial anomalies and severe limb defects [202]. This syndrome, now referred to as Richieri-Costa-Pereira Syndrome (RCPS), almost exclusively affects families from Brazil. Different non-coding expansions in the 5′UTR of the mRNA encoding the EJC core component eIF4A3 have been found in patients [203]. The expansions are located in a region with several repeat motifs. Homozygosity as well as compound heterozygosity of different repeat expansions and a missense mutation have been described. Notably, the presence of the expansions does not seem to influence processing of the pre-mRNA, but rather reduces the abundance of the eIF4A3 transcript by 30–40 % in patient cells compared to control cells [203]. A similar reduction of the expression of eIF4A3 by injecting specific morpholinos in zebrafish embryos led to developmental defects in several craniofacial cartilage and bone structures [203]. Nonetheless, it is currently unclear how the partial loss-of-function of eIF4A3 leads to the pleiotropic phenotype of RCPS.

Effect of NMD on human disease

Hereditary disorder

Many inherited disorders and several cancers have been suggested to be caused by nonsense or frameshift mutations [204]. More than 2400 genetic disorders have at least one causative nonsense allele [205]. A recent meta-analysis showed that nonsense mutations account for approximately 11 % of all described gene lesions causing human inherited disease. Furthermore, they represent approximately 20 % of disease-associated single-basepair substitutions [206]. For many diseases caused by nonsense mutations, NMD acts as a modifier of the clinical phenotype and eliminates mutated transcripts, which encode C-terminally truncated proteins [207]. Such truncated proteins may have dominant-negative effects and, therefore, may be deleterious to the cell. This observation has been made for β-thalassemia, which is caused in many cases by nonsense mutations in the first or second exon of the β-globin gene. Due to the activity of NMD only very low levels of mutant β-globin mRNA are present in red blood cells of heterozygous individuals, which are clinically usually asymptomatic. This can be explained by sufficient amounts of β-globin that are produced from the normal allele. In contrast, nonsense mutations in the last exon of β-globin escape NMD and high levels of mutant mRNA are translated into truncated β globin. The proteolytic system of the red blood cells fails to degrade these truncated β chains, causing a clinical phenotype in the heterozygote called thalassemia intermedia [208]. A similar effect has been documented in many other diseases [207, 209].

In recent years it has become clear that the protective function of NMD is only one side of the coin. Indeed, NMD can worsen the clinical manifestation of genetic disorders, in which the reduced expression of partially functional proteins leads to haploinsufficiency. For example, nonsense mutations in the dystrophin gene, which activate NMD and preclude the synthesis of truncated dystrophin protein, cause a severe form of Duchenne muscular dystrophy (DMD). In contrast, some mutations in the dystrophin gene escape NMD and produce partially functional C-terminally truncated dystrophin protein. These mutations are usually associated with a clinically less severe disease, termed Becker muscular dystrophy [210].

Treatment of genetic disorders by targeting NMD

A large number of patients are affected by nonsense mutations, but only a limited amount of therapeutic treatments are available. In many cases, disease-causing nonsense mutations exert two effects, namely accelerated mRNA degradation due to NMD and translation of a truncated ORF. Hence, potential treatment strategies would require translational read-through by nonsense suppression, inhibition of NMD, or both. Since it may not be necessary to restore normal gene expression levels in order to eliminate the disease [211], translational read-through at stop codons without inhibiting NMD is currently assumed to be the favorable approach. Notably, a drug that is able to induce translational read-though at a given nonsense mutation may potentially be used for the treatment of other nonsense mutations and could, therefore, be used to treat many different diseases caused by nonsense mutations [205].

Currently only a few approaches have been specifically tested to treat diseases caused by nonsense mutations. Originally, aminoglycosides were used due to their ability to cause read-through of translation termination codons by recognition of a near-cognate tRNA and misincorporation of an amino acid [212]. After initial tests of aminoglycosides in cystic fibrosis (CF) cell culture models [213], clinical trials with gentamycin were carried out in patients suffering from CF or DMD. These trials were in principle successful and confirmed that the administration of aminoglycosides in vivo is capable of restoring protein function. However, the high doses of gentamycin required for a prolonged effect may have adverse effects, and therefore limits the application as treatment for patients [214].

An alternative for aminoglycosides is the drug Ataluren, a small-molecule compound formerly known as PTC124 [205]. Ataluren has been reported to selectively promote the read-through of premature termination codons, while not affecting normal stop codons [215]. However, the molecular mechanism of the compound has recently been challenged [216, 217] and its future prospects will largely depend on its efficacy in currently ongoing clinical trials.

Concluding remarks

NMD plays a pivotal role during mammalian gene expression. It controls not only the fidelity of mRNA expression, but also regulates the expression of many genes at a post-transcriptional level. NMD employs a sophisticated machinery of conserved factors, which act at different steps (nuclear and cytoplasmic) of gene expression. Factors involved in NMD are essential for embryonic development in mammals and regulate the symptoms of inherited and acquired genetic diseases. However, a large portion of the NMD pathway and its specific activation remains elusive and needs to be addressed in future studies.