Gene organization and evolutionary history

Classification

The paradigm of the family is human dystrophin, originally identified [1] through its deficiency in the lethal neuromuscular disorder Duchenne muscular dystrophy (DMD) [2,3]. In addition to dystrophin, vertebrates possess two closely related proteins - utrophin [4] and dystrophin-related protein 2 (DRP2) [5]. A single common ancestor of these three proteins is present in all invertebrate metazoans hitherto examined [6]; I shall refer to these generically as the dystrophins.

A protein distantly related to the carboxy-terminal part of the dystrophins was isolated from the electric organ of the electric ray Torpedo [7]. Now known as dystrobrevin, it is present as a single protein in invertebrates and two closely related proteins (α- and β-dystrobrevin) in vertebrates [8,9,10]. The dystrophins and dystrobrevins bind to each other via a homotypic coiled-coil interaction [11].

Although dystrophin and dystrobrevin-like proteins in non-metazoans have yet to be identified, a very remotely related protein has been described [12]; discontinuous actin hexagon (DAH) is an actin-binding membrane-associated phosphoprotein required for cellularization of the embryonic syncytium in Drosophila. The sheer degree of divergence of DAH from the presumed last common ancestor of dystrophin and dystrobrevin hints at a more ancient history and broader functional scope for these proteins.

Gene organization

At more than 2.4 megabases (Mb), with some introns several hundred kilobases (kb) in length, the human dystrophin gene is the largest ever characterized (see Table 1). The reason for the evolutionary maintenance of this large gene size is unclear, but it appears that other vertebrate dystrophin and utrophin genes are similarly colossal (the human utrophin gene has been estimated at 900 kb). Even the Drosophila dystrophin-like gene [13], at 130 kb, is large by this organism's standards.

Table 1 Properties of dystrophin and dystrobrevin genes

The genes are also complex - the human dystrophin gene itself has 79 coding exons [14], a substantial amount of alternative splicing [15,16], and at least seven tissue-specific promoters, which generate a range of transcripts encoding proteins differing in the length and/or sequence of their amino termini. For example, use of a promoter in intron 29 in the retina results in expression of a protein corresponding to the carboxy-terminal 260 kDa of dystrophin; DMD mutations that disrupt this isoform result in congenital stationary night blindness (in addition to the skeletal myopathy caused by all DMD mutations). The utrophin gene has an almost identical intron-exon organization to that of dystrophin, whereas the DRP2 gene shares most aspects of its structure with exons 55-79 of the dystrophin gene. Although the gene structure of vertebrate dystrobrevins strongly resembles that of exons 64-77 of the human dystrophin gene, the organization of the invertebrate dystrophin and dystrobrevin genes is surprisingly idiosyncratic; the Drosophila dystrophin gene has less than half as many exons as its human counterpart, including a mammoth 3.5 kb coding exon.

Evolutionary history

Analysis of known dystrophin and dystrobrevin sequences yields a clear phylogeny for the protein family that is consistent with the accepted phylogeny of the animals bearing them [6,17]. From this we can fairly confidently infer the following evolutionary history (Figure 1). A distant (non-metazoan) ancestor had a single dystrophin/dystrobrevin protein, which probably functioned as a homodimer. We cannot yet tell whether the long amino-terminal extension of the dystrophins is an ancestral or derived trait. At some point before the last common ancestor of metazoans, a duplication gave rise to separate dystrophin and dystrobrevin genes, their protein products now forming a heterodimer of more specialized components. This is the situation in most extant metazoans, including the protochordate amphioxus [6]. In vertebrates, however, a series of further duplications occurred, as had been documented for many other genes [18]. The first of these gave rise to DRP2 (via a partial duplication) and a common ancestor of dystrophin and utrophin, and to α- and β-dystrobrevin. The second resulted in the separate dystrophin and utrophin genes. All vertebrates appear to have this final complement of three dystrophin-like proteins and two dystrobrevins. Interestingly, the syntrophins, which bind to dystrophins and dystrobrevins (see below), have a tree of identical topology [13].

Figure 1
figure 1

Phylogenetic tree of the dystrophin/dystrobrevin family, inferred from a tree constructed using sequences of the cysteine-rich and carboxy-terminal domains of human and fruit fly proteins [6]. Branching to form paralogs is shown vertically and branching to form orthologs (speciation) is shown perpendicular to the page.

Characteristic structural features

These large, multi-domain proteins are traditionally subdivided into the following distinct sections (Figure 2).

Figure 2
figure 2

Structures of the vertebrate dystrophin/dystrobrevin family compiled from the crystal structures of the dystrophin actin-binding domain, two spectrin repeats from α-actinin, and the cysteine-rich region of dystrophin (PDB numbers 1dxx, 1quu and 1eg4, respectively). Actin binding region: cyan, CH1; green, CH2. 'Cysteine-rich' region: green, WW domain; red, orange, cyan, and purple, EF hands. Carboxy-terminal region: yellow, syntrophin-binding segment; red, leucine heptads.

The actin-binding domain (residues 1-220 of human dystrophin)

The amino-terminal 220 amino acids of dystrophin, utrophin, and the invertebrate dystrophins show clear homology to the well known actin-binding regions of the spectrin and α-actinin families, each of which comprises two tandem calponin-homology domains. The amino-terminal domains of dystrophin and utrophin have been shown by a variety of methods to bind to filamentous actin with binding affinities in the low micromolar range and a marked preference for non-muscle forms of actin. The tertiary structures of actin-binding domains from both proteins have been established (Figure 2), and their position on actin filaments has been modeled from electron micrographs of decorated fibers [19,20].

The rod domain (residues 338-3,055 of human dystrophin)

More than 70% of the length of dystrophin, utrophin and the invertebrate dystrophins consists of a rather weakly repeated motif akin to a loose version of the spectrin repeat [21]. These approximately 110-amino-acid motifs are assumed, like their spectrin counterparts, to form antiparallel three-helix bundles (as is modeled in Figure 2). The considerable variability in length and sequence suggests, however, that such a modular construction may be somewhat 'blurred' in the case of the dystrophins. Electron-microscopic studies confirm that, as with the spectrins, the corresponding repeat region is responsible for conferring on dystrophin an extended rod-like shape approximately 110-170 nm in length. There is, however, no evidence for the antiparallel dimerization observed in both spectrins and α-actinins [22].

In the case of vertebrate dystrophins, approximately 24 repeats can be distinguished. The rod domain in the nematode dystrophin is almost identical in size, suggesting that interaction with some other agent places a tight constraint on length (indeed, some humans with interstitial in-frame deletion of a single repeat can suffer appreciable myopathy). Utrophin has a slightly shorter rod domain, with 22 repeats (the missing length is in the region of repeats 14 and 18 of dystrophin). The Drosophila dystrophin rod domain is even shorter than this, with the region corresponding to repeats 14-20 of human dystrophin being half the length. DRP2, like the 116 kDa Dp116 isoform of dystrophin, has a mere two repeats, with a unique 75-residue random-coiled amino terminus.

The cysteine-rich region (residues 3,056-3,354 of human dystrophin; residues 1-284 of human α-dystrobrevin)

At the end of the dystrophin rod domain is a highly conserved constellation of motifs that constitutes a key feature of the dystrophin/dystrobrevin family. The generally accepted name is something of a misnomer, as only five of the fifteen cysteines (in human dystrophin) that give the region its name are highly conserved, four of these being metal ligands in the ZZ domain (see below).

The region comprises the following domains in amino-to-carboxyl order. First, a WW domain [23], which is small (about 40 amino acids), composed largely of β sheet, and named after its two conserved tryptophan residues; it usually binds proline-rich motifs and is missing from dystrobrevin and DAH. Second, EF hands [24], which comprise hairpins of α helices, with the intervening turn often coordinating Ca2+. These are almost invariably duplicated to form a packed pair of hairpins; although only one such module was originally identified in the dystrophins, the recent crystal structure shows the existence of a second, giving a total of four hairpins. None of the loops appears to coordinate metal ions. Finally, a ZZ domain, which has been found in a wide range of proteins [25]. Its structure is reinforced by coordination of Zn2+ by cysteine side chains (four in the vertebrate dystrophins; six in the invertebrate dystrophins and the dystrobrevins), and its function is not known.

Functional studies show that the cysteine-rich domain mediates the interaction between dystrophin and the intracellular tail of β-dystroglycan, a transmembrane component of the dystrophin complex. This is probably the critical site of membrane attachment for dystrophin, and loss of this interaction results in a null phenotype. The structure of most of this region has recently been solved at the atomic level [26]. It turns out to be a rather compact entity, with the WW domain and the EF hands intimately packed (Figure 2). A co-crystal with a β-dystroglycan peptide reveals that the latter's PPPY motif binds the WW domain, with the remainder of the β-dystroglycan enjoying an extended interaction across one surface of the EF hands [26].

The carboxy-terminal region (residues 3,355-3,685 of human dystrophin; 285-686 of human (α-dystrobrevin)

After the ZZ domain is an α-helical region, which has been shown in both dystrophin and dystrobrevin to mediate the interaction with the carboxyl termini of members of the syntrophin family of cytoplasmic adapter proteins [27]. The syntrophin-binding segment is subject to complex patterns of alternative splicing in both dystrophins and dystrobrevins, suggesting that the stoichiometry of the complex can be modulated. This is followed by two sets of helical leucine-heptad motifs, which are responsible for the homotypic interaction between dystrophins and dystrobrevins [28].

The extreme carboxyl terminus differs markedly between dystrophin and dystrobrevin, and no function has been ascribed to this region. Alternative splicing can generate novel carboxyl termini: in the case of vertebrate dystrophin in non-muscle tissues, a 39-residue sequence homologous to the constitutive carboxyl terminus of invertebrate dystrophins is added; and in the case of α-dystrobrevin in the neuromuscular junction, a unique 188-residue domain is added that is subject to tyrosine phosphorylation.

The dystrophin carboxy-terminal region appears to be dispensable for normal muscle function, as shown by rescue of dystrophin-deficient mice by transgene expression and by certain rare human mutations ([29] and my unpublished observations).

Localization and function

Localization

All members of the dystrophin and dystrobrevin family appear to be membrane-associated. Vertebrate dystrophin is located at the cytoplasmic membrane of skeletal, cardiac and smooth muscle cells [30] and at a subset of synapses in the central nervous system [31]. Shorter forms of dystrophin, generated by alternative promoter usage, are variously expressed at retinal synapses (Dp260 [32]), at the outer surface of Schwann cells in the peripheral nervous system (Dp116 [33,34]), and more widely (Dp71 [35]). Utrophin is widely expressed throughout the body, with a striking concentration at the neuromuscular and myotendinous junctions [36] and at various specialized membranes in the brain [37]. DRP2 is found at a range of synapses throughout the central nervous system [38] and in Schwann cells in the peripheral nervous system (D.L. Sherman, C. Fabrizi, C.S. Gillespie and P.J. Brophy, personal communication). The localization of invertebrate dystrophins is known only for the nematode Caenorhabditis elegans Dys-1 protein, which seems to be mainly expressed in muscle cells [39], although the expression of Dyb-1 (see below) makes it likely that Dys-1 may be more widely expressed.

The dystrobrevins are less well studied, but appear to parallel the dystrophins in their expression, with α-dystrobrevin in muscle and the central nervous system and β-dystrobrevin predominant in the brain and other tissues, such as kidney and placenta. Muscle α-dystrobrevin, like dystrophin, is localized to the muscle plasma membrane (sarcolemma) [40]. Nematode dystrobrevin (Dyb-1) is expressed in most muscles and neurons [41].

Function

The fundamental role of the dystrophins and dystrobrevins remains unclear. Much of what we do know has been gleaned from biochemical studies of associated proteins and from the phenotypic consequences of their loss.

Dystrophin forms part of a complex (Figure 3) that includes both integral (dystroglycan, the sarcoglycans) and peripheral (dystrophin, dystrobrevin, the syntrophins) membrane proteins. Dystroglycan, whose direct interaction with dystrophin is described above, crosses the membrane and binds to agrin and laminin in the extracellular matrix [42]. The sarcoglycans form a separable heterotetrameric transmembrane sub-complex of unknown function [43]; this associates laterally with dystroglycan and sarcospan, and probably also directly with dystrophin and/or dystrobrevin [40,44]. The five syntrophins [45] are cytoplasmic adapter proteins containing plextrin-homology and PSD-95/SAP-90, discs large, ZO-1 (PDZ) domains, which bind directly to the carboxy-terminal region of dystrophin and dystrobrevin (see above), and also appear to bind neuronal nitric oxide synthase and voltage-gated Na+ channels via their PDZ domains [46,47]. A number of other proteins (including syncoilin, biglycan, filamin 2, and sarcospan) have been associated with the complex, but their significance is as yet less certain.

Figure 3
figure 3

Schematic diagram of the dystrophin complex as found in vertebrate skeletal muscle, showing the currently understood relationships between the better characterized components. Dystrophin and dystrobrevin form the cytoplasmic core.

Important mutants

The consequences of null mutation are known for humans and/or rodents in the case of dystrophin, utrophin, and α-dystrobrevin, and for nematode in the case of dystrophin and dystrobrevin. The lack of dystrophin that underlies DMD results in secondary loss of all other components of the dystrophin complex from the membrane and ultimately leads to a lethal syndrome of skeletal and cardiac myopathy (involving cycles of membrane failure, cell death, failure to regenerate and fibrosis), stationary night blindness, mental retardation, a cardiac-conduction defect and a subtle smooth-muscle defect. Many of these traits are recapitulated in a subset of the limb-girdle muscular dystrophies that result from sarcoglycan defects [48]. The dystrophin-deficient mouse has a similar but milder phenotype, somewhat mitigated by partial complementation of the dystrophin deficiency by utrophin.

There is as yet no known human defect of utrophin, DRP2 or α-dystrobrevin. A mouse knockout of utrophin has an extremely subtle defect in the structure of its neuromuscular junctions; the importance of utrophin's role is revealed only in the double knockout of both dystrophin and utrophin genes, which has a severe myopathy and structural abnormalities of the neuromuscular junction [49,50]. A mouse α-dystrobrevin knockout displays a gross phenotype similar to that of a null dystrophin mutant, but the integrity of the complex is largely maintained [51]. The loss of both dystrophin and dystrobrevin in the nematode C. elegans results in a neuromuscular defect that seems to stem from a hypersensitivity to acetylcholine [52].

Frontiers

With the basic function of this substantial complex still eluding definition, the principal frontier of dystrophin and dystrobrevin research is self-evident. Considerations of the function of dystrophin dwell largely on two areas, namely a mechanical role and a signaling role. Mechanical models note the actin-dystrophin-dystroglycan-laminin axis, which suggests a mechanical link between the intracellular cytoskeleton and the extracellular matrix. Signaling models note the preponderance of circumstantial associations with molecules whose main role is in communication - for example, agrin, neuronal nitric oxide synthase, voltage-gated Na+ channels, and perhaps sarcoglycans. The two models are not mutually exclusive.

A common theme that seems to run through dystrophin biology is that of synaptic function. In vertebrates, dystrophin and DRP2 are localized to central synapses and utrophin to the neuromuscular junction, a specialized cholinergic synapse. The nematode mutants seem to imply a role for dystrophin and dystrobrevin in cholinergic transmission. Is the ancestral function of these proteins one of synaptic structural organization or regulation? If so, what are they doing in clearly non-synaptic places such as the sarcolemma? Investigation of dystrophins and from disparate organisms and in different tissues may shed light on these and other questions.