Introduction

Eukaryotic photosynthesis was acquired laterally through an endosymbiotic association with a cyanobacterium, which led to the formation of the plastid, a process generally referred to as endosymbiogenesis. The genesis of a plastid through endosymbiosis of a non-photosynthetic host with a prokaryote (cyanobacteria-like) is called a primary endosymbiotic event. All organisms with primary plastids can be grouped in the kingdom Plantae that includes the plants, green algae, red algae, and the glaucophytes (Rodríguez-Ezpeleta et al. 2005), implying that they evolved from a common ancestor. This event is predicted to have occurred roughly 1.5 billion years ago (Yoon et al. 2004). Photosynthesis subsequently spread throughout different eukaryotic kingdoms laterally via secondary endosymbiosis, most commonly as a non-photosynthetic host engulfed a red or green alga. The endosymbiont was eventually reduced to an organelle. More recently, the potential of serial secondary and/or tertiary endosymbioses, whereby different Chl a/c-containing lineages acquired a “red” plastid on separate occasions in unique eukaryote–eukaryote endosymbiotic events, have been proposed to explain the complex array of plastids scattered across the tree of life that accounts for much of the photosynthetic diversity on the planet (Sanchez-Puerta et al. 2007; Sanchez-Puerta and Delwiche 2008; Archibald 2009).

The acquisition of photosynthesis by eukaryotes was associated with changes in the strategies for harvesting and dissipating light energy. In particular, there was a transition from the dominant phycobilisome-based antenna of cyanobacteria, to the thylakoid membrane-integral light-harvesting complexes (LHCs), which are modular antenna systems associated with each photosystem. Lacking any evidence of LHCs in cyanobacteria, it is presumed they arose in the earliest photosynthetic eukaryotes and diversified to account for all the different types of LHC proteins with unique functions, and pigment-binding capabilities (Wolfe et al. 1994).

The LHCs are encoded in the nucleus by a large multigene family. Following translation, they are targeted to the chloroplast via an N-terminal extension called a transit peptide that is recognized by the import apparatus (see Stengel et al. 2007). During import, the transit peptide is removed, the apoprotein assembles with chlorophyll and carotenoids, and inserts into the thylakoid membrane (Hoober et al. 2007). All LHCs have three membrane-spanning regions connected by both stroma and lumen-exposed loops. There is an inherent degree of symmetry in the LHCs whereby the first and third membrane-spanning regions have a similar sequence, and likely originated via an internal duplication (Green and Pichersky 1994). These regions possess the characteristic “LHC motif” (ExxxxRxAM) where the Glu (E) from one LHC motif binds a chlorophyll a molecule via a salt bridge to the Arg (R) of the other (Kühlbrandt et al. 1994). This coordinated binding of chlorophyll stabilizes the central two helices. The second membrane-spanning region, in addition to a lumen-exposed alpha helical region at the C-terminus of the protein, also participates in chlorophyll binding. Plant LHCII proteins bind a total of 14 chlorophylls: eight chlorophyll a and six chlorophyll b (Kühlbrandt et al. 1994; Liu et al. 2004). LHCs also require carotenoids for proper function, and in LHCII there are four carotenoid binding sites: two centrally located luteins, a neoxanthin near the second transmembrane helix, and a fourth carotenoid that exists between monomers in the trimer LHCII structure (Liu et al. 2004). The carotenoids likely play a role in both light absorption and photoprotection and, thus, are essential components of the antenna (Liu et al. 2004).

We start off our discussion of the evolution of the LHCs by describing the structural, functional, and phylogenetic aspects of the chlorophyll a/b-binding proteins in plants and green algae, as these are the most extensively characterized. In fact, the chlorophyll a/b-binding antenna system provides an excellent example of how the duplication and divergence of LHC genes led to the functional specialization of groups of antenna proteins vital for acclimating to dynamic environments (Ganeteg et al. 2004a). This will be followed by discussion of other LHC antenna systems in the major eukaryotic groups, leading ultimately to the process by which such complexity may have developed over the billion years since the acquisition of photosynthesis by eukaryotes.

Chlorophyll a/b-containing organisms

Photosystem II light-harvesting antennae

In cyanobacteria and photosynthetic eukaryotes, Photosystem II (PSII) assembles as a dimer (Peter and Thornber 1991; Boekema et al. 1995; Nield et al. 2000; Zouni et al. 2001; Kamiya and Shen 2003). Surrounding the PSII cores in eukaryotes is the peripheral antenna system, which together forms the PSII–LHCII supercomplex. This peripheral antenna system is composed of two types of pigment-binding proteins: the major trimer-forming LHCII proteins and the monomeric, minor LHCII proteins.

Major LHCII antenna proteins

The major LHCII proteins are the most abundant in the PSII antenna system, and are arranged as trimers located at the periphery of the PSII–LHCII supercomplex (Butler and Kühlbrandt 1988; and review by Dekker and Boekema 2005; Fig. 1). The LHCII complexes help regulate energy flow to the reaction centers by participating in both light harvesting and energy dissipation (Pascal et al. 2005). Early isolation of PSII and LHCII complexes from higher plants using non-denaturing techniques showed a stoichiometry of approximately 8 LHCII trimers per PSII core (Peter and Thornber 1991). However, mild-detergent solubilization of thylakoids and visualizing photosynthetic complexes using transmission electron microscopy on negative-stained samples indicted that the majority of intact PSII-LHCII supercomplexes had only two to four trimers per PSII (Boekema et al. 1999a, b). This suggests a central PSII–LHCII supercomplex surrounded by a population of very loosely, or unbound LHCII trimers. Within the PSII–LHCII supercomplexes, LHCII trimers may be classified as being strongly (S), moderately (M), or loosely (L) bound to the PSII core (C), as determined by their susceptibility to solubilization by detergents (Boekema et al. 1998, 1999a, b; Dekker and Boekema 2005). However, the makeup and organization of PSII–LHCII supercomplexes varies from species to species, and heterogeneity also exists in the oligomerization of PSII–LHCII supercomplexes into larger megacomplexes (reviewed by Dekker and Boekema 2005).

Fig. 1
figure 1

Phylogenetic reconstruction of the LHC superfamily from chlorophyll a/b-containing organisms. A maximum-likelihood tree is shown (−ln L = 36411.55626) with support values for specific nodes indicated (approximate likelihood ratio test, PHYML bootstrap 100 replicates). A total of 289 taxa and 194 characters were included in the analysis. The substitution model used was LG + I+G. Sequences used in the analysis were obtained from Genbank and the U.S. Department of Energy Joint Genome Institute (JGI). Photosystem illustrations showing the positions of LHCII subunits in relation to the PSII core are based on Dekker and Boekema (2005). Position of LHCI antenna proteins in relation to the PSI core is based on Stauber et al. (2009) and Mozzo et al. (2010). Asterisks on gene labels indicate green algal-specific LHCI proteins

In plants, the trimers are composed predominantly of proteins encoded by multiple Lhcb1 genes, as well as Lhcb2 and Lhcb3, and are known to form either homo- or heterotrimers (Jansson 1994). Interestingly, while knockdown of Lhcb3 in Arabidopsis was partially compensated by an upregulation of Lhcb1/2, there was an alteration in the antenna structure and a decline in the rate of state transitions, indicating an important role for the individual members of the trimeric LHCII (Damkjær et al. 2009). Orthologs to these three LHCII genes are generally found in all land plants, but not within the green algae, whose major LHCII proteins diversified independently (Fig. 1); thus, one would predict unique regulatory features. Nevertheless, LHCII proteins found in both algae and land plants contain a trimerization motif (WYGPDR; Hobe et al. 1995). Trimer formation appears to be a common property that many LHC proteins possess, including those in the Chl a/c family (Beer et al. 2006). It is of interest that while CP26 (chlorophyll protein 26) does not normally form trimers, it has a recognizable trimerization motif and retains the capacity to form trimers in A. thaliana as observed in LHCII knockdown mutants (Ruban et al. 2003).

A new member of LHC family, recently identified in the C. reinhardtii genome, has been designated LHCQ (Elrad and Grossman 2004). LHCQ is also found in O. tauri (Six et al. 2005) and a number of land plants (Koziol et al. 2007; Fig. 1). LHCQ proteins form a supported subgroup at the base of the LHCII branch (Fig. 1), and appear to have a trimerization motif like other major LHCII proteins, suggesting a role in the peripheral LHCII antenna system. However, its function, and localization within the antenna system, has not yet, to our knowledge, been examined. LHCQ appears to be a universal feature in Chl a/b-antenna systems, indicating it may have been a component of the earliest algal LHCII antenna system, which has been retained within the green plant lineage.

A special mention must be made to the major LHCII antenna in prasinophytes, specifically the Mamiellales, where we find the unusual LHCP proteins. LHCP proteins are structurally divergent from typical LHCII antenna proteins, and form a separate, strongly supported clade (Fig. 1). LHCPs are also unique in that they bind Chl-a, b, and a Chl c-like pigment not found in other Chl a/b-containing organisms (see Six et al. 2005). LHCP-like proteins are also found in Mesostigma viride (Fig. 1; Koziol et al. 2007), an early diverging member of the streptophyta (Rodríguez-Ezpeleta et al. 2007), and the only genera known to possess orthologs of the prasinophyte LHCP and chlorophyte LHCII proteins. This suggests that LHCP predated the chlorophyte–streptophyte split and was part of the early LHCII antenna system, but was later lost in most green algae and land plant lineages (Koziol et al. 2007).

Minor PSII antenna (CP29, CP26, and CP24)

Biochemical analyses and determination of the crystal structure of PSII were essential in examining the structural organization of individual antenna proteins around PSII. Such research identified the minor PSII antenna, CP26 (Lhcb5), CP29 (Lhcb4), and CP24 (Lhcb6), located at the interface between the major LHCII proteins and PSII (Boekema et al. 1999b; Yakushevska et al. 2003). CP26 and CP29 were present prior to diversification of the major lineages, as orthologs are found in all green plant groups examined (Koziol et al. 2007; Fig. 1). The conserved nature and distribution of these proteins would suggest that they are functionally significant, and there has been extensive research to determine the specific roles of these complexes. Under non-stressful conditions, light energy moves from the major LHCII complexes to the PSII reaction center through the minor antenna proteins. However, under excess light conditions, the antenna system undergoes a conformational change and acquires a quenching state for energy dissipation, termed non-photochemical quenching (NPQ) (Pascal et al. 2005; Ruban et al. 2007). In plants, CP29 and CP26 have been implicated directly in NPQ (Ahn et al. 2008; Avenson et al. 2008), though there does appear to be redundancy in NPQ induction since antisense studies of A. thaliana were unable to identify a single minor antenna protein as a unique site for NPQ (Andersson et al. 2001; Kovács et al. 2006). Collectively, these minor antenna plus the trimeric LHCII can switch from light harvesting to efficient energy dissipation as heat through induction of the xanthophylls cycle by the accumulating trans-thylakoid pH (see review by Horton et al. 2008).

In Chlamydomonas, CP29 has been shown to bind to PSI during state transitions, the implication being that it is integral in the redistribution of excitation energy between PSI and PSII (Kargul et al. 2005; Takahashi et al. 2006; Tokutsu et al. 2009). More recently, CP29 and CP26 RNAi knockout mutants were used to demonstrate that CP29 is required to allow mobile LHCII trimer, which dissociates from PSII during a state transition, to dock to PSI (Tokutsu et al. 2009). CP26 and CP29 are among the most well-conserved and broadly distributed LHCs in the green algae (Koziol et al. 2007; Fig. 1), suggesting that these mechanisms for regulating light-harvesting were established early during the evolution of the LHCII type antenna systems. Whether CP29 has the same function in other green algae needs to be examined.

The third minor antenna protein, CP24, evolved more recently during the transition to land since it is absent in all green algae sampled, and is only found in land plants (Fig. 1; Elrad and Grossman 2004; Six et al. 2005; Koziol et al. 2007). The main question is what function did CP24 impart to the photosynthetic apparatus? In plants, CP24 has an important role in organizing the antenna system and, in particular, connecting the LHCII-M trimers to the reaction center (Kovács et al. 2006). A CP24 knockout mutant in Arabidopsis caused an uncoupling of a portion of LHCII from the reaction center and a decline in photosynthetic efficiency (Kovács et al. 2006; de Bianchi et al. 2008). As a result of an uncoupling between PSII and LHCII in these CP24 knockouts, the macro-organization of the PSII complexes was changed, causing the formation of PSII aggregates (Kovács et al. 2006; de Bianchi et al. 2008). Interestingly, these PSII arrays are predicted to limit diffusion of plastoquinone (PQ) into its binding site on PSII, and limiting electron transport (de Bianchi et al. 2008). Thus, CP24 may have evolved as a mechanism to allow efficient photosynthetic electron transport and aid in the formation of NPQ, which may have been particularly important as grana stacking became more prevalent in plants (Kovács et al. 2006; de Bianchi et al. 2008). With this increased size of grana disks in plants compared to green algae (Lavergne and Joliot 1991; Larkum and Vesk 2003) come issues with protein and plastoquinone diffusion required for efficient electron transport, state transitions, and PSII repair. However, the macro-organization of the photosynthetic complexes, assisted by CP24, likely provides channels to allow efficient diffusion of a pool of mobile proteins and PQ (Kirchhoff et al. 2002; Kirchhoff et al. 2008; de Bianchi et al. 2008). While the reason for the evolution of grana is still under debate (Dekker and Boekema 2005; Mullineaux 2005; Anderson et al. 2008), it is clear that the organization of the thylakoid membrane is integrally associated with the regulation of energy flow to the reaction centers and changes in these regulatory mechanisms coincide with the evolution of the antenna system.

Photosystem I light-harvesting antennae

In land plants there are six genes, Lhca1–6, encoding the antenna system of Photosystem I (PSI) (Jensen et al. 2007). These six Light-Harvesting Antennae (LHCI) proteins all form distinctive and well-supported clades, indicating the establishment of the PSI antenna system prior to the transition to land (Fig. 1). However, there are exceptions to the typical land plant complement of LHCI antenna proteins. In the moss Physcomitrella patens, Lhca4 was not detected (Alboresi et al. 2008), while it is present in both liverworts (Marchantia polymorphum) and lycophytes (Selaginella moellendorffii). This could be easily explained if mosses were basal in the land plant lineage, but this is not widely accepted (Qiu et al. 2006). A more likely scenario is that Lhca4 was lost in P. patens, or not detected due to gaps in the genome. P. patens is also the only moss that has thus far been sequenced, so it is not known whether this loss is species-specific or if Lhca4 is not present in the entire moss lineage. Lhca6, on the other hand, was only identified in flowering plants (Fig. 1), indicating that this LHC gene was acquired at some point following the development of a vascular system.

Based on the pea PSI crystal structure, Lhca1–4 are arranged as a crescent along the PSI-F/PSI-J side of the PSI core (Ben-Shem et al. 2003; Amunts et al. 2007). These antenna proteins assemble as spectroscopically different heterodimers composed of Lhca1 and Lhca4 (Schmid et al. 1997), plus a dimer typically composed of the Lhca2 and Lhca3 proteins (Ihalainen et al. 2000). Based on the conservation of these proteins between diverse land plant taxa (Fig. 1), it is likely that the pea structure for the LHCI antenna system is representative of the land plants; however, the LHCI antenna system may be more heterogeneous than the crystal structure suggests based on the detection of LHCI isoforms (Storf et al. 2004), and identification of additional Lhca gene types—Lhca5 and 6 (Jansson 1999). Lhca5 accumulates to sub-stoichiometric levels compared to the PSI core, and whose expression is enhanced under high-light exposure (Storf et al. 2004; Ganeteg et al. 2004b). Cross-linking studies suggest that Lhca5 associates on the periphery of the LHCI belt by interacting with Lhca2, though it can assemble as a homodimer in the Lhca1/4 site if these proteins are depleted (Lucinski et al. 2006). Thus there seems to be an inherent flexibility in organization (Lucinski et al. 2006). Like Lhca5, Lhca6 is expressed at low levels and regulated in a stress-related fashion distinct from the more abundant LHC genes (Klimmek et al. 2006). In addition, Lhca5 and 6 were found to have a role in mediating the interaction of PSI with the NADPH dehydrogenase complex that is important in cyclic electron flow (Peng et al. 2009).

In comparison, the LHCI complex in the chlorophyte C. reinhardtii is larger and more heterogeneous than vascular plants, being encoded by 9 LHCI genes (Lhca1-9), and containing six to nine LHCI proteins per PSI core (Kargul et al. 2003; Takahashi et al. 2004; Stauber et al. 2009; Mozzo et al. 2010; Fig. 1). With the available evidence, all of the LHCI proteins from C. reinhardtii have orthologs in other members of the UTC clade, including the Chlorophyceae (Volvox carteri), and the Trebouxiophyceae (Chlorella; Fig. 1). It is also likely there are corresponding orthologs in the Ulvophyceae, however, in our analysis, and that of Koziol et al. (2007), data available for this lineage is restricted to EST sampling from one species (Acetabularia acetabulum), and only Lhca2, 7, and 9 homologs were identified (Fig. 1). The LHCI complex binds on one side of the PSI core, as in vascular plants, but individual monomers also interact at alternate sites (Germano et al. 2002; Kargul et al. 2003; Stauber et al. 2009). In C. reinhardtii, the LHCI composition appears to be heterogeneous since certain Lhca proteins have stoichiometric ratios with the PSI core other than 1:1 (Stauber et al. 2009). While Lhca1, 4, and 7 have a 1:1 ratio with the PSI core, Lhca2, 5, 6, 8, and 9 are in sub-stoichiometric amounts, and Lhca3 has a stoichiometry greater than 1, suggesting a mixed LHCI antenna composition in the PSI population that may vary according to the environmental conditions (Stauber et al. 2009).

Of the green algal LHCI complement, Lhca3 is the only clear LHCI ortholog in both green algae and plants (Tokutsu et al. 2004; Koziol et al. 2007; Fig. 1). Lhca3 does seem to have an important functional role in LHCI antenna organization that could explain its conservation. Lhca3 undergoes N-terminal processing during iron deprivation that leads to LHCI remodeling, which in turn also reduces the efficiency of excitation energy transfer from LHCI to PSI (Naumann et al. 2005). In C. reinhardtii, RNAi knockout mutants of Lhca3 cause a depletion of all LHCI proteins, with the exception of Lhca2 and 9, suggesting Lhca3 is important for linking the LHCI antenna system to PSI, but whether this is a consistent role in other taxa remains to be seen (Naumann et al. 2005). In Dunaliella, for instance, iron stress causes the induction of a large Lhca3 variant that effectively increases PSI antenna size, presumably to balance excitation between the photosystems (Varsano et al. 2006). Overall it seems that green algae are capable of modulating the makeup and structure of the PSI–LHCI supercomplex in response to changing environmental conditions. Whether land plants are capable of such LHCI modulation is less certain. While the two available crystal structures show only a single copy of each protein per PSI (Ben-Shem et al. 2003; Amunts et al. 2007), Bailey et al. (2001) observed light intensity-dependent expression of LHCI polypeptides as quantified via immunoblotting. However, more recently Ballottari et al. (2007) showed that the stoichiometry of LHCI to PSI in Arabidopsis does not change with light intensity by quantifying the individual Coomassie-stained polypeptides, the differences attributed to the quantification linearity of the methodology.

Based on the distribution of different Lhca orthologs within the green algae and land plants, it appears that the PSI antenna system was composed of proteins encoded by Lhca3, 2, and 9 in primitive green algae, but orthologs to Lhca2 and 9 were lost in land plants (Fig. 1; Koziol et al. 2007). It may also be that Lhca1 and Lhca4 were also present prior to the separation of the chlorophytes and streptophytes as clades consisting of these proteins from both lineages are positioned next to each other (Fig. 1). However, this association was not supported by the phylogenetic analysis conducted, and it is not certain whether these are true orthologs. Future sampling in the Streptophyta is required before these relationships can be determined.

Antenna divergence and energy dissipation: role of novel LHC proteins

Much research has been focused lately on identifying LHCs involved in NPQ. One protein that has been directly implicated in NPQ is PsbS. PsbS is a four-helix protein that likely evolved independently of plant LHCs, as there is evidence for internal duplication (Kim et al. 1992). PsbS is present in all land plants (Schultes and Peterson 2007), and it has an important role in the initiation of NPQ in Arabidopsis and the protection of PSII (Li et al. 2000; Li et al. 2002). Interestingly, two lumen-exposed glutamates that become protonated as the lumen acidifies during excess light are vital for the induction of NPQ (Li et al. 2000; Li et al. 2004), and correlated with PsbS dimerization (Bergantino et al. 2003). The mechanism underlying the ability of PsbS to mediate NPQ in plants is thought to involve the pH-dependent binding of xanthophylls to two sites on PsbS, which are directly involved in quenching by interacting with the LHCII–PSII complex (Li et al. 2004). Alternatively, PsbS may not be the site of quenching but instead acts as a pH-dependent trigger that activates other quenching sites within the antenna system (Bonente et al. 2008a). PsbS has recently been shown to regulate the organization of LHCII antenna system with the PSII core complex (Kiss et al. 2008). More specifically, Betterle et al. (2009) showed that PsbS was involved in regulating the association of an antenna hetero-oligomer composed of the M-trimer, CP24, and CP29 (termed the B4C complex) with the inner antenna of the PSII reaction center. The B4C complex reversibly dissociated from the PSII core in a light-, ΔpH-, and PsbS-dependent fashion, and its dissociation from PSII was correlated with NPQ capacity (Betterle et al. 2009). Thus, it appears that B4C stabilizes an unquenched state of the LHCII antenna system, and its dissociation allows for structural modification in the PSII–LHCII supercomplex resulting in a quenching state. Interestingly, while green algae such as Chlamydomonas have genes structurally similar to PsbS (Koziol et al. 2007), the relevance in terms of photoprotection is uncertain since unicellular green algae do not appear to accumulate the protein (Bonente et al. 2008b). Green algae also lack CP24 and, therefore, cannot form these B4C complexes, implying that the mechanistic basis of NPQ in these groups is likely different.

LHCSR (LI818) is present in green algae and moss, but absent in lycopods (Selaginella) and seed plants. Interestingly, LHCSR homologs are also found in a variety of Chl a/c-containing lineages (discussed below, Fig. 2). In green algae (Chlamydomonas), LHCSR is a stress-related member of the LHC protein family whose transcripts accumulate in response to stress conditions such as high-light intensity (Savard et al. 1996; Richard et al. 2000; Im et al. 2003; Miura et al. 2004; Zhang et al. 2004; Naumann et al. 2007; Ledford et al. 2004; Peers et al. 2009). In Chlamydomonas there are three genes encoding LHCSR, but LHCSR3 is required for qE (Peers et al. 2009), which is the component of NPQ that operates on a time-scale of seconds to minutes as a rapidly inducible, short-term response to changing light intensity (Li et al. 2009). Based on the broad distribution of LHCSR in all major green algal lineages (Fig. 1), LHCSR was likely part of an early photo-protective mechanism (Peers et al. 2009). While this protein is present in the moss Physcomitrella patens, its absence in the lycopod Selaginella moellendorffii and other land plants indicates a transition to a different mechanism for NPQ during the transition to land, which in this case likely involved PsbS (Peers et al. 2009). What specific advantage PsbS may have over LHCSR in plants remains an intriguing question.

Fig. 2
figure 2

Phylogenetic reconstruction of the LHC superfamily in red algae and chlorophyll a/c-containing organisms. A maximum-likelihood tree is shown (−ln L = 31862.92318). Black dots indicate nodes with support values using approximate likelihood ratio test and PHYML bootstrap (100 replicates) greater than 0.80 and 70, respectively. A total of 221 taxa and 137 characters were included in the analysis. The substitution model used was LG + I+G. Sequences used in the analysis include red algal and chlorophyll a/c-containing organisms used by Koziol et al. (2007) and additional sequences obtained from Genbank and the U.S. Department of Energy Joint Genome Institute (JGI). LHCSR (LI818) protein sequences from green algae and P. patens as well as LHC proteins from B. natans were also included in the analysis

Red algal LHCs and the chlorophyll a/c-proteins

The Chl a/c-containing organisms represent a diverse assemblage of protists that acquired photosynthesis via eukaryote–eukaryote endosymbiosis. While it is generally accepted that stramenopiles, haptophytes, cryptomonads, apicomplexa, and dinoflagellates have red algal-derived plastids, the number and sequence of endosymbiotic events are disputed (Archibald 2009). While these groups are often depicted as being derived from a single endosymbiotic event with a red alga (as shown in Fig. 3), as predicted by the “chromalveolate hypothesis” proposed by Cavalier-Smith (1999), this super group is controversial (Keeling 2009; Baurain et al. 2010). Poor taxon sampling within these diverse groups has hampered examination of antenna evolution in the “chromalveolates,” though the completion of several complete genomes has improved the situation. Nevertheless, LHCs from red algae and organisms with complex red plastids can be roughly divided into four major clades: the red algal/cryptomonad LhcaR clade, the very large chlorophyll a/c-protein clade, the LHCSR/LI818 clade, and LHCZ (Fig. 2). While not all clades are supported in the particular analysis shown in Fig. 2, the divisions represent a consensus of work produced in previous studies (Richard et al. 2000, Green 2003, Koziol et al. 2007; Pearson et al. 2009; Lefebvre et al. 2010), and increased sampling will likely improve the resolution of these major clades.

Fig. 3
figure 3

Evolution of antenna systems in the eukaryotic tree of life. The major eukaryotic lineages are shown along with some key evolutionary events such as the acquisition of photosynthesis from a cyanobacteria via a primary endosymbiogenesis (red circle). This was followed by the transfer of HLIP genes into the nucleus of the host and subsequent evolution of the family of LIL and LHC proteins at different points along the evolutionary tree as indicated (green dots plus icons as explained in the key). Photosynthesis was also spread laterally to other eukaryotic lineages via secondary or tertiary endosymbiogenesis (blue arrows plus symbol) after which the antenna systems continued to diversify. Euglenophytes and chloroarachniophytes acquired plastids independently through a secondary endosymbiosis with a green alga. The Chl a/c-containing organisms acquired a plastid from a red alga, or an organism with a red algal-derived plastid. For simplicity, the tree depicts the relationships of the stramenopiles, dinoflagellates, haptophytes and cryptophytes as predicted by the chromalveolate hypothesis (Cavalier-Smith 1999), which suggests a single secondary endosymbiosis with a red alga, though more complex mechanisms have been proposed (Archibald 2009). In organisms with a complex red plastid, the phycobilisomes were subsequently lost and replaced by an LHC-like antenna system. In cryptomonads, phycobiliprotein remains but does not assemble into phycobilisomes. In peridinin-containing dinoflagellates, a novel, soluble chlorophyll protein (Peridinin-Chl protein) evolved to supplement the membrane-integral LHC system, which is not discussed in this manuscript

In red algae, the LHC antenna system is limited to its association with PSI while phycobilisomes are the predominant antenna for PSII (Wolfe et al. 1994; Marquardt and Rhiel 1997). In Porphyridium, these LHCI-antenna proteins (LhcaR genes) bind chlorophyll a, zeaxanthin and β-carotene (Wolfe et al. 1994). Unlike other major groups, there are no accessory chlorophylls, such as chlorophyll b or c, typically associated with the red algal LHCs. The LhcaR clade comprises the entire family of red algal LHCI proteins as well as LHCs from cryptomonads, for which there is clear evidence for a plastid acquisition from a red alga (Douglas et al. 1991). Presumably, many of the cryptomonad LHCs in the LhcaR clade would be associated with PSII, rather than having a strict association with PSI as in the red algae, but there is little biochemical information on any antenna is this group. Also scattered among the LhcaR clade are several LHCs from stramenopiles and haptophytes (Eppard et al. 2000; Green 2003; Koziol et al. 2007; Lefebvre et al. 2010; Fig. 2), which is reflective of an ultimate red algal origin of their plastids, though clearly the majority of LHCs in the Chl a/c-containing organisms are excluded. The Chl a/c-proteins in the LhcaR clade are possibly specialized PSI-associated proteins, by analogy to the specificity of the red algal LHCs. In support of this hypothesis, Veith et al. (2009) have demonstrated that FCP4 from the diatom Cyclotella meneghiniana, which is in the red algal LhcaR clade, does appear to be functioning as a PSI antenna protein. While there is clear evidence emerging for specific LHCs associated with PSI in diatoms, and this LHCI antenna system is much larger than the LHCI system in plants (Veith and Büchel 2007; Veith et al. 2009), identifying specific polypeptides in most Chl a/c-containing algae has not yet been done.

The Chl a/c clade typically represents the dominant group of LHCs from a variety of stramenopiles (e.g., diatoms, brown algae, and raphidophytes) and haptophytes (Green 2003; Eppard et al. 2000; Pearson et al. 2009; Fig. 2). In these organisms, the dominant carotenoid is fucoxanthin and the LHCs are thus called fucoxanthin-Chl proteins (FCPs). In dinoflagellates, the dominant carotenoid is typically peridinin, but fucoxanthin is found in taxa such as Karenia brevis, which acquired a plastid from a haptophyte via a tertiary endosymbiosis (Nosenko et al. 2007). Compared to plants, however, there are major differences in carotenoid binding of the dominant LHC in algae with complex red plastids, where the carotenoid:chlorophyll ratio is near unity (4:4, Papagiannakis et al. 2005), compared to a ratio of 14:4 in plants (Liu et al. 2004). These pigment differences shift the absorbance spectrum into the 460–570 nm range not available to green plants (Mimuro and Akimoto 2003). There is little known regarding the organization and function of the individual members of this family, but in diatoms, some members of this larger clade have been isolated as trimeric or higher oligomeric fractions (Beer et al. 2006; Lepetit et al. 2007), though a clear association to either PSI or PSII has not been demonstrated. The Chl a/c clade will undoubtedly be subdivided further into taxon specific groups as more data becomes available, and the real challenge will be in deciphering their function. Many diatoms, for instance, have a remarkable capacity for inducing NPQ that is dominated by the xanthophyll cycle (Lavaud et al. 2002), which in diatoms is the pH-dependent de-epoxidation of the xanthophyll diadinoxanthin to diatoxanthin. The molecular basis of NPQ and exactly how the antenna functions to induce this photoprotective mechanism in these organisms is unknown, but identification of LHCSR/LI818 relatives in Chl a/c-containing algae may shed light on this process.

The LHCSR/LI818 clade includes both Chl a/c- and Chl a/b-containing representatives (Richard et al. 2000; Green 2003; Koziol et al. 2007; Pearson et al. 2009; Lefebvre et al. 2010). What is interesting is that diatom (Oeltjen et al. 2002; Zhu and Green 2008; Becker and Rhiel 2006; Park et al. 2010) and haptophyte (Lefebvre et al. 2010) LHCSR homologs are also induced in response to light stress, as observed in green algae (discussed above). In diatoms, expression of an LHCSR homolog (Fcp6/7) was enhanced under high light, and localized within a trimeric FCP fraction (Beer et al. 2006). This was also correlated with an increase in the binding of the diatoxanthin, a xanthophyll cycle pigment, and a decrease in the fluorescence yield (Beer et al. 2006; Gundermann and Büchel 2008). This indicates that the LHCSR homologs in diatoms may also be involved in NPQ, as is the case in Chlamydomonas (Peers et al. 2009; discussed above), suggesting an evolutionarily conserved role for this unique LHC-like protein.

The LHCZ clade is the only other group to contain sequences from organisms with both red- and green-derived plastids (Koziol et al. 2007), which includes the chlorarachniophytes, haptophytes, cryptophytes, and diatoms (Fig. 2). An interesting feature is that orthologs to these proteins are absent in red and green algae, from which the others acquired a plastid, which raises interesting questions with respect to their evolution. These sequences are particularly distinctive, and their function remains unknown.

LHC evolution from the beginning

The LHC protein family is united by the possession of the highly conserved chlorophyll binding domain in two of their transmembrane segments. These “LHC motifs” have been detected in a variety of proteins (Table 1), most notably in the family of Light Harvesting-Like (LIL) proteins (Jansson 1999). The LILs represent a collection of structurally diverse membrane proteins that are distributed throughout oxygenic pro- and eukaryotic organisms (Heddad and Adamska 2002; Neilson and Durnford 2010). Most notably, the LILs differ in the numbers of predicted transmembrane segments, of which at least one contains a classic LHC motif.

Table 1 Occurrence of proteins with LHC motifs (ExxxxRxAM)

The LIL and LHC protein families evolved from small cyanobacterial proteins called HLIPs (High-Light Inducible Proteins; Dolganov et al. 1995) or Scps (small CAB-like proteins; Funk and Vermaas 1999). HLIPs have a single transmembrane domain and are found within cyanobacteria and the plastid genomes of red algae, glaucophytes, and the cryptophyta (see review by Heddad and Adamska 2002). In cyanobacteria, HLIPs have been shown to bind pigments (Storm et al. 2008), presumably through dimerization in a manner similar to that proposed for the classic three-helix LHC proteins (Kühlbrandt et al. 1994). Despite the capability of binding pigments, these proteins are likely not involved in light harvesting, but rather some aspect of photoprotection (He et al. 2001; Havaux et al. 2003). Specifically, they are involved in regulating pigment biosynthesis or part of a chlorophyll scavenging mechanism that works to prevent the formation of reactive oxygen species by unbound chlorophyll molecules (Xu et al. 2002; 2004; Vavilin et al. 2007).

During plastid acquisition, many genes were transferred from the endosymbiont to the host nucleus (Timmis et al. 2004), and the HLIP genes would have been among those transferred. It is in eukaryotes that we find evidence for the expansion of the LIL gene family, with members having one to three membrane-spanning regions. The LIL protein family includes the One Helix Proteins (OHPs), the two-helix Stress Enhanced Proteins (SEPs), and the three-helix Early Light Inducible Proteins (ELIPs), all of which are found in the plants and green algae (Neilson and Durnford 2010). Since these proteins share similarity to the LHCs, it is likely that LHCs evolved from duplication and fusion of HLIP-like genes following evolution of the plastid from endosymbiotic cyanobacteria (Dolganov et al. 1995; Fig. 3). The presence of OHPs (or HLIPs) in all photosynthetic taxa examined would support such an interpretation (Neilson and Durnford 2010). There have been a number of proposals for the order of such events that have incorporated a step-wise progression from one to three membrane-spanning regions (Green and Kühlbrandt 1995; Montanè and Kloppstech 2000; Heddad and Adamska 2000; 2002). While a variety of possibilities exist, it would seem that neither the SEPs nor the ELIPs were precursors for LHC evolution directly, since they are only present in green algae and plants. Instead, it appears that they evolved after the separation of the green algae, red algae and glaucophytes, but prior to green algal divergence (Fig. 3; Neilson and Durnford 2010). While two-helix LILs were detected in some stramenopiles, they are not well conserved and may have evolved independently (Neilson and Durnford 2010), though their small size makes it difficult to distinguish divergence from an independent evolutionary event. In the green algal lineage, appearance of the two- and three-helix LILs is likely coordinated with the loss of phycobilisomes and adaptation to high-light environments, which matches the putative role of many of the LILs in photoprotection.

The origin of the precursor LHC occurred prior to the divergence of the green and red algae (Wolfe et al. 1994), and presumably after the divergence of the glaucophytes, which do not appear to have the classic LHC proteins (Fig. 3). While the phycobilisomes remained the dominant antenna system in red algae and glaucophytes, loss of phycobilisomes in the green lineage triggered diversification of the LHC antenna system into the modular and flexible antenna system we see today (Fig. 1). Duplicated LHCI-related proteins may have been co-opted for the PSII antenna system, ultimately replacing phycobilisomes, assuming that the PSI localization of the red algal LHCs (Wolfe et al. 1994) represents the ancestral state.

Plastids and their antenna systems were also transferred laterally in a number of secondary or tertiary endosymbiotic events (Archibald 2009). Endosymbiotic events with green algae and different eukaryotic hosts gave rise to the euglenophytes and chlorarachniophytes that possess remnants of the green algal-like antenna system (Fig. 3), like the universal presence of CP29 and LHCII-like proteins (Koziol et al. 2007). In these secondary plastid acquisitions, there was significant restructuring of the antenna system, particularly LHCI where no clear homologs were detected (Koziol et al. 2007). Red algae have also been captured as endosymbionts giving rise to plastids during secondary endosymbioses, or even indirectly through tertiary endosymbiotic events (Fig. 3; Archibald 2009). With the exception of the cryptomonads that have phycobiliproteins located in the thylakoid lumen (Ludwig and Gibbs 1989), other Chl a/c-containing organisms lack any remnants of phycobilisomes. During this lateral transfer, the red algal LHC genes diversified into a large and diverse group of proteins that are commonly known for their high carotenoid:chlorophyll ratios and the accessory pigment chlorophyll c (Macpherson and Hiller 2003).

The discovery that LHCSR protein family has homologs in the chromalveolates suggests that this protein is a very early evolving LHC type (Richard et al. 2000), and was passed to the Chl a/c-containing algae through secondary endosymbiosis. The lack of homologs in red algae, however, poses a problem in explaining the extant distribution of this unique protein. It is certainly possible that sampling is an issue and it simply hasn’t been detected in red algae. Otherwise, LHCSR was present in the lineage before red and green algae separation, but red algae lost the LHCSR homolog, as plants have done. We also can’t rule out a lateral transfer of the LHCSR gene as a result of a cryptic endosymbiosis with a green alga (Peers et al. 2009), events that were surprisingly prevalent in diatoms (Moustafa et al. 2009). Such conclusions await improved sampling in the reds and glaucophytes to get a better idea as to the evolutionary transition.

The LHC motif has also been acquired by a variety of other proteins, many of which, to our knowledge, have no direct link to light harvesting or photoprotection (Table 1). In cyanobacteria, a protein with two predicted membrane-spanning regions appears to have arisen through a gene fusion event between an HLIP and a hypothetical single membrane-spanning protein found in most cyanobacteria (Kilian et al. 2007), illustrating shuffling of this domain. The LHC motif is also present in one of the two copies of ferrochelatase in plants, algae, and cyanobacteria. In cyanobacteria, the LHC motif is essential for oligomerization of the ferrochelatase enzyme and subsequent enzyme activity (Sobotka et al. 2008). We also identified an LHC motif at the C-terminus of a family of predicted Rieske iron-sulfur proteins in plants, that we called iron sulfur chlorophyll proteins (ISCPs). ISCPs are predicted to be chloroplast-targeted, and while their function is unknown, the LHC motif perhaps mediates dimerization and regulation of its function through the coordination of chlorophyll (Table 1).

Concluding remarks

Following the capture of cyanobacteria and the transformation of this endosymbiont into an organelle, photosynthetic metabolism has diversified and spread through different eukaryotic lineages via secondary and tertiary endosymbiotic events (Archibald 2009). During more than 1 billion years of eukaryotic evolution, there have been tremendous evolutionary changes in the antenna systems that capture, transfer, and dissipate excitation energy. The changes in antenna systems can be rationalized by considering the varied environments that different taxa are adapted to, and the extremes of the stressors to which they are exposed. Ultimately, there has been considerable evolutionary adaptation that led to the evolution of a modular and flexible antenna system that can acclimate to a wide range of environmental conditions to optimize photosynthesis. Ganeteg et al. (2004a) questioned the role of an LHC antenna system that often appears overly complex and redundant, and found that many of these individual LHC proteins appear to be valuable in adapting to a variety of environmental conditions, and are indeed important for fitness.