Keywords

Genomic Perspectives on the Evolutionary Origins and Variation in Angiosperm Wood

Introduction

The evolutionary and developmental biology of wood are fundamental to understanding the amazing diversification of angiosperms. While flower morphology and reproductive characters associated with angiosperms have been the focus of numerous genetic and genomic evo-devo studies, wood development is a relatively neglected trait of angiosperm evolution and development that is now tractable for comparative and evolutionary genomic studies. The ability to produce wood from a vascular cambium is an ancestral trait of the angiosperms, but has been extensively modified in both basal lineages as well as in more recent, derived taxa. Interestingly, some wood development traits have arisen independently in unrelated taxa, presenting the question of whether common or multiple mechanisms can produce similar developmental changes.

Wood is the product of the vascular cambium. The vascular cambium is a lateral meristem whose initials divide to provide daughter cells that differentiate into secondary xylem (wood) towards the center of the stem, or secondary phloem (inner bark) towards the outside of the stem (Larson 1994). This radial growth from a vascular cambium in woody stems is collectively referred to as secondary growth. For most angiosperms, there are two types of cambial initials. Fusiform initials produce the axial tissues and cell types that can include water conducting tracheary elements, fibers, and xylem parenchyma. Ray initials produce the radial tissues, the rays. Rays are believed to serve various functions including radial transport across the stem, storage of water and nutrients, and biochemical functions. The axial tissues function primarily in water and nutrient conduction, and mechanical support. Significant quantitative, biochemical, and morphological variation exists among angiosperms for many aspects of secondary growth. For example, striking examples of morphological variation include gain and loss of the cambium, successive cambia, xylem furrowed by phloem, and interxylary phloem (Spicer and Groover 2010). The mechanisms underlying this variation are only recently being revealed.

Wood serves multiple functions, including water conduction, water and nutrient storage, and mechanical support. Wood development thus reflects developmental mechanisms that integrate environmental cues to produce highly complex tissues (Arnold and Mauseth 1999). A familiar example of this integration is the annual rings of many temperate tree species, which reflect modification of wood development in response to the changing environmental conditions within and between growing seasons. How this integration occurs is still opaque, in part because of traditional divisions among the different disciplines associated with wood development. Indeed, wood development has been studied since the earliest days of botany and has provided important insights into the evolution and development of plants in general, and of trees in particular (Baas 1982; Groover and Cronk 2013). More recently, advanced genomic approaches have been applied to wood development, including studies in a limited number of angiosperm model tree species with fully sequenced genomes. Integrating genomic studies with classical literature and traditional disciplines including anatomy and physiology could be extremely effective for comparative studies. An ultimate goal is to describe the molecular mechanisms regulating wood formation, structure and physiological functions, including how these biological processes have been modified in different angiosperm lineages and how environmental cues are integrated into developmental processes.

Arguably, wood formation is an excellent subject for comparative and evolutionary genomic studies because (1) wood development is experimentally tractable; (2) wood is of extreme economic and ecologic importance; (3) wood shows incredible phenotypic variation across angiosperm taxa; (4) there is an extensive literature detailing angiosperm wood anatomy; and (5) insights are being made into at least some of the regulatory mechanisms regulating wood development. Additionally, detailed angiosperm phylogenies combined with new sequencing technologies make possible the comparison of gene function across diverse angiosperm taxa, even for non-model species that display unique anatomical or developmental attributes. This chapter presents some of the fundamental information and concepts underlying current and future comparative and evolutionary genomic approaches for wood development in angiosperms.

Expectations for Evolution of Regulatory Mechanisms Regulating Wood Development

Wood development can be subdivided into interrelated processes including the regulation of cell division in the cambial zone, cell expansion and differentiation in the developing xylem, tissue patterning (e.g. the location of vessels within the wood), and numerous other traits. Wood development is also highly plastic, and varies based on stage of development, time of year, and in response to myriad environmental and physiological cues. Anatomical variation has been well-described for large numbers of angiosperms for many wood-related traits in the classical wood anatomy and wood paleobotanical literature (Carlquist 2001). At the same time, increasingly robust phylogenies for angiosperms are being produced using DNA sequence (The Angiosperm Phylogeny 2009). Designing comparative and evolutionary genomic studies of wood formation can take advantage of this information. As discussed later in this review, different phylogenetic scales can be considered for such studies, ranging from ancestral traits that have been modified in different angiosperm lineages, to traits that have arisen independently through convergent evolution in different taxa.

There are a number of wood-related traits that were present in ancestral angiosperms that could be excellent case studies for comparative genomic studies – for example, the tracheary element. Tracheary elements are water-conducting cells that undergo an elaborate differentiation process that supports biosynthesis of a lignified secondary cell wall, and ultimately ends in programmed cell death and lysis to produce a hollow cell corpse (Groover and Jones 1999; Escamez and Tuominen 2014). There are two major forms of water-conducting tracheary elements: tracheids, which are elongated cells with pits that facilitate water transport, and vessel elements, which tend to have larger diameters than tracheids and have openings at each end of the cell, termed perforation plates, that vary from ladder-like scalariform perforations to a simple opening (Esau 1977; Bliss 1921). Within the angiosperms, extensive variation can be seen for tracheary element morphology at different taxonomic levels, among organs within a species, among individuals within a species in response to different environmental conditions, and within individuals at different stages of development (Carlquist and Schneider 2002). This variation in tracheary element morphology and associated functional traits has been the basis of major hypotheses of early angiosperm evolution and diversification (Carlquist 2001, 2009; Bailey 1944). Early angiosperms are characterized by specific wood traits including tracheids, long cambial initials, and upright ray cells. The wood of more derived angiosperms possess vessel elements (or vessel elements and tracheids, or tracheary elements with features of both vessel elements and tracheids), contains parenchyma that can serve to refill embolized vessels, and have procumbent ray cells (Carlquist 2009). Modification of wood anatomies and associated physiological conditions in different lineages likely played a primary role in angiosperm diversification and expansion into extremely diverse habitats. Indeed, xylem heterochrony is a primary factor in understanding the evolution and diversification of early angiosperms (Carlquist 2009).

There are striking innovations in wood anatomy among angiosperms. For example, for at least for some if not all species with successive cambia, a “master cambium” produces conjunctive tissue along with new cambia that then produce secondary xylem and phloem (Carlquist 2007). The result is a stem with repeating layers of secondary xylem and phloem embedded in a matrix of conjuctive tissue. Successive cambia have evolved within 14 orders scattered through the eudicots, and may enable flexibility in woody stems and lianas (Carlquist 2007; Spicer and Groover 2010).

Wood properties can also be examined from an ecological evolutionary perspectives. Since wood is the water conducting tissue of stems, wood properties ultimately determine the ability of a species or individual to grow in different habitats or respond to environmental stresses such as drought. For example, changes in wood properties (specifically, smaller tracheary elements less prone to cavitation) was a major factor in the successful movement of woody angiosperms into freezing environments (Zanne et al. 2014). Interestingly, woody perennial growth has also been found to be associated with reduced exploration of climate space (Smith and Beaulieu 2009) and slower rates of molecular evolution (Smith and Donoghue 2008). Parenchymatous woods illustrate how anatomy affects physiology and adaptive traits, for example in cacti (Mauseth and Plemons-Rodriguez 1998). The number and diameter of tracheary elements produced can vary dramatically in response to water availability, although the mechanisms controlling these responses are poorly understood. These and related topics regarding wood evolution and development have direct relevance to understanding how different species will respond to ongoing climate change.

As discussed below, genes and mechanisms regulating wood formation have been identified that can now be examined in an evolutionary context. While this review cannot cover all of these subjects comprehensively, some examples of genes and mechanisms regulating wood formation are presented in the next section.

Transcriptional Regulation of Wood Formation

Transcription appears to be one of the primary points of regulation for wood formation. This can be inferred from studies in which gene expression was measured using microarrays for successive tissue samples were taken across a developing Populus stems, from the phloem across the cambium and into different stages of secondary xylem maturation (Schrader et al. 2004). In general, changes in gene expression across the tissue profiles are highly correlated to developmental events, with obvious correlations seen between the function of genes associated with xylem development and processes occurring during progressive stages of tissue development (e.g. cell division, cell elongation, and cell wall biosynthesis). Additionally, as discussed below, key transcription factors have been identified that regulate specific aspects of secondary growth and wood formation.

A comprehensive view of the developmental mechanisms underlying wood formation and secondary growth is still emerging. As in other plants and animals, the transcriptional networks regulating gene expression are highly complex, but are beginning to be modeled through integration of different genomic data types using computational approaches. The complexity of transcriptional regulation is shown by ChIP-seq results indicating that individual transcription factors can bind many hundreds or thousands of loci throughout the Populus genome (Liu et al. 2015a). Additionally, there is relatively low correlation between the binding of the individual transcription factors surveyed to date and the expression of bound genes (Liu et al. 2015b), precluding simple models relying on strong correlation of transcription factor binding and transcriptional outcomes for target genes. The lack of strong correlation between transcription factor binding and transcription of target genes may reflect that most genes require the cooperative binding of multiple factors and permissive chromatin states to result in expression change. It may also reflect the type of transcription factors surveyed to date, and “master regulator” transcription factors such as NACs may show more direct correlations between binding and gene expression. Nonetheless, it is obvious that, as has been found by the humane ENCODE project (Consortium 2012), transcription factors tend to bind to large numbers of targets and predicting gene expression requires data describing the binding of multiple transcription factors and chromatin states to be robust.

Several genomic-based approaches have been taken to uncover genes and mechanisms regulating wood formation in angiosperms. Each approach has different strengths and caveats, and ultimately surveys genes affecting wood formation at different levels. For example, association mapping identified single nucleotide polymorphisms in Populus that were correlated with a number of wood chemistry and ultrastructure phenotypes for a large population-based survey (Porth et al. 2013b). Candidate genes identified with strongest effects on phenotypes were primarily structural genes (23 genes), but at least three transcription factors were identified. Limitations to association mapping include that only genes with allelic variation having significant effects on phenotypes can be identified, and the large majority of genes involved in wood development are transparent to this approach. However, the approach does provide insight into the mechanisms and specific genes that are being acted upon by selection. More comprehensively, gene expression during wood formation has been profiled using microarrays and sequencing-based approaches to provide complete catalogues of wood-related genes in species including poplar (Schrader et al. 2004), Eucalyptus (Etienne Paux 2005) and Arabidopsis (Zhao et al. 2005). These approaches do not directly provide functional insights into gene networks or natural genetic variation, however.

Characterization of individual transcription factors has provided important insights into the regulation of wood formation. Transcription factors regulating wood development have been most extensively characterized in Arabidopsis and Populus. While the rosette-form of Arabidopsis lacks fundamental features of arborescent angiosperms such as rays, it does possess a vascular cambium and makes limited secondary xylem in the vegetative rosette. Arabidopsis can also produce limited secondary vascular tissue and interfascicular fibers in the inflorescence stem. Nonetheless, it should be noted that some aspects of wood formation may not be well represented in Arabidopsis.

Developmental genetic, yeast one hybrid and other approaches have defined a hierarchy of NAC-domain and MYB-related transcription factors that are master regulators of tracheary element and fiber differentiation (Brady et al. 2011; Taylor-Teeples et al. 2015; Zhong et al. 2008; Zhong and Ye 2015). Generally, NAC-domain transcription factors occupy the top levels of this hierarchy, and regulate MYB-related transcription factors that in turn regulate genes encoding enzymes participating in secondary cell wall synthesis. Interestingly, misexpression of VASCULAR-RELATED NAC-DOMAIN6 (VND6) results in the differentiation of parenchyma into metaxylem-like vessel elements, while misexpression of VND7 results in differentiation of parenchyma into protoxylem-like vessel elements (Kubo et al. 2005). Furthermore, dominant repression of VND6 and VND7 specifically inhibits metaxylem and protoxylem vessel formation in roots, respectively. Other NAC family members including NSD1, NST1 and NST2 regulate the differentiation of fibers, and nsd1/nst1/nst2 mutants fail to produce secondary cell walls in fibers (Zhong and Ye 2015; Zhong et al. 2006). Thus, NAC-domain transcription factors define mechanisms fundamental to the diversification of the ancestral tracheid cell type into vessels with different morphologies as well as fibers.

New insights into the regulation of the cambium and secondary vascular tissue differentiation was initially identified using the in vitro system for tracheary element differentiation in Zinnia (Ito et al. 2006). In this system, isolated mesophyll cells are induced to differentiate into tracheary elements Differentiation is dependent on exogenous nutritive factors, hormones and a secreted peptide factor, TDIF (Ito et al. 2006). In the plant, TDIF is expressed in the secondary phloem and is perceived by the LRR-like receptor, TDIF RECEPTOR/PHLOEM INTERCALATED WITH XYLEM (TDR/PXY). TDR/PXY in turn modulates expression of a WOX-like transcription factor, WOX4, that regulates the rate of cambial cell division (Hirakawa et al. 2008 2010; Etchells and Turner 2010). Interestingly, this signaling circuit appears to be evolutionarily ancient, with TDIF activity conserved among extant euphyllophytes (Hirakawa and Bowman 2015). Variation in TDIF signaling is thus an attractive mechanism to survey in angiosperm tree species.

More obscure are the mechanisms that regulate the patterning of secondary vascular tissues and the rate of differentiation of cambial derivatives. Class I KNOX transcription factors have been identified that are expressed both in the shoot meristem and cambial zone of Populus (Du et al. 2009; Groover et al. 2006). Both ARBORKNOX1 (ARK1) and ARK2 alter the differentiation of lignified cells within secondary xylem when misregulated. This function is similar to better characterized orthologs in Arabidopsis (SHOOTMERISTEMLESS and BREVIPEDECELLUS) which generally act to repress differentiation within the shoot apical meristem (Long et al. 1996). The expression of these transcriptional regulators in both the apical and cambial meristems presents an interesting opportunity to better understand the cooption of genes and mechanisms from the shoot apical meristem during the evolution of the cambium.

Another potential example of cooption of mechanisms to the cambium from the shoot apical meristem is given by Class III HD ZIP transcription factors. This small family of transcription factors includes members that function in the patterning and polarity of lateral organs and vascular tissues. For example, mutations in the Arabidopsis Class III HD ZIP, REVOLUTA, conditions phenotypes that include adaxialization of both lateral organs and vascular tissues. Mechanistically, REVOLUTA acts antagonistically with another small family of transcriptional regulators, YABBYs. In Populus, misexpression of the REVOLUTA ortholog results in mispatterning of secondary vascular tissues, including reversal of polarity of xylem and phloem and formation of ectopic cambium in the cortex (Robischon et al. 2011). The Class III HD ZIPs are evolutionarily ancient, but functions such as polarity regulation in secondary vascular tissues are derived (Floyd et al. 2006).

Promising advances are being made towards comprehensively modeling transcriptional networks, and identifying correlations between network features and phenotypes. For example, integration of genetic, genomic and phenotypic data was used to identify small, directed networks of genes affecting wood biochemistry (Porth et al. 2013a). Gene co-expression studies can be used to cluster genes into modules that show similar expression pattern across different genotypes, mutants, or in response to experimental treatments. Correlations can then be tested between phenotypic traits and the module eigengene (conceptually, the average) expression values. This approach has the advantage of reducing the dimensionality of the data and minimizes the problems associated with multiple testing commonly encountered in single-gene approaches. In one example, microarray data were used to identify gene modules associated with leaf development in Populus, including putative mechanisms conserved between Populus and Arabidopsis (Street et al. 2011).

Network-based approaches were used for wood development in a recent study of tension wood development. Different Populus ARK2 genotypes were subjected to gibberellic acid (GA) or control treatments, and then either placed horizontally to induced tension wood formation or left upright (Gerttula et al. 2015). Differentiating tension wood, opposite wood and normal wood from the trees and used for mRNA-sequencing. The resulting transcript abundance data were subjected to a co-expression analysis to place genes into modules, which were then correlated to wood properties, wood types and treatments. Genes within modules were then further analyzed to identify candidate transcription factors responsible for regulating the expression of other genes within each module. This proof of concept study could be expanded by including additional genetic or experimental perturbation, which would allow more precise assignment of genes to smaller gene modules. Importantly, these same approaches and concepts can be used to calculate and compare co-expression networks across species, supporting analyses capable of identifying ancestral mechanisms such as regulation of cambium division as well as taxa-specific derived traits.

Post-transcriptional Regulation of Wood Formation

Wood formation is regulated and modified post-transcriptionally by microRNA, by the regulation of rptoein abundance, and by a host of post-transcriptional protein modifications. The relative abundance of specific gene transcripts can be negatively regulated by non-coding microRNAs (miRNAs). At the protein level, transcript stability, rate of translation, protein stability and other factors can result in significant differences between relative rates of transcription and actual protein levels for a given gene. Additionally, phosphorylation, glycosylation, lipidation, ubiquitination and other post-translational modifications can have major effects on the abundance, localization, and activity of proteins. As discussed in this section, ‘omics’ approaches are defining the roles of these various types of post-transcriptional modifications during wood formation.

miRNA Regulation of Transcript Abundance During Wood Formation

miRNAs are short, nuclear-encoded non-coding RNAs involved post-transcriptional regulation of gene expression. In plants, the post-transcriptional regulation by miRNAs is achieved by first processing a miRNA precursor by Dicer-Like1 (Dcl1) into a 21 nucleotide miRNA/miRNA* duplex. The processed miRNA is then incorporated into an Argonaute-associated miRNA-induced silencing complex (miRISC). Specific complementary base pairing between a given miRNA and transcripts from target genes results in cleavage of the target transcript by the miRISC (Meng et al. 2011). miRNAs have been recognized to play crucial roles in diverse biological processes in plants and animals, including cambium differentiation and wood formation in Arabidopsis, as well as Populus and other tree species (Sun et al. 2012).

In a pioneering study of miRNAs in wood formation (Lu et al. 2005), miRNAs families were identified by DNA sequencing of developing secondary xylem tissues of Populus trichocarpa. In this study, a comparisons were made of tension wood and opposite wood from leaning stems, and normal wood from upright trees. Among the miRNAs identified, 12 families were either identical or very similar to Arabidopsis miRNAs, suggesting that these represent evolutionarily-conserved mechanisms. Interestingly, 10 Populus miRNA families were not conserved with Arabidopsis, and the majority of these non-conserved miRNAs were associated with tension wood and/or opposite wood formation. These results indicate that there are species-specific miRNAs, and that miRNAs may regulate tree-related traits such as tension wood formation (Lu et al. 2005).

Evolutionarily conserved miRNAs have been implicated in regulating specific processes underlying wood formation. miR165/166 directly targets transcripts encoding Class III HD-ZIP transcription factors that have been shown to play central roles in organ and vascular tissue polarity, and regulate xylem differentiation in Arabidopsis roots (Carlsbecker et al. 2010) and shoots (Emery et al. 2003). The function and conserved role of Class III HD-ZIP genes has also been studied in Populus. Overexpression of miRNA-resistant Class III HD-ZIP POPCORONA resulted in delayed lignification of xylem and phloem fibers (Du et al. 2011), while overexpression of a miRNA-resistant form of popREVOLUTA resulted in the formation of ectopic cambial layers with reversed polarity within cortical parenchyma (Robischon et al. 2011). miRNAs also regulate structural genes involved in wood formation including laccase-encoding genes, whose protein products mediate the polymerization of monolignols during lignification. The Populus miRNA, Ptr-miR397, was found to directly target 29 of 49 predicted Populus laccase gene transcripts (Lu et al. 2013).

miRNAs also appear to be involved in perennial regulation of secondary growth and wood formation. Deep sequencing of short RNAs in the cambium zone of Populus stems identified more than 100 miRNAs with significant expression changes between active growth and dormancy, including developmental-, phytohormone- and stress-related miRNAs. Most of the development-related miRNAs were enriched in the active growth stage, such as miR164, miR396, miR168, miR319, miR171 (Ding et al. 2014). In contrast, miR166, which targets Class III HD-ZIP transcripts, was more abundant in dormancy and had low expression level during active growth (Ding et al. 2014; Ko et al. 2006), indicating that miR166 plays a role in regulating the annual growth cycle.

Profiling miRNAs expressed during wood formation has also been undertaken in a handful of other tree species. In Eucalyptus, the investigation of miRNAs from xylem, phloem and leaves identified 20 miRNAs of 5 families known in other plants and 28 novel miRNAs of 8 additional families (Victor 2006). In Acacia, 6 highly conserved miRNA families were identified as the potential regulators of secondary wall formation in xylem development, including miR166, miR172, miR168, miR159, miR394 and miR156.

Regulation of Protein Abundance and Post-translational Modification During Wood Formation

The term proteome refers to the total set of proteins and post-translational modifications found in a biological sample. A primary goal of proteomics for wood formation is to determine where and when thousands of individual proteins are produced in secondary vascular tissues, and how they are modified and interact.

Different approaches have been used to catalogue the proteins involved in secondary growth and wood formation. In an early wood-related proteomics study, 15 proteins were identified from differentiating xylem, mature xylem and bark of Populus by 2-Dimensional Electrophoreses (2-DE) coupled with mass spectrometry (MS), including key participants of the phenylpropanoid pathway and lignification (Mijnsbrugge et al. 2000). In addition, highly abundant proteins with unknown function were identified, underscoring that many genes and proteins vital to wood formation remain anonymous. In E. grandis, the cambial region proteome was compared for three ages of growth, and 240 proteins of various putative functions were identified using a 2-DE-LC-MS/MS strategy (Fiorani et al. 2007). From xylem sap of P. trichocarpa x P. deltoides, 97 proteins were identified, including metabolic and glycolytic enzymes and defense proteins (Dafoe and Constabel 2009). The regeneration of secondary vascular tissues after removal of bark provides another useful experimental system for the study of secondary growth. Using this system, 2-DE was used in combination with MALDITOF-MS to identify 244 differentially expressed proteins during the secondary vascular tissue regeneration in Populus, 199 of which were assigned and classified under different functional classes including metabolism, signaling, cytoskeleton functions, cell cycle and secondary cell wall formation (Du et al. 2006). In another study performed by shotgun proteomics on xylem and phloem tissues from two Populus species: P. deltoides and P. tremula × alba, 7505 proteins were identified, 2627 were confidently identified in both xylem and phloem, 606 unique in xylem and 461 in phloem (Abraham et al. 2012). As can be seen from these studies, technical advances in proteomics are quickly providing a more accurate and informative characterizations of the proteins involved in wood formation.

Proteomic approaches have also been applied to more specific aspects of wood formation in angiosperm trees, including reaction wood formation. Using 2-DE, 140 protein species were identified in the upper side of a leaning stems of E. gunnii, with 12 proteins significantly associated with tension wood formation (Plomion et al. 2003). A proteomic approach of tension wood formation in Populus revealed 39 proteins, primarily cell wall-related, in the G-layer in mature xylem by 2-DE and gel-free MS methods (Kaku et al. 2009). A quantitative proteomic and phosphoproteomic analysis of tension wood formation in Populus revealed remarkable developmental plasticity, identifying 1155 proteins and phosphorylation events in comparison of normal wood and tension wood (Mauriat et al. 2015). These results underscores the importance of rapid and reversible post-translational modifications through phosphorylation during wood formation.

Other studies have sought to identify proteins that are uniquely expressed during secondary vascular development. An investigation of plasma membrane proteins from leaves, xylem, and cambium/phloem in Populus found that proteins involved in cell wall and carbohydrate metabolism, membrane trafficking were most abundant in the xylem plasma membranes, in agreement with the large role of cell wall biosynthesis in wood formation. Interestingly, the proteins uniquely found in xylem plasma membranes included enzymes involved in lignin biosynthesis, suggesting that they may exist as a complex linked to the plasma membrane, possibly in close proximity to a transporter translocating lignin monomers across the plasma membrane (Nilsson et al. 2010). Another proteomic analysis of the membrane proteins in differentiating secondary vascular tissues of Populus (Song et al. 2011) found a total of 226 proteins identified as integral plasma membrane proteins, including receptors, transporters, cell wall formation related or intracellular trafficking proteins. In particular, a group of RLKs were identified in the differentiating xylem and phloem, suggesting their involvement in secondary vascular development. An endo-1,4-β-mannanase protein identified in the plasma membrane of differentiating xylem tissue potentially produces oligosaccharides that could serve as signaling molecules to suppress cell wall thickening (Zhao et al. 2013).

Culture-based systems for tracheary element differentiation can provide populations of synchronously differentiating cells for proteomics. The secondary cell wall patterning of tracheary elements is guided by underlying microtubules. Using an in vitro system for tracheary element differentiation, microtubule pulldown experiments paired with quantitative proteomic analysis of labeled microtubule interacting proteins identified 605 microtubule interacting proteins associated with specific stages of differentiation. The proteins associated with membrane trafficking, protein synthesis, DNA/RNA binding, and signal transduction peaked during secondary cell wall formation, while proteins associated with stress peaked when approaching tracheary element cell death (Derbyshire et al. 2015). This study thus provided an entire functional microtubule interactome during tracheary element formation, and expanded our understanding of the complexity of microtubule function in xylem development.

Ultimately the integration of proteomics data with other genomic and phenotypic data is required to comprehensively understand a given biological process, requiring the use of systems biology approaches. Bylesho et al (2009) devised a strategy for data generation and integration to model systematic changes in transcript, protein and metabolite profiles associated with lignin biosynthesis in hybrid aspen (Bylesjo et al. 2009). The joint covariation for all profiling platforms was calculated using multiple O2PLS models, and the results quantified genotype-specific perturbations affecting lignin biosynthesis and growth (Bylesjo et al. 2009). A systems biology approach was also applied to analyze effects of oxidative stress in Populus after acquiring transcriptomic, proteomic and metabolomic profiles of the cambial region of hybrid aspen plants, and then a multivariate analysis method OnPLS was used to integrate the three types of ‘omics’ data. The results provided a first comprehensive model of multi-level responses to oxidative stress in the vascular cambium (Srivastava et al. 2013).

Comparative Evolutionary Genomics Approaches for Wood Formation

There are a number of established approaches for comparative and evolutionary genomic studies that have been applied in both plants and animals, which could now be applied to wood formation in angiosperm trees. At one end of the spectrum, one common approach is to make detailed comparisons (e.g. of expression pattern) for one or a few genes of interest across species with interesting variation for a trait. At the other extreme, comparative genomic approaches can be used to compare DNA sequence, gene expression, or other comprehensive “omics” data within and across species of interest. Comparative genomic approaches are both more comprehensive as well as more powerful than one-gene-at-a-time approaches, and allow integration of quantitative genomic, morphological, biochemical, and other wood-related data. This approach also has the advantage of discovering novel genes and mechanisms that have not been previously described.

A critical challenge for most comparative genomics projects is the determination of orthologous relationships of genes across species. For angiosperms, this is especially challenging given the complex evolutionary histories of angiosperm genomes, with many lineages characterized by whole genome duplication events, ploidy variation, hybridization and structural variation (Soltis et al. 2009). Conceptually and practically, the determination of orthologous genes (formally, genes that are homologous or of common decent in the species being compared) varies depending on the phylogenetic distance and evolutionary history of the genomes of the species being compared. At one extreme, closely related species that have not undergone any genome duplications or rearrangements since diverging from a common ancestor have relatively simple and direct orthology (e.g. orthologous genes having one-to-one orthologous relationships). In cases where one species lineage has undergone whole genome duplication since divergence from a common ancestor, homology may be characterized by one-to-two or one-to-many orthologous relationships. Several different computational approaches have been used to define orthologous relationships, and software is available for performing reciprocal BLAST-based approaches, tree-based approaches, and graph-based clustering approaches for estimating orthologous groups, for example see (Huerta-Cepas et al. 2016; Nakaya et al. 2013; Afrasiabi et al. 2013; Ye et al. 2013; Fischer et al. 2011; Li et al. 2003). In the case of related species with fully sequenced genome, syntenic relationships can be used to further infer orthologous relationships in addition to simple sequence data (Lechner et al. 2014). Morphological or experimental data can ultimately be integrated with orthology data to ask questions regarding the genetic evolution of traits, including wood formation.

For the study of the evolution of wood formation in angiosperms, one approach would be to identify genes and pathways regulating wood formation that have been described in model systems (e.g. Populus or Arabidopsis) and survey their diversification among different lineages. This general approach has been successful in providing fundamental insights into the variation in flower (Becker et al. 2011; Irish and Litt 2005) and leaf (Tsiantis and Hay 2003; Dkhar and Pareek 2014; Tsukaya 2014; Champagne et al. 2007) morphology within the angiosperms. Examples of genes and mechanisms associated with wood formation that could be surveyed include those previously discussed, such as NAC-domain, Class I KNOX and Class III HD ZIP transcription factors, or the mechanisms described by the TDIF/CLE TDR/PXY WOX signaling pathway. However, the selection of candidate genes for study is still relatively subjective, and differences in sequence resources (e.g. genome-level sequence) and experimental tractability (e.g. ability to transform) among species makes this candidate gene approach challenging and non-comprehensive. Indeed, because efficient transformation is difficult for many woody species, performing functional assessment of individual genes in multiple species would be difficult. However, techniques for visualizing gene transcripts using in situ hybridization or protein epitopes using immunolocalizations within tissue sections have been extended to woody models including Populus (Du et al. 2009; Gerttula et al. 2015), and are relatively transferable across species. Examining differences in expression patterns in wood forming tissues during secondary growth could potentially reveal informative differences in the timing or spatial expression of candidate genes among species with unique anatomical, biochemical or other differences in wood development.

A more comprehensive and widely applicable approach would be to us gene expression-based comparative genomics approaches to discover and describe genes and mechanisms underlying phenotypic variation across species. Next generation sequencing (e.g. mRNAsequencing) can now be used to develop the datasets required for such studies from practically any species. This approach has the advantages of discovering genes and mechanisms that may be invariable in population-based approaches or transparent to mutational approaches in model systems. Importantly, this strategy can simultaneously discover mechanisms regulating development and relate them to variation within specific lineages. An example to this approach is given by co-expression gene networks. In this approach, genes that display similar expression across a range of biological conditions (e.g. tissue types or developmental time points) are assigned to coexpression “modules” that often represent genes that work together on common biological processes. For example, a comparative gene regulatory network approach was used to comprehensively describe the transcriptional differences underlying variation in leaf shape for survey of tomato species (Ichihashi et al. 2014). Gene modules were defined based on co-expression relationships, and were then annotated with features including enrichment for genes in specific Gene Ontology classes or correlation with phenotypic traits. To apply such an approach to wood formation, trees from a taxonomic survey spanning some relevant taxonomic range would ideally be grown in a common environment, and then subjected to a variety of experimental perturbations that alter wood development (e.g. hormone treatment). Wood forming tissues would then be harvested and subjected to mRNA sequencing to provide data for comparative co-expression analyses within and among species. Challenges to this approach include collecting together and propagating the required plant material from arboreta or other sources (Groover and Dosmann 2012), and reliably determining orthologous relationships among genes among the different species. However, this approach is currently technically tractable and can utilize analysis approaches and tools developed in other systems. Ideally, such a comparative approach could identify gene modules commonly involved in wood formation across species that might represent ancestral mechanisms, as well as lineage or species-specific modules that could underlie phenotypic differences among species.

New approaches could also be developed for functional genomics within at least some model tree systems that could be scaled to survey larger numbers of genes. An example of such an approach is given by an irradiation hybrid mutagenesis screen in Populus. In this approach, a controlled cross was made between two Populus species in which the pollen from the male parent was irradiated to create chromosomal breaks. In a cross between P. nigra and P. deltoides, over 55 % of the 500 F1 progeny produced contained deletions or insertions of chromosomal segments (Henry et al. 2015). The insertions and deletions were mapped with precision in each F1 progeny using low-coverage, whole-genome sequence data. The genomic data can now be used to assist in reverse or forward genetic screens, including association between altered gene dosage in specific regions of the genome and phenotypes of interest. This population also represents a rich source of genetic perturbation useful in ultimately modeling gene regulatory networks (Filkov 2005). A primary aim of this approach is to create novel genetic changes that alter the complex gene dosage relationships that are believed central to the regulation of heterosis and complex quantitative traits (Birchler et al. 2006) including wood development.

Lastly, imaging-based techniques and data are especially useful for all comparative studies of wood formation. Secondary vascular tissues and wood are characterized by complex three dimensional tissues comprised of multiple cell types, and have both radial and longitudinal developmental gradients to consider. An increasing number of options are available to visualize and quantify molecular and anatomical features that can then be integrated with genomics level information. An example of the power of imaging-based techniques is shown in Fig. 1. To determine how woody stems perceived and responded to gravity, an antibody was raised against a peptide from a Populus PIN-like auxin transport protein and used in immunolocalizations to reveal the gravity-sensing cells within the stem (Fig. 1a, b). Using an auxin-responsive DR5:GUS reporter, the consequence of radial auxin transport by these PIN-expressing cells was revealed in stems placed horizontally to induce tension wood on the upper side of the stem (Fig. 1c) or opposite wood on the lower side of the stem (Fig. 1d). Differential auxin response in the cambial zone versus the cortex in tension wood and opposite wood, respectively, provides insights into the mechanisms regulating these distinct wood development programs that can now be surveyed in angiosperm trees with differing gravitropic stem responses. Molecular phenotypes including cell wall components can be surveyed using an extensive battery of publically available antibodies developed against cell wall epitopes (http://www.ccrc.uga.edu/~mao/wallmab/Home/Home.php). An example is shown in Fig. 2 for an arabinoglactan-recognizing antibody that labels mature gelatinous layers (G-layers) in tension wood fibers in Populus. A more generalized technique for high resolution, three dimensional imaging of wood tissues is laser ablation tomography (Fig. 3). In this technique, a high powered laser is used to progressively and precisely remove successive layers of tissue, with each layer imaged at high resolution. The images can then be reconstructed in three dimensions and features quantified, such as number, patterning, length, and interconnectivity among vessels. This technique could presumably be applied to any woody species or sample.

Fig. 1
figure 1

Imaging of molecular events associated with tension wood formation in Populus. (a) Bright field image of stem cross section. (b) Confocal image of immunolocalization of a Populus PIN3-like protein in the same tissue section. The strong green signal corresponds to the endodermis, which is the innermost layer of the cortex. (c) Tension wood from DR5:GUSPopulus stem of a GA treated tree. (d) Opposite wood from DR5:GUSPopulus stem of a GA treated tree. Blue signal corresponds to auxin response (Note response is strong in the cambial zone of tension wood, versus the cortex of opposite wood. CO cortex, CZ cambial zone, EN endodermis, SX secondary xylem, PI pith)

Fig. 2
figure 2

Imaging of arabinogalactan protein epitopes in Populus tension wood. (a) A cross section of a Populus stem that has transitioned to tension wood formation. Strong labeling by the antibody JIM14 (red signals) of the gelatinous cell wall layer in the lumen of fibers within tension wood takes time to mature and be labeled by the antibody, providing a molecular marker of fiber type and differentiation. (b) Higher magnification of tension wood fibers with labeled gelatinous layers. Blue signal is UV autofluorescence. CO cortex, CZ cambial zone, DF differentiating fibers, MF mature fibers, NW normal wood, TW tension wood

Fig. 3
figure 3

Three dimensional rendering of wood sample using laser ablation tomography. Wood block rendered in three dimensions by integration of multiple z-stacked images. For information about this technology see http://l4is.com/Image courtesy of Lasers for Innovative Solutions, LLC

Conclusions and Future Perspectives

The quickly advancing fields of genomics and computational biology make comparative and evolutionary genomic studies increasingly attractive. The large number of diverse angiosperm tree species of ecological or economic interest makes it infeasible to produce the range of resources for each to become a model species. However, sequencing-based experimental approaches such as mRNAsequencing paired with computational approaches such as comparative gene co-expression network analysis could be extended to most species. Importantly, comparative approaches can actually be much more powerful and address questions that one-species-a-time approaches cannot. The power of looking beyond a handful of model tree species would enable answers to such questions as, “what are the core set of genes and mechanisms that make a tree a tree.”