Introduction

One of the most attractive themes in modern biology is the pursuit of molecular mechanisms that underlie the phenotypic diversity upon which adaptive strategies are based. Insect body marking patterns have successfully diversified to adapt to their environments. Switches in larval coloration and body markings in the Lepidoptera are well known (Scoble 1992). The swallowtail butterfly, Papilio xuthus, changes its color pattern dramatically during the molt to the final instar (fourth molt, Fig. 1a). Juvenile instars of P. xuthus mimic bird droppings with many tubercle structures on their surface (Fig. 1a, arrowheads), whereas the fifth (last) instar shows a cryptic pattern. Recently, we found that this pattern switch is regulated by a juvenile hormone (JH, Futahashi and Fujiwara 2008). JH titer decreases during the fourth larval instar, and treatment with a JH analog (JHA) prevents this pattern switch. The expression levels or patterns of the genes involved in determining the larval body pattern should change during the fourth molt. We previously found that five genes involved in melanin synthesis—tyrosine hydroxylase, dopa decarboxylase, ebony, yellow, and GTP cyclohydrolase I—are differentially expressed between the third and fourth molts in accordance with the cuticular pigment pattern of the subsequent instar (Futahashi and Fujiwara 2005, 2006, 2007). Using complementary deoxyribonucleic acid (cDNA) subtraction methods, we cloned the hard cuticular protein genes associated with the specific tubercle structures (HCP1 and HCP2) and the bilin-binding protein gene (BBP) that is only expressed at the final molt (Futahashi and Fujiwara 2008).

Fig. 1
figure 1

Larval body pattern of swallowtail butterfly Papilio xuthus and expression analysis of 98 genes identified by cDNA subtraction. a Larval pattern of the second, third, and fourth instar larvae. Lateral views of the third, fourth, and fifth instar larva are shown. Inset shows the tubercle structures (arrowheads) on the surface. Scale bars = 1 cm. b Expression pattern of candidate genes by RT-PCR between the second, third, and fourth molts. The gene for the ribosomal protein L3 (rpL3) was used as control. Three genes, actin, PxutCPR2, and PxutCPR4, were obtained from both subtractive libraries. Green arrowheads indicate the genes strongly expressed at the fourth molt, and red arrowheads indicate the genes weakly expressed at the fourth molt. c Relative abundance of mRNAs for cuticular protein genes was examined by semiquantitative RT-PCR. The dilutions for 1–6 thus range from none to 3,125-fold

In this paper, we reported the expression pattern of 98 genes identified by cDNA subtraction. The spatial expression patterns of obviously differentially expressed genes were analyzed by whole-mount in situ hybridization and compared with stage-specific body markings. We identified 13 genes associated with green coloration or tubercle structures, three of which (Fsg02, Psg12, and Psg14) were also reported by Futahashi and Fujiwara (2008).

Materials and methods

Experimental animals and developmental staging

P. xuthus was purchased from Eiko-Kagaku (Osaka, Japan), kindly provided by Dr. Akira Yamanaka (Yamaguchi University, Japan), or collected from the field. Larvae were reared on leaves of Zanthoxylum ailanthoides (Rutaceae) at 25°C under long-day conditions (16 h light/8 h dark). The duration of the fourth instar (the third ecdysis to head capsule slippage, HCS) was approximately 4 days, and the duration of the fourth molt (pharate fifth instar) was approximately 21 h. The staging of molting period was based on the time when HCS occurred.

cDNA subtraction

A whole dorsal integument from the thoracic 2 segment to the abdominal 7 segment was dissected from a larva. After the fat body and muscle attached to the epidermis were carefully removed, total epidermal ribonucleic acid (RNA) was isolated using TRI reagent (Sigma, St. Louis, MO, USA) according to the manufacturer’s instructions. Three or four specimens were used every 2 h from 0 to 20 after HCS of each molt (total 36–40 specimens at each molt). Poly (A)+RNA samples (≈2 μg) were prepared using Micro-Fast Track messenger RNA (mRNA) isolation kit (Clontech, San Jose, CA, USA) and subjected to cDNA subtraction by using the polymerase chain reaction (PCR)-select cDNA subtraction kit (Clontech) according to the manufacturer’s instructions. We screened for final instar-specific genes (Fsg) and penultimate instar-specific genes (Psg). Fsgs were isolated by subtracting the epidermal cDNA population of the third molt from the epidermal cDNA population of the fourth molt, and Psgs were isolated by subtracting the epidermal cDNA population of the fourth molt from the epidermal cDNA population of the third molt (Fig. 1a). In the initial subtraction screening, 40 randomly selected clones were sequenced. We selected the three redundant cDNA fragments as Fsg01Fsg03 in the fifth (final)-instar library and as Psg01Psg03 in the fourth (penultimate)-instar library. To avoid repeatedly sequencing clones corresponding to these genes and thus improve the chances of isolating low-abundance clones, we added specific primers for Fsg01Fsg03 or Psg01Psg03 cDNAs (Supplementary Table 1) to the T7 and SP6 primers used for subsequent subcloning (Supplementary Fig. 1a). When the cloned inserts were analyzed by PCR and electrophoresis, the redundant clones produced multiple bands because it is amplified by T7 and SP6 primers as well as redundant gene-specific primers, whereas the other clones produced single bands (Supplementary Fig. 1b). We therefore sequenced only the single-band clones by using a 3100 DNA sequencer (Applied Biosystems, Foster City, CA, USA). Among 1,044 randomly selected colonies from the final instar-specific library and 1,028 from the penultimate instar-specific library, 252 and 554 colonies were sequenced, respectively (including the 40 colonies analyzed initially, Supplementary Fig. 1). We also checked the eight multiple-band colonies of each library and confirmed that these colonies contain inserts of any one of Fsg01–Fsg03 or Psg01–Psg03, respectively. The redundancy rates were {(988 − 514)/988} × 100 = 48% and {(1,004 − 212)/1,004} × 100 = 79%, respectively. We obtained a sequence of ∼650 nucleotides in average. To confirm the sequence of each cDNA, we performed sequencing from both ends. To identify the genes from which these partial sequences were derived, cDNA fragments were subjected to 5′ and 3′ rapid amplification of cDNA ends technique, and their nucleotide sequences were determined. Clustering was performed manually by collecting identical clones using a criterion of greater than 95% identity in a sequence greater than 100 nucleotides. We also referred to BLAST results for clustering. After clustering, 806 clones from both subtractive libraries were classified into 98 independent nonredundant contiguous cDNAs.

Expression analysis by RT-PCR

To analyze the transcription patterns of candidate stage-specific genes during molt, total epidermal RNA was isolated from larvae every 4 h during a molting period using TRI reagent. RNA (2 μg) was reverse-transcribed with random primer (N6) using a First-Strand cDNA Synthesis Kit (Amersham, Sunnyvale, CA, USA). Mixtures of the different timed cDNA samples (every 4-h cDNA samples of each stage) were used as the templates for the second, third, and fourth molts in initial screening and semiquantitative reverse transcriptase (RT) PCR, and each of the 4-h cDNA sample, which was not the same as used for mixture samples, was used as the templates to confirm stage specificity. After activating Taq DNA polymerase at 94°C for 2 min, PCR was performed for 30 cycles of 94°C for 30 s, 52–60°C for 30 s, and 72°C for 60 s. The primers for each gene are shown in Supplementary Table 1. The range of product length is 80–655 bp. The gene for the ribosomal protein L3 (rpL3, Futahashi and Fujiwara 2006) that is expressed constitutively in the cell was used as an internal control for normalization of equal sample loading. For semiquantitative RT-PCR, RT products were subjected to fivefold serial dilutions followed by PCR with primer pairs for each gene simultaneously.

in situ hybridization

A whole dorsal integument from the thoracic 2 segment to the abdominal 7 segment was dissected from a larva. After the fat body and muscle attached to the epidermis were carefully removed, larval epidermis was fixed immediately in 4% paraformaldehyde in phosphate-buffered saline. In situ hybridization was performed as described in Futahashi and Fujiwara (2005). Digoxigenin-labeled antisense RNA probes were used, and the color reaction was performed at room temperature in 100 mM Tris–HCl, 100 mM NaCl, and 50 mM MgCl2 (pH 9.5) containing 3.5 μl/ml 5-bromo-4-chloro-3-indolyl-phosphate, 4-toluidine salt, and 4.5 μl/ml nitroblue tetrazolium chloride. Digoxigenin-labeled sense-strand probes were used as negative controls. Template of the RNA probe for each gene was prepared using the same primers as RT-PCR analysis, and the probe lengths were described in Supplementary Table 1. The range of product length is 150–455 bp. Six to 24 pieces of integument were used in each probe.

Phylogenetic analysis

Sequences were aligned using Clustal_X (Thompson et al. 1997). Phylogenetic trees were constructed by the neighbor-joining method with the MEGA4 program (Tamura et al. 2007). The confidence of the various phylogenetic lineages was assessed by the bootstrap analysis. We compared several related genes with the phylogenetic analysis. The following sequences were used to create the diagram (see Fig. 2): Fsg02, P. xuthus Fsg02 (AB264632); Aa-JF23, Atrophaneura alcinous JF23 (BAG32140); Ms-INSa, Manduca sexta insecticyanin-a (P00305); Ms-INSb, M. sexta insecticyanin-b (Q00630); Bm-bombyrin, Bombyx mori bombyrin (NP_001036872); Pb-BBP, Pieris brassicae bilin-binding protein (P09464); Pr-BP1, Pieris rapae bilin-binding protein 1; Pr-BP2, P. rapae bilin-binding protein 2. Amino acid sequences of Pr-BP1 and Pr-BP2 were referred to Kim et al. (2006).

Fig. 2
figure 2

Deduced amino acid sequence of Fsg02 protein and alignment of amino acid sequences of other related genes. The conserved four cysteine residues are denoted by asterisks. Conserved residues are highlighted, and functionally or structurally similar residues are shadowed

The culturing of juvenile instar epidermis

To investigate the effects of JH on the gene expression, we dissected a whole dorsal integument from the thoracic 2 segment to the abdominal 7 segment of fourth instar epidermis containing tubercle regions just after the third ecdysis and cultured them with or without JHA (methoprene) in 0.5 ml Grace’s medium at 25°C. To alter JH conditions, we precultured epidermis with 1 μg/ml methoprene (+JH) or without methoprene (−JH) for 24 h (Fig. 7a). After the JH treatment, we applied 20E to the medium for 16 h. The culture medium was then replaced with medium lacking 20E for 24 h (Fig. 7a). This 20E treatment mimics an ecdysone pulse during the fourth molt and is necessary for the induction of the fourth-molt-specific genes (Hiruma and Riddiford 1990). We compared mRNA expression during the fourth molt by RT-PCR.

Results

Subtraction screening of stage-specific epidermal cDNA clones

After analyzing 2,072 clones (including redundant clones) from two subtractive libraries, 98 independent nonredundant contiguous cDNAs were classified. Three cDNAs were found in both the penultimate- and final instar libraries. BLASTx searches for Bombyx cuticular protein genes (Futahashi et al. 2008) and NCBI database (http://blast.ncbi.nlm.nih.gov/Blast.cgi) against all nonredundant sequences in GenBank showed that these three genes bore significant similarity to actin (accession no. CAA28818, P < e −150), and two cuticular protein genes (BmorCPR2, P = e −47 and BmorCPR4, P = e −32) of the silkworm B. mori, respectively. We denoted these genes as actin, PxutCPR2, and PxutCPR4, respectively. Expression analysis with RT-PCR indicated that these genes are not differentially expressed (Fig. 1b,c), suggesting that their expression is ubiquitous and is not completely eliminated by the cDNA subtraction method. Because out of over 2,000 clones, only three genes were present in both (reciprocally subtracted) libraries, we thought that the subtraction step was successful. Excluding these three genes, 31 candidate Fsgs (Fsg01Fsg31), and 64 candidate Psgs (Psg01–Psg64) were isolated (Tables 1 and 2). A BLASTx search for Bombyx cuticular protein genes (Futahashi et al. 2008) showed that 39 genes have sequence similarity (cutoff threshold E values: P < e −10). Other than cuticular protein genes, BLASTx searches against all nonredundant sequences in GenBank showed that 49 genes display significant similarity with known proteins (cutoff threshold E values: P < e −10), and a further gene, Fsg25, had weak similarity to known proteins (P = 3e −9). Among the remaining six transcripts, four do not have long open reading frames (>200 bp), suggesting that these transcripts may be 3′ untranslated regions. The deduced amino acid sequences of Fsg20 and Psg36 showed no significant similarity with known proteins (Tables 1 and 2). Fsg02, Psg12, and Psg14 are also reported as BBP, HCP1, and HCP2 by Futahashi and Fujiwara (2008).

Table 1 Final instar specific cDNA clones from cDNA subtraction and match probabilities (P < e −9) resulting from blastx search
Table 2 Penultimate instar specific cDNA clones from cDNA subtraction and match probabilities (P < e −9) resulting from blastx search

Stage-specific expression analysis among the second, third, and fourth molts in the epidermis

We first screened genes with stage-specific expression by RT-PCR. The expression of genes involved in the larval body markings is expected to be changed during the fourth molt. As shown in Fig. 1b, two Fsgs (Fsg02, Fsg20) were obviously expressed more strongly at the fourth molt than at the second or third molt (Fig. 1b, green arrowheads), and 11 Psgs (Psg01, Psg05, Psg10, Psg12, Psg13, Psg14, Psg15, Psg22, Psg27, Psg31, and Psg48) were obviously expressed more weakly at the fourth molt than at the second or third molt (Fig. 1b, red arrowheads). To confirm this stage specificity, the expression levels of these genes were also analyzed every 4 h during the third and fourth molts (Supplemental Fig. 2). We also confirmed that these genes were differentially expressed between the third and the fourth molts by semiquantitative RT-PCR (Fig. 1c). Although more quantitative analysis is needed to identify the stage specificity more clearly, we used these results as an initial discovery tool. Other genes also show stage specificity. Expression of Fsg07, Fsg09, Fsg14, Fsg15, Fsg18, Psg16, and Psg38 are stronger at the third and fourth molts than the second molt (Fig. 1b). Expression of Psg24, Psg42, Psg47, Psg49, Psg50, Psg55, and Psg60 are weaker at the third and fourth molts than the second molt (Fig. 1b). However, the specificity did not correspond to the fourth molt, and we therefore exclude these genes from further study.

Fsg02 is similar to blue pigment-binding proteins

Fsg02 has sequence similarity to the lipocalin family of butterflies and moths (Fig. 2), including JF23 of A. alcinous (48% identity), bombyrin of B. mori (43% identity), bilin-binding protein of P. brassicae (41% identity), two bilin-binding proteins of P. rapae (41% and 41% identity), and insecticyanin-b and insecticyanin-a of M. sexta (39% and 38% identity, respectively). Bilin-binding proteins of P. rapae was referred to Kim et al. (2008), and other genes were displayed significant sequence similarity by BLASTx search against all nonredundant sequences in GenBank. The four conserved cysteine residues of these proteins are also conserved in Fsg02 (Fig. 2, asterisks). All of them are binding proteins of lipophilic ligands. JF23 is a putative binding protein for lipophilic substances related to butterfly oviposition (Tsuchihara et al. 2000), and bombyrin is a neural lipocalin, especially present in the central nervous system (Sakai et al. 2001). Importantly, both bilin-binding protein and insecticyanin are blue pigment-binding proteins, which are associated with larval greenish coloration (Riddiford et al. 1990; Schmidt and Skerra 1994; Kim et al. 2006). The strong similarity between the Fsg02 amino acid sequence and the sequences of these proteins, together with its specific expression at the fourth molt, suggests that Fsg02 is involved in the greenish coloration of P. xuthus larvae. As described above, Fsg20 shows no similarity to other sequences in the databases, and its homolog is not even found in the B. mori genome.

Fsg02 and Fsg20 expression correlates strongly with the green region of the final instar

We used whole-mount in situ hybridization to investigate the relationships between the distribution of the two Fsgs (Fsg02, Fsg20) mRNAs and larval green coloration. Fsg02 was strongly expressed in the presumptive green region of the fifth instar larvae (Fig. 3a,b). In the final instar, P. xuthus larva represents overall green coloration, except for white and black markings. Green pigments were not observed in the black and white stripes at the edge of the V-shaped markings (Fig. 3a). Fsg02 expression was not detected in the two lines corresponding to these black (Fig. 3a, arrows) and white stripes (Fig. 3a, arrowheads). To confirm the relationship between Fsg02 expression and the V-shaped markings, an integument sample was taken just before the fourth ecdysis (17 h after HCS), at which time the black stripe was already pigmented and used for in situ hybridization. One line of the Fsg02-nonexpressed region is identical to the black stripe of the V-shaped markings (Fig. 3a, arrows). During the third molt, the Fsg02 signal was not detectable, except for a spot on the abdomen (Fig. 3c, arrows). These spots probably coincide with the pale blue spot of the fourth instar larvae (Fig. 3c, white arrows). Thus, as expected by its presumed blue pigment-binding function, Fsg02 was specifically expressed in the presumptive green region of the fifth instar and the pale blue spot region of the third instar.

Fig. 3
figure 3

Expression patterns of Fsg02 and Fsg20. Expression patterns of Fsg02 and Fsg20 at 15 or 17 h after HCS during the molt to the fifth instar (a, b) and at 14 h after HCS during the molt to the fourth instar (c) are shown (abdominal 4–5 segments [a, c] and thoracic 3 segment [b]). Lower panels in a show the magnification of red boxes in the upper panels. The black region was already pigmented at 17 h after HCS. Arrows and arrowheads in a indicate the black and white stripe at the edge of V-shaped markings, respectively. Arrows in b indicate the eyespot black region, and arrowheads in b indicate the region where Fsg20 signal is detected but Fsg02 signal is not. Arrows in c indicate the pale spot region. Scale bars = 1 mm

Like Fsg02, Fsg20 was also expressed in the presumptive green region of the fifth instar larva (Fig. 3a,b). Unlike that of Fsg02, the Fsg20 signal was not detected in one line of the V-shaped markings (Fig. 3a, arrows). Around eyespot markings, expression of Fsg20 differed from that of Fsg02. The Fsg02 signal was detected at the black eyespot region, but the Fsg20 signal was not (Fig. 3b, arrows). Around the black eyespot region, there were some regions where the Fsg20 signal was detected, but the Fsg02 signal was not (Fig. 3b, arrowheads). The body coloration was yellowish at the area where the Fsg20 signal alone was detected, suggesting that Fsg20 is associated with yellowish coloration. During the third molt, the Fsg20 signal was not detectable, even in the pale spot region (Fig. 3c). These results suggest that Fsg20 expression is associated with the yellow or green region of the final instar.

Several cuticular proteins are associated with juvenile instar-specific tubercle structures

Between two subtractive libraries, we found 41 putative cuticular protein genes (17 Fsgs, 22 Psgs, PxutCPR2, and PxutCPR4, the latter two genes being found in both subtractive libraries, Tables 1 and 2). Each of them contains a putative signal peptide sequence at the N terminus. Among these 41 cuticular proteins, 11 genes (Psg01, Psg05, Psg10, Psg12, Psg13, Psg14, Psg15, Psg22, Psg27, Psg31, and Psg48) were strongly expressed at the second and third molts and thought to be juvenile instar specific in the larval epidermis (Fig. 1b, red arrowheads). Juvenile instar larvae of P. xuthus have many tubercle structures on their surfaces (Fig. 1a, inset, Fig. 4, arrowheads). Cuticular proteins are necessary for exoskeleton construction and are good candidates for regulators of juvenile instar-specific tubercle characteristics. Interestingly, all of these 11 young larva-specific Psgs were expressed specifically in the presumptive tubercle regions at 14 h after HCS during the third molt (Fig. 4, arrowheads). The expression pattern of each Psg gene was slightly different. In the small tubercle region, Psg01 and Psg10 signals were barely detected, Psg05 and Psg48 signals were faintly detected, whereas other Psgs were clearly expressed (Fig. 4, red arrowheads), suggesting that the composition of cuticular proteins was different due to the tubercle size. We also analyzed spatial expression pattern 13 h after HCS and found similar expression patterns (data not shown). During the fourth molt, the expression of these genes was not detectable (Futahashi and Fujiwara 2008, and data not shown).

Fig. 4
figure 4

Expression patterns of tubercle-associated hard cuticular protein genes at 14 h after HCS during the third molt. Black arrowheads indicate the position of large tubercle regions, whereas red arrowheads indicate the position of small tubercle regions. The fourth instar patterns are also shown (leftmost). Scale bars = 1 mm

Relationships between cuticular protein motif and expression pattern

Cuticular proteins have been categorized according to their conserved motifs. The type of each cuticular protein gene is denoted in color in Tables 1 and 2 (red: RR-1, blue: RR-2, green: Tweedle, yellow: cuticular protein with a 44-amino acid motif-like (CPFL), purple: glycine-rich, and brown: others). The most prevalent motif is the Rebers and Riddiford Consensus (R&R Consensus), which in an extended form is known to bind chitin (Rebers and Willis 2001; Togawa et al. 2004, Willis et al. 2005). Proteins with R&R Consensus can be split into three groups, RR-1, RR-2, and RR-3, according their amino acid sequences (Willis et al. 2005). It is generally claimed that RR-1 is characteristic of proteins in soft and flexible cuticles, and RR-2 proteins are associated with stiff and hard cuticles (Willis et al. 2005). Because the precise localization of individual cuticular protein mRNAs has been rarely reported, we analyzed the relationships between the cuticular protein motifs and their expression pattern. Nine genes (Fsg13, Fsg16, Fsg21, Fsg28, Psg31, Psg38, Psg46, PxutCPR2, and PxutCPR4) belong to RR1 protein, and ten genes (Psg01, Psg05, Psg09, Psg10, Psg13, Psg15, Psg22, Psg27, Psg34, and Psg48) belong to RR2 protein judged by a tool based on profile hidden Markov models on the cuticleDB website (http://bioinformatics2.biol.uoa.gr/cuticleDB/index.jsp; Supplementary Table 2). Most of these juvenile instar tubercle-specific proteins belong to RR-2 protein (Fig. 6); however, Psg31 belongs to RR-1 protein (Supplementary Table 2).

Among RR-2 protein genes, Psg09 and Psg34 seem to be not stage specific (Fig. 1b). By whole-mount in situ hybridization, we found that Psg09 was expressed broadly, whereas Psg34 was expressed at the basal cells of hairs at both the third and fourth molts (Fig. 5). Except for Psg31, RR-1 protein genes were expressed at both the third and fourth molts (Fig. 1b). Interestingly, the expression of RR-1 protein gene PxutCPR4 was weaker at the tubercle than the other region (Fig. 5, arrowheads). We also analyzed the spatial expression pattern of cuticular protein without R&R Consensus. The expression of the glycine-rich type of cuticular protein gene Psg32 was quite similar to that of PxutCPR4 (Fig. 5). As for CPFL proteins, there are three CPFL genes (Psg12, Psg14, and Psg56) in two subtractive libraries, and all three have sequence similarity with BmorCPFL4 (Table 2). Although expression of Psg12 and Psg14 coincided with the tubercle region (Fig. 4), Psg56 was broadly expressed in both the third and the fourth molts (Fig. 5). We found similar expression patterns 13 and 14 h after HCS during the third molt and 13, 14, and 15 h after HCS during the fourth molt (data not shown). Thus, the expression pattern of each cuticular protein genes was intricately regulated and was not entirely associated with the sequence similarity or specific motif. Phylogenetic analysis of cuticular proteins showed that tubercle-associated cuticular proteins do not form a single cluster (Fig. 6, red boxes) and suggests that association with tubercle development evolved independently.

Fig. 5
figure 5

Expression patterns of cuticular protein genes at 14 h at abdominal 4–5 segments after HCS during the third molt (upper panels) and at 15 h after HCS during the fourth molt (lower panels). Arrowheads indicate the tubercle positions of the fourth instar larva. The fourth and fifth instar patterns are also shown (leftmost). Type of cuticular proteins are shown below. Scale bars = 1 mm

Fig. 6
figure 6

Neighbor-joining tree of cuticular protein genes based on their amino acid sequences. The numbers at the tree edges represent the bootstrap values. The scale bar indicates the evolutionary distance between the groups. Red boxes indicate the tubercle-associated cuticular proteins of P. xuthus in this study

Expression of stage-specific cuticular proteins is regulated by JH

We previously found that JH treatment induces HCP1 (Psg12) and HCP2 (Psg14) expression and suppresses BBP (Fsg02) expression in vivo (Futahashi and Fujiwara 2008). To investigate JH effect of expression of stage-specific Fsgs and Psgs in vitro, we cultured juvenile instars’ epidermis with or without JHA (methoprene, Fig. 7a). The expression of the young larva-specific Psg mRNAs was suppressed in the absence of JHA, whereas the expression of PxutCPR2 and PxutCPR4 mRNAs was not (Fig. 7b, −JH). However, JHA treatment of the epidermis induced the expression of young larva-specific Psg mRNAs during the fourth molt, caused by an ecdysone pulse (Fig. 7b, +JH), which is consistent with the results shown in Fig. 1b. These results suggest that JH regulates the expression of the young larva-specific Psg genes, and high levels of JH in the early phase of the fourth instar cause juvenile-type gene expression. We did not observe Fsg02 and Fsg20 expression even in the absence of JHA (data not shown), suggesting that the cultured condition was insufficient or that another factor was necessary for the induction of these two Fsgs.

Fig. 7
figure 7

a Time course of hormonal treatment in epidermis culture. See “Materials and methods” for detail. b Expression analysis of cuticular protein genes by RT-PCR. Larval epidermis during the second, third, and fourth molt (left) and epidermis cultured with JHA or without JHA (right) were used as the template. Culture condition is shown in a. The gene for the ribosomal protein L3 (rpL3) that is expressed constitutively in the cell was used as the control

Discussion

Expression analysis of the genes isolated by cDNA subtraction revealed that Fsg02 and Fsg20 are strongly expressed in the presumptive green regions only during the molt to the final instar (Fig. 1b,c). Also, several cuticular protein genes are strongly expressed in the tubercle region during the early larval molting period but not during the final larval molt (Fig. 1b). The expression of these genes coincides with either the final instar-specific green coloration or the complex juvenile instar-specific exoskeletal structure (Figs. 3 and 4). Our results provide novel molecular markers and insights into the molecular mechanisms of the larval color pattern and the body shape.

Genes involved in green coloration

The green coloration of butterfly larvae is due in part to the presence of blue and yellow pigments. Blue pigments are thought to function in conjunction with yellow carotenoids as camouflage (Riley et al. 1984), and they are usually associated with proteins in pigment–protein complexes (Law and Wells 1989). However, both blue and yellow pigment-binding protein has been identified only in a few species. Similarity analysis of the amino acid sequence of Fsg02, together with its region-specific expression at the fourth molt, suggests that Fsg02 may be equivalent to a P. xuthus bilin-binding protein (or insecticyanin).

Our present results also show that there are some differences among Fsg02, M. sexta insecticyanin and P. brassicae bilin-binding protein. The first difference of Fsg02, M. sexta insecticyanin and P. brassicae bilin-binding protein is its expression pattern. Fsg02 and M. sexta insecticyanin is strongly expressed mainly in the epidermis (Li and Riddiford 1992, 1994), whereas P. brassicae bilin-binding protein is expressed not in the epidermis but in the fat body and gonads (Schmidt and Skerra 1994). Fsg02 is strongly expressed only at the final larval molt (Fig. 1b,c), whereas M. sexta insecticyanin is strongly expressed during all the early larval instars (Li and Riddiford 1992), and P. brassicae bilin-binding protein is expressed from the third ecdysis to adult (Schmidt and Skerra 1994). The second difference is its sequence similarity. Notably, these three genes do not form a single cluster in the phylogenetic analysis. The highest degree of identity was observed between Fsg02 and JF23 of A. alcinous, a putative binding protein for lipophilic substances related to butterfly oviposition (Tsuchihara et al. 2000). Because A. alcinous larva does not exhibit green coloration, JF23 is unlikely to be associated with the production of larval coloration. Although the sequence similarity among Fsg02, M. sexta insecticyanin, and P. brassicae bilin-binding protein is rather weak (Fig. 2), the three-dimensional structures of M. sexta insecticyanin are similar to that of P. brassicae bilin-binding protein (Holden et al. 1987; Huber et al. 1987; Schmidt and Skerra 1994), suggesting that these three proteins bind the same known ligand, biliverdin IXγ (Goodman et al. 1985; Huber et al. 1987). These findings suggest that these three proteins evolved independently, to serve a common physiological role in each organism. The comparison of blue pigment-binding protein among more species may clarify the evolution of both pigment-binding protein and larval green coloration.

Expression analysis suggests that Fsg20 is also associated with green coloration (Fig. 3). Since the areas where Fsg20 was expressed but Fsg02 does not acquire a yellowish coloration (Fig. 3b, arrowheads), one possibility is that Fsg20 is associated with yellow pigment binding. Larvae of the closely related species Papilio machaon have many yellow spots in the black marking. Expression analysis of Fsg20-like genes among the green larvae of other species may clarify the relationship between Fsg20 and coloration.

The body markings and structure of the larva of P. xuthus change dramatically during the final molt; therefore, the epidermis is likely to display a unique gene expression pattern during the fourth molt. However, in a search for final instar-specific genes among the more than 1,000 clones isolated, only Fsg02 and Fsg20 are highly expressed at the final molt, suggesting that only a few genes can change body coloration dramatically.

Intricate regulation of the spatial expression pattern of each cuticular protein gene

The cuticle structures between the flexible (expansion) and rigid (nonexpansion) cuticle in M. sexta larva are different (Wolfgang and Riddiford, 1981). Several reports have suggested that RR-1 protein is associated with a flexible cuticle and RR-2 protein is associated with a rigid cuticle (Cox and Willis 1985; Lampe and Wilis 1994, Gu and Willis 2003; Willis et al. 2005). However, precise spatial localization of each cuticular protein gene was largely unknown. We newly revealed the spatial expression pattern of many cuticular proteins and found that the spatial expression pattern of each cuticular protein did not entirely correlate with their motif (Fig. 6). The expression of 11 cuticular protein genes coincided with the tubercle structures, suggesting that these proteins are associated with complex structure. The combination of cuticular protein genes at the tubercle region is intricately regulated. Some cuticular proteins, such as PxutCPR4 and Psg32, are less expressed at the tubercle regions (Fig. 5), suggesting that the expression pattern change of flexible cuticular protein is also necessary for the construction of tubercle structures. We found that the expression patterns of two RR-2 protein genes are quite different from other RR-2 proteins and that expression patterns are different even among the closely related CPFL protein genes (Fig. 5). Our findings indicate that it is difficult to predict the expression pattern of each cuticular protein gene by sequence similarity alone (Figs. 5 and 6). Although it is needed to study physical differences of the cuticle between the tubercle and nontubercle regions, comparison of the spatial expression pattern is necessary for understanding each cuticular protein function. Juvenile instar larva of P. xuthus is a good example to analyze, because it is easy to recognize both rigid (tubercle) and flexible regions.

In M. sexta, JH alters the combinations of cuticular protein genes expressed. Although the JH regulation in the epidermis has been reported between final larval instar and pupa (Riddiford et al. 2003), no differential regulation of gene expression by hormones has been reported between the third and fourth molts. We found that the expression of tubercle-associated cuticular protein genes is regulated by JH both in vivo (Futahashi and Fujiwara 2008) and in vitro (Fig. 7). The expression of these cuticular protein genes is induced by 20E in the presence of JH during a larval molt. Then, during the final larval molt, the lower level of JH prevents this induction. Cuticular proteins are good representative proteins with which to study the effects of JH on gene expression. In this study, we focused on genes whose expression patterns were obviously different at the fourth molt (Fig. 1b). A more quantitative comparison of the remaining genes may clarify the differences in gene expression between juvenile and final instar larvae.