Introduction

Bacteria efficiently convert carbon and nitrogen sources into various classes of intracellular and extracellular biopolymers with distinct chemical properties and biological functions1,2. These polymeric substances provide energy storage, adhesion, or protection functions in cells, and their production is regulated by environmental stimuli3. Over the past few decades, a better understanding of the molecular mechanism underlying the biosynthesis of biopolymers has shed light on the production of tailor-made biopolymers and their utility in the industrial field4,5. Among them, polypeptides contribute significantly to a sustainable society due to their biomass origin, biodegradability, and unique functionality6.

Cyanophycin, a nonribosomally synthesized polypeptide consisting of equimolar amounts of aspartate and arginine as the backbone and branched sidechain, respectively, was first identified in cyanobacteria in the form of opaque and light-scattering cytoplasmic granules (Supplementary Fig. 1). Cyanophycin has high nitrogen to carbon ratio of 1:2, making it a reservoir of fixed nitrogen in most cyanobacteria7. Simultaneous oxygenic photosynthesis and N2 fixation are a significant challenge for microorganisms because the O2 produced from CO2 fixation is inhibitory to nitrogenase, which catalyzes the conversion of N2 to NH38. To solve this problem, diazotrophs have developed physical strategies to temporally and spatially separate nitrogenase from O2. The heterocyst-forming cyanobacteria, such as Anabaena spp., perform CO2 fixation and N2 fixation in different cells: CO2 fixation in the vegetative cells and N2 fixation in heterocysts9. Other cyanobacteria, such as Trichodesmium spp., temporally segregate the processes by a diel cycle with CO2 fixation during the day and N2 fixation at night10. The fixed nitrogen accumulates at night as cyanophycin granules separated from the other cellular components but not as compounds that affect cellular metabolic dynamics. Cyanophycin is degraded by cyanophycinase and isopeptidase to release nitrogen to support cell growth during the day. Thus, cyanophycin granules serve as dynamic storage bodies in diazotrophic cyanobacteria to uncouple N2 fixation from overall growth dynamics9. Cyanophycin accumulation enables the optimization of the nitrogen utilization of cyanobacteria in natural environments with fluctuating and limiting nitrogen supplies11.

The biosynthesis of cyanophycin is achieved by a single enzyme named cyanophycin synthetase (CphA1) that catalyzes peptide elongation coupled with ATP hydrolysis5. CphA1 uses a low-molecular-weight cyanophycin (at least 3‒4 dipeptides long) as a primer for cyanophycin polymerization12. Recently, the tertiary structure of CphA1 has become available and reveals that amino acid polymerization is driven by three distinct domains13: the N-terminal domain (N domain), the middle glutathione synthetase-like domain (G domain) with the ATP-grasp fold, and the MurE-like muramyl ligase domain (M domain). The discrete active sites of the G and M domains are used for extending the Asp backbone and attaching the Arg sidechain, respectively (Supplementary Fig. 1). The N domain has been proposed to anchor cyanophycin polymers through electrostatic interactions13. However, it remains unclear how ATP-dependent addition of aspartate to cyanophycin is initiated at the catalytic site of the G domain.

In this work, given the importance of CphA1 in cyanophycin biosynthesis and the biotechnological value that structural information on enzyme-substrate interactions would provide, we investigate the CphA1 enzyme from the marine diazotroph Trichodesmium erythraeum IMS101 (TeCphA1), an important contributor to global nitrogen and carbon cycling14. We solve the TeCphA1 structure bound to ATP analogs, cyanophycin primer peptides, and the substrate aspartate, in addition to apo and ATP analog-bound structures. Key elements, including residues and protein dynamics required for enzymatic activity, are identified based on experimental evidence of the G and M domains. In particular, we reveal the structural bases for aspartate recognition and condensation, which deepen our understanding of the reaction of cyanophycin biosynthesis in the G domain. Moreover, our data show a potential role of protein dynamics in the catalytic efficiency for the reaction of Arg sidechain condensation.

Results

Cryo-EM structures of TeCphA1

The TeCphA1 structures in three different states were determined by single-particle cryogenic electron microscopy (cryo-EM): apo, the adenosine-5’-(γ-thio)-triphosphate (ATPγS, an ATP analog)-bound state, and the substrate-bound state (TeCphA1-aspartate-ATPγS-cyanophycin primer peptide complex) at 2.91, 2.96, and 2.64 Å resolution, respectively (Fig. 1a, Supplementary Figs. 25, and Supplementary Table 1). The cryo-EM data of the substrate-bound state were collected under a high concentration of l-aspartate (100 mM), cyanophycin primer peptide, (β-Asp-Arg)4, and ATPγS. TeCphA1 consists of three domains: an N-terminal domain (N domain, residues 1‒161) that adopts a partially similar structure to E. coli RNA-polymerase α-subunit13, a glutathione synthetase-like domain (G domain, residues 162‒487), and a MurE-like muramyl ligase domain (M domain, residues 488‒876) (Fig. 1b). TeCphA1 basically adopts a tetrameric assembly with D2 symmetry (Fig. 1a and Supplementary Fig. 6), which is a common architecture with another cyanobacterial CphA1 from Synechocystis sp. UTEX2470 (SuCphA1) (Supplementary Fig. 7)13.

Fig. 1: Tetrameric structure and domain architecture of TeCphA1.
figure 1

a Cryo-EM maps of TeCphA1 in the apo, ATPγS-bound, and substrate (ATPγS/aspartate/(β-Asp-Arg)4)-bound states. The maps are divided into six modules (N, Gcore, Glid, Gω, Mcore, and Mlid) and ATPγS, which are shown in different color coordination. The map of (β-Asp-Arg)4 is highlighted with a yellow dashed-line box in the density map of substrate-bound TeCphA1. A lump on the extra map (arrowhead) is observed between two Glid modules in the apo and ATPγS-bound states. Each map is viewed from a twofold axis of tetrameric TeCphA1. b Residue range of each module on the TeCphA sequence and the reaction scheme of the G and M domains. The color coordination of each module is consistent with that in Panel a. c Ribbon diagram of the tetrameric TeCphA structure in the substrate-bound state. Each protomer is divided into the above-defined modules with the same color coordination as Panel a.

The G domain, which is responsible for backbone elongation through linking aspartate to the C-terminal backbone carboxy group of cyanophycin, is based on Gcore (residues 162‒234, 306‒324, and 400‒487), with Glid (residues 235‒305) and Gω (residues 325‒399) modules inserted into the sequence (Fig. 1b and Supplementary Fig. 1). Gcore is a central module that provides a large dimer interface with another Gcore in the TeCphA1 tetramer (Fig. 1c and Supplementary Fig. 8a). Glid acts as a lid of the ATP-binding site of Gcore, and Gω is positioned to surround the substrate-binding site together with Glid. The relative map intensity of Gω versus Gcore and Glid is higher in the substrate-bound state than in the other two states (Fig. 1a), suggesting that Gω is clearly visible as a result of reduced mobility, which is probably due to the binding of aspartate and (β-Asp-Arg)4.

The M domain, which catalyzes the addition of the Arg sidechain to the β-carboxy group of the C-terminal Asp residue of cyanophycin (Fig. 1b), is divided into two modules, Mcore (residues 488‒723) and Mlid (residues 724‒876). Mcore contacts the intramolecular N domain and three Gcore to form the TeCphA1 tetramer (Fig. 1c and Supplementary Fig. 8b), resulting in the clear density map of Mcore (Fig. 1a), and ATPγS binds to the surface pocket of Mcore. Therefore, Mcore assumes a distinct catalytic site from the G domain in TeCphA1, which is similar to the structural observation of SuCphA113. On the other hand, the structural evidence of the ATP-bound forms of the G and M domains remains insufficient for the CphA1 enzymes from Acinetobacter baylyi DSM587 (AbCphA1) and Tatumella morbirosei DSM23827 (TmCphA1)13. Mlid is not well resolved in the apo- and ATPγS-bound states (Fig. 1a and Supplementary Fig. 8c), suggesting the mobile nature of Mlid. However, in the substrate-bound state, the Mlid modules were partially resolved around ATPγS in chains A and B (Fig. 1a and Supplementary Fig. 6). The density of Mlid was relatively weak compared with that of Mcore. A large shift of Mlid was observed in the other CphA1 enzymes13 and Mur ligases15,16,17.

TeCphA1 has a high sequence identity (69.6%) and similarity (83.2%) to SuCphA1 in the region consisting of the N, G, and M domains (residues 1‒876). We compared the structures of TeCphA1 in the substrate-bound state and SuCphA1 that bound a cyanophycin analog (β-Asp-Arg)8-NH2 and  an ATP analog 5’-adenylylmethylenediphosphonate (AMPPCP) to the G domain (PDB 7LGJ)13. These structures are similar to each other. The overall structures of the N domain and each module of the G and M domains (Gcore, Glid, Gω, Mcore, and Mlid) were quite similar between TeCphA1 and SuCphA1 (root-mean-square deviation (RMSD) = 0.521‒1.601 Å). The spatial arrangement of the N domain and the Gcore and Mcore modules was also well conserved between TeCphA1 and SuCphA1 (Supplementary Fig. 7b), whereas the relative orientation of the other modules versus Gcore and Mcore was different (Supplementary Fig. 7c, d). In particular, the distance between the Mlid and N domain of TeCphA1 is shorter than that of SuCphA1 (Supplementary Fig. 7d).

Functional roles of Gcore and Gω in adding aspartate to cyanophycin primer peptide

The TeCphA1 structure in the substrate-bound state visualizes the initial state of the catalytic reaction for the ATP-dependent addition of aspartate to the C-terminal backbone carboxy group of cyanophycin (Fig. 1b and Supplementary Fig. 1). In the cryo-EM map processed with C1 symmetry, all the TeCphA1 protomers in the tetramer bound a cyanophycin primer peptide, (β-Asp-Arg)4, on the surface of Gcore (Fig. 1a). The cryo-EM map showed a clear density for the C-terminal β-Asp-Arg dipeptide unit (4th unit) and the other Asp backbones with a hook-like shape (Fig. 2a). The Arg sidechain of the 3rd unit partially showed weak density. Since the Arg sidechains of the 1st and 2nd units were invisible in the cryo-EM map, these models could not be built in the substrate-bound TeCphA1 structure. In contrast, the Arg sidechain of the 2nd unit was visible in the cryo-EM map of SuCphA113, and its guanidium group was located near Ala188. This residue is substituted with Phe on many cyanobacterial CphA1, including TeCphA1 (Fig. 2a and Supplementary Fig. 9). The residue type at position 188 may affect the Arg-sidechain orientation of the 2nd unit on the active site of Gcore. In the TeCphA1 structure, the C-terminal backbone carboxy group and Arg sidechain of the 4th unit interact with Arg309 (the nearest atomic distance, 2.9 Å) and Glu215, which is derived from another protomer forming the Gcore-Gcore dimer (Glu215’) (2.8 Å), respectively (Fig. 2a and Supplementary Fig. 8a). Since the catalytic activity of TeCphA1 was impaired by the Ala substitutions of Glu215 and Arg309 (Fig. 2b and Supplementary Fig. 9), the interaction between the Gcore-Gcore dimer and the 4th unit seems to be required for the activity of TeCphA1.

Fig. 2: Aspartate and cyanophycin primer peptide recognition at the G domain of TeCphA1.
figure 2

a Binding site of ATPγS, the substrate aspartate, and a cyanophycin primer peptide, (β-Asp-Arg)4, on the surface of the G domain. The cryo-EM map carved 2 Å around Mg2+, ATPγS, the substrate aspartate, and (β-Asp-Arg)4 at contour level 5. Individual chains of the G domain dimer are colored cyan (chain D) and medium blue (chain B). The purple structure shows the large loop of Gω of chain D. The repeating units of (β-Asp-Arg)4 are numbered from the N-terminus. b Relative enzymatic activity of TeCphA1 (WT) and each G domain mutant, which was measured by detecting released phosphates. Data are presented as mean ± standard error of the mean (SEM) (n = 4 independent experiments). c Relative activity of TeCphA1 using the primer β-Asp-Arg peptide with different degrees of polymerization (DP = 1‒5). The data of the reaction solution with (right) and without (left) TeCphA1 are presented as mean ± SEM (n = 3 independent experiments). d Cryo-EM map of substrate-bound TeCphA1. Densities of the four Gω modules of the TeCphA1 tetramer are shown in purple.

The mutations of Glu215 and Arg309 (E215A and R309A) of the G domain were commonly analyzed in the two studies on TeCphA1 and SuCphA113. Of the two, R309A (and its corresponding mutation in SuCphA1) showed almost no activities. Both residues seem to be essential for the catalytic activities of TeCphA1 and SuCphA1. On the other hand, while the assay conditions were not the same between the two, the E215A mutation was different between TeCphA1 and SuCphA1. The E215A of TeCphA1 lost most of the catalytic activity (Fig. 2b), but the corresponding mutation of SuCphA1 retained nearly half the activity. Since the interaction between E215 and the C-terminal Arg sidechain of cyanophycin contributes more to the cyanophycin binding in TeCphA1 than in SuCphA1, the results of this mutational analysis seem to be explained by the difference in the structures. Although the importance of the Arg sidechains for the lack of poly-Asp polymerase activity has been explained by the structural observation of SuCphA113,18, our results showed that the Arg sidechain of the 4th unit is a major contributor to (β-Asp-Arg)4 recognition and determines the position of the 4th Asp backbone at the active site of the G domain to avoid polymerization progression with the aspartate backbone alone. Since (β-Asp-Arg)4 is well accommodated at the binding site of Gcore, a cyanophycin molecule with at least 3‒4 dipeptides long seems to be suitable as a primer peptide for TeCphA1 activity (Fig. 2c). This observation is consistent with the previous results of the other CphA118. According to the common structural features between TeCphA1 and SuCphA1, the cyanophycin primer peptide seems to adopt a hook-like shape at the active site of CphA1 (Fig. 2a).

While (β-Asp-Arg)4 was observed in all protomers, the substrate aspartate was observed only in one protomer (chain D) (Fig. 2a). The map intensity of Gω was significantly different among the four protomers, and Gω of chain D showed the highest map intensity of the four Gω (Fig. 2d). The substrate aspartate was visualized in the most stable Gω. The other Gω modules seemed to be more mobile than those in chain D. TeCphA1 required four substrates, l-aspartate, l-arginine, (β-Asp-Arg)4, and ATP, for the catalytic reaction (Supplementary Fig. 10a). Steady-state kinetic analysis showed a typical Michaelis-Menten kinetics for l-arginine, (β-Asp-Arg)4, and ATP (Supplementary Fig. 10b). While we have determined the apparent Km value for ATP, TeCphA1 has two active sites in the G and M domains for the different catalytic reactions with ATP. As observed for the other CphA1 enzyme19, the sequential condensation reaction of TeCphA1 may be rate-limited primarily by the site with lower affinity for ATP when the Km values for ATP differ between the G and M domains. Unlike these substrates, aspartate acts as a positive effector of the catalytic reaction with the Hill coefficient (h) 1.92 ± 0.07. In addition, Khalf and Kprime (= Khalfh) were estimated as 18.0 ± 1.1 mM and 254 ± 21 mM, respectively. As shown in Supplementary Fig. 10b, the catalytic activity of TeCphA1 increases with the increasing aspartate concentration; the cellular aspartate concentration may be a key regulator for the catalytic activity of TeCphA1. The results of the kinetic analysis for aspartate (Supplementary Fig. 10b) suggest that 100 mM aspartate is appropriate for the cryo-EM analysis. While it is unclear why the aspartate molecule was visible in only one subunit, the cryo-EM structure of TeCphA1 with ATPγS, (β-Asp-Arg)4, and aspartate may be a snapshot of the equilibrium state of complex formation between the enzyme and aspartate.

The substrate aspartate was accommodated in a small space beside Ser396 (Fig. 2a). This binding mode seemed to be conserved in SuCphA1 because SuCphA1 activity was lost by the S396W mutation13. The substituted Trp residue prevented the reaction of Asp backbone elongation by occupying the space for substrate aspartate binding (Fig. 2a). Residues contributing to substrate aspartate recognition are conserved in the cyanobacterial CphA1 enzymes (Fig. 2a and Supplementary Fig. 9). Asn394, a residue on the large loop of Gω (residues 389‒399), and Arg323 interacted with the β-carboxy group of the substrate aspartate at distances of 2.5 Å and 3.4 Å, respectively (Fig. 2a). The α-carboxy group of the substrate aspartate was close to the sidechain of Arg458 (2.5 Å) in the TeCphA1 structure. Each Ala mutation of Asn394, Arg323 or Arg458 impaired TeCphA1 activity (Fig. 2b). These observations are consistent with the notion that Gω functions as a module for substrate aspartate recognition.

Catalytically important residues in Gcore and Glid for Asp backbone elongation

The G domain adopts an ATP-grasp fold that requires nucleophilic attack of the C-terminal backbone carboxy group of cyanophycin on ATP for Asp backbone elongation13,20. However, since these two functional groups are negatively charged, it is necessary to avoid electrostatic repulsion between them. The ATP bound to the ATP-grasp fold typically interacts with one to three Mg2+ ions20. In the G domain of TeCphA1, the γ-phosphate chelates two Mg2+ ions together with two acidic residues, Asp431 and Glu450 (Figs. 2a and 3a). This structural feature is similar to SuCphA113 and other ATP-grasp enzymes, such as bacterial glutathione synthetases21 and N5-carboxyaminoimidazole ribonucleotide synthetase (PurK)20. In addition, the α- and β-phosphates of ATPγS interact with the positively charged sidechains of Lys261 and Lys220, respectively (Fig. 3a). While the triphosphate of ATP is neutralized overall by these interactions, there are still two conserved charged residues, Arg309 and Arg323, around the γ-phosphate. The substrate-bound TeCphA1 structure shows that these residues electrostatically interact with the C-terminal backbone carboxy group of (β-Asp-Arg)4 and β-carboxy group of the bound aspartate (Fig. 3a and Supplementary Fig. 9).

Fig. 3: Glid movement dependent on the binding of ATP and substrates, aspartate and (β-Asp-Arg)4.
figure 3

a Residue arrangement around ATPγS, the substrate aspartate, and the C-terminal backbone carboxy group of (β-Asp-Arg)4. Hydrogen bonds and Mg2+ coordination are shown as yellow dashed lines. b Sequence logo depicting different conserved motifs between the P-loopG of CphA1 and the P-loop of the GshF ATP-grasp-like module. The P-loop sequences are shown in red boxes. The numbers under the logo correspond to the residue numbers of TeCphA1 and GshF from Pasteurella multocida. c Structural comparison of Glid among three states of TeCphA1. A comparison was performed after superposing the Gcore modules of the three states. Green spheres represent two Mg2+ ions. Asterisks and red arrows indicate the position of Lys293 and the C-terminal backbone carboxy group of (β-Asp-Arg)4, respectively. d Superposed structures using the Gcore modules of the substrate-bound TeCphA1 and the AMPPCP/(β-Asp-Arg)8-NH2-bound SuCphA1 structure (PDB 7LGJ)13.

The substrate-bound-state structure suggests that the phosphate-binding loop of Glid (hereafter P-loopG, residues 263‒269) (Fig. 3b) plays a critical role in the catalytic reaction. In the substrate-bound state, His267 on the P-loopG is located between γ-phosphate and the C-terminal backbone carboxy group of (β-Asp-Arg)4 (Fig. 3a). This His267 position seems to be suitable to stabilize an acylphosphate intermediate together with Arg309, following the nucleophilic attack of the substrate aspartate. The importance of these residues was confirmed by the conservation of cyanobacterial CphA1 enzymes, including SuCphA1 and decreased activity of a mutant substituted with Ala both in TeCphA1 and SuCphA1 (Figs. 2b and 3b)13. This loop corresponds to the B loop of PurT and PurK22, which are ATP-grasp enzymes of E. coli. These enzymes function in the de novo purine biosynthetic pathway, and the B loop is required to efficiently generate an acylphosphate intermediate in these enzymes. However, His267, which seems to be important to stabilize the acylphosphate intermediate in TeCphA1, is not conserved in PurT and PurK22.

Glid motion in the catalytic mechanism of Asp backbone elongation

Due to the charge neutralization of the γ-phosphate of ATP and the C-terminal backbone carboxy group of cyanophycin, these two groups can approach each other until reaching the distance of nucleophilic attack. A comparison of the three states of the TeCphA1 structure (Fig. 3c) revealed conformational changes of Glid induced by ATP, the substrate aspartate and (β-Asp-Arg)4 binding. These conformational changes seem to contribute to optimizing the relative dispositions of ATP, aspartate, and (β-Asp-Arg)4 for the catalytic reaction. The binding of ATPγS causes Glid to move toward Gcore by 4.6 Å, which is the distance between the Cα atoms of Lys293 (Fig. 3c, asterisk), and the substrate aspartate and (β-Asp-Arg)4 binding induces a further shift of Glid by 4.3 Å. This Glid movement seems to change the binding position of ATPγS toward (β-Asp-Arg)4. Interestingly, Glid and ATPγS in the substrate-bound TeCphA1 structure are located at a position closer to Gcore than those in the SuCphA1 structure bound with the cyanophycin analog (β-Asp-Arg)8-NH2 (Fig. 3d). Since the residues contributing to the reaction of Asp backbone elongation in the G domain are conserved between TeCphA1 and SuCphA1 (Supplementary Fig. 9), the binding of the substrate aspartate seems to modulate the orientation of ATPγS together with Glid.

While the binding of ATP, the substrate aspartate, and (β-Asp-Arg)4 seems to cause the rearrangement of their relative orientations for the catalytic reaction, the γ-phosphate of ATP is less susceptible to nucleophilic attack by the C-terminal backbone carboxy group of (β-Asp-Arg)4, because they are located at a distance of ~6 Å (Fig. 3c). Although Glid moves toward the active site to change the relative positions of ATP and (β-Asp-Arg)4, a further mechanism is required to bridge this distance gap for the catalytic reaction to proceed. One of the possible mechanisms is the thermal motion of Glid, which is mentioned in relation to SuCphA113. Since the cryo-EM structures of TeCphA1 only represent the average position in the equilibrium of the Glid motion, it may be possible to consider that the protein dynamics allow these two molecules to occasionally access the distance of the catalytic reaction.

While the relative orientation of ATPγS and (β-Asp-Arg)4 in the substrate-bound state needs a greater change for the catalytic reaction, the relative orientation of (β-Asp-Arg)4 and the substrate aspartate in the G domain seems to be suitable for ATP-dependent backbone elongation (Fig. 1b and Supplementary Fig. 1). The amino group of substrate aspartate is located 4.1 Å from the carboxy group at the C-terminus of (β-Asp-Arg)4, and the position of the carboxy group may be adjusted by phosphorylation (Figs. 2a and 3c).

Structural and functional roles of Mlid in Arg sidechain condensation

In the substrate-bound state, Mcore was clearly visible in the four protomers of the TeCphA1 tetramer. ATPγS interacts with the conserved P-loop of Mcore (hereafter P-loopM, residues 495‒501) (Fig. 4a and Supplementary Fig. 9). Lys499 and an Mg2+ ion are located between the β-phosphate and γ-thiophosphate groups of ATPγS, and the Mg2+ ion is chelated by Thr500, Thr522, and Glu558. These interactions with ATP are highly conserved in the Mur ligases, which synthesize the peptide stem of bacterial peptidoglycan23,24. Since the Ala mutation of Lys499 impaired TeCphA1 activity (Fig. 4b), the interaction between Lys499 and ATP is also critical to the catalytic activity of CphA1 enzymes.

Fig. 4: Structural bases of Mlid for ATP-dependent Arg sidechain condensation to cyanophycin.
figure 4

a Binding site of ATPγS between Mcore (wheat) and Mlid (cyan) on the surface of the M domain. The cryo-EM map carved 2 Å around Mg2+ and ATPγS at contour level 10. The magenta structure shows the P-loopM. Hydrogen bonds and Mg2+ coordination are shown as yellow dashed lines. b Relative enzymatic activity of TeCphA1 (WT) and each M domain mutant, including the deletion mutant of Mlid (ΔMlid). Enzyme activities were measured by detecting the released phosphates. Data are presented as mean ± standard error of the mean (SEM) (n = 4 independent experiments). c Superposed structures of the M domain in three TeCphA1 states. Asterisks (*) represent the steric crashes of Phe692 with the adenine ring of ATPγS (pink) and Ala755 (ocher). d Localization of a cyanophycin analog, (β-Asp-Arg)3-Asn, on the N and M domains of TeCphA1. The model of (β-Asp-Arg)3-Asn was created by the superposition between the Mcore modules of the substrate-bound TeCphA1 structure and the ATP/(β-Asp-Arg)8-Asn-bound SuCphA1 structure (PDB 7LGQ)13.

While ATPγS is visible in the Mcore modules of all protomers, the cryo-EM densities for Mlid were observed weakly only in protomers A and B (Supplementary Fig. 11), suggesting a relatively mobile nature of Mlid, even in the substrate-bound state. In protomers A and B, Mlid and Mcore sandwich ATPγS (Figs. 1a and 4a), and Arg731 and His748 in Mlid interact with the α-phosphate and γ-thiophosphate groups of ATPγS. Since Mlid was not observed in the ATPγS-bound state, Mlid is unlikely essential to the stable binding of ATPγS.

However, since the deletion mutant of Mlid (ΔMlid) showed substantially reduced activity of TeCphA1, the Mlid module is necessary for the efficient catalytic reaction of TeCphA1 (Fig. 4b) and seems to play a significant role in ATP-dependent Arg sidechain condensation. Superposition of TeCphA1 in apo- and ATPγS-bound structures reveals a conformational change of Phe692 in Mcore upon ATPγS binding, and Phe692 and Asn497 sandwich the adenine ring of ATPγS (Fig. 4c). This sidechain orientation of Phe692 causes steric hindrance with the methyl group of Ala755 in Mlid (Fig. 4c and Supplementary Fig. 12). On the other hand, the sidechain position of Phe692 in the apo form inhibits ATP binding but not the interaction between Mcore and Mlid (Fig. 4c). The flexible nature of Mlid and the conformational change of Phe692 seem to cause mutually exclusive binding conformations of ATP and Mlid, which may facilitate the exchange of ATP and ADP in the reaction cycle. The invisible Mlid in protomers C and D can be explained by mutually exclusive binding conformations, and protomers C and D are considered ATPγS-binding states. Intriguingly, protomers A and B of the substrate-bound state showed simultaneous binding of ATPγS and Mlid to Mcore. We consider that Mlid and ATPγS in protomers A and B are in a metastable state that can be realized due to a high concentration of ATPγS. The interaction between Ala755 and Phe692 in protomers A and B may be energetically unfavored and thus easily changed to the Mlid-binding or ATPγS-binding state under lower ATPγS concentrations.

While ATPγS was visible on all M domains, (β-Asp-Arg)4 was not observed on the M domain of the substrate-bound TeCphA1 structure. This observation is consistent with the previous result that the Arg-unmodified Asp residue at the C-terminus of cyanophycin is required to bind to the M domain in the SuCphA1 structure13. When the TeCphA1 structure is superposed with the SuCphA1 structure in complex with the cyanophycin analog (β-Asp-Arg)8-Asn13, the cyanophycin analog is located between the N domain and Mcore of TeCphA1. Interestingly, we found a densely charged region with multiple Arg and Asp/Glu residues on the surface of Mlid (Fig. 4d). These residues seem to interact with the cyanophycin analog, as observed in SuCphA1. Mlid may contribute to guiding cyanophycin into the active site of Mcore. In the Mur ligases, the interaction with the substrate induces a rotation of the C-terminal domain, which corresponds to Mlid of CphA1 enzymes, resulting in a closed conformation15,17. However, the orientation of Mlid is almost unchanged by the substrate (cyanophycin) interaction in SuCphA1 (RMSD = 0.158 Å, 2,583 atoms of Mcore and Mlid)13.

Discussion

Cyanophycin is a unique biopolymer that consists of the β-Asp-Arg dipeptide as a repeating unit (Supplementary Fig. 1) and is produced by CphA1-dependent peptide synthesis. Our understanding of this polymerization process has been greatly improved by a recent report that provided structural snapshots of CphA113. However, the catalytic and regulatory mechanisms of this reaction have not been fully elucidated. In the present study, we determined the CphA1 structure with a complete set of substrates for the reaction of the G domain, aspartate and (β-Asp-Arg)4, which visualizes the initial state for ATP-dependent addition of aspartate to the C-terminal backbone carboxy group of a cyanophycin primer peptide. This structure proposes a feasible catalytic mechanism for the Asp backbone elongation reaction that occurs in the G domain (Fig. 5). In the apo state, Glid adopts an open conformation and is the furthest away from Gcore. ATP binding induces the movement of Glid toward Gcore to sandwich an ATP molecule, whereas Gω remains flexible. Cyanophycin and aspartate binding further changes Glid to a closed conformation, and the orientation of Gω is fixed to localize aspartate in a suitable position for the reaction by the interaction of Asn394 on the large loop with the substrate. Thus, ATP, the substrate aspartate, cyanophycin, and residues, including His267 on the P-loopG, are arranged to initiate the reaction of Asp backbone elongation. The C-terminal backbone carboxy group of cyanophycin is activated by ATP to form a high-energy acylphosphate intermediate required for the ATP-grasp enzymes22, which is stabilized by His267 on the P-loopG together with Arg309. The amino group of the substrate aspartate nucleophilically targets the carboxy carbon of the acylphosphate, thus generating a tetrahedral intermediate in which an oxyanion is stabilized by the interaction with Arg309 and Gly456. A new peptide bond forms between cyanophycin and the substrate aspartate with the release of phosphate.

Fig. 5: Schematic diagram of the catalytic reaction for aspartate addition to the C-terminal backbone carboxy group of cyanophycin.
figure 5

(i) G domain from a protomer in the apo state. Three modules, Gcore, Glid, and Gω, are shown in blue, pink, and purple, respectively. (ii) ATP-bound state. The chemical structure of ATP is depicted as AMP diphosphate. (iii) Substrate-bound state. The orientation of Glid further changes compared to the ATP-bound state. (iv) Feasible catalytic mechanism in the G domain. In this mechanism, a tetrahedral intermediate is formed by nucleophilic attack to an acylphosphate intermediate of cyanophycin by the amino group of aspartate. Arg309 is assumed to stabilize the acylphosphate intermediate and the tetrahedral intermediate together with His267 and Gly456, respectively.

The G domain of CphA1 enzymes adopts a fold similar to the ATP-grasp-like module of bacterial hybrid-type glutathione synthetase, GshF, which catalyzes peptide bond formation between the amino group of glycine and the C-terminal backbone carboxy group of γ-glutamyl cysteine13,25. The amino acid sequence of the P-loopG in CphA1 is different from that in GshF. In particular, the P-loopG has a His residue (His267) in CphA1 but a Tyr or Phe residue in GshF (Fig. 3b). In addition, Glu215 and Arg458 of TeCphA1 are substituted with Leu and Met/Tyr in GshF, respectively. These different residues of CphA1 from GshF seem to be optimized for the catalytic reaction for the Asp backbone elongation of cyanophycin. Mutational analysis suggested that His267 plays a critical role in the catalytic reaction for TeCphA1 (Figs. 2b and 5). Glu215 and Arg458 are required for the recognition of (β-Asp-Arg)4 and the substrate aspartate at the active site of Gcore, respectively (Fig. 2a, b).

Although Tlr2120 from Thermosynechococcus elongatus BP-1 has been found to be the only CphA1 enzyme that synthesizes cyanophycin in a primer-independent manner26, the other known CphA1 enzymes show primer-dependent activity. Efficient CphA1 activity requires a cyanophycin primer at least 3‒4 dipeptides long (Fig. 2c)18. Hence, to rapidly initiate cyanophycin synthesis upon the transition to the N2 fixation phase, cyanophycin primer peptides must be available in cyanobacterial cells. Another critical factor for CphA1 activity is the cellular level of aspartate. In the process of N2 fixation, aspartate is produced by the incorporation of nitrogen into oxaloacetate and is consumed directly for cyanophycin synthesis9. Aspartate is also required as a nitrogen donor for arginine biosynthesis in cyanobacteria27. Therefore, the intracellular aspartate concentration is one of the factors most likely to affect the progression of cyanophycin synthesis. The steady-state kinetics of TeCphA1 suggests an allosteric effect with positive cooperativity toward aspartate, and the reaction is, in fact, accelerated with an increasing aspartate concentration up to quite a high concentration range (Supplementary Fig. 10b). The structural comparison between the substrate-bound TeCphA1 and the SuCphA1 bound with the cyanophycin analog (β-Asp-Arg)8-NH2 showed that the binding of aspartate modulates the orientation of ATP together with Glid (Fig. 3d). This aspartate-mediated regulation of Glid seems reasonable to control the CphA1 activity associated with storage of fixed nitrogen in cyanobacteria to avoid wasteful phosphorylation of primer peptides under conditions in which aspartate is not available to CphA1 as a substrate.

In the apo and ATPγS-bound states, a large density was observed between two Glid modules and over the binding site of (β-Asp-Arg)4 (Fig. 1a, arrowhead). CphA1 is known to synthesize cyanophycin in E. coli cells28; hence, cyanophycin is the most likely candidate for the molecule with increased density. In addition, the polymerization reaction of CphA1 often terminates up to a polymer length of 25‒30 kDa26. The mobility of Glid is likely to be affected by the association of a high-molecular-weight molecule into the space between two Glid modules, and the regulation of Glid motion may also be related to the length determination and polymerization termination of cyanophycin, although this mechanism is a question that needs to be addressed in further study. On the other hand, the windshield wiper model using the N domain has been proposed as a promising model for predicting the efficiency of reactions using two distant active sites alternately13. CphA2, which synthesizes cyanophycin by directly linking β-Asp-Arg dipeptides, lacks not only the P-loopM but also the Mlid module to abolish the activity for Arg sidechain condensation29, which is supported by our mutational data of TeCphA1, which does not exert the CphA2 activity (Fig. 4b and Supplementary Fig. 10a). This molecular evolution may imply that the presence of Mlid is unfavorable for the reaction to proceed efficiently only in the G domain, rather than alternately in the two domains. The mobility of Mlid and its densely charged region (Fig. 4c) may play a role in the reactivity of both active sites together with the N domain.

Methods

Materials

Cyanophycin primer peptides β-Asp-Arg, (β-Asp-Arg)2, (β-Asp-Arg)3, (β-Asp-Arg)4, and (β-Asp-Arg)5 were synthesized via the solid phase method by Sangon Biotech (Shanghai, China). The synthesized peptides were confirmed using a TripleTOF 5600+ system (Sciex) with a TurboIonSpray (TIS) probe (Turbo V ion source, Sciex). Time of flight-mass spectroscopy (TOF-MS) spectra (m/z range, 100‒2000 Da) were recorded in positive-ion mode with the following parameters: curtain gas, 15 psi; ion source gas 1, 15 psi; ion source gas 2, 0 psi; ion spray floating voltage, 5500 V; and interface heater temperature, 400 °C. The mass spectrometer was recalibrated using an APCI positive calibration solution (Sciex). The obtained spectra are listed in Supplementary Fig. 13. All the other chemicals were purchased from Sigma–Aldrich or FUJIFILM Wako Pure Chemical Corporation.

Expression and purification of TeCphA1

The gene encoding TeCphA1 (ENA code: ABG51217) was amplified from the genomic DNA of T. erythraeum IMS101 with forward and reverse primers (CphA-for: TATACATATGAAAATCCTCAAACTCCAGA CATTACG; CphA-rev: GGTGCTCGAGAATTGAACTTTTTAAAACTT) containing 5’ 10-bp overhangs complementary to the vector sequence for seamless cloning (TransGen Biotech). The target gene was cloned into the pET-22b(+) vector (Novagen) between the restriction sites NdeI and XhoI along with a C-terminal hexa-histidine tag. All TeCphA1 mutants were prepared by inverse PCR for site-specific mutagenesis using PrimeSTAR Max DNA polymerase (Takara Bio) and primers listed in Supplementary Table 2.

Recombinant proteins were expressed in the E. coli strain BL21(DE3) (Novagen). Cells were inoculated in LB media supplemented with 50 mg ml‒1 ampicillin and cultured overnight. Upon reaching an optical density of ~0.6 at 600 nm, the temperature was decreased from 37°C to 16°C, and the culture was supplemented with 0.5 mM isopropyl β-d-1-thiogalactopyranoside. Protein expression was performed for 16 h, after which cells were harvested by centrifugation at 5,180 × g for 10 min, followed by resuspension in lysis buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 1 mM dithiothreitol (DTT), and 5 mM imidazole). Cells were lysed by sonication, and cell debris was then removed by centrifugation at 40,000 × g for 30 min. The supernatant fraction was loaded onto an equilibrated Ni-nitrilotriacetic acid (NTA) resin (Qiagen). After extensive washing with wash buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 1 mM DTT, and 20 mM imidazole), the recombinant proteins were eluted with elution buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 1 mM DTT, and 200 mM imidazole). The eluate was further purified by loading onto a Mono Q column (Cytiva) for anion exchange chromatography, and elution was then performed with a linear gradient of 0 to 1.0 M NaCl. After concentration with Vivaspin 20 devices (30 kDa cutoff; Sartorius), the protein solution was loaded onto a Superdex200 10/300 GL column (Cytiva) for size exclusion chromatography with buffer containing 20 mM Tris-HCl (pH 8.0), 400 mM NaCl, and 1 mM DTT. The purified TeCphA1 and its mutants were concentrated with Vivaspin 20 devices and then stored in buffer for size exclusion chromatography at ‒80°C until use.

Grid preparation and cryo-EM data collection

Cryo-EM samples were prepared by mixing 24 μM TeCphA1 with each solution. The final sample compositions were as follows: (1) apo state, 12 μM TeCphA1 in a solution containing 20 mM Tris-HCl (pH 8.0), 400 mM NaCl, and 1 mM DTT; (2) ATPγS-bound state, 2 μM TeCphA1 in a solution containing 50 mM Tris-HCl (pH 8.2), 200 mM NaCl, 20 mM KCl, 20 mM MgCl2, 1 mM DTT, 4 mM ATPγS, 5 mM l-aspartate, 0.5 mM l-arginine, and 10 mM (β-Asp-Arg)4; and (3) substrate-bound state, 4 μM TeCphA1 in a solution containing 50 mM Tris-HCl (pH 8.2), 67 mM NaCl, 20 mM KCl, 20 mM MgCl2, 1 mM DTT, 10 mM ATPγS, 100 mM l-aspartate, and 20 mM (β-Asp-Arg)4. Three microliters of sample was applied to a holey carbon grid (Quantifoil, Cu, R1.2/1.3, 300 mesh) rendered hydrophilic by a 30 s glow-discharge in air (11 mA current with PIB-10). The grid was blotted for 15 s with a blot force of 0 (sample 1), for 15 s with a blot force of 10 (sample 2), and for 10 s with a blot force of 10 (sample 3) and flash-frozen in liquid ethane using Vitrobot Mark IV (Thermo Fisher Scientific) at 18 °C and 100% humidity.

For apo and ATPγS-bound states, cryo-EM datasets were acquired with a Talos Arctica (Thermo Fisher Scientific) transmission electron microscope (TEM) operating at 200 kV in nanoprobe mode using EPU software for automated data collection (Supplementary Fig. 5a). The movie frames were collected by a 4 k × 4 k Falcon 3 direct electron detector (DED) in electron counting mode at a nominal magnification of 120,000, which yielded a pixel size of 0.88 Å pixel‒1. Forty-nine and fifty movie frames were recorded at an exposure of 1.02 and 1.00 electrons per Å2 per frame, corresponding to a total exposure of 50 e Å‒2, respectively. The defocus range was ‒1.0 to ‒2.5 μm for the apo and ATPγS-bound states. The dataset for the substrate-bound state was obtained with a 300 kV Titan Krios G3i TEM (Thermo Fisher Scientific) equipped with a Gatan Imaging Filter (GIF), Quantum energy filter (Gatan), and a K3 Summit DED (Gatan) (Supplementary Fig. 5a). The dataset was acquired in counting mode with a nominal magnification of 105,000 (yielding a calibrated pixel size of 0.83 Å pixel‒1) using SerialEM software30. Forty-nine movie frames were recorded at an exposure of 1.00 electrons per Å2 per frame, corresponding to a total exposure of 49 e Å‒2 and ‒0.84 to ‒1.8 μm of defocus range. Data collection details are listed in Supplementary Table 1.

Cryo-EM data processing

For the apo state dataset, 2,109 acquired movies were subjected to beam-induced motion correction by MotionCor231. The contrast transfer function (CTF) parameters were calculated using Gctf32. A total of 340,000 particles were fully automatically selected using SPHIRE crYOLO33,34 with a generalized model using a selection threshold of 0.1. The subsequent processes were performed using RELION3.135. The particles were extracted by downsampling to a pixel size of 2.64 Å pix−1 and subjected to 2D classification. A total of 180,018 particles (17 classes) that displayed secondary-structural elements were selected for ab initio map reconstruction. D2 symmetry was imposed on the generated ab initio map and used as an initial reference map for the 3D classification assuming asymmetry (C1). A total of 103,041 particles associated with the best 3D class with the highest resolution were recentered, re-extracted with a pixel size of 0.88 Å pix−1 and subjected to 3D refinement with D2 symmetry. Three rounds of CTF refinement35 and Bayesian polishing36 followed by 3D refinement with a soft-edged 3D mask yielded a D2 map with 3.11 Å resolution. To further improve the resolution, additional particles were selected using SPHIRE crYOLO33,34 with a relaxed selection threshold of 0.02. A total of 556,730 particles were extracted by downsampling to a pixel size of 2.64 Å pix−1 and subjected to 2D classification. A total of 429,149 particles (14 classes) that displayed secondary-structural elements were selected. The D2 map with a resolution of 3.11 Å obtained above was used as a reference map for the 3D classification assuming asymmetry (C1). A total of 426,572 particles associated with the best three 3D classes were recentered and re-extracted with a pixel size of 0.88 Å pix−1 and subjected to 3D refinement with D2 symmetry. Three rounds of CTF refinement35 and Bayesian polishing36 followed by 3D refinement with a soft-edged 3D mask yielded a 3.03 Å resolution map. Then, no-alignment 3D classification was conducted (D2 symmetry, two expected classes, T = 8) with a soft-edged 3D mask, and 161,823 particles were selected by choosing the best 3D class. The last 3D refinement (D2 symmetry, with a mask diameter of 260 Å) with a soft-edged 3D mask and subsequent postprocessing yielded a map with a global resolution of 2.91 Å according to the Fourier shell correlation (FSC) = 0.143 criteria37. The details of the cryo-EM data processing are shown in Supplementary Fig. 2, and the associated parameters and statistics are summarized in Supplementary Table 1. Directional FSC plots were calculated using the 3DFSC server to evaluate the evenness of the distribution of particle orientations (Supplementary Fig. 5b, c)38.

For the dataset of the ATPγS-bound state, 1,730 acquired movies were subjected to beam-induced motion correction by MotionCor230. The CTF parameters were calculated using Gctf32. A total of 927,603 particles were picked fully automatically using SPHIRE crYOLO33,34 with a generalized model using a selection threshold of 0.001. The subsequent processes were performed by using RELION3.135. The particles were extracted by downsampling to a pixel size of 2.64 Å pix−1 and subjected to 2D classification. The 424,332 particles (16 classes) that displayed secondary-structural elements were selected for ab initio map reconstruction. D2 symmetry was imposed on the generated ab initio map and used as an initial reference map for the 3D classification assuming asymmetry (C1). A total of 210,971 particles associated with the best 3D class with the highest resolution were recentered and re-extracted with a pixel size of 0.88 Å pix−1 and subjected to 3D refinement with D2 symmetry. Two rounds of CTF refinement35 and Bayesian polishing36 followed by 3D refinement with a soft-edged 3D mask yielded a D2 map with 2.91 Å resolution. Then, no-alignment 3D classification was conducted (D2 symmetry, two expected classes, T = 16) with a soft-edged 3D mask, and 49,842 particles were selected by choosing the best 3D class. The last 3D refinement (D2 symmetry, with a mask diameter of 260 Å) with a soft-edged 3D mask and subsequent postprocessing yielded a map with a global resolution of 2.96 Å according to the FSC = 0.143 criteria37. The processing strategy is shown in Supplementary Fig. 3, and the associated parameters and statistics are summarized in Supplementary Table 1. Directional FSC plots were calculated using the 3DFSC server to evaluate the evenness of the distribution of particle orientations (Supplementary Fig. 5b, c)38.

For the datasets of substrate-bound states, all 3,843 acquired movies were dose-fractionated and subjected to beam-induced motion correction implemented in RELION 3.135, and 3,735 motion-corrected micrographs were selected with total motion < 34 Å. The CTF parameters were calculated using CTFFIND439, and 3,730 micrographs were then selected with the estimated maximum resolutions <6 Å. A total of 1,371,057 particles were automatically selected from the selected micrographs in RELION 3.135 using template images, which were generated from 1,338 particles (9 classes) by 2D classification of the 2,008 manually selected particles. The particles were extracted by downsampling to a pixel size of 2.49 Å pix−1 and subjected to several rounds of 2D classification. The 296,927 particles (77 classes) were selected from the last round of 2D classification and applied to two rounds of 3D classification assuming asymmetry (C1). A total of 195,538 particles associated with the best 3D class with the highest resolution of the second round were recentered and re-extracted with a pixel size of 0.83 Å pix−1 and subjected to 3D refinement. CTF refinement35 and Bayesian polishing36 were applied to the resulting particles to refine the per-particle defocus, beam-tilt, and beam-induced motion corrections. The map generated by the 3D refinement with a soft-edged 3D mask after Bayesian polishing showed a strong indication of C2 symmetry, including the Mlid modules. Therefore, C2 symmetry was imposed on the C1 refined map, which was then compared with the original map. The visual inspection showed that the Mlid modules were more clearly resolved with the C2 map. Accordingly, C2 symmetry was assumed for the subsequent processing to further improve the resolution. Two rounds of 3D classification were performed using the C2 map, and 358,932 particle images (133 classes) were reselected from the last round of 2D classification to increase the structural homogeneity of the dataset. A total of 198,918 particles associated with the best class with the highest resolution of the second 3D classification were recentered, re-extracted with a pixel size of 0.83 Å pix−1, and then subjected to 3D refinement. After two cycles of CTF refinement35 and Bayesian polishing36, the resulting particles were subjected to two rounds of 3D refinement imposing C2 symmetry (first no 3D mask and second with a soft-edged 3D mask). Additionally, to investigate the possible asymmetrical nature of substrate binding as well as the Gω and Mlid modules of our research interests, the last two rounds of 3D refinement were repeated assuming asymmetry. Postprocessing yielded a map with a global resolution of 2.52 Å for the C2 process and 2.64 for the C1 process according to the FSC = 0.143 criteria37. The details of the cryo-EM data processing are shown in Supplementary Fig. 4, and the associated parameters and statistics are summarized in Supplementary Table 1. Directional FSC plots were calculated using the 3DFSC server to evaluate the evenness of the distribution of particle orientations (Supplementary Fig. 5b, c)38.

Model building and refinement

Each of the four models of Gcore and Mcore in the TeCphA1 tetramer was manually built based on the post-processed D2 cryo-EM map of the apo state using Coot40. Model refinement was performed using phenix.real_space_refine with the secondary-structure restraints generated by Phenix.secondary_structure_restraints in the PHENIX program suite41,42. The built models were fit to the C2 cryo-EM map of the substrate-bound state with UCSF Chimera43. The fitted models were readjusted into the map and automatically refined using Coot and PHENIX. The structural models of the other modules, the N domain, Glid, Gω, and Mlid, were generated using the protein structure modeling tools SWISS-MODEL44 and AlphaFold245 to utilize them as initial models for the further model building to generate relatively weak density maps. SWISS-MODEL was run using a part of the sequence of TeCphA1 (residues 218‒481) to build the initial models of Glid and Gω. The model with the highest GMQE (0.75) and QMEAN (‒0.84) scores were obtained from the crystal structure of γ-glutamate-cysteine ligase/glutathione synthetase (GshF, PDB 3LN7), sequence identity of 35.5%) as a template. AlphaFold2 predicted the initial models of the N domain (residues 1‒161) and Mlid (724‒879), in which almost the entire region showed high predicted local distance test (pLDDT) scores (>90), except for the terminal residues (residues 160 161, 724‒733, and 877‒879) and loop regions (residues 58‒65 and 826‒833). The initial models were fit to the C2 cryo-EM map of the substrate-bound state with UCSF Chimera. Manual model rebuilding and refinement were performed using Coot and PHENIX, respectively. The refined models were applied to the C1 cryo-EM maps of the substrate-bound state and D2 maps of the apo and ATPγS-bound states. After fitting each module to the maps with UCSF Chimera, the models were readjusted to the map and refined using Coot and PHENIX. Finally, ligands were manually fit in the map of the ATPγS-bound and substrate-bound states. The PDB coordinates AGS and 7ID were used for the models of ATPγS and a β-Asp-Arg dipeptide. All the final models were separately validated using MolProbity46 in PHENIX. The statistics of the model refinement are summarized in Supplementary Table 1. Figures were generated using UCSF Chimera and PyMol (Schrödinger).

CphA1 activity assays

The reaction mixture contained 2.5‒10 µg of purified TeCphA1, 50 mM Tris-HCl (pH 8.2), 20 mM KCl, 20 mM MgCl2, 1 mM DTT, 4 mM ATP, 5 mM l-aspartate, 0.5 mM l-arginine, and 2 mM synthetic cyanophycin primer peptide of different lengths (1- to 5-mer). These are essentially the same conditions as in a previous report19 with a small modification (replacement of 10 mM 2-mercaptoethanol with 1 mM DTT). For enzyme kinetics, the activity of TeCphA1 (1 μM as a tetramer) was measured by varying the concentration of one of the four substrates, ATP, l-aspartate, l-arginine, and (β-Asp-Arg)4. Kinetic analysis revealed that our enzyme-assay conditions for arginine, (β-Asp-Arg)4, and ATP were appropriate. While the concentration of aspartate is rather low for a stable enzyme-activity measurement, we determined the aspartate concentration at 5 mM by referring to an earlier experiment19. In the analysis of TeCphA1 mutants, a 4-mer peptide, (β-Asp-Arg)4, was used as a primer. The reaction was performed at 30°C for 30 min, with a total reaction volume of 25 µL. To confirm the TeCphA1 activity, the amount of released phosphate was measured by the colorimetric molybdenum blue method47. The postreaction solution was mixed with 250 µL of molybdate solution (0.5% ammonium molybdate and 0.5 M H2SO4) and 50 µL of SnCl2 solution (2 mg ml‒1 SnCl2 dissolved in HCl and 0.5 M H2SO4). After incubation for 6 min at 25°C, the absorbance at 660 nm was measured using a Shimadzu UV-1800 spectrometer (Shimadzu Corporation). Data were analyzed using GraphPad Prism (GraphPad Software). In the steady-state kinetics, the data for ATP, l-arginine, and (β-Asp-Arg)4 were fitted with a Michaelis-Menten curve, and the curve fitting for l-aspartate was performed using an allosteric sigmoidal equation as follows:

$${{{{{\rm{Reaction}}}}}}\; {{{{{\rm{rate}}}}}}=\frac{{V}_{{{{{{\rm{max }}}}}}}{[S]}^{h}}{{{K}_{{{{{{\rm{half}}}}}}}}^{h}+{[S]}^{h}}=\frac{{V}_{{{{{\rm{max }}}}}}{[S]}^{h}}{{K}_{{{{{{\rm{prime}}}}}}}+{[S]}^{h}}$$
(1)

where Vmax is the maximum enzyme velocity; [S] is the substrate concentration; Khalf is the concentration of substrate that produces a half-maximal enzyme velocity, and h is the Hill coefficient48.

Sequence alignments

A homologous sequence search was performed using the amino acid sequences of TeCphA1 (Accession No. MBS9770029.1) and GshF from Pasteurella multocida (Accession No. WP_010906990.1) in Protein BLAST49. CLUSTAL W50 was used for multiple sequence alignments using default parameters, and the results were displayed by ESPript 3.051. Sequence logos were created using multiple sequence alignment data in the WebLogo server52.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.