Introduction

Protein sequences encompass the information needed to provide the right protein folding pathways to the biologically active protein fold. Nonetheless, it is the protein functions at atomistic level that directs their structures, i.e., the biological functions need to find the proper set of local protein conformations to perform its activity. Three-dimensional structure information is usually described as a simple succession of repetitive structures (see Fig. 1), namely, the α-helix and the β-sheet, connected by “random” coil (Eisenberg 2003; Pauling and Corey 1950). Helical structures are locally stabilized by hydrogen bond patterns of backbone atoms (between residues i and i + 4) (Pauling et al. 1951), while extended structures are also maintained by hydrogen bonds but at longer distances (Pauling and Corey 1951a). They represent 1/3rd and 1/5th of the total residues, respectively. A third defined state, called β-turns, is characterized by the reversal of polypeptide chain and is stabilized by a hydrogen bond between the first and last residues (Richardson 1981; Rose 1978; Venkatachalam 1968). 25% of the residues are associated with such structures (Bornot and de Brevern 2006).

Fig. 1
figure 1

Structural characteristics of three secondary structures. a Right-handed α-helix, b left-handed PPII, and c three β-strands forming sheet. The cartoon representation highlights the structural geometry, while ball and stick represent the atomic arrangements of the three secondary structures. The proline rings can be observed in (b), and the comparison of oxygen (red) and nitrogen (blue) clearly indicates the absence of intra H-bonding in PPII. In a and c, the close proximity of oxygen and nitrogen atoms makes it favourable for intra H-bonding. High helical rise of the PPII and lack of intra H-bonding make its backbone highly solvent accessible. Visualization is done with the PyMOL software (Delano 2013) (color figure online)

However, another common repetitive conformation exists, characterized before the β-turns in the 1950s, but often forgotten, namely, Poly-l-proline-II helices II (PPII) helix (Cowan et al. 1955; Pauling and Corey 1951b) (see Fig. 1b). It can be characterized as a left-handed helical structure with dihedral angle characteristic to that of β-strands and with an overall shape resembling a triangular prism (Arnott and Dover 1968; Sasisekharan 1959) (see Fig. 2 for a comparison with other local structure conformations). The PPII helix has distinct trans-isomers of peptide bonds with dihedral angles of [−75°, +150°]. The rise per residue of PPII helix is 3.1 Å with three residues per turn. Thus, this distinct helical structure rises at 9.3 Å per turn compared to 6.0 Å pitch of a 310 helix. The primary reason for such open and relatively elongated geometry of PPII is the absence of H-donor atoms due to the cyclic side chain of proline residues. Therefore, the PPII conformation is highly acceptable of H-donor atoms from its environment or third party moieties enhancing its solvation energy. PPII is observed commonly in the collagen triple helix and hence was deemed confined to fibrous proteins (Bochicchio and Tamburro 2002; Soman and Ramakrishnan 1983; Sreerama and Woody 1994, 2003). It would be found through circular dichroïsm studies that PPII is present in folded proteins and in other structural folding contexts as well. Later, Creamer et al. (Whittington et al. 2005) demonstrated the existence of PPII in denatured proteins, while NMR studies (Toal and Schweitzer-Stenner 2014) established PPII as a favoured local structure over α-helices in denatured states. Interestingly, the presence of proline residues is not a strict requirement for a PPII and that indeed establishes PPII as a distinct class in secondary structures. Rather, it has been advocated since 1993 (Adzhubei and Sternberg 1993) to include PPII in mainstream secondary structures, such as α-helices and β-sheets. A striking fact is that residues associated with PPII conformations represent nearly 5% of the total residues in a structure (Mansiaux et al. 2011), but the lack of popular PPII assignment approaches prevents their systematic analysis.

Fig. 2
figure 2

Orientation and structural organization of the different helices. a α-helix: right handed with a spherical coiling. b 310 –helix, c π-helix, and d polyproline helix: left handed with a triangular prism coiling. Proline residues are marked in yellow. e PPII helix with minimum residues possible. Only three residues can adopt a PPII conformation. In this example, none of the residue is proline. The proline rings can be observed in (d). High helical rise of the PPII can be clearly seen. Visualisation done with the PyMOL software (Delano 2013) (color figure online)

The structural properties of PPII make it highly suitable for partnered interactions. Since the backbone of PPII lacks any intra-hydrogen bonding, it requires external partners for hydration. This unique property is a reason for PPII conformation to interact with SH3 domain and thus playing a regulating role in crucial signalling pathways and cell recognition involving SH3. The distinctive structural properties, such as open, elongated structure, suggest PPII to be involved in interaction with nucleic acids. PPII has also been observed to be involved in amyloid fibrillar pathologies, such as Parkinson’s pathology. Since the PPII helix is relatively small and flexible, it is highly useful in design of cell-penetrating peptides (CPP). The current review aims at covering the different definitions of PPII based on contexts and the various methodologies that assign PPII helix. Later, it also reviews the role of PPII in protein–protein and protein–DNA interactions, involvement of PPII conformation in pathologies, and recent advances made in PPII scaffold applications.

Developments in PPII structural assignment

PPII dihedral angles are quite particular. The most classical way to analyse them is to use Ramachandran map (1963), as shown in Fig. 3. The map is based on calculations of dihedral angles between the two adjacent planes of protein backbone, hinged at C α atoms. The dihedral rotation of the planes is restricted by the steric clashes that define the disallowed regions on the map. Therefore, the map is a very powerful tool to assess the stability of a structure based on the local analysis of degrees of freedom for dihedral planes. Further evolution of the map leads to the marking of areas for specific secondary structures, namely, α-helix, β-strands, and later β-turns (see Fig. 3a). Lately, allowed region for PPII was assigned from the north-western quadrant of the map, allowed for β-strands (see Fig. 3b). A recent review catalogues the evolution of Ramachandran map very efficiently (Carugo and Djinovic-Carugo 2013). It is, however, very distinctive observation that Prof. Ramachandran incepted the idea based on the collagen hydrogen bonding argument (Bella et al. 1994; Rich and Crick 1955), which arose due to the presence of hydroxyproline.

Fig. 3
figure 3

Ramachandran plot. a From a non-redundant data set of the Protein DataBank. b Shows the allowed region for PPII helix assigned using modified DSSP approach (Chebrek et al. 2014; Mansiaux et al. 2011). Visualisation is done with the R software (R Core Team 2013)

More than 20 secondary structure assignment methods (SSAM) had been published in 30 years (Aksianov and Alexeevski 2012; Cao et al. 2016; Carter et al. 2003; Cubellis et al. 2005b; Dupuis et al. 2004; Fodje and Al-Karadaghi 2002; Frishman and Argos 1995; Hosseini et al. 2008; Hutchinson and Thornton 1996; Kabsch and Sander 1983; King and Johnson 1999; Kneller and Hinsen 2015; Labesse et al. 1997; Law et al. 2014; Majumdar et al. 2005; Martin et al. 2005; Oluwatobi Salawu 2016; Parisien and Major 2005; Park et al. 2011; Richards and Kundrot 1988; Sklenar et al. 1989; Zacharias and Knapp 2014). They have been defined with various criteria (Offmann et al. 2007): the most popular SSAM uses backbone hydrogen bonding pattern-based methods (Carter et al. 2003; Fodje and Al-Karadaghi 2002; Frishman and Argos 1995; Kabsch and Sander 1983; Zhang and Sagui 2015).

Nonetheless, very few SSAM assigns PPII to the protein coordinates. Only five SSAMs, to be more precise, include the assignment of PPII conformations. The first available approach was XTLSSTR (King and Johnson 1999), where a structure is assigned based on a simple approach similar to the visual inspection of secondary structures. It calculates three distances and two angles based on the backbone geometry and then searches for amide–amide interactions. It successfully assigns α-Helix, 310 Helix, Extended β-strand, hydrogen bonded and non-hydrogen bonded turns, and polyproline (type-II) helices.

SEGNO (Cubellis et al. 2005b) makes assignment based on distance and torsion angle calculation. For assigning PPII, it uses dihedral angles between the two-peptide planes separated by one and two residues, respectively, named diheco and diheco2. An important observation is that PPII is assigned when a residue is not defined as β-strand by SEGNO and lies within predefined values of Φ and Ψ angles. Later, taking into account the range of the four diheco angles (220–270 and 100–140), the PPII helical conformation is assigned to the residue. These thresholds are relaxed for the termini of PPII with a minimum length of the helix to be three residues and the overall shape of PPII is deemed to be like a triangular prism.

PROSS (Srinivasan and Rose 1999) uses the concept of mesostates from a torsional grid for the assignments. The grid is described as the unit squares covering all areas in a Ramachandran plot. The grids are of two kinds based on their unit area: smaller unit square: fine grid and broader unit square: coarse grid. Based on the type, each unit grid is referred to as a coarse/fine mesostate. Therefore, in principle, the Ramachandran plot is converted into a Φ/Ψ grid with marked regions (allowed, favourable, and disallowed) covering more than one mesostates. In a very similar approach related to SEGNO, PROSS also does not directly assign PPII conformation rather resolute it out after β-strand leftovers.

DSSP-PPII (Mansiaux et al. 2011) is an extension of DSSP with included dihedral angle parameters for PPII assignment, thus isolating PPII from coils. Kabsch and Sander’s DSSP (Kabsch and Sander 1983) has been the most widely used method. It is based on detection of hydrogen bonds defined under an electrostatic criterion. It makes an elaborate eight state SSA: α-Helix, 310 Helix, π-helix, β-turn, bend, extended strand, β-bridge, and coil. DSSP has been implemented in numerous databases and softwares, e.g., PDB (Berman et al. 2000; Bernstein et al. 1977) and GROMACS (Pronk et al. 2013; Van Der Spoel et al. 2005). Although being widely used and treated as a gold standard methodology, DSSP does not assign PPII. DSSP-PPII (Mansiaux et al. 2011) uses dihedral space (Φ and Ψ, −75° and +145°) to define the core of PPII while increasing by ε radiating out at 1 degree. The value of ε is chosen as an equilibrium between the number of amino acids assigned as PPII by the three previous approaches (with an extra constraints, two consecutive dihedral angles should be assigned as PPII. One of the major features of this method is to use DSSP that is already an established and trusted method for other secondary structure elements (SSE). Therefore, the code can be adapted to apparently any other assignment method, if and when required. A specific database had been proposed to the scientific community (Chebrek et al. 2014).

ASSP (Kumar and Bansal 2015), an extension of helical geometry calculation program, HELANAL-plus (Bansal et al. 2000) that is used to calculate the local helical structure parameters: twist, rise, virtual torsion, and radii. ASSP uses the difference between these parameters calculated over two or more adjacent Cα windows of four residues. Later, in the protocol, the overlaps are resolved based on the established minimum lengths of helices: α(4), 310(3), π(5), and PPII(3). Therefore, PPII conformations are assigned based on the helical geometry of the local region. Since it uses HELANAL, which further is based on Sugeta and Miyazawa, and Shakarji methods for helical geometry, ASSP tends to assign β-sheets with less efficiency (Shakarji (1998); Sugeta and Miyazawa 1967). They applied their SSAM to analyse in detail the PPII (Kumar and Bansal 2016) and found that near 3/4 of PPIIs occur in conjunction with α-helices and β-strands, and serve as linkers as well. They also underline a large number of CH···OH-bonds.

All these methods are well designed for PPII assignments. However, the number of PPII assignment approaches is still limited compared to SSAM for other secondary structure elements, and remains a limitation for the use by scientific community.

Survey of amino acids in PPII conformation

The Adzhubei and Sternberg paper in 1993 (1993) had refreshed the interest in PPII as mainstream secondary structures, such as α-helices and β-sheets, but also underlined the non-obligation of PPII to be constituted with only proline residues. Numerous mutational studies, e.g., SH3 domain—PPII peptide binding analysis provided a desired assertion that PPII conformations are favourable in denatured space (Creamer 1998; Ferreon and Hilser 2003). Impact of residue level mutations on PPII concludes that PPII conformation is retained even after successive changes of proline with alanine or glycine residues, implying that PPIIs are not constituted by a succession of proline residues alone. Therefore, PPII should rather be understood as a structural conformation found with different residue propensities in folded and unfolded states. Others experiments further establish PPII as a separate structural class (Adzhubei et al. 2013; Stapley and Creamer 1999).

Apart from these studies, restricted coiled library analysis performed by Jha et al. explores the influence of neighbours on the residues having favourable PPII propensities (Jha et al. 2005). Examination of bias-free coiled library sets reveals dominant PPII conformation for ten of amino acid residues (Pro, Ala, Met, Glu, Leu, Asn, Cys, Gln, Lys, Gly, and Tyr). Another proposal of similar propensities comes from Cubellis and coworkers which analyse position specific propensities in 5700 PPII helices and classified data with peptide lengths (Cubellis et al. 2005a). Thus, residues, such as Ala, Met, Lys, Thr, and Leu, favour PPII conformation in longer peptides, while Asp, Ile, and Glu adopt the conformation in shorter peptides (<3 res). Trp, Phe, and Gly do not favour PPII; however, interestingly, Gly is present in a repetitive motif in collagen triple helix, while Trp and Phe have been crystallized in interaction with PPII–hydrophobic motif interactions. Thus, supposedly, these residues could stabilize and mark the terminus of a PPII helix (Cubellis et al. 2005a). In the most recent survey, Kumar and Bansal show that 40% of PPIIs contain no Pro residues. Besides, aromatic amino acids are avoided within the helix, while Gly, Asn, and Asp residues are preferred in the proximal flanking regions (Kumar and Bansal 2016).

Based on hard-sphere Monte Carlo simulations, the propagation of the PPII helix is logically explained by the interaction between the prolyl ring and the backbone (Cβ) of the previous residue. However, this logic breaks when a poly-Alanine adopts a PPII conformation, and therefore, a better explanation could be the neighbouring environment and the presence of polar residues (Creamer 1998). PPII does not have characteristic main chain H-bonding pattern; thus arguably, Ser, Thr, Gln, and other polar residues can stabilize the PPII helix by non-local hydrogen bonding with the backbone (Creamer 1998; Cubellis et al. 2005a). The overall survey of amino acid propensities reveals that propensities of amino acids in PPII are highly context based. They seem to deviate according to the presence of PPII in fibrous or globular protein context.

Role of PPII in protein–protein (PPI) and DNA–protein interactions

The distinct feature of polyproline helices is that unlike other SSE, they do not have intra-hydrogen bonding, making the backbone, as well as the side chains, highly solvent accessible. Such conformations would be hankering for finding partners for hydrogen bonding and stabilization. Therefore, the sequence and structural characteristics of PPII make it worth to be probed for partnered interactions. One of the important tools to study the PPII role in protein–protein and DNA–protein interactions is the SH3 domain models. SH3 (Src homology 3) domains are small yet important structural domains in proteins involved in cell signalling and regulation, e.g., Tyrosine kinases. SH3 domains are also well known to interact with PPII conformations (Agrawal and Kishan 2002). Hence, host-pathogen models designed with SH3 domains are critical to understand interaction space of PPII conformation with respect to proteins and/or nucleic acids. Many such studies focusing on signal transduction and cell–cell recognition have been explored for potential PPII–protein and PPII–nucleic acid interactions (Hicks and Hsu 2004; Williamson 1994). For instance, C-terminus of Synapsin-I, a protein regulating synaptic vesicle transport in neurons, is proline-rich region. Synapsin-I interacts with the cytoplasmic polyproline region of membrane protein, vesicle-associated membrane protein 1(VAMP-I) (Williamson 1994). Phosphorylation of a serine residue upstream of C-terminus PPII helix regulates the secretion of a synaptic vesicle, while VAMP-I helps in recognition. Similarly, in Ras-GTP signalling pathway, the SH3 domains of the adaptor protein bind to the polyproline region of SoS protein (xPxxPPPψxPx) leading to exchange of GTP. Another set of interactions (Booker et al. 1992) is in vacuolar sorting, where SH3 domain of phosphatidylinositol-3 kinase binds to the GTP-binding protein dynamic. Structurally, it is acknowledged that the PPII helix-binding region of SH3 domain is a smooth hydrophobic surface flanked by conserved charged residues (Booker et al. 1992). The PPII interactions also have a significant structural–functional role in transcription, as many transcription factors have proline-rich terminals (Koleske et al. 1992). This could also indicate points to the role of PPII interactions in multimeric complex formation during transcription. A well-characterized case of PPII–protein interaction is the RNA polymerase II (RNApolII). C-terminus of RNApolII has multiple copies of conserved motif YSPTSPS, which further is a two-fold SPXX motif. SPXX is a DNA binding motif found in DNA binding domains (Suzuki 1989; Suzuki et al. 1990). Furthermore, Hicks and Hsu (2004) investigated the structural aspects of PPII in DNA binding and recognition (Hicks and Hsu 2004). Exemplifying with three DNA interacting proteins; viz. third K homology domain of NOVA-2 [see Fig. 4 (Lewis et al. 2000)], the Epstein–Barr nuclear antigen-1, and the Drosophila paired protein homeodomain, they quantify the binding of PPII to the nucleotides’ minor groove and underline the specificity and non-specificity of recognition. The optimal size and specific recognition offered by PPII backbone residues strongly advocate to recognize PPII as a nucleic acid binding motif (Hicks and Hsu 2004).

Fig. 4
figure 4

Interaction of Nova protein K homology domain with RNA hairpin [PDB id: 1ec6_A (Lewis et al. 2000)]. The conserved motif of the variable loop is colour in yellow. The two PPII helices are coloured in magenta. The occurrence of C-term helix is reported to be the difference between RNA bound and unbound form. Visualisation is done with the PyMOL software (Delano 2013) (color figure online)

Functional role of polyproline in diseases

Role of PPII in protein–protein and DNA–protein interactions, and role in sorting and transport mechanisms have been investigated for its involvement in pathologies and diseases. KISS-1 Receptor (KISS1R) has in its intracellular domain three triplets of Proline–Arginine–Arginine (PRR). The addition of a fourth triplet induces the formation of a PPII, and inhibits KISS1R presentation on cell membrane. The retention of KISS1R in cytoplasm ceases the interaction with kisspeptin and thus abolishes the secretion of GnRH leading to Hypogonadotropic hypogonadism (Chevrier et al. 2013). Besides, several studies using ROA (Raman optical activity) and VBD (vibrational circular dichroism) structural visualization techniques confirm the presence of PPII conformation in pathological fibrillar aggregates (Adzhubei et al. 2013; Blanch et al. 2000; Bochicchio and Tamburro 2002). Conversion of PPII to β-sheet conformation in amyloidogenic precursor of human lysozyme may indicate a highly potential role of PPII in numerous amyloid-based conformational disorders (Blanch et al. 2000). For instance, phosphorylation of a threonine flanked by a PPII in Tau protein leads to the misfolding and aggregation of microtubular proteins in Alzheimer’s disease (Syme et al. 2002). A similar role of PPII has been found in α-synuclein, responsible for aggregation in Alzheimer’s and Parkinson’s pathologies (Adzhubei et al. 2013). Taken together, this emphasizes a deeper understanding of its structural features (Adzhubei et al. 2016).

Recent advances in polyproline research

The growing interest in physico-chemical and structural properties of PPII, especially their short extended-helical structure attracted the attention of pharmaceutical companies. Very recently, cell-penetrating vector approaches are designed based on PPII scaffold (Eiriksdottir et al. 2010; Foged and Nielsen 2008; Franz et al. 2016; Geisler and Chmielewski 2009; Ruzza et al. 2004; Yamashita et al. 2016). As explained above, PPII backbone has a high solvent accessibility and thus is highly hydrated in solvents. Therefore, use of PPII for cell penetration poses a challenge for hydrating the PPII-based moiety and their convenient uptake in hydrophobic membranes (Franz et al. 2016). Chmielewski’s group (Fillon et al. 2005) addressed this by designing and introducing cationic and hydrophobic moieties on the PPII backbone and observed no structural change. The compactness and inherent flexibility of the PPII conformation is the key to their adaptability and accompanied by cationic and hydrophobic moieties; they becomes highly suitable for a cell-penetrating vector (Foged and Nielsen 2008). The study observes a tremendous increase in PPII-based Cell-Penetrating Peptide (CPP) uptake compared to the traditional ones. Another important difference is the claimed reduction in toxicity. This is based on the observations that PPII scaffold-based CPP: Sweet Arrow Peptides—SAP(E)—obtain a net negative charge unlike the traditional CPP which are positively charged (Franz et al. 2016; Geisler and Chmielewski 2009; Li et al. 2010).

Conclusion and perspectives

Polyproline II helix is arguably a distinct member in secondary structure elements, based on its geometry, sequence, and structure. PPII has a left-handed geometry compared to right-handedness of popular protein helices (see Fig. 2). Its sequence composition varies based on the presence in a globular or fibrous protein environment. It is quite an interesting observation that proline, a major α-helix breaker/kink, when in succession adapts a distinct helical form itself. Moreover, it dominates the α-helical form in denatured space. Such examples can be appreciated in light of the expanse of the second structural space. Although PPII conformation represents only 5% of the conformational space, we highly advocate for it to be considered in regular secondary structures. Besides, its representation is equivalent if not more than the 310 helices. The involvement of PPII–protein and PPII–nucleic acid interactions in different pathologies, structural applications, and drug carriers makes it even more viable candidate to be included in the main regular secondary structures. Its potential role in Alzheimer’s and Parkinson’s could not be ignored, given recent publications on the subject. The presence of PPII in regular, ordered, and disordered regions while establishes that its distinctiveness is not sufficient to seize the complete structural space of PPII conformations. Therefore, more assignment approaches and coiled library experiments are needed to explore such conformations. This review addresses the neglect on conformations, such as PPII and bias towards “regular” secondary structures. Figure 5 shows the number of publications about PPII since 1968. The increase is clear, but remains limited. The number of papers had never been higher than 100 papers/per year. In regards to the interest of this “lost” secondary structure, we can expect a better representation in the future.

Fig. 5
figure 5

Yearwise publications trends on polyproline II helices. The bars depict the number of publications corresponding the year on x-axis. An exponential function is represented in blue curve. Dark bars show the sudden surge in publications compared to the previous year. Visualisation is done with the R software (R Core Team 2013)