Keywords

Hepatitis B Core Protein a Component of the Hepatitis B Virus

Hepatitis B Virus (HBV) belongs to the family of Hepadnaviridae and is a major human pathogen. Approximately one third of the world population has been in contact with the virus (Alter 2003) and around 250 million have become chronic carriers with an increased risk for developing liver cirrhosis and primary liver cancer. The virions consist of two shells. The outer shell is a membrane containing envelope, which is densely packed with surface proteins (HBs). The inner shell is a capsid that is formed entirely by a protein called hepatitis B core protein (HBc) or Hepatitis B core antigen (HBcAg). These HBc subunits assemble around a complex of viral pregenomic (pg) RNA and viral polymerase (protein P) (for review see Nassal 2008; Ning et al. 2017) and form an icosahedral capsid. Inside the capsid the viral polymerase reversely transcribes the RNA into partly double-stranded (ds) DNA. After reverse transcription is completed, capsids are either targeted towards the nucleus (early after infection) or they are enveloped by a lipid membrane, which is densely packed with HBs (Fig. 14.1). These mature virions (Summers and Mason 1982) are secreted together with a huge excess of subviral particles formed by HBs and virus-like, enveloped particles containing empty, highly phosphorylated HBc capsids (Ning et al. 2011).

Fig. 14.1
figure 1

Schematic representation of Hepatitis B Virus (HBV). a The viral capsid (grey) is enveloped by a lipid membrane (yellow), which is densely packed with surface proteins (green). b The three types of surface proteins (S, L, M-HBs) have increasing extensions to the N-terminus. The largest variant, L-HBs, adopts two topologies (topo I and topo II). Topo I supports binding of L-HBs to the cellular receptor and topo II to the capsid

The viral envelope is densely packed with three types of surface proteins (HBs) which are termed large (L), medium (M) and small (S). All types have of HBs have the same C-terminal membrane-embedded domain that comprises S-HBs and consists of 4 transmembrane helices. M-HBs and L-HBs have N-terminal additions of increasing length. These additions are termed preS2 for the part that is common in M-HBs and L-HBs and preS1, for the part which is unique to L-HBs. The whole N-terminal extension to S-HBs in L-HBs is commonly referred to as the PreS-domain (Fig. 14.1). For the envelopment of HBc L-HBs and S-HBs are important while M-HBs is dispensable. It is still an ongoing debate where the binding site for these two surface proteins on the capsids is.

In the envelope, L-HBs has a dual function as it binds to the cellular receptor during infection and to the capsid during envelopment. This is achieved by a dual topology (Bruss et al. 1994) in which the N-terminus of L-HBs can either face the luminal or cytosolic site of the membrane (Fig. 14.1, topo I and topo II). Viral secretion by budding requires enveloped particles. Infective particles include capsids with a mature DNA-genome and an envelope that contains L-HBs. All other types of secreted particles are none infective.

Structure of Hepatitis B Core Protein

The capsid forming human HBc is a small protein of only 183–185 amino acid residues. It consists of an N-terminal assembly domain (1–144), which forms the ordered part of the capsid and a C-terminal domain (CTD), which is largely disordered. The CTD is rich in arginines (arginine rich domain, ARD) and contains several phosphorylation sites, a nuclear export signal (NES) and nuclear localization signals (NLS) (Fig. 14.2). Thus the CTD accommodates the main functional sites of HBc that control ordered progression of viral maturation and its localization within the host cell.

Fig. 14.2
figure 2

Cartoon of HBc-monomer [chain C of 6HTX (Böttcher and Nassal 2018)]. The assembly domain is shown in colour: N-terminal fulcrum in blue, central spike in green and hand-region in red. The hinge at Gly 111 between the spike and the hand region is indicated in magenta. Cys61 that forms an inner-dimeric SS-bridge in HBc dimers is shown in yellow. The helices are labelled in roman numbers following the numbering of 1QGT (Wynne et al. 1999). Most published structures resolve the assembly domain only up to residue 144. The subsequent CTD is largely disordered and contains seven sites that can be phosphorylated in vitro (Heger-Stevic et al. 2018a), the arginine rich domain (ARD) nuclear localization signals (NLS) and the nuclear export signal (NES). The last resolved residue of HBc in capsids points towards the capsid interior, while the N-terminus is located at the capsid exterior

The structure of the assembly domain of HBc is known from numerous studies by electron cryomicroscopy (Crowther et al. 1994; Kenney et al. 1995; Bottcher et al. 1997; Yu et al. 2013; Böttcher and Nassal 2018; Conway et al. 1997; Schlicksup et al. 2018) and X-ray crystallography (Alexander et al. 2013; Bourne et al. 2006; Venkatakrishnan et al. 2016; Wynne et al. 1999). The assembly domain is mainly α-helical and contains 5 helices (Bottcher et al. 1997; Wynne et al. 1999), which are numbered from the N- to the C-terminus (I–V). The N-terminal part of the assembly domain (residues 1–48) accommodates two short helices connected by extended strands of protein. This N-terminal part resembles a fulcrum and loops around the central helical hairpin formed by the two longest helices III and IV (residues 48–111). The central helical hairpin is followed by helix V which together with a subsequent proline rich turn and an extended stretch form a hand-like region (Fig. 14.2). Helices IV and V are flexibly linked at Gly 111 (Böttcher et al. 2006; Packianathan et al. 2010), enabling a hinge-like movement between the two domains.

Most structures resolve the assembly domain only up to residue 144 and in some exceptional cases up to residue 152 (Böttcher and Nassal 2018). However, the density downstream of residue 144 is very weak suggesting either flexibility or lower occupancy with some of the C-termini being in different conformations. This region between the ordered well-resolved assembly domain and the unresolved ARD is often referred to as linker-region. The unresolved residues following 152 contain the Arginine rich domain and the phosphorylation sites of which seven can be phosphorylated in one HBc-monomer in vitro (Heger-Stevic et al. 2018a).

HBc Forms Dimeric Building Blocks

The Hbc monomer on its own is most likely unstable and has not been observed in solution yet. Folding experiments suggest that two HBc monomers associate as partly unfolded proteins into a dimer and then fold together (Alexander et al. 2013) into a mature dimer. The dimers are hammerhead-shaped (Crowther et al. 1994) with the handle as the inner dimer interface. This interface is formed by helices III and IV of each monomer. The dimer is further stabilized by an inner-dimer SS-bridge (Fig. 14.3, magenta) between Cys 61 (Fig. 14.2, yellow) in the center of helix III. The SS-bridge is dispensable for dimer and capsid formation (Nassal 1992) and slows down capsid assembly in vitro (Selzer et al. 2014).

Fig. 14.3
figure 3

Dimer in cartoon representation (CD dimer of capsids of wt-HBc [6HTX, (Böttcher and Nassal 2018)]. The dimer is shown along the dimer axis from the capsid exterior. The inner dimer SS-bridge in the centre of the spike is shown in magenta. Positions of point mutants that stabilize dimers (L42A, F23A) and trimers of dimers (Y132A) and abolish capsid formation are shown as green balls

HBc, C-terminally truncated at residue 138, cannot assemble into capsids and remains in its dimeric form, while longer constructs are able to assemble into capsids under favourable conditions (Zlotnick et al. 1996). The dimeric form is also stabilized over capsids at pH 9.5 and at low ionic strength (Wingfield et al. 1995) and stays dimeric under reducing conditions, even at high protein concentrations (Freund et al. 2008). Certain point mutations abrogate the ability of HBc to assemble into capsids but still support the dimer formation. The probably best-known point mutation that abolishes capsid formation is an exchange of Tyrosine132 to Alanine. This position is at the strained turn of the hand region (Fig. 14.3). In human HBc such mutants form stable, planar trimers of dimers whereas in woodchuck HBc the corresponding mutant forms dimers (Zhao et al. 2019). The inability of these mutants to form capsids and their ability to form well diffracting crystals, made them an important construct for routine high-resolution studies of effector binding (Zhou et al. 2017; Qiu et al. 2016; Klumpp et al. 2015) well below 2 Å resolution. Other point mutations in the fulcrum such as F23A and L42A (Fig. 14.3) also abolish capsid formation and lead to dimers that are stable under a wide range of conditions (Alexander et al. 2013).

Hbc Dimers Assemble into Icosahedral Capsids

Dimers are the main building block from which capsids with either T = 3 (180 subunits) or T = 4 (240 subunits) triangulation are formed. The diameter of the capsids across the tips of the spikes is 36 nm for T = 4 capsids and 32 nm for T = 3 capsids (Crowther et al. 1994). Thus T = 3 capsids provide about 70% of the volume of T = 4 capsids. The asymmetric units of T = 4 and T = 3 capsids both contain two types of HBc dimers with different spatial surroundings (Fig. 14.4): One type of dimer is located between a fivefold and a local sixfold symmetry axis (blue, cyan in Fig. 14.4) and the other is placed between two local sixfold symmetry axes (red, yellow in Fig. 14.4). In T = 4 capsids the dimer axes of all dimers are located at local twofold symmetry axes, meaning that these dimers do not have perfect twofold symmetry and the individual monomers deviate slightly in their structures to adapt to the different spatial surroundings. In T = 3 capsids only one of the two dimers is located at a local twofold symmetry axes whereas the other dimer is located at a strict two-fold axis and contributes only one of its monomers to the asymmetric unit. The dimer at the strict two-fold symmetry axis in the T = 3 particles has perfect two-fold symmetry with both monomers having identical structures.

Fig. 14.4
figure 4

Hbc in the asymmetric unit. Left: Surface representation of a T = 4 capsid [EMDB 0271, (Böttcher and Nassal 2018)]. The surface is coloured according to the spatially different positions of the monomers. One type of dimer (blue, cyan) is located between a fivefold and a twofold symmetry axis. The other type of dimer (red, yellow) is located between two twofold symmetry axes. The maximal diameter of T = 4 capsids measured across the tips of the spikes is 36 nm. Right: The 4 subunits of the asymmetric unit [pdb 6HTX, (Böttcher and Nassal 2018)] superposed. The assembly domain is shown in colour corresponding to the spatial surroundings as indicated on the left. The resolved linker residues are shown in white. The structures deviate in the hand region to adapt to the different spatial surroundings around the five-fold and local six-fold symmetry axes. The most adaptation-affected side chains were identified by NMR and group into a bending region (light yellow) and a dynamic region (dark yellow) (Lecoq et al. 2018) and are indicated for chain C

Within the capsids the hammerhead-shaped dimers are orientated such that the head is part of the continuous protein shell and the handle forms a protruding spike. Thus capsids have either 90 (T = 3 particle) or 120 (T = 4 particle) spikes at their surface. At either side of the spikes are large triangular holes of approximately 2 nm diameter that fenestrate the capsid shell and are large enough to be crossed by small molecules or nucleotides.

In the capsids the hand-like C-terminal regions of the assembly domains, which form the opposite tips of the hammerhead, mediate the inter-dimer contacts. Either five or six of these hand-like regions assemble around strict five-fold-symmetry axes and the local six-fold symmetry axes respectively. These different spatial environments require slight adaptations in the structure of the protein (Fig. 14.4) that manifests as a swivel of the hand region relative to the spike with Gly111 as a hinge. Solid State NMR confirms that the hand region comprises a dynamic region between residues 121 and 135 and a bending region between residues 117 and 120 that differ in their chemical shifts between the 4 monomers in the asymmetric unit (Lecoq et al. 2018).

T = 4 capsids are more common than T = 3 capsids and account for more than 90% of capsids in virions (Roseman et al. 2005; Seitz et al. 2007; Dryden et al. 2006). Early experiments with HBc expressed in E. coli suggested that the CTD and especially the linker region between assembly domain and ARD (Watts et al. 2002) is decisive for whether T = 3 or T = 4 capsids are formed. In addition full length HBc had a strong dimorphism (Wingfield et al. 1995; Crowther et al. 1994) whereas HBc truncated at position 149 formed mainly T = 4 particles (Zlotnick et al. 1996). Later experiments with optimized, heterologous expression systems did not confirm the earlier described dimorphism of full length wt-HBc (e.g. Böttcher and Nassal 2018.) suggesting that higher dimer concentrations are another decisive factor that controls the speed of capsid assembly (Lutomski et al. 2017) as well as the ratio between T = 3 and T = 4 particles. (Harms et al. 2015). The critical dimer-concentration of truncated HBc1–149 for capsid formation is around 0.5 µM and is the same for T = 3 and T = 4 particles (Harms et al. 2015).

Important Sites of the Assembly Domain

HBc is densely packed with functionally important sites. Some of them map to the ordered assembly domain while others are located in the disordered CTD, which probably changes its accessibility during viral maturation. The assembly domain harbours sites that are important for capsid assembly, immunogenicity, formation of virions and secretion (Fig. 14.5). Many functionally important sites map to the inner dimer interface such as the tips of the spikes and the surroundings of a hydrophobic pocket inside the spikes; others are located at the inter-dimer interface like a hydrophobic Heteroaryldihydropyrimidine pocket between the hand region and fulcrum (Venkatakrishnan et al. 2016).

Fig. 14.5
figure 5

Part of a capsid in surface representation coloured with important sites: evolutionary conserved residues are coloured in blue [motif I light blue, motif II: dark blue, motif III medium blue (Dill et al. 2016)]; residues essential for secretion (Ponsel and Bruss 2003) are coloured in red, mutations causing premature envelopment are shown in yellow and those leading to low level secretion phenotypes in orange (Le Pogam et al. 2000). Magenta highlights residue 80 at the tips of the spikes in the centre of the immune dominant region (Salfeld et al. 1989; Koschel et al. 2000). Green marks sites hat interfere with capsid formation but not with dimer formation. These sites are mainly buried in the protein shell and not surface exposed. The left panel shows the outer surface, to which many important sites map. Interestingly, the upper part of the spikes only contain the immune dominant loop but are otherwise neither conserved nor contain residues that are important for secretion. The right panel shows the inner surface, which only displays 2 important residues: Gly 111 in blue in the conserved motif II and Asn 136 in red being essential for secretion

In addition individual, exposed residues of the hand region (R127, I126, F122, A137, N136, and I139) at the outer surface of the spike (L60, L95, and K96) and at the fulcrum (S17, F18) (Fig. 14.5) are essential for the secretion of enveloped virions with a mature genome (Ponsel and Bruss 2003). Even conservative mutations of these key residues abrogate secretion of virions with mature genome (Ponsel and Bruss 2003) but not that of empty, enveloped capsids (Ning et al. 2018).

The Tips of the Spikes

The tips of the spikes are the most exposed site of the capsids and form the major immuno-dominant loop (Fig. 14.5) and probably the binding platform for the envelope. The tips of the spikes consists of the tight loops between helix III and IV containing residues 77–81. Each monomer contributes one of these loops to the tips. This generates a characteristic cleft between the two loops, which could act as potential binding site for other factors. The widths and length of the cleft is only large enough to support direct contact to 2–3 amino acids. The tip of the spikes contains 4 negatively charged residues which accumulate in the cleft providing an exposed platform for potential electrostatic interactions (Fig. 14.6). Furthermore, the tips together with the upper half of the spikes are more flexible than the rest of the capsid as indicated by the high B-factors and the low local resolution in this region of the capsids (Böttcher and Nassal 2018). MD simulations (Hadden et al. 2018) suggest that the tips sample a wide conformational space forming a dynamic cleft between the loops. These observations agree with a potential binding site for small binders following an induced fit mechanism.

Fig. 14.6
figure 6

Dimer in surface representation coloured with the relative surface charges. Positive charges are coloured in blue, negative charges in red. The tips of the spikes are the most negatively charged exposed region of the capsids. The fulcrum preceding helix III is also surface exposed and is the most positively charged region of the spike and leads to the entrance of the hydrophobic pocket in the interior of the spikes

The tips of the spikes also harbour the main immunogenic loop, which is centred on residue 80 (Salfeld et al. 1989; Conway et al. 1998; Belnap et al. 2003) at the start of helix IV. Deletion of residue 80 (Fig. 14.5, magenta) abolishes the reaction with human anti-HBc positive serum (Koschel et al. 2000) confirming the location of the immunogenic loop close to the tips of the spikes. Some antibodies show a strong preference for certain quasi-equivalent sites over others (Wu et al. 2013; Conway et al. 2003; Harris et al. 2006) demonstrating that the conformational differences between the four monomers in the asymmetric unit of the capsid can be recognized by antibodies although the structural differences of the four monomers are very minor at this position (Fig. 14.5). In certain complications of Hepatitis B virus infection that lead to acute liver failure, a strong B-cell response causes massive production of IgG against HBc (Farci et al. 2010). Antibodies from such patients bind towards the side of the spikes peripheral to the major immune dominant loop and show a strong conformational preference for some of the monomers over others (Wu et al. 2013). Typically only one monomer per dimer has a bound antibody and within the spatial context of the asymmetric unit it is always the same.

The tips of the spikes are also the contact sites with the envelope in virions (Dryden et al. 2006; Seitz et al. 2007) and bind to fragments of PreS and S-HBs (Muhamad et al. 2015). NMR-studies suggest that the S-HBs fragment forms a short helical segment and the PreS fragment a tight loop, which both bind with one face to the cleft at the tips of the spikes (Muhamad et al. 2015). Peptides that interfere with binding of L-HBs to the capsids (Dyson and Murray 1995) also attach to the tips of the spikes at the cleft at the tips of the spikes (Böttcher et al. 1998; Freund et al. 2008). Contacts between capsid and envelope appear variable from varying directions (Seitz et al. 2007). Although these observations strongly suggest that the tips of the spikes are the binding sites between capsid and envelope, none of the residues at the tips of the spikes appears to be of importance for envelopment, virion formation or secretion according to extensive mutational screens (Koschel et al. 2000; Ponsel and Bruss 2003). This apparent contradiction is not yet fully resolved and might hint to an unspecific electrostatic interaction between HBc and the envelope and could reflect the different roles of S-HBs and L-HBs in envelopment with PreS probably mediating the specific contact away from the tips of the spikes (Ponsel and Bruss 2003; Pastor et al. 2019).

The Hydrophobic Pocket Inside the Spikes

The spikes accommodate a hydrophobic pocket (Fig. 14.6, lower panel, arrow) in the centre of the spikes, which harbours Cys61 and F/I97. In capsids derived from patients’ serum (Roseman et al. 2005) and in some samples of capsids derived from HBc over-expressed in E. coli (Böttcher and Nassal 2018) the pocket contains an unknown factor. Its small size is consistent with a small molecule or an individual amino-acid side chain rather than with a folded motif of a potential protein binding partner. The pocket is of interest because it is lined by numerous residues that have been implicated in either low or premature secretion phenotypes. In particular residue 97 is implicated in premature secretion. In wt-Hbc this position is typically occupied by a large hydrophobic residue, such as phenylalanine (e.g. strains ayw and adyw) or isoleucine (e.g. strains adr and adw); however, in some naturally occurring mutants, these bulky residues are replaced by a smaller leucine. These mutants have a premature secretion phenotype such that capsids containing ssDNA are enveloped (Yuan et al. 1999a, b).

Structural differences between capsids of wt-HBc and of F97L-mutant are subtle and impact the side chain orientation of residues 97 in helix IV and the disulphide-bridge at the adjacent helix III, which is always reduced in 97L but not in 97F. Both changes together enlarge the hydrophobic pocket (Böttcher and Nassal 2018) such that the pocket factor can penetrate deeper into the spike. The entrance of the pocket is also lined by P5 and L60 (Fig. 14.5, orange) which if mutated to P5T or to L60V are implicated with low level secretion phenotypes and are naturally occurring variants of the Hepatitis B Virus (Le Pogam et al. 2000). Residue 5 is close to the N-terminus of HBc, which is largely disordered in freshly prepared, phosphorylated capsids and folds slowly over a couple of days (Heger-Stevic et al. 2018b). This slow maturation in vitro might hint at a cis–trans isomerization of the proline which is generally catalysed by peptidyl-prolyl-cis-trans-isomerases in vivo. Mutations of the proline are likely to abolish the structural maturation and thus interfere with processes that sense a folded N-terminal region. In vivo experiments in Huh cells show that mutating any of the pocket delineating residues L60, L95, K96 and I126 to smaller alanines abrogates binding of the L-HBs to HBc (Pastor et al. 2019). All these observations suggest that the pocket could be involved in binding PreS but is generally too small to accommodate parts of the PreS-sequence directly.

The Hydrophobic Pocket at the Inter Dimer Interface

HBc-dimers interact around the five-fold and local six-fold symmetry axes by packing the hand region of one dimer against the fulcrum and the hand region of the adjacent dimer. This inter-dimer contact forms a hydrophobic pocket between two dimers to which assembly modulators bind (Venkatakrishnan et al. 2016; Schlicksup et al. 2018). These assembly modulators fall into two categories: The heteroaryldihydropyrimidine-derivates (HAP) (Stray et al. 2005) and the phenylpropenamide-derivates (Katen et al. 2010). While HAP-derivates lead to aberrant structures (Stray et al. 2005) phenylpropenamides accelerate capsid assembly and maintain the capsid structure (Katen et al. 2010). Both types of modulators have antiviral potential (for review see Zlotnick et al. 2015) but have a different mode of action. HAPs probably act by accumulating hexamers of dimers as intermediates (Stray et al. 2005). These hexamers are well stabilized and can lead to tube-like assemblies with hexamers of dimers as main building block (Liu et al. 2017). In contrast phenylpropenamides lead to accelerated formation of intact capsids that then progress into empty enveloped virus-like particles (Feld et al. 2007). HAP and Phenylpropenamides both intercalate between helix V of the hand region in one dimer and helix IV of the spike below the fulcrum of the other dimer (Schlicksup et al. 2018; Katen et al. 2013). The changes in the individual asymmetric units are small but lead to large range distortions that accumulate in the capsid structures (Katen et al. 2013).

Most of the structural binding studies of these modulators have used C-terminally truncated HBc1-149-variants. The truncation removes the ARD of the CTD and abolishes unspecific interactions with RNA. The resulting capsids are more homogeneous and therefore are better amenable to high-resolution structure determination. However, in capsids C-terminally truncated HBc and full length HBc have a somewhat different orientation of the C-terminal tail of the assembly domain (Yu et al. 2013), which impacts the inter-dimeric hydrophobic pocket. Superposing structures of truncated HBc variants with bound modulator (Schlicksup et al. 2018; Venkatakrishnan et al. 2016; Katen et al. 2013) with full length HBc-capsids suggests that the modulators also affect residues 151–152 of the adjacent subunit (Böttcher and Nassal 2018) making an even more extended contact at the interdimer interface than suggested by the studies of the truncated HBc1–149. It is likely that part of the mode of action of the assembly modulators is in co-ordinating the CTD and accelerating the interaction of these only weakly ordered residues to form stable capsids. Recently, more substance classes were discovered (Huber et al. 2019; Corcuera et al. 2018) which also target the same hydrophobic pocket and modulate capsid assembly. These modulators provide a versatile toolbox for interfering with viral maturation and thus for the development of new antiviral strategies.

The C-Terminal Domain (CTD)

The CTD downstream of residue 144 is largely disordered. It can be divided into a linker region and the arginine rich domain (ARD). Parts of the linker region have been resolved as weak density in high resolution EM-maps of empty, phosphorylated capsids (Böttcher and Nassal 2018). This linker region folds into a kinked stretch that resembles the loop and stretch of the hand region. The linker packs against the last resolved residues of the assembly domain in the adjacent dimer and contributes to the inter-dimer contact. This involvement in inter-dimer contacts probably explains its impact on the morphology of capsids (Zlotnick et al. 1996).

The ARD downstream of the linker has many positive charges located at 16 arginines and interacts randomly with host-RNA. These arginines are interspersed with six serines, which can be phosphorylated and which are phosphorylated in HBc-dimers prior to capsid assembly (Zhao et al. 2018). There is evidence that the dephosphorylation of HBc is concomitant with the encapsidation of the pregenomic RNA (Zhao et al. 2018). If HBc is phosphorylated in vitro, capsids no longer incorporate random host RNA and remain empty (Heger-Stevic et al. 2018b). Similarly, the packaging of host RNA is reduced by phosphorylation mimics in which three of the serines are replaced by negatively charged glutamates (Porterfield et al. 2010). Thus, stable incorporation of the genome probably requires a fine-tuned charge balance between negative charges contributed by the genome and the phosphorylated serines and the positive charges contributed by the arginines (Le Pogam et al. 2005). Phosphorylation of the CTD has no impact on the structure of the assembly domain (Böttcher and Nassal 2018; Heger-Stevic et al. 2018b). Therefore, a conformational change of the capsid cannot act as a maturation signal that directs the capsid either towards envelopment or to the nucleus. However, differences in the accessibility of the CTD in phosphorylated and unphosphorylated capsids as well as capsids of phosphorylation mimics are evident by a changed accessibility to proteases (Heger-Stevic et al. 2018b; Selzer et al. 2015). Such conformational accessibility-switch is an attractive mode for guiding maturation as it either masks or unmasks the nuclear localization signal. However, it is difficult to understand what the actual driver of this maturation is. Some experiments compare the accessibility of the CTD to trypsin digestion in empty capsids that have been generated from HBc dimers in the absence of RNA (Selzer et al. 2015), whereas other experiments compare capsids that self-assemble in E. coli (Heger-Stevic et al. 2018b) and balance their charge by either picking up RNA (unphosphorylated) or remaining empty (phosphorylated). Accordingly, the observations differ and show different degrees of protection of the CTD. Unfortunately, protease protection assays cannot distinguish whether accessibility has changed due to a partial disintegration of the capsids or a genuine conformational switch. Studies on nucleocapsids with mature genome suggest that the maturation leads to a destabilization of the capsids probably with partial disintegration (Cui et al. 2013), which could explain changes in accessibility to proteases. In contrast, structural studies, and especially those aiming at high-resolution select for the most intact particles and discard information from a large percentage of particles, which are structurally more variable or damaged but could account for the observed changed accessibility.

Assuming a conformational switch that makes the CTD accessible to the outside without damaging the gross architecture of the capsids raises the question where the CTDs emerge from the capsid. Electron microscopic studies of capsids with Import in ß bound to the CTD suggests that the CTD egresses through the holes at the local six-fold axes (Chen et al. 2016). However, these holes are narrow and cannot easily widen to accommodate an extended peptide stretch without a simultaneous rearrangement of the adjacent hand regions. Far more attractive are the large holes at the base of the spikes. With a diameter of some 2 nm, they are large enough to support the exit of one or two extended stretches of the CTD. In agreement with this model are observations on phosphorylated HBc capsids, which show the last resolved residues of the assembly domain extending towards the large holes adjacent to the base of the spikes (Heger-Stevic et al. 2018b; Böttcher and Nassal 2018). Furthermore, in the phosphorylated premature envelopment mutant F97L the three most C-terminal residues of the CTD (181–183) are located at the base of the spikes directly adjacent to the large holes (Böttcher and Nassal 2018). This locates the C-termini of HBc at the capsid interior at a favourable position for exposure through the adjacent large holes without the need of capsid reorganization in the assembly domain.

Evolution of Capsids

Hepadnaviridae and the related Nackednaviridae are found in almost all vertebrates, including fish (Hahn et al. 2015), amphibians (Dill et al. 2016), birds and mammals. Phylogenetic analysis suggests that both families emerged from a common ancestor some 430 Mya (Lauber et al. 2017). While the non-enveloped Nackednaviridae are only reported in fish, the enveloped Hepadnaviridae are found in mammals (Orthohepadnavirusses), birds (Avihepatovirusses), reptiles, amphibian (Herpetohepadnavirusses) and in some fish (Meta- and Parahepdanaviruses) (Lauber et al. 2017; Schaefer 2007). The non-enveloped Nackednaviridae have a capsid that is structurally similar to the human HBc capsids (Lauber et al. 2017) but probably evolved much earlier. Phylogenetic comparison identify three conserved motifs in the capsid proteins (Dill et al. 2016), which include the surface-exposed side of the hand region, parts of the fulcrum and the tight kink between helix IV and IV connecting the spike domain to the hand region (Fig. 14.5, blue). All three motifs are in close proximity to the hydrophobic pocket at the inter dimer interface, suggesting that important sites for capsid formation are conserved over geological ages. In contrast, residues at the inner-dimer interface and at the tips of the spikes are not conserved. While most capsid proteins are relatively small with some 180+ residues the capsid proteins of Avihepadnaviruses in birds and Herpetohepadnaviruses in reptiles and amphibians have much larger capsid proteins of some 260+ residues. The most remarkable difference of these larger variants compared to the other capsid proteins are extensions of some 40–50 residues replacing the tips of the spikes (Nassal et al. 2007). This demonstrates the structural variability of the spike region. Recent unpublished data shows that in duck Hepatitis B capsid protein (DHBc) this extension domain folds separately and much more slowly than the rest of the capsid protein (Makbul et al. 2020), The extension domain packs to the side of the spikes as already suggested by the earlier discovered broad spikes of DHBc capsids (Kenney et al. 1995) and contributes almost 50% of the area of the inner-dimer interface. Thus it adds an important contribution to capsid stability (Makbul et al. 2020). The inner dimer contact within the core spike of DHBc is also changed in comparison to HBc with an altered twist of the helices III and IV and a stabilization of the contact by stacking of aromatic residues rather than an inner-dimer di-sulphide bridge (Makbul et al. 2020). These observations further highlight the fact that the inner-dimer interface in the related core proteins is much less conserved than the inter-dimer packing.

HBc Capsids as Tool in Biotechnology

The ability of HBc to self-assemble into capsids has been fascinating for many researchers and was quickly developed into a tool to display other proteins or protein fragments at the surface of capsids (Schodel et al. 1992; Borisova et al. 1996; Ulrich et al. 1998). Large insertions are tolerated at the tips of the spikes without abolishing particle formation. Depending on the size of the inserted protein these particles have some 40–60 nm diameter and carry up to 240 foreign protein fragments at their surface. This spatially condensed display of protruding epitopes is conceptually interesting for the generation of vaccines that trigger a strong immune response. HBc is a particularly attractive carrier, as it is not infectious on its own, triggers a strong B cell, T cell and CTL response (Milich et al. 1995) and is directly displayed in B cells in mice (Milich et al. 1997). Systematic structural screening has shown that the tips of the spikes tolerate the insertion of whole proteins such as GFP (Kratz et al. 1999; Böttcher et al. 2006) or the dimeric outer surface lipoprotein C (OspC) of the Lyme disease (Skamel et al. 2006). However, insertions into HBc at the tips of the spikes requires that the C- and N-terminus of the inserted fragment are in close proximity and that spatial clashes of the two displayed proteins on the same spike do not interfere with the independent folding of HBc and the inserted protein. Some residual conformational stress at the tips of the spikes is generally well tolerated by the capsids but is propagated through the capsid scaffold and leads to distortions of the capsid that are hinged at Gly 111 at the joint between helix IV in the spike and helix V in the hand-region (Böttcher et al. 2006). This type of conformational stress occurs during protein folding and capsid formation. It is also possible to induce conformational stress post capsid formation. For this an acidic insertion domain is introduced which forms a coiled coil helix with a complementary basic peptide (peptide velcro) but is disordered without the binding partner (O'Shea et al. 1993). In the absence of the complimentary peptide, capsids have the expected structure of wt-HBc capsids. The large acidic insertions are not resolved by electron cryo-microscopy, which is expected for an unfolded insert. Upon addition of the complimentary basic peptide, capsids completely disintegrate (Böttcher et al. 2006). This enables a triggered opening of an artificial HBc-based nano-container.

Spatial constraints restrict which proteins and protein fragments can be inserted to the tips of the spikes. These constraints are partly eliminated by using very long flexible linkers, which give the foreign fragment the freedom to fold independently of the icosahedral capsid scaffold. Similarly, using mosaic particles of wt-HBc and chimeric HBc (Vogel et al. 2005) with an inserted foreign protein (fragment) relieve the spatial constraints by reducing the number of surface-exposed foreign epitopes. However, insertions of proteins in which C- and N-terminus are distant remain problematic. Here a different strategy which uses “split cores” was designed (Walker et al. 2011). In these constructs HBc1–149 is split at the tips of the spikes between helices III and IV into an N-terminal and a C-terminal protein. These two split-core proteins still assemble into icosahedral capsids and also do so when foreign proteins are inserted at the tips of the spikes. This strategy by Walker and co-workers allows the expression of the foreign proteins at the C-or N-terminal end of the split-core rather than in a conformationally more constraint loop and has been successfully used for more problematic inserts.