1 Introduction

Human heat shock proteins (HSPs) were originally identified as stress-responsive proteins required to deal with thermal and other proteotoxic stresses. It became clear shortly thereafter that all HSP families also encode constitutively expressed members like Hsc70 (HSPA8) in the HSP70 family. The heat shock genes (and the protein family members that they encode) that have been most extensively studied are those that are heat inducible, such as HSP70i (HSPA1A/B), HSP40 (DNAJB1), and HSP27 (HSPB1). With the sequencing of the human genome and the computational annotation of its genes, it became apparent that most HSP families contain additional members. The number of genes coding for the diverse HSP family members varies widely in different organisms. For example, in the HSPA (HSP70) family, the number of members varies from three in Escherichia coli to 13 in humans. Gene duplication during evolution likely satisfied the need for additional members in different intracellular compartments as well as for tissue specific or developmental expression. Moreover, gene duplication provides functional diversity for client specificity and/or processing.

Since the annotation of the human genome, the names used for the human family members in the literature have become rather chaotic and up to ten different names can be found for the same gene product. In addition, almost identical names have been used to refer to different gene products. For example, HSPA1B has been called HSP70-2, whereas HSP70.2 refers to the testis specific HSPA2 member. This has greatly hampered studies that involve comparisons of regulation and function between these members. The first attempt to clarify the nomenclature of the HSPA family was published in 1996 (Tavaria et al. 1996) but now requires modification and expansion. Here, we provide updated guidelines for the nomenclature of human HSPA (HSP70) as well as for the HSPH (HSP110), HSPC (HSP90), DNAJ (HSP40), and HSPB (small HSP) families and for the human chaperonin families (HSP60 and CCT). This nomenclature is based on the systematic gene symbols that have been assigned by the HUGO Gene Nomenclature Committee (HGNC) and are used as the primary identifiers in databases such as Entrez Gene and Ensemble. For HSP gene retrieval, we used Entrez Gene (Wheeler et al. 2008). Mouse orthologs were identified using National Center of Biotechnology Information (NCBI) Homologene (Wheeler et al. 2008).

2 The HSPA (HSP70) and HSPH (HSP110) families

The human genome encodes 13 members of the HSPA family (Table 1), excluding the many pseudogenes (Brocchieri et al. 2008). The most studied genes are HSPA1A and HSPA1B, the products of which only differ by two amino acids and which are believed to be fully interchangeable proteins. Together with HSPA6, these are the most heat-inducible family members. HSPA7 has long been considered to be a pseudogene, but recent analyses (Brocchieri et al. 2008) suggest that it might be a true gene that is highly homologous to HSPA6. HSPA8 is the cognate HSPA and was designated previously as Hsc70 (or HSP73). It is an essential “house-keeping” HSPA member and is involved in cotranslational folding and protein translocation across intracellular membranes. HSPA1L and HSPA2 are two cytosolic family members with high expression in the testis. HSPA9 is the mitochondrial housekeeping HSPA member (HSPA9 is also known as mortalin/mtHSP70/GRP75/PBP74). We also note that there are two murine mortalins, mot-1 and mot-2. HSPA5 is the ER localized HSPA chaperone (BiP). Stch (which we propose to be called HSPA13) is found in microsomes and may yet be another compartment-specific HSPA member with housekeeping functions. HSPA12A, HSPA12B, and HSPA14 are more distantly related members about which very few data are available.

Table 1 HSP70 superfamily: HSPA (HSP70) and HSPH (HSP110) families

The human genome also encodes four HSP110 (HSPH; Table 1) genes which encode a family of HSPs with high homology to HSPA members except for the existence of a longer linker domain between the N-terminal ATPase domain and the C-terminal peptide binding domain. In fact, two members, HSPA4 (HSPH2) and HSPA4L (HSPH3), were previously named as HSPA members in the Entrez Gene database. Besides the three cytosolic members, one compartment-specific HSPH member (HYOU1/Grp170) is present in the ER, and we propose to name it HSPH4 to be consistent with the rest of the HSP110 family. Recent evidence shows that HSPH members are nucleotide exchange factors for the HSPA family (Dragovic et al. 2006; Raviol et al. 2006).

3 The DNAJ (HSP40) family

A first attempt to standardize the HSP40 family nomenclature was published previously (Ohtsuka and Hata 2000) and parts of this system have been preserved herein. The DNAJ (HSP40) family is probably the largest HSP family in humans (Table 2) and is identified by the presence of a conserved J-domain known to be responsible for HSPA recruitment and stimulation of the HSPA ATPase activity. Cheetham and co-workers (Hennessy et al. 2005) divided this family into three subfamilies based on their homology to the DnaJ protein from E. coli. The human genome encodes four type A proteins (Table 2) that show homology to the E. coli DnaJ and contain an N-terminal J-domain (potentially following a signal sequence), a glycine/phenylalanine-rich region, a cysteine-rich region, and a variable C-terminal domain. To date, there are 14 type B proteins that contain an N-terminal J-domain and adjacent glycine/phenylalanine-rich region. This subfamily contains the most widely expressed and most heat-inducible human DNAJ member, DNAJB1. In addition, humans have 22 type C DNAJ proteins that only contain the J-domain but not necessarily positioned at the N terminus. It has been suggested that these members recruit HSPA members to specific subcompartments and/or functions. Finally, a number of other J-domain containing proteins are found in the NCBI and InterPro databases which have not yet been annotated as DNAJC members. They currently are listed in Table 2 as DNAJC23–DNAJC30. In addition, many DNAJ pseudogenes, which are not listed here, are scattered throughout the genome. Many of these pseudogenes show homology to only part of the J-protein but lack large parts of the protein, in some cases even the entire J-domain. A closely related family of proteins with imperfect HPD motifs has been described as ‘J-like’ proteins (Walsh et al. 2004). Only one annotated J-protein with an imperfect HPD motif is currently included—DNAJB13—which has an HPL instead that is conserved in the mouse ortholog. The gene previously named as Dnajb10 is actually the mouse ortholog of human DNAJB2 and, hence, at our request has been renamed by the Mouse Genomic Nomenclature Committee as Dnajb2. Hcg3 is the closest human homologue of DNAJB3/MSJ-1 and it encodes both N- and C-terminal domains in the same transcript but there is a reported frame shift between them, which, if true, results in a truncated protein of 145 amino acids.

Table 2 The DNAJ (HSP40) family

4 The HSPB (small HSP) family

The family of small HSPs consists presently of 11 members (Table 3) that are characterized by a signature conserved crystallin domain flanked by variable N- and C-termini. The best studied members are HSPB1 (HSP27), HSPB4 (αA crystallin), and HSPB5 (αB crystallin). The small HSPs are often found in oligomeric complexes involving one or more family members and as such may provide the cell with a large diversity in chaperone specificity. Interestingly, many members show high and sometimes even exclusive expression in skeletal and cardiac muscle, but high expression is also found in many other tissues.

Table 3 The HSPB family (small heat shock proteins)

5 The HSPC (HSP90) family

This HSP family encodes five members (Table 4) with the exception of the so-called new member Hsp89-alpha-delta-N (HSP90N) (Schweinfest et al. 1998), which was found to be a chimera of two genes with its main part identical to HSPC1 (Chen et al. 2005). The genes encoding these family members were initially annotated as HSPC members in Locuslink (the forerunner of the current Entrez Gene database). Based on the analysis of human and an additional 31 genomes across all kingdoms of organisms, Chen et al. (2005, 2006) built a nomenclature system for the family to indicate the homologues of different genes. To be consistent with the rest of the HSP family members, we have chosen to use HSPC as the approved designation. However, we recognize that there will be occasions when it will be useful to link the human gene and protein names to earlier systems of nomenclature such as the one developed by Chen and colleagues. This nomenclature system provides an example of how nomenclature equivalence statements can be used to advantage, such as HSPC1/HSP90AA1, particularly when an author wants to link to an established phylogeny-based nomenclature to discuss homologues of the human HSPC genes in other organisms. We recommend that future phylogeny-based nomenclatures that include human homologues also include the root human designation such as HSPC as the beginning of the name of the gene in the other species.

Table 4 The HSP90/HSPC family

6 The human chaperonin families (HSPD/E and CCT)

In the mitochondria, single human orthologs of the E. coli GroEL (HSP60) and GroES (HSP10) are expressed and are annotated as HSPD and HSPE, respectively (Table 5). In the cytosol of human cells CCT (TRiC), a hetero-oligomeric chaperonin complex composed of eight different subunits, plays an essential role in folding newly synthesized cytosolic proteins and preventing protein aggregation. These subunits are encoded by separate genes and share approximately 30% amino acid sequence identity (approximately 15–20% identity to GroEL). There are two genes encoding CCT6 (zeta subunit): CCT6A (zeta-1) is constitutively expressed while CCT6B (zeta-2) is expressed in a testis-specific manner. None of them has been shown to be heat inducible. Eight genes of this family are annotated in the NCBI database as CCT2-CCT5, CCT6A, CCT6B, CCT7, and CCT8. Although the human gene encoding the alpha subunit of CCT is not currently named as CCT1 by the HGNC (current symbol is TCP1), we think that the symbol CCT1 would be clearer because then it is obviously denoted as a subunit of CCT. In addition, three chaperonin-like genes, MKKS/BBS6, BBS10, and BBS12, have been identified in the human genome. Mutations in these genes cause McKusick–Kaufman syndrome and/or Bardet–Biedl syndrome (Stoetzel et al. 2006, 2007). Products of these three genes are unlikely to be CCT subunits and may be related to cilia and centrosome/basal body functions.

Table 5 Chaperonins and related genes

7 Other heat-inducible protein families and chaperones

There are proteins in other families that are heat inducible and that have chaperone-like functions. Some of these have also been named heat shock proteins, e.g., HSP47 (Nagai et al. 1999). This ER-resident protein functions as a collagen-specific chaperone. However, it has full length homology to the serine peptidase inhibitor (serpin) protein family. So far, none of the other serpin paralogs has been demonstrated to be heat inducible or to have chaperone-like activities. Therefore, this gene has been named as SERPINH1 by the HGNC and has not been listed as an Hsp here.

8 Concluding remarks

This is a first attempt to arrive at a consistent and clear nomenclature for the HSP and related chaperone genes in the human database. We realize that future modifications will be necessary and we plan to update the tables provided here at regular intervals. This nomenclature has been reviewed and approved by the editors of Cell Stress & Chaperones and all proposed modifications to the current HGNC nomenclature are currently under review by the HGNC. It has been adopted by this, the official journal of the Cell Stress Society International, as the accepted nomenclature of human heat shock genes and proteins.