Introduction

Accurate transmission of genetic information is crucial for all organisms as unequal division of sister chromatids may lead to genomic instability which is the hallmark of many cancers and genetic disorders (Torres et al. 2008; Ricke and van Deursen 2013; Lee et al. 2016). To keep the genome intact, sister chromatids are tethered together from S phase to metaphase–anaphase transition by a multiprotein complex called cohesin. This allows to counteract the pulling force of mitotic spindle microtubules preventing precocious sister chromatid separation and enables precise segregation of sister chromatids to daughter cells (Peters et al. 2008; Ding et al. 2016). In addition to its role in chromosome segregation, cohesins were shown to play important cohesion-independent functions. These include global regulation of transcription, chromosome condensation, and DNA repair by homologous recombination and non-homologous end joining (Ström et al. 2004; Potts et al. 2006; Kawauchi et al. 2009; Tittel-Elmer et al. 2012; Lindgren et al. 2014; Fumasoni et al. 2015; Gelot et al. 2016; Merkenschlager and Nora 2016; Shen and Skibbens 2017). Since mutations in cohesin and its regulatory proteins are the cause of many developmental disorders and cancers, it is of great importance to elucidate molecular mechanisms that govern cohesin ring formation, deposition, localization, and stability (Liu and Krantz 2009; Losada 2014).

The structure of cohesin ring

In the yeast Saccharomyces cerevisiae, the cohesin complex includes three essential core subunits Smc1, Smc3, and Scc1 (Rad21 in humans) as well as several regulatory subunits including Scc3 (SA1 and SA2 in humans), Pds5 (PDS5A and PDS5B in humans), and Wpl1 (WapI in humans) (Fig. 1a). Both Smc1 and Smc3 proteins consist of the N-terminal Walker A motif, long coiled-coil region that is separated by a globular domain called hinge, and the C-terminal domain containing the Walker B motif. Coiled-coil self-folding enables interaction between the N and C terminus producing the globular head domain. Smc1 and Smc3 interact with each other through the hinge and head domain creating elongated heterodimer (Marston 2014). Close proximity of Smc1 and Smc3 heads creates two separate ATPase active sites that play a crucial role in cohesin loading, establishment, and release (Ladurner et al. 2014; Murayama and Uhlmann 2014; Elbatsh et al. 2016). The third core subunit, Scc1, bridges the Smc subunits by interacting through the C terminus with the Smc1 head domain and by binding Smc3 just above its head domain via the N terminus (Gligoris et al. 2014; Huis in’t Veld et al. 2014). Together, these three proteins form a tripartite ring-like structure which topologically entraps sister chromatids (Haering et al. 2008; Gligoris et al. 2014; Murayama and Uhlmann 2014) (Fig. 1a). Besides its role in the cohesin ring formation, Scc1 creates a binding platform for other proteins too. These include two essential, stably associated cohesin subunits Scc3 and Pds5 and non-essential, less stably connected protein Wpl1. Scc3 interacts with the C terminus of Scc1 while Pds5 binds to the N terminal part of Scc1 near the Smc3–Scc1 interface (Kulemzina et al. 2012; Chan et al. 2013; Roig et al. 2014). In contrast, Wpl1 binds to cohesin not only through an interaction with Scc1 but also with Smc3, Pds5, and Scc3 (Kulemzina et al. 2012; Chatterjee et al. 2013) (Fig. 1a).

Fig. 1
figure 1

Possible cohesin cycle model. a Structure of cohesin complex. Smc1 and Smc3 are composed of two globular domains called head and hinge that are separated by long coiled-coil. Smc1 and Smc3 create heterodimer through the interaction between head and hinge. Scc1 bridges Smc1 and Smc3. Pds5 and Scc3 binds to cohesin through Scc1. Wpl1 associates with cohesin by interacting with Smc3, Scc1, Scc3, and Pds5. b Hook-shaped cohesin loading complex, composed of Scc2 and Scc4 subunits, binds at numerous sites within cohesin. c Scc2–Scc4 folds cohesin ring bringing head and hinge into close proximity. This induces cohesin ring opening, likely within the hinge domain, creating an entry gate for sister chromatids. d Cohesin loader dissociates, entry gate close up and sister chromatids become entrapped inside cohesin ring. e During DNA replication Smc3 becomes acetylated by Eco1 that enables stable cohesin association with sister chromatids and blocks Wpl1 releasing activity. f In anaphase Esc1 separase cleaves Scc1 creating exit gate for sister chromatids. g, e Cohesin which were not acetylated by Eco1 interact with chromatin only transiently because Wpl1 promotes Scc1 dissociation from Smc3 creating exit gate for DNA

Interestingly, eukaryotes possess two other complexes called condensin and Smc5/6 that are closely related to cohesin. Condensin ensures correct chromatin compaction and organization while Smc5/Smc6 plays important roles in DNA damage repair (Lindroos et al. 2006; Menolfi et al. 2015; Iwasaki and Noma 2016; Mahendrawada et al. 2016; Robellet et al. 2016). Both complexes show cohesin-like architecture consisting of two Smc subunits (Smc2–Smc4 in condensin and Smc5–Smc6 in Smc5/Smc6 complex) that dimerize and interact through head and hinge domains. The head domains create functional ATPase active sites that are bridged by a non-Smc subunit (Brn1 in condensin and Nse4 in Smc5/Smc6 complex). Both complexes also contain additional regulatory proteins (Ycs4, Ycs5 in condensin and Nse1, Nse3, Nse5, Mms21 in Smc5/Smc6 complex). While it was shown that condensin is a highly flexible complex that may adopt many conformations, little is known about Smc5/6 complex properties (Robellet et al. 2016).

The process of sister chromatid cohesion

The process of sister chromatid cohesion can be conceptually divided into four stages: cohesin loading, establishment of cohesion, maintenance of cohesion, and cohesion dissolution (Fig. 1a–f). Cohesin loading is mediated by essential Scc2–Scc4 complex (NIPBL and MAU2 in humans) that interacts with cohesin enabling DNA entrapment inside the cohesin ring (Ciosk et al. 2000; Haering et al. 2008; Gligoris et al. 2014; Murayama and Uhlmann 2014) (Fig. 1b, c). After encircling the sister DNA molecules, cohesins are not able to stably hold sister chromatids together and are prone to removal from the chromatin. It was recently proposed that the DNA entrapped within cohesin ring interacts with two sensor lysines on Smc3 (K112/K113 in budding yeast and K105/K106 in human) inducing ATP hydrolysis that weakens interactions between Smc1 and Smc3 heads (Murayama and Uhlmann 2015). This promotes Wpl1–Pds5–Scc3-dependent dissociation of Scc1 N terminus from Smc3 leading to DNA exit from the cohesin ring (Chan et al. 2012, 2013; Murayama and Uhlmann 2015) (Fig. 1d, g, h). Cohesion establishment occurs in S phase when cohesins are converted into a tethering-competent state by essential Eco1 acetyltransferase (Esco1 and Esco2 in humans). Eco1 acetylates sensor lysines on the Smc3 head making cohesin refractory to Wpl1 releasing activity, possibly by modulating Smc3 ATPase activity (Çamdere et al. 2015; Elbatsh et al. 2016) (Fig. 1e). Interestingly, it was shown that several proteins mediating replication fork stability and progression such as Mrc1, Csm3, Tof3, Ctf18, Chl1, or Ctf4 are important accessory factors for cohesion establishment (Xu et al. 2007). It is poorly understood how these proteins contribute to cohesion establishment but it seems that they might be required for optimal structure and conformation of replication forks that is vital for efficient Smc3 acetylation (Borges et al. 2013; Samora et al. 2016). From G2 phase until the onset of anaphase, cohesion is maintained by Pds5 and Scc3 (Chan et al. 2013; Roig et al. 2014). At least in the case of Pds5, protection of cohesion involves inhibition of Smc3 deacetylation (Chan et al. 2013). Finally, at the time of anaphase Esp1 separase cleaves Scc1 allowing Hos1 (HDAC8 in humans) to deacetylate Smc3 creating an exit gate from cohesin ring for the DNA (Uhlmann et al. 1999; Borges et al. 2010) (Fig. 1f). Next, chromosomes begin to move apart pulled by the mitotic spindle and segregate equally to daughter cells. It should be, however, noted that in humans most of the cohesins are removed from the chromosome arms during prophase in a WapI-dependent manner by opening the Smc3–Scc1 interface. Only the remaining, mostly centromere-bound, cohesins are cleaved by separase at the metaphase–anaphase transition (Waizenegger et al. 2000; Gandhi et al. 2006).

In the following sections, we will focus on the cohesin loading stage and briefly summarize the recent advancement in the understanding of this process.

The structure of cohesin loading complex

The Scc2–Scc4 cohesin loading complex was first identified and characterized in budding yeast (Michaelis et al. 1997; Ciosk et al. 2000) but shortly, homologs of both Scc2 and Scc4 were found in other organisms including humans where two isoforms of Scc2, NIPBLA, and NIPBLB exist (Furuya et al. 1998; Rollins et al. 1999, 2004; Gillespie and Hirano 2004; Tonkin et al. 2004; Bernard et al. 2006; Seitan et al. 2006; Watrin et al. 2006). Scc2 is a large protein (171 kDa) that consists of the unstructured N-terminal globular domain followed by over a dozen of HEAT repeats and a C-terminal globular domain (Chao et al. 2017). Electron microscopy together with crystallographic analysis revealed that the N terminus of Scc2 creates a head-like domain that is connected to an oblong, HEAT repeat-containing structure that folds back and ends with a C-terminal globular domain (Kikuchi et al. 2016; Chao et al. 2017). Scc4 is a much smaller protein (72 kDa) containing 13 TPR (tetratricopeptide repeat) modules with 2 helicases in the N terminus and 1 in the C terminus (Hinshaw et al. 2015). Scc4 binds to the N-terminal part of Scc2 creating a tight, highly flexible hook-shaped complex (Braunholz et al. 2012; Chao et al. 2015; Hinshaw et al. 2015) (Fig. 1b). It seems that Scc4 binding stabilizes the unstructured N terminus of Scc2 possibly protecting it from proteolysis which was shown recently in vivo (Woodman et al. 2014; Hinshaw et al. 2015; Kikuchi et al. 2016).

The molecular insight into cohesin loading

The exact sequence of events that lead to DNA entrapment by cohesin is currently under debate; however, the preassembly of Scc2–Scc4 complex followed by the interaction between Scc2–Scc4, chromatin, and cohesin seems to be absolutely required for cohesin loading in vivo. Scc2 alone has high affinity to double-stranded DNA and binds only poorly to single-stranded DNA in vitro. Interestingly, the loading reaction can be performed by Scc2 or its C-terminal fragment alone in vitro but requires Scc4 in vivo (Murayama and Uhlmann 2014). These led to a hypothesis that Scc4 may be responsible for directing the loading complex to specific chromosomal loci by interaction with other proteins (Chao et al. 2015; Hinshaw et al. 2015). Cohesin loading requires numerous contacts between Scc2–Scc4 and cohesin subunits (Smc1, Smc3, Scc1, Scc3) as disruption of these interactions prevents or greatly disturbs the loading process (Murayama and Uhlmann 2014) (Fig. 1b). Moreover, topological binding of DNA by cohesin requires ATP hydrolysis by the head domains as mutations in Walker A motifs of Smc1 and Smc3 abolish cohesin loading (Arumugam et al. 2003; Murayama and Uhlmann 2014). To entrap the DNA molecules cohesin must be transiently opened. It was shown that artificial tethering of Smc1 and Smc3 hinge domains prevents cohesin loading suggesting that temporary hinge dissociation may create an entry gate for the DNA (Gruber et al. 2006). However, this model raises the question of the role of ATP hydrolysis since the hinge domain is situated far from the head ATPases. Taking into account the high flexibility of cohesin loader and cohesin, multiple contacts between them and the fact that the hinge domain can interact with the head domain, Scc3 and Pds5, it was proposed that the main role of the Scc2–Scc4 would be to bend the cohesin ring in such a way that the head and hinge could interact (Mc Intyre et al. 2007; Murayama and Uhlmann 2014, 2015; Chao et al. 2015, 2017) (Fig. 1c). The ATP hydrolysis would then enable hinge opening and DNA entrance (Gruber et al. 2006). An alternative model was also proposed suggesting that the entry gate for DNA is the same as the exit gate and located at the Smc3–Scc1 interface. In this model, the main role of Scc2–Scc4 would be to impose hinge and head proximity that enables the interaction between the DNA and conserved sensor lysine residues on the Smc3 head. This would induce ATP hydrolysis that allows the DNA to pass the heads and Wpl1-dependent transition through Smc3–Scc1 interface (Murayama and Uhlmann 2015). In any case, the next step would be the DNA replication-dependent acetylation of Smc3, cohesin stable association with chromatin, and conversion into cohesive state. It is worth pointing out that like cohesin, both condensin and Smc5/6 complexes are able to entrap DNA in a reaction that requires ATP hydrolysis (Kanno et al. 2015; Robellet et al. 2016). Interestingly, in contrast to cohesin, Scc2–Scc4 complex is only partially required for condensin and Smc5/6 binding to chromatin (Lindroos et al. 2006; D’Ambrosio et al. 2008).

Determinants of Scc2–Scc4 binding to chromatin

In budding yeast, cohesins are first loaded onto chromatin in late G1/early S phase when Scc1 is resynthesized after its proteolytic cleavage at anaphase (Michaelis et al. 1997; Uhlmann et al. 1999; Ciosk et al. 2000). In humans instead, most of the cohesins that were deposited by WapI in prophase are loaded again already in telophase (Darwiche et al. 1999; Sumara et al. 2000). Cohesins acetylated by Eco1 during replication can be stably bound to chromatin for several hours. Instead, cohesins loaded outside the S phase are highly dynamic and associate with chromatin for seconds possibly even without encircling the DNA. This transient interaction is thought to be important for transcription regulation but not for cohesion (Gerlich et al. 2006; Bernard et al. 2008; Gause et al. 2010). The whole genome mapping revealed that in S. cerevisiae cohesin loading complex associates with the chromosomes at multiple localizations including centromeres cores, telomeres, and numerous loci along chromosome arms like rDNA gene promoters or tRNA genes. Interestingly, Scc2–Scc4 binding sites poorly overlap with loci occupied by cohesin. (Lengronne et al. 2004; Lopez-Serra et al. 2014). This is because cohesins are pushed off the loading locations by transcription machinery to the convergent transcriptional termination sites (Lengronne et al. 2004; Ocampo-Hafalla et al. 2016). In fission yeast, cohesins largely colocalize with cohesin loaders and only partially translocate to the regions of convergent transcription (Schmidt et al. 2009). Interestingly, in Drosophila melanogaster, it seems that cohesin loaders and cohesins occupy the same localizations, especially in regions of transcribed genes (Misulovin et al. 2008). Finally, in humans, Scc2 preferentially binds to active gene promoters and does not colocalize with cohesin (Zuin et al. 2014). Recently it was reported that human cohesin can be also repositioned by the transcription (Davidson et al. 2016). It was shown that DNA sequences per se are not sufficient to target Scc2–Scc4 to the chromatin in vivo (Chao et al. 2015). Rather, protein partners are needed to direct the loader complex to the correct locus. Recent research indicates that there are several pathways for Scc2–Scc4 recruitment to chromosomes. It was shown that in Xenopus laevis association of the cohesin loader with chromatin requires the assembly of pre-replication complex (pre-RC) including MCM helicase, ORC complex, Cdc6, Cdt1, and Cdc7–Dbf4 (Dbf4-dependent kinase, DDK) (Takahashi et al. 2004). Later, it was reported that Scc2 interacts physically with pre-RC through DDK (Takahashi et al. 2008). Disruption of any of the pre-RC components resulted in strongly decreased levels of chromatin-associated Scc2 and cohesin although neither DNA unwinding nor DNA synthesis were needed for Scc2 to interact with chromatin (Gillespie and Hirano 2004; Takahashi et al. 2004). Interestingly, in this pathway, the presence of intact cohesin complex is not essential for the cohesin loader to bind to chromatin (Gillespie and Hirano 2004). It is not entirely clear if this pathway is active in other organisms. In S. cerevisiae, cohesin loading is Cdc6 independent suggesting that pre-RC is not required (Uhlmann and Nasmyth 1998) even though it was shown that cohesins accumulate transiently at active replication origins (Tittel-Elmer et al. 2012). However, the requirement for DDK seems to be at least partially conserved. It was shown in fission yeast Schizosaccharomyces pombe that Swi6 protein, which is essential for global heterochromatin organization including centromeric regions, associates with the Dbf4 (Dfp1 in S. pombe) subunit of DDK as well as with Scc2 (Bailis et al. 2003; Fischer et al. 2009). Importantly, mutations in Swi6 or Dfb4 caused decrease in the cohesin levels at centromeres, but not chromosome arms, leading to precocious sister chromatid separation (Nonaka et al. 2002, Bailis et al. 2003). However, it is not known whether DDK is essential for Swi6–Scc2 interaction. In budding yeast, it was shown that cohesin loading complex is strongly enriched at the centromere cores in the process that requires Scc4-dependent targeting and the presence of DDK that is recruited to centromeres by the Ctf19 kinetochore complex (Fernius et al. 2013; Natsume et al. 2013; Hinshaw et al. 2015). Interestingly, this pathway requires the presence of Scc1 cohesin subunit for Scc2 binding to centromeres (Fernius et al. 2013). Mutations in the Ctf19 complex only influence the cohesin levels at pericentromeric regions but not chromosome arms and do not cause lethality (Fernius et al. 2013; Hinshaw et al. 2015). These results imply that other pathways must exist for Scc2–Scc4 recruitment to centromeres and other chromosomal locations in S. cerevisiae. Latest research suggests that this pathway may depend on chromatin remodelers. In budding yeast, Scc2–Scc4 was shown to be recruited by the RSC chromatin remodeling complex (SWI/SNF B or PBAF in humans) to a set of specific chromatin localizations, including chromosome arms and centromeres. Scc2–Scc4 preferentially binds to nucleosome-free regions that are often located at active gene promoters. RSC together with Scc2–Scc4 cooperate to sustain the nucleosome depletion state (Lopez-Serra et al. 2014). Moreover, it was shown that RSC interacts with cohesin but the specific interaction subunit is still unknown (Huang et al. 2004). Disruption of RSC complex leads to severe Scc2 loss from chromatin, decreased levels of chromatin-bound cohesin, and marked precocious sister chromatid separation both at chromosome arms and centromeres (Baetz et al. 2004; Lopez-Serra et al. 2014). Interestingly, recent research showed that in budding yeast another chromatin remodeler called Irc5 (LSH or HELLS in human) is also involved in cohesin loading. It was shown that Irc5 is important for efficient association of Scc2 with chromatin and Scc1. Interestingly, Irc5 associates with the middle part of Scc1 but no interactions with cohesin loader subunits were shown so far. Disruption of IRC5 or its translocase activity had no effect on interactions between core cohesin subunits but instead led to decreased levels of chromatin-bound cohesin both at chromosome arms and centromeres resulting in mild premature sister chromatid separation. Moreover, reduced level of cohesin at rDNA region observed in the irc5 deletion mutant caused increased recombination in the rDNA array and loss of rDNA repeats. Because neither Irc5 can complement RSC complex nor vice versa, it seems that each protein plays some unique, non-overlapping roles in the cell (Litwin et al. 2017 and personal communication). Interestingly, in humans, SNF2, a catalytic subunit of several chromatin remodeling complexes, was also proposed to promote cohesin loading onto chromatin. SNF2 was shown to interact with Scc1 and lack of functional SNF2 caused decrease in cohesin levels at the Alu repeat-containing region (Hakimi et al. 2002). Moreover, in a mouse model it was demonstrated that ATRX chromatin remodeler interacts with cohesin complex and colocalizes with cohesin on chromatin (Kernohan et al. 2010). Depletion of ATRX in human cells caused cohesion defects in centromeres and telomeres (Ritchie et al. 2008; Eid et al. 2015). However, it is currently unknown whether or not the ATRX attracts cohesion loading complex.

The role of NIPBL (Scc2) in Cornelia de Lange syndrome (CdLS) pathogenesis

Cornelia de Lange syndrome is a rare genetic disorder that occurs in less than 1 per 10,000 births. It is characterized by microbrachycephaly, limb abnormalities as well as growth and cognitive retardation. In 60% of patients, CdLS is caused by heterozygous mutations in NIPBL while 5% of patients carry mutations in HDAC8 which encodes Smc3 deacetylase or in genes encoding Smc1, Smc3, or Rad21 cohesin complex subunits. The cause of remaining 35% cases is still unclear (Mannini et al. 2013). Cells from CdLS individuals with NIPBL mutation exhibit decreased levels of NIPBL mRNA which likely results in decreased protein levels as it was shown in a mouse model (Remeseiro et al. 2013; Kaur et al. 2016). Interestingly, mutations in SMC1, SMC3, and HDAC8 also result in reduction of NIPBL mRNA although to a lesser extent (Kaur et al. 2016). Recent crystallographic and biochemical analyses allowed to determine the effect of NIPBL mutations present in CdLS patients on cohesin loader complex stability and interactions. Mutations positioned in the beginning of the NIPBL N terminus resulted in impaired interaction with MAU2 (Scc4) (Braunholz et al. 2012). On the other hand, mutations in the end of the N-terminal part of NIPBL did not alter the complex formation and cohesin loading. It was rather proposed that these mutations may disturb interactions with protein partners (Chao et al. 2015). Mutations located in the middle of an elongated HEAT-repeat-containing domain as well as the hook-shaped part positioned ahead of C-terminal globular domain were also examined. It turned out that most of the mutants showed reduced interaction between Scc2 and Scc1 although the biological consequences of these mutations are unknown (Kikuchi et al. 2016). Interestingly, while there is only limited evidence suggesting that premature sister chromatid separation or increased DNA damage sensitivity may be the cause of CdLS (Kaur et al. 2005; Vrouwe et al. 2007), other data suggest that transcription dysregulation may be responsible for this disorder. In agreement with this hypothesis, disruption of yeast Scc2 or human NIPBL alters expression of many genes some of which are upregulated and some are repressed (Liu et al. 2009; Lindgren et al. 2014). This most likely reflects existence of mechanisms that prioritize and enable cohesin loading to centromeres even under the conditions of cohesin loader or cohesin depletion (Schaaf et al. 2009; Heidinger-Pauli et al. 2010; Fernius et al. 2013). This then allows to tether sister chromatids together and enables equal division of the genetic material but cannot rescue the defects connected with transcription regulation.

Concluding remarks

In recent years, we have learned a great deal about the cohesin complex architecture and the process of cohesin loading. Cohesin, which was first considered to be a static structure responsible for holding sister chromatids only, turned out to be a flexible and highly dynamic complex with multiple functions that require sophisticated regulatory mechanisms. Among many, cohesin loader has attracted much attention because it is the major regulator of cohesin and its mutations are the main cause of Cornelia de Lange syndrome. Nevertheless, there are some questions that need to be addressed. Detailed mechanistic understanding of the cohesin loading process is still incomplete. What are the structural changes that cohesin ring undergoes in the presence of loading complex and what are the consequences of these alterations for cohesin architecture? Moreover, the molecular factors that regulate Scc2–Scc4 chromatin association are only beginning to be elucidated. Are chromatin remodelers attracting Scc2–Scc4 to chromatin by physical interaction or their role is limited only to eviction or redeployment of nucleosomes to ensure proper chromatin environment for loading? Is there a role for chromatin remodelers in later stages of cohesion? Finally, do mutations in human chromatin remodelers contribute to cohesion defects that cause cancers and developmental disorders? Answers to these questions will be the key challenge for future studies and will allow to better understand the fundamental role of cohesin and cohesin loader in the cell.