Introduction

Clustered regularly-interspaced short palindromic repeats (CRISPRs) are a family of short and highly conserved DNA sequence repeats that have been found in many bacteria (Jansen et al. 2002; Mojica et al. 2005). In addition, the existence of several conserved CRISPR-associated (Cas) genes in the vicinity of CRISPR loci are often described (Godde and Bickerton 2006; Haft et al. 2005; Jansen et al. 2002). Regarding CRISPRs and Cas, it was recently proposed that CRISPR and Cas genes might be involved in confering immunity to the host cell against foreign DNA (Makarova et al. 2006; Mojica et al. 2005). In addition, regarding the spacers, Bolotin et al. (2005) recently suggested that the unique spacer elements were the traces of past invasions by extrachromosomal elements (Bolotin et al. 2005).

Most recently, regarding Escherichia coli CRISPR-Cas promoters and their silencing, Pul et al. (2010) demonstrated that DNA-binding protein H-NS is involved in silencing of the CRISPR-Cas promoters, resulting in cryptic Cas protein expression (Pul et al. 2010).

Campylobacter organisms, primarily C. jejuni, C. coli and C. fetus are Gram-negative bacteria, that are the major and typically recongnized Campylobacter organisms of medical, public health or veterinary interest worldwide (Debruyne et al. 2009; Lastovica and Skirrow 2000; Moore et al. 2005).

The thermophilic species Campylobacter lari was first isolated particularly from seagulls of the genus Larus (Benjamin et al. 1983; Skirrow and Benjamin 1980). C. lari has also been shown occasionally to be a cause of clinical infection (Martinot et al. 2001; Nachamkin et al. 1984; Werno et al. 2002). In addition, an atypical group of isolates of urease-positive thermophilic Campylobacter sp. (UPTC) was isolated from the natural environment in England in 1985 (Bolton et al. 1985). Thereafter, these organisms have been described as a biovar or variant of C. lari (Mégraud et al. 1988; Owen et al. 1988). Subsequent isolates were obtained in France (Bezian et al. 1990; Mégraud et al. 1988), Northern Ireland (Kaneko et al. 1999; Matsuda et al. 2003; Wilson and Moore 1996), The Netherlands (Endtz et al. 1997) and Japan (Matsuda et al. 1996; Matsuda et al. 2002). Thus, these two representative taxa, namely urease-negative (UN) C. lari and UPTC occur within the species of C. lari (Matsuda and Moore 2004).

Regarding CRISPR and Cas in Campylobacter organisms, Schouls et al. (2003) employed sequence analysis of the CRISPRs to genotype a collection of Campylobacter strains (n = 180 for C. jejuni; n = 4 for C. coli). In addition, Price et al. (2007) described a novel method for genotyping the CRISPR locus of C. jejuni and C. coli (a total of 210 Australian isolates) subjected to high-resolution melt analysis following real-time PCR. However, no reports have appeared on CRISPRs and Cas with other Campylobacter organisms.

Although CRISPR and Cas have recently been identified in C. fetus subsp. fetus 82-40 (DDBJ/EMBL/GenBank accession number NC_008599), following whole genome shotgun sequencing analysis, reports have not yet appeared for C. fetus. In addition, the C. lari RM2100 strain, whose genome analysis has already been carried out (NC_012039; Miller et al. 2008), was shown not to carry any CRISPRs and Cas.

However, during the process of our genome sequence analysis for a representative taxon of C. lari UPTC isolate, we, for the first time, found the occurrence of the CRISPR and Cas in the environmental Japanese UPTC isolate CF89-12 genome DNA. Therefore, the aim of the present study was firstly to identify and molecularly characterize the CRISPR and Cas from the C. lari taxon, UPTC CF89-12 isolate. Moreover, we wished to clarify whether these genes are expressed in UPTC cells or not.

Materials and methods

Representative C. lari taxon, UPTC isolate employed in the present study and growth conditions

The Japanese isolate UPTC CF89-12, which was isolated from the water of Asahigawa River, Okayama prefecture, Japan (Matsuda et al. 1996), was employed in the present study.

Following culturing on the charcoal-cefazolin-sodium deoxycholate agar medium (Oxoid, Hampshire, UK), UPTC cells were cultured on blood agar base No. 2 (Oxoid) that contained defibrinated horse blood [7% (v/v); Nippon Bio-test, Tokyo, Japan], supplemented with Butzler Campylobacter-selective medium (Virion, Zurich, Switzerland), under microaerophilic conditions using BBL Campypak Microaerophilic System Envelopes (Becton–Dickinson, NJ, USA) at 37°C for 48 h. Cells were further cultured on Mueller–Hinton agar under the same microaerophilic conditions.

Genomic DNA preparation

UPTC CF89-12 genomic DNA was prepared by the cetyltrimethylammonium bromide (CTAB) method (Sambrook and Russell 2001) and following RNase treatment. The DNA concentration was adjusted to approximately 800 ng/μL.

Construction of the genome DNA library of UPTC CF89-12

A genomic DNA library was constructed using NEBNext™ DNA Sample Prep. Reagent Set 1 (New England BioLabs Japan Inc., Tokyo, Japan). The DNA was fragmented using Covaris S-series (Covaris Inc., MA, USA) and was separated by agarose gel electrophoresis [300–500 base pairs (bp)]. Cluster generation was carried out using the constructed library DNA as templates with Cluster Station and Cluster Generation Kit (Illumina Inc., Ca, USA).

Nucleotide sequence analysis

The nucleotide sequence was determined using Genome Analyzer IIx and Sequencing Kit (Illumina Inc.). Nucleotide sequence analysis of full-length CRISPRs locus was carried out using the GENETYX-Windows computer software version 9 (GENETYX Co., Tokyo, Japan).

Schematic representation of the CRISPRs locus in the UPTC CF89-12 isolate

CRISPRs loci for the UPTC CF89-12 organism (AB598370) analysed in the present study are illustrated in Fig. 1.

Fig. 1
figure 1

The resultant schematic representation of the CRISPR and Cas locus (7,500 bp) identified in the UPTC CF89-12 and the locations of the primer pairs for the RT-PCR amplifications (A), and the primer sequences employed (B)

Total cellular RNA purification and reverse transcription (RT)-PCR

Total cellular RNA was extracted and purified from UPTC CF89-12 cells, using RNAprotect Bacteria Reagent and RNeasy Mini Kit (QIAGEN, Tokyo, Japan). Firstly, RT-PCR analysis was carried out using the primer pair of f-/r-p-CasUPTC for the putative (p)-Cas, as shown in Fig. 1, with the QIAGEN OneStep RT-PCR Kit (QIAGEN). In the present study, we designed a primer pair in silico for RT-PCR amplification of the CRISPRs p-Cas transcripts segment from the UPTCCF89-12 isolate, based on sequence information (Nakanishi et al. 2010). Then, we also designed two primer pairs of f-/r-Cas1UPTC and f-/r-Cas2UPTC for RT-PCR amplification of the transcript segment for the Cas1 and Cas2 (Fig. 1), respectively. These primer pairs were expected to generate RT-PCR products of the transcript segments of 700 bp with the f-/r-p-CasUPTC, 470 bp with the f-/r-Cas1UPTC and 230 bp with the f-/r-Cas2UPTC, respectively. These three primer pairs sequences correspond to the nucleotide positions (np) 2,824 through 2,847 bp and np 3,535 through 3,512 bp for the f-/r-p-CasUPTC, np 4,643 through 4,671 bp and np 5,084 through 5,112 bp for the f-/r-Cas1UPTC and np 5,331 through 5,364 bp and 5,562 through 5,529 bp for the f-/r-Cas2UPTC of the nucleotide sequence data of the CRISPRs genes cluster including adjacent genetic loci of the UPTC CF89-12 isolate (AB598370). In addition, another PCR primer pair (f-/r-MOMP-common) constructed for the amplification of the two major outer membrane protein (MOMP) genes, PorA1 and PorA2 from C. lari (Hirayama et al. 2010) for the total cellular RNA positive-control of the UPTC CF89-12. This primer pair was expected to generate RT-PCR product of the transcripts segment of approximately 1,200 bp. For primer design purposes nucleotide sequence alignment analysis was carried out by employing CLUSTAL W software (1.7 program) (Thompson et al. 1994) incorporated in the DDBJ.

Amplified RT-PCR products were separated by 1% (w/v) agarose gel electrophoresis in 0.5× TBE at 100 V and detected by staining with ethidium bromide.

Results

CRISPR locus identification and the resultant schematic representation

The novel CRISPR locus (7,500 bp) that we have identified in the present study, consisted of the p-Cas [structural gene 3,012 bp, 1,003 amino acids (aa), np of the structural gene 1,438–4,449 bp], Cas1 (903 bp, 300 aa, np 4,436–5,338 bp), Cas2 (426 bp, 141 aa, np 5,325–5,750 bp), leader sequence region (146 bp, np 5,751–5,896 bp) and 12 CRISPRs consensus sequence repeats (each 36 bp; 5′-ATTTTATCATAAAGAAATTTAAAAAGAGACTAAAAC-3′), separated by non-repetitive unique spacer regions of similar length (29–31 bp), as shown in Fig. 2. Thus, the 12 CRISPRs consensus sequence repeats of each 36 bp demonstrated an identical nucleotide sequence in the UPTC CF89-12. In addition, as shown in Fig. 2, the 11 non-repetitive unique spacer regions contained distinctly different nucleotide sequences to each other (Fig. 2). The resultant schematic representation of the CRISPR and Cas loci (7,500 bp) identified in UPTC CF89-12, is illustrated in Fig. 1A.

Fig. 2
figure 2

Nucleotide sequences of the 11 unique spacer regions in the UPTC CF89-12 CRISPRs. The 11 unique spacer regions were numbered (nos. 1–11) from 5′ to 3′ in the CRISPRs

In addition, a possible overlap of 14 bp (np 4,436–4,449 bp) occurred between p-Cas and Cas1 structural genes (AB598370). Another possible overlap of 14 bp (np 5,325–5,338 bp) was also seen between Cas1 and Cas2 structural genes.

Putative promoter and hypothetical intrinsic ρ-independent transcription terminator structures

Regarding the identification of the putative promoter structures for the CRISPRs and Cas loci, the authors attempted to perform nucleotide sequence alignment analysis of the 100 bp region upstream of the p-Cas ORFs among UPTC CF89-12 and five C. jejuni 81116, NCTC11168, RM1221, subsp. doylei 269.97, IA3902 isolates, (data not shown). The nucleotide sequence of UPTC CF89-12 is shown in Fig. 3. In this region, a typical promoter consensus sequence at -10 region (TATAAT) was seen at the locus between np 1,395 and 1,360 bp for UPTC CF89-12. However, no consensus sequences at the -35 region were identified, and a semi-conserved T-rich region (T, 11/17) was identified between np 1,354 and 1,370 bp, instead of the region, as shown in RpoD promoters in the genome of C. jejuni (Petersen et al. 2003) (Fig. 3). In addition, similar promoter structures were also seen among the five C. jejuni isolates (data not shown). Therefore, these CRISPR and Cas genes may possibly be transcribed by the σ70 factor in the UPTC and C. jejuni organisms, as described by Petersen et al. (2003).

Fig. 3
figure 3

Nucleotide sequence analysis for the identification of the putative transcription promoter structures in UPTC CF89-12 CRISPRs

We also attempted to perform nucleotide sequence alignment analysis of the leader sequence region (np 5,751–5,896 bp) immediately upstream of the CRISPRs in order to identify any other promoter structure(s) which exist for the CRISPRs among the UPTC F89-12 and five C. jejuni isolates. However, no typical promoter consensus sequences or semi-conserved T-rich region were identified (data not shown).

Moreover, the hypothetically intrinsic ρ-independent transcription terminator structure which contains a G + C rich region near the base of the stem between np 6,886 and 6,901 bp and a single-stranded run of T residues (np 6,901–np 6,904) were seen downstream of the 12 CRISPRs consensus sequence repeats in UPTC CF89-12 (Fig. 4).

Fig. 4
figure 4

Nucleotide sequence analysis for the identification of the putative intrinsic ρ-independent transcription terminator structure in the UPTC CF89-12 CRISPRs

Regarding the regions immediately upstream of the p-Cas and downstream of the 12 CRISPRs consensus repeats within the 7,500 bp locus in the UPTC CF89-12 isolate, two putative and full-length structural genes of the 5′-methylaminomethyl-2-thiouridyl-ate methyltransferase (TRMU; structural gene 1,017 bp in length, np 179–1,195 bp) and the phosphatidyl glycerophosphatase A (PGPase, 489 bp, np 6,736–7,224 bp; reverse direction), respectively were identified.

In addition, the amino acid sequence of the putative ORF of TRMU and PGPase from UPTC CF89-12 gave so high a sequence similarity of approximately 94 and 96% to those two counterparts in the C. lari RM2100 strain (NC_012039, Miller et al. 2008), respectively. Thus, the CRISPRs and Cas locus (approximate 5,700 bp) within the UPTC CF89-12 isolate occurred between the TRMU and PGPase structural genes, which both gave high sequence similarities to those two counterparts in the C. lari RM2100 strain, as described above.

No in vivo transcription of the p-Cas, Cas1 and Cas 2 genes

When RT-PCRs were carried out with UPTC CF89-12 using the three primer pairs (f-/r-p-CasUPTC for the p-Cas; f-/r-Cas1UPTC for Cas1; f-/r-Cas2UPTC for Cas2) to amplify the gene transcripts of the CRISPRs and Cas locus, no RT-PCR signals of expected sizes were seen (Fig. 5). However, in vivo MOMP transcription, employed as a positive-control for the total cellular RNA, was shown to occur (lane 5 in Fig. 5). Thus, no in vivo transcription of the p-Cas, Cas1 and Cas2 genes was confirmed in the UPTC CF89-12 cells (lanes 2, 3 and 4 in Fig. 5). In addition, three positive signals for corresponding gene segments occurred with UPTC CF89-12 DNA and without the RNA, as shown in lanes 7, 8 and 9 in Fig. 5.

Fig. 5
figure 5

RT-PCR analyses of the p-Cas, Cas1 and Cas2 gene transcripts expressed in the UPTC CF89-12 cells. Lane M, 100 bp DNA ladder (New England BioLabs Inc.). Lanes 1–6, with the UPTC CF89-12 total cellular RNA and without the UPTC CF89-12 genome DNA; lanes 7–10, without the RNA and with the DNA; lanes 1, 6, without the reverse transcriptase (negative-control); lanes 2–5, with the reverse transcriptase (Sensiscript and Omniscript) and HotSarTaq DNA polymerase; lanes 7–10, with HotStar Taq DNA polymerase. Three primer pairs (f-/r-Cas1UPTC, lanes 1, 2, 7; f-/r-Cas2UPTC, lanes 3, 8; f-/r-p-CasUPTC, lanes 4 and 9) were employed in the present RT-PCRs, respectively. In addition, another PCR primer pair (f-/r-MOMP-common) was employed for the total cellular RNA positive-control of the UPTC CF89-12 (lanes 5, 6, 10)

Discussion

Campylobacter lari RM2100, whose genome analysis has already been described (Miller et al. 2008), was shown not to carry any CRISPRs or Cas, as described above. However, in the present study, we firstly identified the CRISPR locus (7, 500 bp) in a representative C. lari taxon, namely UPTC, in the Japanese UPTC CF89-12 isolate.

In Fig. 6, schematic representations of the CRISPRs and Cas loci in the five C. jejuni isolates, whose genome sequences have already been completed (accessible in the DDBJ/EMBL/GenBank databases) are illustrated for comparison.

Fig. 6
figure 6

Schematic representations of the CRISPR and Cas loci in the five C. jejuni isolates whose genome sequences have already been completed and are accessible in DDBJ/EMBL/GenBank. *Author’s annotated (C. jejuni RM1221)

Figures 1 and 6 also indicate that UPTC CF89-12 and five C. jejuni isolates contained p-Cas and two Cas genes, Cas1 and Cas2 within the CRISPR and Cas loci, similarly. Regarding the CRISPRs consensus sequence repeats, four to 12 CRISPRs consensus sequence repeats separated by non-repetitive unique spacer regions (Fig. 6) occurred in these six UPTC and C. jejuni isolates. Nucleotide sequence alignment analysis of these CRISPRs consensus sequence repeats was carried out among the six Campylobacter isolates, as shown in Fig. 7. Surprisingly, the nucleotide sequence similarity of these CRISPRs consensus sequence repeats were shown to be approximately 92–100% among the UPTC CF89-12, C. jejuni 81116, NCTC11168, RM1221 subsp. doylei 269.97 and IA3902, and the sequence differences occurred within the three bases among these six isolates. However, no sequence similarity occurred among the unique spacer regions within the CRISPRs and Cas loci from these isolates (Fig. 2).

Fig. 7
figure 7

Nucleotide sequence alignment analysis of the CRISPRS consensus sequence repeats among one UPTC and five C. jejuni isolates. Numbers at the left and right refer to the nucleotide positions of each CRISPRs consensus sequence repeats among six isolates. Dots indicate identical bases; changes are indicated so; dashes are deletions; identical positions in all cases are marked with asterisks

In addition, Bolotin et al. (2005) suggested that the unique spacer regions are the traces of past invasions by extrachromosomal elements. Therefore, it would be worthwhile to clarify if the CRISPRs spacer regions (29–31 bp) which occurred in the UPTC isolate CF89-12, are homologous to other extrachromosomal genes or not. Then, we compared the nucleotide sequences of the 11 CRISPRs unique spacer regions with the other sequences already reported. Consequently, the non-repetitive unique spacer regions nos. 7 and 9 in Fig. 2 were 100% identical to the sequences in the pCL2100 megaplasmid genome DNA (46,201 bp) and C. lari RM2100 genome DNA (1,525,460 bp) (Miller et al. 2008), respectively. Regarding the no. 9, the nucleotide sequence was also identified to be 100% identical to the sequence of the C. lari integrated element 1 (CLIE1, Cla_0845; np 803,155–803,183) in C. lari RM2100 (Miller et al. 2008). This may indicate that the no. 9 sequence can be derived from prophage following phage infection.

Nucleotide and amino acid sequence similarities of the Cas1 and Cas2 among the six isolates including UPTC CF89-12 are also shown in Tables 1 and 2. Nucleotide and amino acid sequence similarities of the Cas1 and Cas2 were very high (97.2–100%) among these five C. jejuni isolates, respectively. However, nucleotide and amino acid sequence similarities of the Cas1 and Cas2 were not so high i.e. 70.7–71.8 and 65.4–66.4% between UPTC CF89-12 and the five C. jejuni isolates, respectively.

Table 1 Nucleotide sequence similarity of Cas1 (upper right) and Cas2 (lower left) among five C. jejuni and one UPTC CF89-12 isolates
Table 2 Amino acid sequence similarity of the putative ORFs for Cas1 (upper right) and Cas2 (lower left) among five C. jejuni and one UPTC CF89-12 isolates

The CRISPRs and Cas locus within the UPTC CF89-12 isolate was identified to exist between the TRMU and PGPase structural genes, both giving high sequence similarities to those two counterparts in the C. lari RM2100 strain, as described above. In addition, in the C. lari RM2100 strain, a hypothetically possible lipoprotein structural gene (423 bp) (140 aa residues for the ORF, reverse direction) occurred between these two counterparts. Therefore, in the UPTC CF89-12, the CRISPRs and Cas locus may be transfered and introduced into the genomic DNA from any extrachromosomal origin by a homologous recombination.

In the present study, no RT-PCR but positive PCR signals for the p-Cas, Cas1 and Cas2 genes were confirmed in the UPTC CF89-12 cells. Most recently, regarding these results, Pul et al. (2010) described that the DNA-binding protein H-NS suppresses CRISPR-Cas gene expression in E. coli K12 cells (Pul et al. 2010). Therefore, none of these RT-PCR signals may possibly be caused due to the similar suppression of the CRISPR and Cas genes expression in the UPTC cells.