Introduction

Osteogenesis imperfecta (OI) is a heritable connective tissue disorder that is typically characterized by bone fractures, tooth abnormalities (dentinogenesis imperfecta), and blue or gray sclera. It is usually transmitted in an autosomal dominant fashion but recessive OI also occurs [1]. Mutations in at least 12 different genes are known to cause OI, but the large majority of individuals with OI have dominant mutations in COL1A1 or COL1A2, the genes encoding the two collagen type I alpha chains [1].

The clinical manifestations of OI are variable, ranging from few symptoms to perinatal lethality, which has led to a clinical classification that distinguishes OI types with different disease severities [2]. Individuals with the mildest form of OI are diagnosed with OI type I, which is associated not only with an increased propensity for fractures but also normal or near-normal stature and absence of long-bone deformities [3]. OI type I is frequently caused by nonsense or frameshift mutations in COL1A1, which lead to haploinsufficiency of the collagen type I alpha 1 chain [1]. Such haploinsufficiency mutations are typically associated with normal-appearing teeth and discoloration of sclera [3].

Heterozygous deletions of the entire COL1A1 gene apparently are a rare cause of OI type I. Individuals from 6 unrelated families with deletions of the entire COL1A1 gene have been described in the literature [46]. Two additional individuals with such mutations are listed in the Osteogenesis Imperfecta Variant Database (https://oi.gene.le.ac.uk/) [7, 8] and another two cases are included in the Decipher database of genomic variation [9]. When the deletion encompasses not only COL1A1 but also neighboring genes, a contiguous gene syndrome may arise, with clinical manifestations that are not usually seen in individuals with collagen type I alpha 1 haploinsufficiency. For example, some of the individuals with heterozygous deletions of COL1A1 and of adjacent genes had dental manifestations and learning disability [5, 6].

We have recently established a gene panel for diagnosing pediatric metabolic bone disorders using semiconductor (SC)-based next-generation sequencing [10]. Multiplex PCR is used to amplify the exons and exon/intron boundaries of target genes and the resulting amplicons are sequenced on a SC chip [11, 12]. Even though the system is primarily designed to detect missense mutations and small insertions or deletions, the data provided by a sequencing run can also be used to detect copy number variations (CNV) [13]. In the present study we describe 12 individuals with a diagnosis of OI type I from 5 families, where SC sequencing indicated a heterozygous deletion of the entire COL1A1 gene.

Subjects and Methods

Subjects

The study population comprised individuals with a clinical diagnosis of OI type I in whom molecular diagnostic evaluation at Shriners Hospital for Children in Montreal identified deletions of the entire COL1A1 gene. The DNA-based diagnosis of OI was performed for individuals who are seen either at that institution or at one of several referral centers. Until the time of the present report, the laboratory had completed the molecular diagnostic work-up in individuals with a clinical diagnosis of OI type I from 161 families. Until March 2013, Sanger sequencing was used to examine COL1A1 and COL1A2 (all exons and exon/intron boundaries) in 145 families. After March 2013, SC sequencing was used as the primary sequencing modality to analyze samples from 16 additional families and to reanalyze samples from 8 families in whom prior Sanger sequencing had not revealed disease-causing sequence alterations.

This report presents data on the 5 families where SC sequencing indicated the presence of deletions of the entire COL1A1 gene (Fig. 1). Affected individuals were assessed clinically at Shriners Hospital for Children in Montreal or at the Children’s Hospital of Eastern Ontario in Ottawa. Clinical data were obtained by retrospective chart review. Clinical diagnoses were based on the assessment by one of the co-authors. The study was approved by the Institutional Review Board of McGill University. Informed consent was provided by patients, or, in minors, their parents.

Fig. 1
figure 1

Pedigrees of the five families with COL1A1 deletions. The index patient of each family is indicated by an arrow. Individuals with clinical features of OI type I are indicated by full black symbols. Individual I-1 of Family 5 could not be evaluated, but historical information suggested a diagnosis of OI type I

Radiological Measures

Dual-energy X-ray absorptiometry was performed in the antero–posterior direction at the lumbar spine (L1–L4) using a Hologic QDR Discovery device (Hologic Inc., Waltham, MA, USA). Lumbar spine areal bone mineral density (aBMD) results were transformed to age- and gender-specific z-scores using published reference data [14, 15]. Peripheral quantitative computed tomography (pQCT; XCT-2000, Stratec Inc., Pforzheim, Germany) was performed at the metaphysis (4 % site) and at the diaphysis (65 % site) of the radius as described, and z-scores were calculated based on age- and gender-matched reference data established by one of the authors [16, 17]. Information about tibia and femur fractures as well as scoliosis was obtained from radiological reports, including results for the Cobb angle (degree of spine curvature on antero-posterior radiographs). Scoliosis was said to be present when a Cobb angle of >10° was observed [18].

Sequencing

Total genomic DNA was extracted from saliva (Oragene® DNA Genotek, Ottawa, Ontario, Canada) or peripheral blood (QIAamp DNA Blood Mini Kit, Hilden, Germany). SC sequencing was performed as described in detail elsewhere [10]. A total of 10 ng DNA per sample was used for target enrichment by multiplex PCR, followed by ligation to index sequences (Ion Xpress™ Barcode Adapters Kit; Life Technologies). Libraries from 4 samples were pooled. To clonally amplify the library DNA onto Ion Sphere Particles (Life Technologies), the library pool was subjected to emulsion PCR using an Ion OneTouch™ 2 system (Life Technologies) following the manufacturer’s protocol. Enriched Ion Sphere Particles were subjected to sequencing on an Ion 316 Chip using Ion PGM 200 Sequencing Kit (Life Technologies) as per the manufacturer’s instructions with the 200-bp single-end run configuration.

Data from the SC sequencing run were processed using Torrent Suite software (version 4.02; Life Technologies) for base calls, read alignments, and variant calling using the reference genomic sequence (hg19) of target genes as described [10]. Called variants were annotated using Ion Reporter (version 4.6), the web-based software provided by the manufacturer of the SC sequencing device.

Analysis for CNV was also performed with Ion Reporter. The CNV detection strategy used by this software has been described by others [13]. The read counts for each amplicon are divided by the total number of reads from the sample to correct for between-sample variability. These normalized read counts are then divided by the normalized reads from a pool of ‘baseline’ samples. We selected as baseline the SC sequencing results from 12 males who had previously been sequenced using the same targeted gene panel. The resulting ratios between test sample and baseline samples are expressed as log2 ratios and corrected for the GC content in each amplicon. When a CNV is detected, the algorithm provides the ‘confidence’ of the finding, a quality metric that is defined as the log-ratio between the called ploidy state likelihood of the segment (i.e., a ploidy state of 1 in the case of a heterozygous deletion) and the likelihood of the normally expected ploidy state (i.e., a ploidy state of 2). High confidence values (>10, according to the manufacturer’s recommendations) indicate a high certainty that the copy number state is different from the expected value of 2.

Sanger sequencing of PCR products covering all exons and exon/intron boundaries of the COL1A1 and COL1A2 genes was performed using a BigDye Terminator cycle sequencing kit (Applied Biosystems, Foster City, USA). The nucleotide sequence was determined using an Applied Biosystems 3100 DNA sequencer.

Confirmation of Deletions by Quantitative PCR

The deletions identified by SC sequencing were confirmed by two methods, quantitative PCR and array comparative genomic hybridization (CGH). Quantitative PCR for copy number state was carried out using the TaqMan probes indicated in the legend of Fig. 2 together with a control probe (ribonuclease P RNA component H1) according to the manufacturer’s protocol (Applied Biosystems, Foster City, CA).

Fig. 2
figure 2

Deletions of COL1A1. a The genomic organization of the COL1A1 locus, showing the sites analyzed by quantitative PCR for copy number state (indicated by numbers surrounded by circles; 1 Hs06426040_cn, location Chr17:48279703; 2: Hs00789254_cn, location Chr17:48270203; 3 Hs00453869_cn, location Chr17:48261473). COL1A1 is transcribed from the minus strand, and therefore, the numbering of sites and exons is from right to left. b The quantitative PCR analysis in genomic DNA from each index patient of each family (labeled with ‘F,’ followed by the number of the family in this study) shows copy number state 1 for all three locations. c Results of array CGH analyses. The upper panel shows the size of the deletion found in the index patient of each family. The middle panel indicates the genes located in the deletion area of each family. The extent of the deletions in Family 1 to 5 is shown in different gray levels to indicate which genes were included in each deletion. The lower panel shows the size of the deletions that have previously been reported in the literature [46] or in the Decipher database. Two additional individuals with COL1A1 deletions are listed in the Osteogenesis Imperfecta Variant Database, but could not be included in this figure, as the extent of the deletions beyond COL1A1 has not been reported

Array CGH

Array CGH was performed using an Agilent (CGX™-4 CGH v1.1 4-plex v1.1 4-plex) array (Perkin-Elmer). This CGH array contains 180,000 oligonucleotides targeting 980 genes and more than 240 important cytogenetic regions with an average spacing of 65–75 kb throughout the genome, and 10 kb in targeted regions.

Results

Phenotype Description

The 5 families comprised 12 individuals with OI type I, ranging from 1.1 to 43 years of age (Table 1; Fig. 1). All but one of these individuals had a height within normal limits (z-score between −2.0 and +2.0), whereas lumbar spine areal BMD z-score was low (<−2.0) in 5 of the 11 individuals (45 %) in whom bone densitometry had been performed. Two individuals had not sustained any long-bone fractures, the others had suffered between 1 and 10 long-bone fractures. Vertebral fractures and scoliosis were observed in 2 and 1 individuals, respectively. All individuals reported here were able to walk independently.

Table 1 Phenotypic characteristics of individuals with COL1A1 deletions

Family 1

The index patient (II-1) is a boy who was last evaluated at 11 years of age. He was born at term (birth weight 3175 g, length 52 cm) by spontaneous vaginal delivery to healthy parents. In infancy, he was treated twice for intestinal intussusception. He started ambulating at 18 months of age. He had vesicoureteral reflux for which he received antibiotic prophylaxis. He was diagnosed with mild learning disability and speech delay when he entered school. Teeth were discolored and fragile, indicating osteogenesis imperfecta, and required capping before the age of 6 years. The first fracture (of the left tibia) occurred at the age of 9 months, followed by three more long-bone fractures. A thoracic vertebral compression fracture was noted at 9 years of age. He was treated with oral risedronate as part of a randomized trial for 2 years [19]. Examination of collagen type I protein in skin fibroblast was reported as indicating a decreased production of alpha 1 chains, but Sanger sequencing of COL1A1 and COL1A2 did not reveal any disease-related sequence alterations.

Family 2

The index patient (II-1) was a male who was last assessed at 17 years of age. He was born at term without complications. The first fracture (of the left tibia) occurred at the age of 12 months, followed by two more long-bone fractures during the next 16 years. A grade 2 compression fracture of thoracic vertebra 8 and a thoracic kyphosis of 53° were noted at the age of 17 years. He attended a special class at school due to learning disabilities.

His younger brother (II-2, 8 years of age at last follow-up) was born after 32 weeks of pregnancy (birth weight 1500 g) by spontaneous vaginal delivery. There were no postnatal complications. He sustained the first fracture (of the right tibia) at the age of 6 months but had no long-bone fractures thereafter. Learning disabilities and speech delay were noted when he started school. Hearing tests were normal.

The mother (I-2) of the two boys had not sustained any fractures but blue sclera suggested that she had OI. When she was assessed at 34 years of age, her lumbar spine areal BMD was within normal limits, but pQCT of the distal radius revealed low trabecular volumetric BMD (z-score −2.4).

Family 3

The index patient (II-1) sustained his first fracture (of the left tibia and fibula) at the age of 2.3 years after a minor fall. Transiliac bone biopsy showed large number of osteoblasts and hyperosteocytosis with borderline low trabecular bone volume. Treatment with intravenous zoledronic acid was started at the age of 2.8 years.

His younger brother (II-2) experienced his first fracture (of the right tibia) after a minor injury at the age of 2.5 years. Transiliac bone biopsy showed a normal amount of cortical and trabecular bone but increased numbers of osteoblasts and osteoclasts. Treatment with intravenous zoledronic acid was started at the age of 2.9 years.

The mother (I-2) of these two boys had fractured her left tibia at least twice and had undergone rodding surgery in her left tibia during childhood. No hearing problem was reported.

Family 4

The index patient (II-1) was born at term (birth weight 3090 g, length 52 cm) without fractures. Both parents were healthy. The first fracture (of the left tibia) occurred at the age of 12 months, followed by three more long-bone fractures until the last follow-up at 15 years of age.

Family 5

The index patient (III-1) was born at term (birth weight 3600 g) without fractures. He sustained the first fracture (right tibia) at 6 months of age, followed by six more long-bone fractures in the following 2.5 years. A thoracic vertebral compression fracture was noted at the age of 16 months. Treatment with intravenous zoledronic acid was started at 3 years of age. No more fractures of long bones or vertebra have been observed in the available follow-up period at the age of 10 years.

His mother (II-2) had a history of fractures and normal lumbar spine areal BMD results. Peripheral QCT was not performed in the mother. The maternal aunt (II-4) of the index patient had a history of 10 long-bone fractures and had normal lumbar areal spine BMD, but radius pQCT showed low trabecular volumetric BMD. Her son (III-2) was evaluated at 13 months of age and had blue sclera but had not sustained any fractures. The grandfather (I-1) of the index patient had OI type I according to historical information but could not be assessed.

Molecular Diagnosis

SC sequencing was performed in the index patients of the five families. In four index patients, genomic DNA had previously been analyzed for mutations in COL1A1 or COL1A2 by Sanger sequencing, but no disease-related sequence alterations had been found. The DNA of one index patient was analyzed directly by SC sequencing, without prior Sanger sequencing.

SC sequencing achieved a mean read depth above 500 in each sample (Table 2). No disease-related missense or small indel mutations were found in COL1A1, COL1A2, CRTAP, LEPRE1, PPIB, SERPINH1, FKBP10, PLOD2, SP7, SERPINF1, BMP1, TMEM38B, IFITM5, WNT3A, DKK1, or LRP5. However, SC sequencing indicated heterozygous deletions of the entire COL1A1 gene in each index patient. Confidence was above 300 in each sample, indicating a high degree of certainty that the deletions were true positive findings.

Table 2 SC sequencing and array CGH results in the index patients of each family

To confirm the heterozygous COL1A1 deletions, we used quantitative PCR with genomic copy number assays interrogating three sites in and close to COL1A1 (Fig. 2a, b). This confirmed that one copy of the entire gene was deleted in each of the index patients. The same technique was used to confirm the presence of the COL1A1 deletion in each of the affected family members described earlier. DNA was not available from family members who did not have a clinical diagnosis of OI.

To further confirm the COL1A1 deletions and estimate the size of the deletion in each family, we performed array CGH. The locations of the most centromeric and the most telomeric oligonucleotide probes that were part of the deletions are indicated in Table 2. The distance between these locations indicates the minimal size of each deletion. The neighboring oligonucleotide probes that are just outside of the deletions allow for an estimate of the maximal deletion size. Deletion size differed markedly between index patients, varying from 18.5 kb to 2.23 Mb (Fig. 2c). The number of genes encompassed by the deletion ranged from one gene (only COL1A1) to 47 genes.

Discussion

In this study, we identified heterozygous deletions of the entire COL1A1 gene in 5 unrelated families who had OI type I with blue sclera. We initially detected COL1A1 deletions by SC sequencing and then confirmed them using quantitative PCR and array CGH, which revealed that deletion sizes varied markedly between families. Even though all affected family members had clinical manifestations of OI type I, in some families we observed additional features, in particular dentinogenesis imperfecta and learning disability, that are not usually associated with OI type I caused by COL1A1 haploinsufficiency mutations.

The severity of the skeletal involvement in the described individuals was broadly similar to OI type I caused by COL1A1 stop or frameshift mutations that we had investigated previously [3]. The two groups had similar mean height z-scores and a similar percentage of individuals were treated with bisphosphonates (Table 1). Deletions of the entire COL1A1 gene seem to be a rare cause of OI type I, as we found such deletions in 5 of 161 families (3 %) with this diagnosis.

Array CGH showed that the deletion encompassed up to 46 genes in addition to COL1A1. In that respect, it is interesting to note that the index patient of Family 1 was the only individual in the present series who had tooth discoloration and on that basis was clinically diagnosed as having dentinogenesis imperfecta. Dentinogenesis imperfecta is not usually associated with haploinsufficiency mutations in COL1A1 [3]. The index patient of Family 1 was the only individual presented here in whom the deletion included DLX3 and DLX4 (Fig. 2c). These two genes were also deleted in the two individuals reported by Horbuz et al. and Mannstadt et al., who had OI and tooth involvement [5, 6] (Fig. 2c). Enamel loss was also noted in one individual with a deletion encompassing DLX3 and DLX4 in addition to COL1A1 [4]. In contrast to these individuals, the three affected members of our Family 2 appeared to have normal teeth, even though their deletion included all the genes that were deleted in the individual reported by Mannstadt et al., except for DLX3 and DLX4.

DLX3 and DLX4 encode transcription factors that are thought to be important for craniofacial development [20]. DLX4 haploinsufficiency due to a heterozygous frameshift mutation has recently been proposed as a cause of cleft palate and jaw abnormalities [21]. We did not observe cleft palate in the index patient of Family 1 and this feature has also not been noted in the other reports mentioned earlier [46]. However, cleft palate and Robin sequence were observed in an individual with OI type I who had a deletion of chromosome 17q21.33 to 17q23.1, which includes both DLX4 and COL1A1 [22]. It is also interesting to note that heterozygous frameshift mutations in DLX3 that lead to a truncated protein, apparently with dominant negative effect, have been associated with tricho-dento-osseous syndrome [23, 24], which includes tooth abnormalities (amelogenesis imperfecta). Overall, our observations are compatible with the hypothesis that heterozygous deletions involving DLX3 and DLX4 are associated with dental manifestations.

Another noteworthy finding of the present study is that the three affected individuals in Families 1 and 2 all had learning disabilities and speech delay. Again, these manifestations are not usually present in OI type I, but have been reported in the two individuals described by Harbuz et al. and Mannstadt et al. [5, 6] (Fig. 2c). The deletions in Families 1 and 2 and in those previous reports overlap for a stretch of DNA that contains a large number of genes and it is, therefore, not possible to prove which gene is responsible for causing the learning disability. Harbuz et al. hypothesized that the deletion of CACNA1G might be responsible for this aspect of the phenotype, given that CACNA1G codes for a calcium channel that is expressed in neurons and has been implicated in causing a recessive cognitive disorder [25]. This hypothesis is in accordance with the observation that affected individuals in Families 1 and 2 as well as individuals 267067 and 255632 of the Decipher database (Fig. 2c) had intellectual disability in addition to OI and that the deletions in these individuals also included CACNA1G.

From a methodological perspective, the present study confirms that amplicon-based SC sequencing of a targeted gene panel can identify not only missense and small indel mutations but also large heterozygous deletions affecting an entire gene. We had previously used an in-house method to detect large heterozygous deletions by SC sequencing [10]. In the present study, we used the software provided by the device manufacturer to identify CNV. Integration of the CNV algorithm into the software facilitates the detection of such variants.

All patients reported here had deletions of all exons of the COL1A1 gene, but we did not observe partial multi-exon deletions. Partial COL1A1 deletions appear to be rare, as they represent only 6 out of the approximately 1500 pathogenic COL1A1 mutations that are presently listed in the Osteogenesis Imperfecta Variant Database (accessed 14-Sep-2015). Another reason for not observing partial COL1A1 deletions may be that the present study examined patients with OI type I, whereas partial deletions often lead to in-frame deletions, which tend to result in more severe OI phenotypes [26, 27].

In summary, the present study shows that deletions of the entire COL1A1 gene can be detected using SC sequencing. We found such deletions in 3 % of families with a clinical diagnosis of OI type I. Deletions encompassing not only COL1A1 but also neighboring genes can lead to contiguous gene syndromes that may include dental involvement and learning disability.