Introduction

Determination of sibship is a frequent question in kinship analysis, e.g., if the alleged father of a child is not available so that (half) siblings of the child have to be analyzed or in probate disputes. Moreover, it is the method of choice in routine forensic case work when, e.g., putrefied, skeletonized, or severely burned corpses have to be identified and no other method such as comparison of dental records, fingerprints, or identification by tattoos is applicable. There have been several studies published on improving DNA extraction and/or STR analysis especially for highly degraded tissues from putrefied tissues or severely burned bodies so that a kinship analysis is possible [1, 2]. STR results from the deceased can be compared to data from living offsprings or from parents and a regular paternity analysis can be conducted for identification purpose. Additionally, DNA is often extracted from tools that carry DNA from the deceased (e.g., from tooth brushes or combs [3]) and a matching probability can be calculated when the resulting STR pattern is identical.

But there are also cases in which no other relative than the putative brother or sister from the deceased is available for analysis [4, 5]. Sometimes even only putative half siblings can be investigated [6]. Then, a statistic calculation can be done which bases on the assumption that siblings theoretically share more alleles than non-related persons. Whether two unrelated persons share one or two alleles by chance depends only on the population frequencies of the alleles. Related persons, however, can also share alleles by descend. A pair of full sibs inherits the same allele from both of his two parents with a probability of 0.5. Since the transmissions from both parents to the two children are assumed to be independent, there are probabilities of ¼, ½, and ¼ for the two sibs, to share 0, 1, or 2 identical alleles by descend, respectively. In addition, there is a probability for the two sibs to share alleles for similar reasons as unrelated do, namely by the fact that two or more alleles of the parents were the same by chance. The probability for a pair of full sibs to share 0, 1, or 2 allele (identity by state) can be calculated from the identity by descend probabilities and the population frequencies of the alleles [7].

For many years, several STR multiplex PCRs of different companies are routinely used in kinship analysis such as the AmpFlSTR Identifiler kit (Applied Biosystems) or the Powerplex® 16 (Promega) both amplifying 15 STR loci in parallel [812]. Recently, multiplex PCRs were developed comprising even more loci, such as the GlobalFiler® with 24 loci (Applied Biosystems) or the Powerplex® 21 providing 20 STR loci (Promega). The latter was already tested in own studies and showed great potential for a forensic approach [13].

The aim of this study was to establish reliable background data for sibship analysis to be used in forensic case work by investigating three different groups: full siblings with identical mother and father, verified half siblings with different fathers, and unrelated individuals.

Material and methods

Samples

DNA samples for the first part of this study came from twin pairs that had to be genetically investigated for determination of mono- or dizygosity by STR analysis [14]. In all cases, samples were only taken after informed consent from the parents. Only dizygotic twin pairs were chosen for this study. Altogether, 346 twin could be recruited; all were of European origin.

Samples for the second project “Half siblings” derived from real paternities analyzed in the Institutes of Legal Medicine in Kiel or Essen. Half sibship for all 30 half siblings was confirmed by regular paternity analysis of all three parents.

Additionally, group three consisted of 112 unrelated individuals from real paternities in the Institute of Legal Medicine in Essen. Every individual was compared with everyone else which resulted in 6216 comparisons. All individuals were of German descent and apparently unrelated. All persons investigated gave informed consent.

DNA extraction

From all twins, DNA had already been extracted from buccal swabs or umbilical cord tissue [14] and was send to us for genetic analysis. From the individuals of group 2 (half siblings) and group 3 (unrelated individuals), DNA was extracted from buccal swabs using the innuPrep DNA Swab Kit (Analytik Jena, Jena, Germany) according to the manufacturer’s instructions or by use of 300 μl SwabSolutionTM per sample (Promega, Mannheim, Germany). In Kiel, buccal swabs were subjected to DNA extraction using the Invisorb Spin Swab Kit (Stratec, Berlin, Germany).

DNA amplification and fragment analysis

Samples were subjected to the Powerplex® 16HS kit, comprising 15 STRs, or the newly introduced Powerplex® 21 kit which includes the 15 STRs of the Powerplex® 16HS kit as well as D2S1338, D19S433, D1S1656, D12S391, and D6S1043 (both Promega). Ninety sibling pairs were taken for additional employment to the Powerplex® 21 kit. Additionally, specific samples were subjected to the male-specific Powerplex® Y23 kit (Promega). The amplification protocols for all kits followed the manufacturer’s instructions with a reduced PCR volume of 12.5 μl in the GeneAmp® PCR system 9700 (Applied Biosystems). The employment of this non-standard reaction has been thoroughly and independently tested according to the existing quality managements.

Amplification products (0.5 μl in 12-μl sterile water with 0.3-μl ILS500 standard) were separated and detected on an ABIPrism3130 Analyzer (Applied Biosystems) in comparison to the allelic ladders which are components of both kits. Electrophoresis results were analyzed using the GeneMapper® ID Software v3.2 with the bin set provided by the manufacturer (www.promega.com). Allele peaks were interpreted when greater than or equal to 50 relative fluorescence unit (RFU) and lower or equal to 3000 RFUs. All positive controls had to show the expected full STR profile with allelic peaks between 50 RFUs and 3000 RFUs and without any allelic drop out or drop in. All other data were dismissed and analysis was repeated.

Statistical analysis

A commercially available computer program (M.B. Baur, R. Fimmers, W. Spitz, Version 1.19 m+, Bonn, Germany) was used to calculate likelihood ratios and a posterior probabilities for the hypotheses for each pair of sibs and half sibs given the hypotheses below. The prior probabilities for this calculation were assumed to be equal for the hypotheses. W-values and likelihood ratios (LR) for sibship probability were calculated.

For all groups, the following hypotheses were regarded:

  1. Scenario 1:

    Hypothesis 1: Both individuals investigated are full siblings

    Hypothesis 2: Both individuals are unrelated

    and

  2. Scenario 2:

    Hypothesis 1: Both individuals investigated are half siblings with the same mother

    Hypothesis 2: Both individuals are unrelated

Additionally, for full siblings and half siblings, a comparison of the following hypotheses was performed:

  1. Scenario 3:

    Hypothesis 1: Both individuals investigated are full siblings

    Hypothesis 2: Both individuals investigated are half siblings with the same mother

Simulation of data

In addition to the empirical data, we simulated marker data for one million pairs of full sibs, half sibs, and unrelated persons each. For these simulations, we used 15 and 20 marker systems with the same allele frequencies as we used for the evaluation of the empirical cases. Moreover, we assumed a value of θ = 0.01 for the co-ancestry of the unrelated persons and the parents of the sibs as well as a mutation rate of 0.001 for each transmission independently for all marker systems. The likelihood of the simulated/observed genotype data of the pairs was then calculated under three different hypotheses about their relationship (full sibs, half sibs, and unrelated). Likelihood ratios and W-values were calculated for pairs of the three hypotheses using frequency data from our own study of unrelated Germans [13].

Ethics

The initial study was approved by the local committee on research in human subjects of the University of Luebeck and the local committees of all participating centres. There were no more investigations made than necessary for the initial studies. All investigations were done with anonymized samples.

For the half sibling project, all people gave informed consent. Samples were obtained and analyzed after advice of the Medical Ethics Committees of the University of Duisburg-Essen and University Hospital of Schleswig-Holstein in accordance with the Declaration of Helsinki. The anonymity of the individuals investigated was preserved corresponding to the rules of data protection of the Human Medical Faculties Essen and Kiel.

Results and discussion

Testing of relationship and data reliability

From 346 twin pairs, 19 pairs tested for full sibship (versus unrelated) demonstrated LR below 10 (W-value > 90 %) after analyzing 15 STRs (Table S1). Those samples were further investigated to ensure that no sample mix up has led to wrong, negative, or rather low results.

All samples to be investigated were clearly and reproducibly labeled; thus, a simple mix up was not very likely. However, all 19 cases were reanalyzed and labeling was thoroughly checked. Six sample pairs derived from male pairs and could therefore be subjected to a Y chromosome-specific STR analysis using the Powerplex® Y23 (Promega). Four pairs (5, 8, 12, and 13) showed an identical Y-specific haplotype pointing to real sibship despite of the low values. Two pairs displayed clearly different haplotypes (1 and 6); thus, a mix up of samples was assumed and the corresponding samples were omitted and not used for further investigations. In eight cases, LR values of >100 (average, 156.1; median, 127.8), corresponding to W-values > 99 % with an average of 99.43 and median of 99.21 %, for a full sibship could be obtained by analyzing 20 STRs instead of only 15 loci (bold marked in Table S1). Thus, these pairs were considered to be real siblings. Three cases led to rather low LRs for a full sibship but to high values for half sibs. Retesting of new samples from those individuals showed that these individuals really were siblings and no sample mix up had occurred. For the last three pairs, W-values for full sibship or half sibship remained below 12.7 % (LR under 0.14476). Re-analyzing samples from the original tissues including the mothers confirmed the origin of the samples. Thus, data were included in the study.

By using this experimental approach, we were aware that dizygotic twins are not always full siblings. There are a few references describing dizygotic twin pairs with two different fathers [15, 16], but the frequency of this so-called heteropaternal superfecundation is not at all determined. A study of the literature revealed that its frequency among dizygotic human twins varied between 2.4 % [17] and 0.25 % [18]—more data could not be found, although it is a quite normal event in other mammals as cats and dogs [19]. In our own institutes, 22 (Essen) and 18 (Kiel) cases of dizygotic twins were investigated in the last 8 years, none with different fathers. Since such cases are very rare among humans, we are confident that such occurrences do not affect our calculations and conclusions noticeably.

Investigation of full siblings (twin group)

Scenario 1: full siblings/unrelated

After investigating of 344 dizyotic twin pairs, we obtained a maximum value of LR 9.35 × 1010 (W-value >99.999999 %) for aposteriory probability of hypothesis 1 (full sibship) when including 15 STRs provided by the Powerplex® 16; the minimum LR was 5.407 × 10−5 (W-value 0.005407 %), median LR was 20,643 (W-value 99.9953325 %), and average 291.2 × 106 (96.554953 %).

For an easier interpretation of a possible relationship, we divided the results into eight different groups (see Table 1). In summary, in 95 % of all investigated full sibling pairs, a LR for full sibship of 9 or higher (W-value ≥ 90 %) could be achieved, while only less than 3 % led to LRs below 1/99 (W-value ≤ 50 %). Almost 75 % of the pairs as well as the simulated comparisons demonstrated a LR > 999 (W-value > 99.9 %). However, we did find twins with LRs > 1/9 (W-value < 10 %) (1.8 % of pairs). This contradicts other studies, e.g., the investigation of 50 known siblings using 15 STR markers by Reid et al. [8] in which all sibling pairs had a combined sibship index (CSI) of >107. The CSI is based upon the degree of sharing alleles between two DNA profiles. A CSI less than 1 implies that the two individuals are not related as siblings, while a value over 1 supports this kind of relationship [6]. The difference to our results might be due to the rather low number of individuals investigated by Reid et al. [8]. Pu and Linacre, for example, generated 357,630 full sibling pairs from DNA profiles of random populations (Chinese, Caucasian, and African Americans) using the 15 STRs provided with the Identifiler Kit (Applied Biosystems) [9]. They found in about 1.5 % of their cases CSI values less than 1 which corresponds to our findings of 1.8 % of pairs with LR < 1/99.

Table 1 Results for calculating full sibship versus unrelated from 344 dizygotic twin pairs in comparison to simulated data. Calculations were done using German population frequencies [13] after amplification of 15 STRs

The high concordance between experimental and simulated data underlines the reliability of our results and makes the possibility of heteroparental superfecundation among our twins highly unlikely.

Scenario 2: half siblings/unrelated

Investigating the possibility for a half sibship relation between our twin pairs, the values differed slightly. Regarding this possibility, 96.8 % demonstrated a LR over 9 (W-value ≥ 90 %), 49.7 % even a LR ≥ 999 (W-value ≥ 99.9 %) (Table 2). These data show that most cases clearly supported a relationship.

Table 2 Results for calculating half sibship versus unrelated from 344 dizygotic twin pairs in comparison to simulated data. Calculations were done using German population frequencies [13] after amplification of 15 STRs

Investigation of 20 instead of 15 STRs

In sibship analysis, several authors recommend to investigate as many STRs as possible, since the significance of the analysis would benefit greatly by including more than 15 STRs or additional genetic markers such as mitochondrial DNA or gonosomal STRs [9]. In an attempt to verify these assertions, we investigated 88 of our twin pairs (including the cases with lowest probabilities) and additionally using the Powerplex® 21 kit which comprises altogether 20 STR markers.

Looking at these 88 twin pairs, maximal and minimal values for a full sibship were 1.24 × 109 and 5.41 × 10−5 when regarding 15 STRs and 5.9123 × 1013 and 1.817 × 10−5 after increasing to 20 STRs. The total number of cases with LR over 999 increased when regarding 20 STRs (from 61.4 to 76.1 %; Table 3). Regarding the cases individually, the probability increased in 78 out of 88 twin pairs (=88.6 %) and decreased in 10 cases (=11.4 %). While the increase was often very high, the decrease usually was only marginal (e.g., from 546 to 304 or from 279 to 215). However, twice the LR dropped from >1000 to 419 and 151, respectively. This example shows that increasing the number of loci can greatly improve the informative value but not necessarily in every case. This was already mentioned in a study by Wenk and Shao [20].

Table 3 Results for calculating full sibship (half sibship) versus unrelated from 88 dizygotic twin pairs. Calculations were done using German population data [13] after amplification of 15 STRs or 20 STRs. Data for testing these same twins in a half sibship versus unrelated hypothesis is given in brackets

As expected, combined data for the possibility of half sibship from our twin pairs were also higher when calculating with 20 STRs (Table 3). In 95 %, an increase of the LR could be obtained, maximum value increased from 447,810 to 308,600,000, median from 414 to 4111. Only four times, LR dropped, while in 26 % of the cases, the LR changed to at least 1000.

Scenario 3: full siblings/half siblings

In many cases, the people requesting a sibship analysis are certain that they have the same mother, but want to know if they have the same father, too. Our results demonstrate that it is much more difficult to answer this question. Only about 60 % and up to 76 % of pairs demonstrated a LR over 9 (W-value ≥ 90 %) investigating 15 or 20 STRs, respectively (Table 4). Moreover, more than 25 % of full sibling pairs with LR greater than 99 (W-value ≥ 99 %) in the full sibship versus unrelated calculation drop below an LR of 9 here. An analysis of at least 20 STRs is clearly recommended to distinguish between full sibship and half sibship.

Table 4 Results for calculating full sibship versus half sibship from 260 (15 STRs) or 88 (20 STRs) dizygotic twin pairs. Calculations were done using German population data [13]

Investigation of half siblings (granted half siblings)

Altogether 30 pairs of half siblings could be included in this study. In all cases, the relationship was confirmed prior to our project by additionally investigating the corresponding parents. In theory, half siblings have a 0.5 chance that they share one allele in one locus and 0.5 chance that they display different alleles [5].

Scenario 1: full sibling/unrelated

When using the hypothesis full siblings versus unrelated, a maximum value of 99.993 % could be achieved, but on average, values were lower than for half sibship (see scenario 2) showing that the calculation results really prefer the possibility of half sibship.

Scenario 2: half siblings/unrelated

When regarding two hypotheses (half sibship/unrelated), 23 % of our pairs reached probabilities of at least 99 % (LR at least 99); two of them were over 99.9 % (LR > 999). In 57 % of the cases, a W-value of 90 % or higher was reached for the possibility of half sibship. Only 3 % of the calculations led to values lower than 10 % (Table 5). These data also fit to a former simulation study: In a simulation of 355,620 pairs of half siblings (15 STRs, three populations), about 12 % of obtained combined half sibling index (CHSI) values were found to be less than 1 [9].

Table 5 Results for calculating half sibship versus unrelated from 30 half sibling pairs in comparison to simulated data. Calculations were done using German population frequencies [13] after amplification of 15 STRs and 20 STRs

Investigation of 20 instead of 15 STRs

The confirmed half siblings were also additionally investigated with 20 STRs. As Table 5 shows, the pooled values here increased as well as most of the single data: In 63 % of the cases, probability for half sibship increased with 20 STRs while in 37 % values were lower. Again as expected, these data show that the increase of markers investigated does not necessarily improve the data reliability.

Scenario 3: half siblings/full siblings

In this scenario, the sharing of even one allele is interpreted as a hint for full sibship by the program used for calculation. Therefore, only those STRs in which the half siblings do not share any allele increase the possibility for half sibship. Nevertheless, 40 % and up to 65 % of half sibling pairs displayed a LR over 9 (W-value ≥ 90 %) investigating 15 or 20 STRs, respectively (Table 6).

Table 6 Results for calculating half sibship versus full sibship from 30 half sibling pairs. Calculations were done using German population data [13] after amplification of 15 or 20 STRs

Investigation of unrelated individuals

We applied the same hypothetical models to our unrelated group (112 people) and calculated the probability for full sibship or half sibship versus being unrelated in 6216 comparisons using either 15 or 20 STRs.

Scenario 1: full siblings/unrelated

Regarding full sibship, a LR below 1/9 was achieved in more than 93 % (15 STRs) or more than 97 % (20 STRs) of comparisons (Table 7). LRs higher than 9 (W-value ≥ 90 %) were reached in less than 0.5 % of comparisons. So, in most cases, a strong hint towards “unrelated” could be obtained. In the already mentioned study by Pu and Linacre regarding simulated sibships, also, 178,815 non-sibling pairs were generated from the same populations [9]. About 1.5 % of the pairs showed CIS values larger than 1 which points to a sibship. This fits to our results with 0.5 % of the cases reaching LRs ≥ 9.

Table 7 Results for calculating full sibship versus unrelated from 112 pairs of unrelated people in comparison to simulated data. Calculations were done using German population frequencies [13] after amplification of 15 and 20 STRs

Scenario 2: half siblings/unrelated

When regarding the hypothesis, half siblings/unrelated investigating 15 or 20 STRs more than 85 % and more than 89 % of comparisons demonstrated a LR below 1/9, respectively (Table 8). Approximately 2 % of comparisons led to LRs higher than 9.

Table 8 Results for calculating half sibship versus unrelated from 112 pairs of unrelated people in comparison to simulated data. Calculations were done using German population frequencies [13] after amplification of 15 and 20 STRs

Concluding recommendations

Since the data of empirical and simulated comparisons are in agreement in each of the three groups investigated (full siblings, half siblings, unrelated people), even if the number of empirical half sibling pairs is rather low, we can assume a high reliability of our data and are able to give some recommendations for dealing with this kind of sibship analyses. Therefore, the power of the calculated values was tested under the following question: how many pairs with a LR above a defined value of an identical number of full siblings (or half siblings) and unrelated pairs chosen by chance are really full siblings (half siblings). The results are displayed in Table 9 which should be read as follows. Regarding the question of full sibship, more than 99.9 % of comparisons with a LR > 999 (W-value > 99.9 %) are in fact full siblings. If the result of a full sibship, analysis is only a LR between 9 and 99, the kinship expert has to keep in mind that nearly 9 % of comparisons with this LR are not related. On the other hand, only less than 0.05 % of pairs with an LR for full sibship below 1/99 are really full siblings.

Table 9 Percentages of actual full siblings (half siblings) among an identical number of siblings and unrelated with regard to W-value and LR

Now, in a given case, a kinship expert can identify the value of “his” determined kinship probability and better understand its meaningfulness. With regard to our data, we recommend a cutoff level of LR > 9 (W-value > 90 %) as minimal value for full sibship (half sibship), since at this level less than 10 % of pairs will be wrongly attributed as full siblings (half siblings), even though they are unrelated. However, it remains very difficult to reliably answer the question if a male and a female person with the same mother also have the same father.