Introduction

Artificial selection has made the domestic dog the most phenotypically diverse mammal in the world, as illustrated by the tremendous variation in morphology and behavior among breeds. This also holds true for life history traits such as adult lifespan, which varies about two-fold across dog breeds [1,2,3]. Given that dogs suffer from many of the same diseases as humans, share our environment, and have a sophisticated level of medical care, what we learn about lifespan in dogs is likely to have a high degree of relevance to humans as well [4,5,6].

The strongest determinant of lifespan in dogs is body size, with small dogs outliving their large counterparts (by about 5 years on average and up to 8 years) [7,8,9,10]. Another factor that has the potential to affect canine lifespan is genetic diversity or the level of inbreeding [7, 11,12,13]. The genetic diversity found across dogs as a species is comparable to humans, but within dog breeds it is much smaller [14, 15]. Reproductive isolation (i.e., closed stud books), founder effects, bottlenecks, and genetic drift as well as breeding practices such as selection for extremes, inbreeding, and the use of popular sires have led to strong genetic differentiation among dog breeds and high levels of homozygosity in purebred dogs [15,16,17,18,19,20]. The detrimental effects of inbreeding on fitness-related traits have been documented in a wide variety of wild, captive, and domestic populations [21,22,23,24,25,26,27]. Populations with small effective population sizes and high levels of homzygosity such as modern dog breeds are at a particularly high risk of inbreeding depression though the benign living environment of these dogs and purging might mitigate it to some extent [21, 25, 27,28,29]. In line with this, the lifespan advantage of mixed-breed dogs which typically have a much higher genetic diversity compared to purebred dogs has been robustly documented based on various data sources, including owner surveys [1, 2], cemetery data [30], teaching hospital data [7, 11] and primary veterinary practices ([31] but see [12]). Mixed breed dogs live on average about 1.2 years longer than similarly-sized purebred dogs [11, 31]. There is also some evidence that individual inbreeding level negatively affects juvenile survival and adult lifespan within breeds [11, 13].

Despite the effects of genetic diversity on lifespan on those scales, we could not confirm the expected negative correlation between inbreeding levels and lifespan across dog breeds when accounting for body size in an earlier study [11], even though more inbred breeds have recently been shown to have a higher burden of overall morbidity [32]. There are several possible reasons for this counterintuitive finding. One possible explanation discussed in the study [11] is that the effect of inbreeding on lifespan is primarily due to loss of allelic diversity and a concomitant accumulation of deleterious mutations caused by bottlenecks during breed formation, with breeds exhibiting a relatively similar degree of historic inbreeding depression. This idea is consistent with the fact that bottlenecks during domestication and breed formation, rather than further inbreeding of established breeds, explain the increased genetic load in dogs compared with wolves [33]. Nevertheless, data on genetic diversity across breeds suggests substantial variation from highly inbred breeds to breeds with levels of genetic diversity quite close to that of mixed breed dogs ([32], see table S1).

We might also fail to detect an expected relationship between inbreeding and lifespan due to (a) the fact that we do not have estimates of lifespan and inbreeding from the same individual dogs; and (b) the relatively low quality of available lifespan estimates in veterinary studies, most of which suffer from bias due to incomplete birth cohorts (i.e., right-censored data [34]). Studies that include only those dogs that die before a specified date can lead to underestimating lifespan, if only dogs dying relatively young are captured in the estimate for birth cohorts, such that a large proportion of dogs might still be alive at the time of sampling. This bias can be exacerbated when population numbers, and hence age structure, change over time, as is often the case with dog breeds gaining or losing popularity (see, e.g., [35] for a low median lifespan in a rapidly increasing population of Chihuahuas). If the bias across breeds is relatively similar and/or the effect of the variable of interest is very strong, as in the case of body size, its relationship with lifespan will still be detectable. For traits with smaller effects on lifespan, especially if cohort bias is related to the trait of interest, it might mask the effect of the trait.

To better understand how size and genetic diversity shape lifespan across breeds, we need to analyze their relationship with the underlying causes of death. Large size seems to incur lifespan costs mainly via an increased rate of aging [9], whereas the mixed breed lifespan advantage seems to be caused by both a decreased age-independent mortality hazard as well as a decreased rate of aging [11], suggesting that these two factors might differ in their effect on the major causes of death. The impact of a given cause of death on mean lifespan depends on two parameters—the age at which dogs succumb to this cause of death, and its prevalence in the population. Here we investigate the effect of body size and genetic diversity on two major causes of death in dogs, one usually deemed desirable — “dying of old age,” the other one often dreaded — cancer.

Strictly speaking, old age is not a cause of death as aging is not a disease (but see [36]), though for practical reasons it has been termed as such in surveys on dog mortality before, and we will do so here as well. While old age strongly increases the risk of dying from many diseases, it is not a terminal process itself. Given that dog owners typically choose to euthanize their pets once they consider the pet’s quality of life to be too poor for continued survival, their owner-determined lifespan has been suggested as a measure of the end of their healthspan [4, 5]. Hence, the lifespan of dogs that reportedly died of old age can be viewed as a measure of the breed-specific healthspan potential. With a reported prevalence of around 25% based on owner reports including purebred and mixed breed dogs [1, 2], old age mortality in dogs seems of a similar magnitude as cancer mortality, but so far, no analysis of factors affecting its prevalence has been undertaken. Cancer is usually an age-related disease and the single most common cause of death in dogs, accounting for about 15–30% of all deaths but with prevalence varying widely across breeds (reviewed in [37]). Because of the similar prevalence and biology of canine cancers as in humans, dogs have become a valuable model in comparative oncology [38,39,40,41,42,43,44]. Several studies have documented an increased cancer risk in larger breeds [1, 38] and Fleming et al. [45] showed that larger size is linked to an increased risk of dying of neoplasia, but not of other pathophysiological processes based on a large sample of dogs dying at veterinary teaching hospitals. Evidence for an effect of genetic diversity on cancer mortality is more ambiguous. While some evidence points to increased cancer incidence in purebred compared to mixed breed dogs [46, 47], Bellumori et al. [48] did not find a significant mixed breed advantage in cancer prevalence. None of these studies, however, accounted for size effects on cancer mortality. We are not aware of any analyses that have looked at the effect of size or genetic diversity on cause-of-death-specific lifespan.

Taking advantage of a large public database for purebred dogs (koiranet of the Finnish Kennel Club), we address the problem of bias in lifespan and prevalence estimates by only including completed birth cohorts (i.e., all dogs of these cohorts have died off) in our mortality parameter estimates. Additionally, we quantify the bias in breed-specific lifespan estimates caused by incomplete birth cohorts and changing registration numbers and show that this indeed can explain our earlier failure to detect the relationship between genetic diversity and lifespan across dog breeds. We then quantify the effect of body size and genetic diversity on breed-specific mean lifespan, as well as lifespan and the proportion of old age and cancer mortality. We also show that our findings hold true when accounting for the relatedness among breeds.

Methods

Mortality data

We used average breed-specific lifespan and cause of death data from the public database koiranet of the Finnish kennel club (https://jalostus.kennelliitto.fi/frmEtusivu.aspx?Lang=en&R=57). To avoid the problem of underestimating lifespan due to the inclusion of incomplete birth cohorts (i.e., individuals born and registered within a specific calendar year), and additional bias due to changing registration numbers, we restricted the data to the birth cohorts of 1988 to 2002 (data extraction January 2020). This means that only the few dogs of the later birth cohorts that completed their 18th year of life (for 2002), 19th year of life (for 2001) and so on, are not included in the data set, which might lead to minor underestimates of lifespan, especially in small breeds where such long lifespans are more frequent. Restricting the included birth cohorts even further would have decreased the number of breeds with a large enough sample size substantially.

Table 1 lists the mortality parameters we extracted from the data. To exclude extrinsic mortality from our corrected mean lifespan estimate, we recalculated the average lifespan given for each breed without the causes of death categories “accident,” “damage done by large carnivores,” “euthanasia due to behavioural problems,” “lost,” “wash-outs” (dogs euthanized because of unsuitability for the intended working purpose), and “cause of death not specified” for all breeds. We did not exclude the categories “dead without diagnosis of illness,” “euthanasia, non-diagnosed,” because the percentage of dogs in these categories that might have died due to extrinsic causes of deaths is likely very small (results hold qualitatively if these are excluded, and also if “cause of death unspecified” is included).

Table 1 Mortality parameters computed for each breed and their meaning. Further explanations can be found in the text

We additionally extracted average breed lifespan without the birth cohort restriction (complete causes of death data available for a given breed up to the end of January 2021). As with the restricted data set, we recalculated this parameter excluding clearly extrinsic causes of mortality and deaths with unspecified cause (uncorrected mean lifespan). To estimate the bias in breed lifespan estimates introduced by incomplete birth cohorts and changes in population numbers, we computed the difference between the breed-specific mean lifespan estimate based on the complete and the restricted data set (bias in mean lifespan). For all breeds, we calculated the percentage change in annual registration numbers (i.e., the number of new individuals in a breed that are registered by the Finnish Kennel Club) over the years only included in the estimate of uncorrected mean lifespan, based on the difference between mean annual registration numbers for the years 2002–2006 and the years 2016–2020.

As a rough measure or proxy of breed-specific healthspan potential, we used the average age at death of dogs with the cause of death reported as “old age (natural or euthanasia),” which we refer to as old age lifespan. Additionally, we computed the proportion of dogs that reportedly died of old age as the % old age mortality. To characterize breed-specific cancer mortality, we extracted the average age at death for dogs with the reported cause of death “tumor, cancer” (i.e., cancer lifespan). Furthermore, we calculated the % cancer mortality as the proportion of dogs that died of cancer. In this case, we only included deaths with a specific diagnosis in the denominator, i.e., we excluded dogs in the cause of death categories “dead without diagnosis of illness,” “euthanasia, non-diagnosed.” It seems plausible that a non-negligible percentage of deaths in these categories were due to cancer, and therefore including them would have underestimated cancer rates (results hold if these are not excluded). The next most common cause of death in the database was heart disease, but case numbers for this and the other categories of death were too limited for many breeds to obtain meaningful cause-specific mortality estimates.

We included data from 119 breeds for which at least 80 deaths were documented. These breed-specific estimates of corrected mean lifespan are based on a total of 40,841 dogs.

Body size, genetic diversity, and genetic relatedness data

When available, we used the average of the female and male size (i.e., body mass) described in the Fédération Cynologique Internationale (FCI) breed standard. For breeds without an expected body mass in the breed standard, we used the expected body mass found in other breed descriptions from the internet (mainly American Kennel Club [AKC] standards). As an estimate of breed-specific genetic diversity, we used median heterozygosity measured by MyDogDNA® (Genoscoper Laboratories Oy, or Optimal Selection ®, Wisdom Health, Vancouver, WA, USA) and published by [32]. Since Genoscoper was originally located in Finland, the majority of the dogs tested so far come from Europe and many from Finland. Hence, the dogs from which the genetic diversity data come are not identical to those used for the mortality estimates, but most come from the same, albeit temporally derived European breed populations. Estimates were provided for breeds with at least 30 genotyped individuals. In contrast to the study by Yordy et al. (2019), levels of genetic diversity were not significantly correlated with body size (Pearson correlation: r =—0.062, p = 0.504).

We used genetic data from a recent large-scale dog genotyping analysis [49] that was generated using the Illumina Canine HD SNP chip (Illumina, San Diego, CA), which covers 172,000 SNPs spread out across the dog genome. We estimated genetic relatedness among individuals by calculating identity-by-state (IBS) for all pairs of individuals using PLINK v.1.9 [default settings: plink—file datafile—cluster—matrix;,50]. Using this matrix, we then generated breed-average IBS matrices by calculating the mean IBS within breeds and between pairs of breeds (Table S2).

Statistical analyses

To test for the difference between uncorrected and corrected mean lifespan (i.e., the bias), we used a paired t test. We used linear regression to analyze the relationship between bias as well as our mortality parameters and our predictor variables body size, genetic diversity (heterozygosity), and percentage change in registration numbers. For the three percentage variables, we first used logistic regression, but model assumptions were violated (significant overdispersion and the response was not linear on a logit scale). In contrast, linear model assumptions were met relatively well, and therefore we used linear regression for these, too. To identify statistical outliers, we used studentized residuals and Bonferroni corrected p-values [51]. We checked for influential points visually using Cook’s distances and changes in the regression coefficients [51]. One breed, the Basenji, proved to be a highly influential point in several of the analyses, leading to a change of around 20% in the β coefficient for heterozygosity, and was subsequently removed from all analyses except the bias and the percentage change analyses. This resulted in a sample size of 118 breeds for analyses of mortality variables. None of the statistical outliers were influential points, so we did not remove them. We also did not remove influential points identified in only one of our analyses, because we wanted to keep the data set constant for our mortality analyses to ensure comparability of R2 values and β coefficients. Table S3 gives an overview of outliers and their direction as well as influential points and their effects on the model statistics. None of these decisions affected our basic results. Because sample sizes varied widely among breeds, we also checked for the need to use weighted regression using the DuMouchel-Duncan test of model change with weights. None of the tests indicated a significant change for the weighted model.

Due to a shared phylogenetic history, dog breeds are not strictly independent data points. To ensure that our results are robust with respect to breed relatedness, we repeated our analyses, employing phylogenetic comparative methods that controlled for breed-level identity by descent (studies that used this approach for research on dogs include [52,53,54]). First, we used a mixed model approach (package EMMREML [55]). Second, we used a Bayesian regression model framework (package brms [56, 57]). Although models did converge with this latter approach, results need to be interpreted with caution since for most models a number of divergent transitions and low effective sample sizes were reported. Nevertheless, estimated model parameters were very similar to the EMMREML approach. Because genetic information was not available for all breeds for which we had data on mortality, size, and genetic diversity, only 96 breeds could be included in these analyses. After accounting for breed relatedness, the results were qualitatively the same and quantitatively very similar to those based on the analyses including all breeds (Tables 2, S4, and S5).

Table 2 Statistical effects of body size, heterozygosity, and the percentage change in registration numbers (% change) on our parameters of interest.

All statistical analyses were performed in R version 4.0.3 [58]. Effect plots showing partial residuals were created using package jtools [59].

Results

Table 2 provides statistics for all regressions performed and Table S1 shows the breed-specific values for all variables used in the analyses.

Bias in lifespan estimates

Lifespan estimates for the dataset including only completed birth cohorts were higher than those based on the unrestricted data set for all breeds with the difference in means being statistically significant (paired t test: mean bias = 1.17 years, t118 = 24.227, P < 2.2e-16). The estimated bias ranged from 0.02 to 3.0 years (Table S1) and was statistically linked to our variables of interest. It increased significantly with decreasing size and increasing heterozygosity (Fig. 1a,b). As expected, it also increased significantly with the percentage change in registration numbers, which explained 34.2% of its variance (Table 2, Fig. 1c). This shows that the mean lifespan of small and more heterozygous breeds as well as those rising in popularity is underestimated. The percentage change in registration numbers was significantly larger for breeds with higher heterozygosity (Table 2, Fig. 1d).

Fig. 1
figure 1

(a) The estimated bias decreases with body size (P = 0.040, partial R2 = 3.6%), (b) increases with heterozygosity (P < 0.001, partial R2 = 11.3%), and (c) with the percentage change in registration numbers (P < 0.001, partial R2 = 34.2%). (d) Breeds with higher heterozygosity showed a stronger percentage increase in registration numbers (P < 0.001, partial R2 = 51.2%). Depicted are predicted values with 95% confidence intervals and partial residuals accounting for the effect of the other variables in the regression models (n = 119)

Mean lifespan

For uncorrected mean lifespan, which varied between 6.4 and 12.3 years (Table S1), the model predictors accounted for 51.3% of the variance (Table 2). While size was a statistically significant predictor, heterozygosity was not significantly associated with this parameter (Fig. 2a b, Table 2). Corrected mean lifespan varied between 7.0 and 13.9 years (Table S1). When the bias introduced by incomplete birth cohorts was removed, corrected mean lifespan decreased significantly with increasing body size, and increased significantly with increasing heterozygosity (Fig. 2c d, Table 2). The two factors together explained 61.0% of the variance in mean lifespan across breeds. Judged by partial R2 values, size was a relatively more important predictor than heterozygosity (Table 2). For each additional 1 kg of body mass of the averaged breed standard mass, mean breed lifespan is estimated to decrease by 25.6 days, and for each additional one percent of heterozygosity, a breed is predicted to gain 30.7 days of lifespan. Over the observed size range, that translates into a difference of 4.5 years in predicted mean lifespan from the smallest to the largest breeds, and a difference of 1.7 years from the lowest to the highest observed heterozygosity value.

Fig. 2
figure 2

(a) Uncorrected mean lifespan decreases with body size (P < 0.001, partial R2 = 51.0%), (b) but it lacks a statistically significant association with heterozygosity (P = 0.133, partial R2 = 2.0%). (c) Corrected mean lifespan decreases with size (P < 0.001, partial R2 = 57.6%) and (d) increases with heterozygosity (P < 0.001, partial R2 = 14.1%). Depicted are predicted values with 95% confidence intervals and partial residuals accounting for the effect of the other variables in the multiple regression model (n = 118)

Causes of death

Old age mortality

The average lifespan of dogs reportedly dying of old age varied between 9.3 and 14.9 years across breeds (Table S1). On average, old age lifespan was 1.7 years higher than mean lifespan across breeds, and dogs dying of disease died on average 2.9 years younger than those dying of old age. The findings observed for old age lifespan were similar to those for corrected mean lifespan. We found that old age lifespan was significantly negatively correlated with body size and positively correlated with heterozygosity (Table 2, Fig. S1a, b). The percentage of dogs dying of old age varied between 16.8 and 62.5% across breeds (Table S1) and was also negatively correlated with body size and positively correlated with heterozygosity (Fig. 3a b, Table 2). The full model explained 39.5% of the variance in the percentage old age mortality (Table 2), with heterozygosity explaining 15.1% of the variance, which increased to 22.6% when we removed two influential points, Rough Collie and Smooth Collie. We found that the predicted percentage old age mortality increased by 1% for every decrease of 3 kg in breed size or, respectively, each increase of 1.4% in heterozygosity. Over the observed size range, that translates into a difference of 20.7% in the predicted percentage old age mortality from the smallest to the largest breed, and a difference of 14.1% from the lowest to the highest observed heterozygosity value.

Fig. 3
figure 3

(a) The percentage of old age mortality decreases with body size (P < 0.001, partial R2 = 31.1%) and (b) increases with heterozygosity (P < 0.001, partial R2 = 15.1%). (c) The percentage of cancer mortality increases with body size (P < 0.001, partial R2 = 12.2%), (d) but there was only a weak trend for it decreasing with heterozygosity (P = 0.086, partial R2 = 2.5%). Depicted are predicted values with 95% confidence intervals and partial residuals accounting for the effect of the other variables in the multiple regression model (n = 118)

Cancer mortality

Cancer lifespan varied between 6.7 and 13.0 years (Table S1). It was on average 0.7 years lower than mean lifespan and 2.4 years lower than old age lifespan across breeds. With respect to size and heterozygosity, the pattern for cancer lifespan mirrored that for the other lifespan parameters — cancer lifespan decreased significantly with body size and increased significantly with heterozygosity (Table 2, Fig. S1c, d). The percentage cancer mortality varied between 5.8 and 53.7% (Table S1) and the full model explained 13.5% of the variance in the percentage of cancer mortality. It increased significantly with body size and tended to decrease with heterozygosity (Fig. 3c d, Table 2). However, the trend was not replicated in the reduced data set correcting for breed relatedness and removing an outlier (Flat-Coated Retriever) erased it. Each 1 kg of additional size increased the predicted percentage of cancer mortality by a bit more than 0.2%, which translates into a difference of 12.5% in the predicted percentage cancer mortality over the observed range of breed sizes.

Discussion

Our findings show that both body size and genetic diversity are important factors in shaping lifespan across dog breeds, with body size exerting a stronger effect than genetic diversity across all mortality measures. Genetic diversity played a larger role in old age mortality than in cancer mortality. However, both of these owner-reported cause of death categories are very heterogenous, so more detailed analyses are needed to improve the accuracy and interpretation of the quantitative estimates of these associations.

Mean lifespan

Our study shows that the bias in lifespan estimates due to right-censored age-at-death data as well as changes in population size and age structure can be substantial–up to 3 years in our set of breed populations. Both of our variables of interest affected the size of the estimated bias, with heterozygosity showing a tighter association than body size. The larger bias in small and more heterozygous breeds is consistent with the fact that right-censoring bias is expected to be higher in long-lived population [34]. As expected, we saw a strong correlation between population increase and the estimated bias, consistent with other studies that found low lifespan in rapidly growing breed populations [35, 60, 61] Interestingly, change in breed popularity was also linked to breed average heterozygosity, suggesting that heterozygosity affected right-censoring bias in our population via two mechanisms — increased longevity and popularity. Factors that might contribute to the increase in popularity of more genetically diverse breeds are that newly recognized breeds become fashionable over time and/or people might switch away from very inbred breeds due to health concerns. The causality of the relationship might also be reversed, with breeds losing genetic diversity as they become rarer. The fact that the bias in lifespan estimates was positively correlated with genetic diversity likely explains why we could not detect its expected effect on lifespan across dog breeds in an earlier study [11]. Once this bias is largely corrected for, not only size, but also genetic diversity emerges as an important factor in shaping lifespan across dog breeds.

The strong negative effect of body size on breed-specific mean lifespan we document here is consistent with earlier studies [8,9,10,11, 31, 62]. However, we found a larger percentage of variance in mean lifespan explained by size compared with earlier studies (this study 58%, 40% [10], 44% [9]). Comparing the partial R2 of size for uncorrected and bias corrected mean lifespan suggests that this difference is due in part to the fact that we corrected for right-censoring bias. Additionally, our high explanatory power might have been influenced by the fact that we took genetic diversity into account as an additional predictor, that we used a relatively larger sample size for many breeds, and that we considered a more representative sample population compared to, e.g., teaching hospital data. Despite size being the strongest determinant of lifespan in dogs, we still do not fully understand the proximate mechanisms underlying this relationship. The insulin-like growth factor (IGF-1) pathway, a known modulator of lifespan in many model organisms, likely contributes to the lifespan advantage of small dogs [63] which have been shown to have lower IGF-1 levels than their large counterparts [64,65,66]. More recent studies suggest that glycolytic metabolic rates [67], mitochondrial bioenergetics and thermoregulation [68] or tryptophan metabolism [69] might play a role in lifespan differences of small and large dogs. Jimenez [70] reviews the physiological mechanisms underlying canine aging.

More genetically diverse breeds outlived their more inbred counterparts, with each percent in median heterozygosity adding one month in predicted mean lifespan when accounting for body size. This finding is well in accordance with expectations based on population genetic theory and in line with previous studies reporting a lifespan advantage of mixed breed dogs [7, 11, 31] as well as a negative effect of individual pedigree-based inbreeding coefficient on lifespan documented in at least one breed (Golden Retrievers [11],but see [71] for Irish Wolfounds). Similarly, Urfer et al. [12] found that breeds with an inbreeding coefficient below the median enjoyed a lifespan that was 3 to 6 months longer than those above the median inbreeding coefficient, but, surprisingly, they did not detect a lifespan advantage of mixed breed dogs. Across the range of observed genetic diversity in our breed sample, the effect of genetic diversity translated into an average lifespan gap of about 1.7 years between the least and the most genetically diverse breed, which is at the higher end of the reported estimates of the lifespan gap between mixed breed and purebred dogs (1.8 years [7], 1.2 years [11, 31]). The relative importance of genetic diversity was smaller than that of body size, but it is possible that we are underestimating its explanatory power as measured by partial R2. First, the effect of inbreeding depression on lifespan also has a random component. Because of founder effects and subsequent breeding events (e.g., a specific disease-linked mutation in a popular sire), different deleterious alleles linked to mortality become enriched in the various breed populations and the ages at death for those causes will vary. For example, dilated cardiomyopathy in Dobermans kills dogs at a younger age than degenerative myelopathy in Corgis, (see also [72]). Second, the population on which our mortality parameter estimates are based is not the same as, but rather ancestral to, the one that has been genotyped for the heterozygosity estimates. This might introduce a variable amount of error in the median heterozygosity estimates across breeds.

“Dying of old age”

Old age was the most reported cause of death in our study population. Therefore, it is not surprising that lifespan patterns in this category largely match those for mean lifespan. Interestingly, the relative importance of size was somewhat smaller and the slope less steep than for mean lifespan. The expected lifespan gap across the range of sizes decreased to 3.4 years compared to 4.5 years for mean lifespan. Rapid growth has been conceptually linked to a higher probability of developmental errors and consequently “jerry-built” bodies [73, 74]. Kraus et al. [9] hypothesized that this might cause a higher variance in quality in large dogs compared to small ones. Hence, while on average the bodies of large dogs break down earlier, there will be some individuals that escape a high burden of errors. The size-related pattern we observed in our study (i.e., a lower percentage of large dogs dying of old age) combined with a somewhat attenuated effect of size on the lifespan of those dogs fits this idea well. However, so far there is no clear empirical evidence of a higher burden of replication error in large dogs. There is some indication for lowered production of reactive oxygen species (ROS) in cells of long-lived breeds [67, 68], but the link to size is less clear [70]. Contrary to expectations, a higher rate of damage in circulating lipids was found in small compared to large dogs [52].

Other factors might also contribute to the observed pattern. Large dogs with a similar morbidity burden to small dogs might be more difficult to manage and therefore the decision for euthanasia might be reached earlier, which could also contribute to the body size effect on old age lifespan. Moreover, while people are generally aware of the fact that large dogs have lower life expectancies, they might still be less likely to consider “old age” as the appropriate cause of death category, because ages of death below about 10 years might still seem young for dogs in general. This might decrease the old age mortality percentage for larger dog breeds.

Increased genetic diversity was positively correlated with old age lifespan as well as the percentage of dogs dying of old age. The latter makes intuitive sense. Founder effects during breed creation and subsequent breeding practices have inadvertently led to an enrichment of deleterious mutations in the closed populations of purebred dogs, resulting in a multitude of breed-specific disease predispositions [75, 76]. Inbreeding depression is thought to be mainly caused by deleterious recessive alleles becoming unmasked with increased homozygosity [26]. Consistently, while mixed breed dogs were found to more likely carry a common recessive disease-linked mutation, purebred dogs were more likely to be genetically at risk, i.e., homozygous for such mutations [77]. Hence, we would expect that we are more likely to find homozygosity of recessive disease-associated alleles in breeds with a low average heterozygosity, so more individuals are expected to succumb to disease vs. dying of age-related morbidities. The higher level of morbidity in more inbred dog breeds is in line with this expectation [32].

The shorter lifespan of the aged population in less genetically diverse breeds emphasizes that inbreeding not only affects canine lifespan via an increased likelihood of dying of inherited diseases, but also more generally in a cumulative way. Indeed, Marsden et al. [33] found that weakly deleterious mutations contributed most of the additive genetic load in dogs. There is some evidence that mixed breed dogs age slower than purebred dogs when size is accounted for [11] but to confirm that the rate of aging is negatively associated with genetic diversity across breeds we need to construct age-specific mortality and morbidity trajectories, which was outside the scope of this study. Consistent with an effect of genetic diversity on age-related morbidity, a study of the Hungarian dog population based on owner questionnaires found that purebred dogs started to suffer from health problems at an earlier age than mixed breed dogs, and suggests that mixed breed dogs might have a longer healthspan than their purebred counterparts [78]. Intriguingly, the explanatory power of genetic diversity on old age lifespan was larger than for mean lifespan or cancer death lifespan, perhaps because the variable age at onset of diseases associated with mortality adds variance to the relationship compared to the effect of genetic diversity on aging per se.

The group of dogs dying or being euthanized because of age-related morbidities rather than an unequivocal disease process can provide an especially valuable comparative model for human research into morbidity and mortality patterns in old age [4, 5]. Across breeds, dogs that reportedly died of old age lived on average 3 years longer than those that succumbed to a specified disease, albeit with substantial variation around this mean value. Studies of human centenarians have suggested that some people are able to live a particularly long and healthy lifespan as “escapers,” individuals that live without chronic disease for their entire lives, while others are “survivors,” managing to not succumb to chronic disease [79]. Ongoing long-term longitudinal studies such as the Dog Aging Project [80] and the Golden Retriever Lifetime Study [81] might tell us whether the same is true in dogs.

Dying of cancer

The lifespan patterns for dogs dying of cancer mirrored that for mean lifespan, showing that cancer deaths contribute to the negative effect of size and the positive effect of genetic diversity on mean lifespan. Just as in humans and other species, canine cancer is typically an age-related disease [41, 82, 83], likely because the aging process and cancer development share several important pathways (reviewed in [63, 82]). Hence, the increased rate of aging in large as well as purebred dogs [9, 11] might be causally linked to the shorter lifespan of dogs of larger and genetically less diverse breeds that die of cancer. One of the proximate mechanisms involved might be the growth hormone or insulin-like growth factor I pathway, since small dogs have lower serum IGF-1 levels than large dogs [64,65,66]. Some dwarf mice with disruptive mutations in this pathway show increased lifespan and a later onset of cancer (reviewed in [82]). There is also intriguing evidence of cancer protection in a human population with congenital IGF1 deficiency [84]. Our finding that size affects the lifespan of dogs dying of cancer is consistent with earlier evidence suggesting that the onset of cancer might occur later in smaller breeds [85].

Across our breed sample, dogs dying of cancer lived on average 0.9 years longer than those dying of other diseases, consistent with cancer being a disease that typically strikes at an advanced age, but they still lost on average 2.4 years of life compared to those reportedly dying of old age. The evolutionary model of cancer suggests that late onset cancers are often sporadic in origin, i.e., due to somatic mutations alone, while inherited genetic variants are more likely to contribute to the development of early-onset cancers [86]. Due to founder effects, genetic drift and breeding practices, many breeds are at an increased risk for specific types of cancer with variable ages of onset [37, 40, 43]. So, while in some breeds sporadic cancers with little lifespan costs might predominate, in others, these genetic predispositions lead to higher lifetime costs (e.g., histiocytic sarcoma in Flat Coated Retrievers and Bernese Mountain Dogs). This variation might also contribute to the somewhat lower explanatory power of genetic diversity for cancer mortality lifespan compared to mean or old age lifespan.

With respect to the proportional cancer mortality across breeds, we could only identify size as a statistically significant risk factor. Whereas across species, cancer risk is largely independent of size, an observation known as Peto’s Paradox [87,88,89,90], the evolutionary model of cancer predicts that within species, cancer risk increases with size, simply because of a larger cell number and higher cell division rate in larger individuals [86, 91, 92]. This should also hold true for dog breeds, since likely not enough time has elapsed on an evolutionary timescale for natural selection to adjust mechanisms of cancer resistance accordingly [88]. Consistent with this, Michell [1] documented higher cancer mortality in large compared to small breeds and Fleming et al. [45] showed that larger size is linked to an increased risk of death due to neoplasia in a large sample of dogs dying at veterinary teaching hospitals (see also [83]). Our findings confirm this relationship for a more representative population of purebred dogs. This within-species effect of size has also been documented for humans for various cancers [93]. Additionally, mice with mutations disrupting the GH/IGF-1 pathway are typically smaller and tend to have a decreased cancer incidence (reviewed in [63, 82]), suggesting that this pathway is also implicated in the decreased cancer mortality risk of small dogs. Increased glycolytic metabolic rates might also contribute to the predisposition of larger dogs to cancer [67, 70]. For a more complete understanding of the positive correlation between size and cancer risk in dogs, it will be necessary to distinguish between cancer types. For example, while size has a major effect on risk of osteosarcoma (reviewed in [94]), we know less about the importance of size for other types of canine cancer. Additionally, analysis of the association between size and cancer risk within breeds could help us to separate the relative effects of size and breed.

We found only weak, non-robust support for an increase in cancer death risk with decreasing genetic diversity in our data set. This is in line with the lack of a strong support for a mixed breed advantage in cancer death risk ([31, 48], but see [47]). Still, several factors might have masked a more clear-cut effect of genetic diversity on proportional cancer mortality. While suggestive evidence exists that low genetic diversity and inbreeding are linked to an increased cancer risk (reviewed in [95]), the effect does not necessarily have to be linear. Different deleterious alleles rise to high frequency in different populations, with oncogenic mutations appearing to be enriched only in some breeds (reviewed in [37, 40, 42, 44]). Hence, we might expect to see a linear effect of inbreeding level on cancer risk within breeds (e.g., Golden Retrievers [19]), but not necessarily across breeds. Furthermore, the effect of genetic diversity on cancer risk might be tumor specific and not evident in overall cancer risk ([95], e.g., mammary cancer [96], lymphoma [19]). A methodological caveat is that we were not able to account for competing hazards in our analyses, so if an early-onset disease kills many dogs in a breed, a high cancer risk might be masked. Finally, the quality of our cause of death data is not very high. The causes of death in the koiranet database are all owner-reported (i.e., no validation of the diagnosis necessary), and the categories provided are not fully mutually exclusive. In particular, when older dogs have cancer, some owners might not pursue a definitive diagnosis for the actual cause of death and enter their death as due to old age rather than cancer.

Study limitations

There are two main solutions to the problem of biased lifespan estimates due to incomplete birth cohorts. One is using statistical methods developed for right-censored data; the other is to eliminate death data from individuals coming from birth cohorts with a significant percentage of dogs still alive [34]. Urfer et al. [12] used the first method to investigate factors affecting lifespan based on data from primary care US veterinary hospitals. However, if the censoring itself is not random with respect to the outcome, this might lead to biased estimates as well, which might explain the unusually long life-expectancies reported by Urfer et al. [12]. We took the alternative approach and included only completed birth cohorts in our estimates. While this avoids the problem explained above, it is not without potential problems of its own. We inevitably miss more recent changes in breed-specific mortality resulting from changing prevalence of diseases, such as those due to increasing allele frequencies of deleterious mutations because of recent inbreeding or decrease in these same alleles due to better breed management via genetic testing. Another caveat comes from the fact that the population on which our mortality parameter estimates are based is not the same as, but mainly ancestral to, the one that has been genotyped for the genetic diversity estimates used.

We used a large, well-used public database, which allowed us to include more than 40,000 dog deaths while avoiding the problem of case selection bias inherent in earlier large-scale studies on dog mortality based on teaching hospital or insurance data [9, 11, 45, 85, 97]. Because of the large sample size and the permanent accessibility of the database for owners, our study population is likely also more representative than those based on short-term owner surveys. Nevertheless, there are several potential sources of bias that we were not able to control for in our analyses. Desexing has been linked to extended lifespan, especially for females ([31, 98], but see [99, 100]), and to increased cancer risk (reviewed in [101, 102]). However, desexing dogs is not a common practice in Finland [103], rendering it unlikely that this is a major source of error in our analyses. One major drawback shared by many population-based studies on dog mortality is the owner classification of the cause of death data, resulting in a higher degree of uncertainty for this part of our study. Finally, as in most dog mortality studies published so far, most dogs included in our analyses likely were euthanized (estimated at 86% in UK dogs [31]) and euthanasia decisions might partly depend on size.

Conclusions

Our large-scale study shows that size affects lifespan across dog breeds not only via decreasing the age at which dogs die from two major causes of death, but also by affecting the proportional mortality due to “old age” and cancer. Many of the large breeds have become even larger over the last century (compare today’s breed weights to those in, e.g., [104,105,106]). Our findings suggest that reversing this trend would decrease the probability of these dogs dying prematurely from disease, and hence improve lifespan as well as healthspan. Correcting for right-censor bias, we were able to provide strong evidence that genetic diversity does indeed impact lifespan across dog breeds. As with size, genetic diversity affected the age at death from “old age” and cancer as well as the proportion of old age deaths. These results strongly suggest that many breeds would not only benefit from managing the existing breed-wide genetic diversity, but also from outcross programs to increase genetic diversity [107, 108]. Population genetic simulations by Windig and Doekes [109] emphasize that continuous, low-level outcrossing is needed to minimize inbreeding rate in small populations, typical for many dog breeds. The analyses presented here of two important sources of mortality shed some light on which mortality components behind lifespan are affected by our focal variables, though studies with improved diagnostic validity are clearly needed to confirm our findings and suggest specific underlying mechanisms. The highest quality data on mortality and morbidity come from prospective cohort studies, but only few small-scale ones have been conducted so far in dogs [110, 111]. Large-scale longitudinal studies such as the Golden Retriever Lifetime Study [81] and the Dog Aging Project [80] have the potential to substantially improve our understanding of the determinants and underlying mechanisms shaping mortality and morbidity patterns in our canine companions.