1 Introduction

Nowadays is common to subscribe to some type of guarantees and assistance contracts to protect our goods. The automotive business is not different. When one buys a vehicle, it is obligated by law to subscribe an insurance contract, but some brands offer the possibility to subscribe also an assistance contract. An assistance contract can be defined as integrated package of maintenance and repair services where problems like defects or failures are rectified by an external service provider for an agreed period of time. The service provider will charge a price for such service, usually collected by means of a fixed monthly fee.

In recent years, assistance contracts have received significant attention due to the increased profits for service providers and also because of the reduction of risk for their subscribers. However, in a competitive environment, the customers compare several offers from multiple service providers looking for the better deal. As so, service providers that offer competitive contracts may expect to have a better market share. With this thoughts in mind, the service providers constantly seek to have an aggressive strategy, ensuring certain objectives like the maximization of customers or the global financial return for their business.

This paper aims to present a mathematical model for estimating prices for assistance contracts of the Group NORS (Auto-Sueco and Galius companies) that take into account the risk associated to the estimation of the costs. NORS is a company in the automotive industry world that focus in the import, sale and assistance services of heavy-duty vehicles of the Volvo Trucks and Renault Trucks brands. In fact, NORS is considered a specialist in heavy-duty vehicle assistance. Auto-Sueco and Galius are the main importers and service providers for Volvo Trucks and Renault Trucks brands, respectively. These companies sell assistance contracts where each vehicle maintenance is included and fully planned to take into account the activity and function of the new acquired truck. When the business manager designs a new contract, he plans all the maintenance schedule taking into account the most appropriate components to each truck (e.g. type of oil, specific filters). In general, these contracts include all the repairs, except for accidents or other events that may be covered by regular insurances. Consequently, the covered repairs may occur at any time and their values have a big variability that accounts for the risky part of the contract. Due to these specifications, the proposed model takes into account for each period of the contract, both the repair costs and the maintenance costs as independent random variables and fits adequate probability distributions to these random variables according with the historical data available in the company. Once the probability distributions have been fitted, the empirical probability distribution of total contract cost is obtained by throughout the simulations of random samples from the probability distributions fitted. As so, total contract cost is modelled as a function of period (time or kilometres) and quantiles of interest of the probability distributions. In order to define the final value of the contract the product manager can add a profit margin to the estimated value of the costs.

Recent literature was reviewed to understand current and future practices in cost estimation models for maintenance and service contracts (see, for example, [1, 11] and [4]). Those authors report different perspectives for cost estimation both from a qualitative and quantitative (stochastic or deterministic approaches) point of view. Models for component and system level degradation and assessment along with life cycle “big data” analytics were referred as the two most important knowledge and skill. The solution presented in this paper for cost estimation in assistance contracts is based on the business knowledge and literature. According to the business knowledge, the costs were split into two groups mutually exclusive—the maintenance costs, which are related with the regular maintenance programmed by the constructor and the ones that are not included on that set, the repair costs—which were modelled, in each group, regarding the regular maintenance intervals. By default, both groups were modelled by parametric methods, but in the case of data with nonstandard features Kernel methods were applied, as the literature generally suggests.

Section 2 is devoted to describe and analyse some of the historical data available in the group. Section 3 describes the methodology followed to estimate the overall cost of the contracts. The definition of contract price is presented in Sect. 4, while in Sect. 5 the first results of applying this model to the historical data available in two companies of NORS group are presented. Finally, in Sect. 6 we refer to the main conclusions of this approach as well as some possible further lines of research related to this subject.

2 Historical Data

The NORS Group is a Portuguese group whose vision is to be a world leader in transport solutions, construction equipment and agriculture equipment. In its genesis are 86 years of history and activity in Portugal, which started with the representation of the Volvo brand in 1933. In 2017, the NORS Group was present in 17 countries across 4 continents with more than 3,700 employees and sales exceeding 1.6 billion Euros. In order to proceed with this work, we were granted access to the relevant financial data as well as to the maintenance and repairs data, related with 1478 and 1520 assistance contracts from Company A and Company B signed in the last 4/5 years. For confidentiality reasons the monetary unities provided on this work were rescaled and only the data prior to 2018 was considered. The provided data set for each contract includes its temporal duration, the identification of the truck, the truck model, as well as all the interventions applied to the vehicle chargeable to the contract. For each intervention, a repair sheet was provided that included the repairing date, the mileage of the truck, the services and parts applied and their costs. In this paper, that introduces an initial approach to modelling activity within the group, we will disregard the technical information about each which parts and services were applied in each repair, concentrating only in their financial costs.

In order to have a first view of the data provided, a graphical representation of a subset of the data provided is represented in Fig. 1. On the left, the cumulative repair costs for each contract available on the database is plotted against the contract temporal length. On the right subplot, the same costs are plotted against each vehicle mileage at the time of each repairing. Each line represents a different contract (vehicle), and all the contracts represented are related with the same truck model. It can be seen that the costs variability increases along the x-axis, being wider in the temporal scale (a) than within the spatial scale (b).

Fig. 1
figure 1

Cumulative total costs versus months (a) and Cumulative total costs versus km (b)

For each truck model, the regular maintenance intervals and their operations are preset by the vehicle constructor and its execution is mandatory according to the contract terms (e.g., oil changing, filters or timing belts replacements, regular inspections). Complementarily, there are other replacements and services that may be made under the contract agreement, but those are not considered on the regular maintenance scheme defined by the constructor (e.g., parts failures, consumables). As so, in this work we classified each cost related with the maintenance contract in a two-folded way: the maintenance costs, which are related with the regular maintenance programmed by the constructor and the ones that are not included on that set that we label as repair costs. Additionally, we consider also that the estimate of annual mileage travelled by each vehicle/contract, and the model regular maintenance intervals are known inputs and they will be considered in the definition of the contract’s price.

Figure 2 shows the cumulative maintenance costs (left) and cumulative repair costs (right) against the vehicle mileage. A significant positive linear correlation between the maintenance cumulative costs and the mileage of the truck seems to arise from the first subplot (r 2 = 0.92). Concerning the repair costs, it can be seen that the values are small in the first kilometres travelled by the truck and that they increase very quickly with the mileage of the contract. On the other hand, the order of magnitude for maintenance and repair costs are significantly different.

Fig. 2
figure 2

Cumulative maintenance costs versus vehicle mileage (a) and cumulative repair costs versus vehicle mileage (b)

As stated before, maintenance costs are related with the spare parts and services associated with each periodic maintenance recommended by the car constructor to a given truck model. As will be properly defined in (1), for each truck model, and regarding the maintenance costs, the developed model assigns one random variable to each mileage interval, that is centred in multiples of the maintenance periodicity, and with an amplitude of half of the periodicity. We consider the same mileage partition for the repair costs, adding an extra initial interval, that covers the [0, “half maintenance periodicity”], that is not needed for modelling the maintenance costs, as the trucks don’t have any maintenance costs in that initial period. Figures 3 and 4 represent the maintenance and repair costs as function of the mileage, for a given truck model. The periodic maintenance for this type of truck occur at each 75,000 km, and each interval is marked in the figures.

Fig. 3
figure 3

Maintenance costs vs mileage for a model of truck with maintenance periodicity of 75,000 km

Fig. 4
figure 4

Repair costs vs mileage for a model of truck with maintenance periodicity of 75,000 km

3 Estimating the Costs of Contracts

This section describes the development of a conceptual cost model considering a probabilistic approach. The contract includes provision for maintenance activities and repair actions, therefore the total cost of the contract is given as the sum of the costs of each intervention in the different mileage intervals.

3.1 Total Cost

The duration of an assistance contract may depend on the time or on the number kilometres travelled by the vehicle (or both). Some previous analysis on the historical data showed that for the NORS clients, the mileage is time dependent and the costs showed a high correlation with the mileage. As so, in this work the costs will be modelled as function of the number of kilometres. As previously found in Sect. 2, the costs have large variability in each interval and either the maintenance costs as well as the repair costs take very different values in the same interval. In the other hand, the previous analysis shows that the Pearson correlation coefficient for the costs of two consecutive intervals is low (less than 0.5), indicating the absence of “strong” positive correlation between the costs of two consecutive intervals. All of this confirms our idea that is possible to obtain the total cost of the contract for the truck of model k by the sum of a set of independent random variables that represent the costs of each intervention (maintenance and repair) in different intervals of kilometres:

$$\displaystyle \begin{aligned} C_{n}=\sum_{i=1}^{n} M_{i} + \sum_{i=0}^{n} R_{i} \,, \end{aligned} $$
(1)

where M i is a random variable that represents the maintenance costs in the interval I i, i = 1, …, n; R j is a random variable that represents the repair costs in the interval I j, j = 0, …, n with \(I_{i}=I_{j}=\left ]\left (i-0.5\right )\times A_k,\:\left (i+0.5\right )\times A_k\right ]\), i, j = 1, …, n, the interval of kilometres, and, in particular, for repairs is considered \(I_{0}=\left ]0, \:0.5\times A_k\right ]\). A k is the periodicity of the maintenance of truck of the model k.

As was considered that the total cost of the contract for the truck of model k is a linear function of the independent random variables M i and R j, this cost is also a random variable whose (unknown) distribution depends on probability distributions of these random variables. Taking into account the estimated density distribution of C n, it is possible to obtain a set of statistical quantities of interest. If the distributions of M i and R j are known, Monte Carlo methods (see, for example, [13]) allow us to simulate the total cost distribution. The Monte Carlo procedure uses algorithmically generated pseudo-random numbers which are then transformed to follow a prescribed probability distribution. Figure 5 schematically represents the input variables simulated form a probability distribution, the functional relationship that then provides the output (total cost) and the probability distribution of the output.

Fig. 5
figure 5

Monte Carlo simulation

As stated, Monte Carlo simulations of the total costs distribution require the knowledge of the distributions of the random variables M i and R j. The initial descriptive study made with the maintenance and repair costs historical data using this intervalar formulation, indicates that the distributions, truncated Normal, Gamma, Log-normal and Weibull are good options for modelling most of those variables. All of these distributions have \(\mathbb {R}^+\) as support. In the cases where the cost in each interval does not have a simple analytical representation, it was considered the kernel density estimation (KDE) methods.

When possible, the parameters of parametric distributions were estimated using the maximum likelihood, otherwise the method of moments was used [7]. The Anderson-Darling goodness of fit test (AD-Test) was used to select the theoretical distribution with the best adherence to the data in each interval.

KDE approach was used in the instances where the cost in each interval did not had a simple analytical representation. Essentially, this method consists in the sum of bumps placed in the observation with some shape and a fixed width

$$\displaystyle \begin{aligned} \displaystyle{\hat{f}(x)=\frac{1}{nh}\sum_{n}^{i=1} K\left(\frac{x-X_{i}}{h}\right)}, \end{aligned} $$
(2)

where X = (X 1, …, X n) is an i.i.d random sample from an unknown distribution with density function f(x), the kernel function, K(.), is a smooth function and h is the smoothing bandwidth. In other words, the function K(.) is the shape and h is the width of the bumps. As kernel function we considered the most common option, the Gaussian kernel defined by:

$$\displaystyle \begin{aligned} \displaystyle{ K(x)=\frac{e^{\frac{-\|x\|{}^2}{2}}}{\int e^{\frac{-\|x\|{}^2}{2}} dx} }. \end{aligned} $$
(3)

For the estimation of the bandwidth we consider the estimator

$$\displaystyle \begin{aligned} \displaystyle{\hat{H}=\frac{0.9\times \min (S, IQ/1.34)}{n^5}}, \end{aligned} $$
(4)

where S be the standard-deviation of X and IQ the inter-quantile range of X.

Before starting the description of simulation procedure, is important to notice that the M i, i = 1, ⋯ , n, were seen as absolutely continuous random variables, while the R j, j = 0, ⋯ , n, were seen as mixture random variables. This classification is due to the existence of maintenance costs at all intervals I i, i = 1, ⋯ , n, as well as to the fact that not every truck needs a repair assistance in all the intervals.

Concerning the repair costs, let \(R_j^c\) be a continuous random variable that represents the positive repair costs in the interval j, j = 0, ⋯ , n, with the probability density function given by

$$\displaystyle \begin{aligned} f(r_j^c)= \begin{cases} f_D(r^c_j), \: r^c_j>0 \\ 0, \: \mbox{otherwise} \end{cases}, \end{aligned} $$
(5)

were f D(.) is the probability density function of one theoretical distributions referred above or is the result of the application of the KDE method for positive repairing cost’s. Finally, the repairing cost in the interval j, R j, j = 0, ⋯ , n, is related to \(R_j^c\) through

$$\displaystyle \begin{aligned} R_j=g(R_j^c)=\begin{cases} R_j^c, \: R_j^c>0 \\ P(R_j=0), \: R_j^c=0 \end{cases} \end{aligned} $$
(6)

At this stage, every random variable (cost) in a certain interval has already a probability distribution function associated to it (parametric or non-parametric). The distribution for the total costs will be sum of the distributions associated with each independent random variable. For simulation purposes, we assume that every random variable \(M_{i} \sim F_{M_i}(\theta _{M_i})\) and \(R_{i} \sim F_{M_j}(\theta _{R_j})\), i = 1, …, n, j = 0, …, n, where \(\theta _{M_i}\) and \(\theta _{R_j}\) represent the distribution parameter vectors, define the vector \((M_1^{1,*}, \cdots , M_n^{1,*},R_0^{1,*}, \cdots , R_n^{1,*})\) and allow to obtain the total cost for the first realization, which is defined by \(C_n^{1,*}=M_1^{1,*}+ \cdots +M_n^{1,*}+R_0^{1,*}+ \cdots +R_n^{1,*}\). This procedure is repeated N times, where N is a large number, in order to create the resample \(C_n^{1,*}, C_n^{2,*}, \cdots , C_n^{N,*}\). The obtained distribution is the Monte Carlo evaluation of the distribution of total cost of the contract.

The sample of the distribution of total cost of the contract allows to obtain several quantities of interest such as probabilities, quantiles, etc. In this work, we consider the quantile α × 100%, \(q_{F_{C_n^*} } (\alpha )\), to estimate the contact cost, where 1 − α represents some measure of risk.

With this approach, the contract costs are only dependent of the limit of kilometres. But, in fact, sometimes the duration of the contract depends on time. In order to include this possibility, we defined the function f km(t) that relates the contract time and the truck mileage. Accordingly to the data provided by the company (see Fig. 6), this relation seems to be reasonable fitted by a linear function, so we considered the function

$$\displaystyle \begin{aligned} f_{km}(t)=a\times t + b \end{aligned} $$
(7)

to define the expected travelled distance ran by a truck as function of the months in the contract, where a and b are the coefficients of the linear regression.

Fig. 6
figure 6

Travelled distance vs time (months), for one truck model

3.2 An Illustrative Example

The illustration of the proposed method is based on truck model with maintenance periodicity of 75,000 km and a contract with duration of 300,000 km.

For practical purpose, the methodology proposed was implemented in R software [9] by means of a Graphical User Interface (GUI). To fit the truncated Normal distribution was used the package truncnorm [2, 8]. The Gamma, Log-normal and Weibull distributions parameters were estimated using the function fitdist from the package fitdistrplus [5]. In the execution of the AD-Tests was used the package goftest of R [6, 10].

As kernel function we consider a Gaussian kernel, and the bandwidth estimative was calculated using (4). In this case, the estimation was carried out with the functions density and bw.nrd0 from R [3, 12].

For this work, N = 1000 Monte Carlo simulations were performed. Typically, the number of Monte Carlo simulations is defined taking into account the desired accuracy. Thus, it is desirable to determine a value of N that obtains a suitable level of accuracy for a given problem at hand. There are several methods for determining N that guarantees a specified level of accuracy for standard error estimates, confidence intervals, confidence regions, hypothesis tests or bias correction. As this work aimed to obtain the sampling distribution of total cost, N was chosen based on the histogram regularity of the samples. Different histograms were produced by varying the number of simulations. For N = 1000 the histogram of the samples turned out to be regular beyond any reasonable doubt.

Table 1 provides, for each interval and for both type of intervention, the intervals of kilometres considered, the fitted distributions and the empirical probability of the repairing occurrences. The results showed that in the first interval only 6% of the contracts lead to repair costs, while in the other intervals this probability exceeded 50%. As stated before, the maintenance costs for the I 0 interval are not simulated, as there is no periodic intervention in that mileage range given that the first maintenance intervention is schedule by the constructor to the 75,000 km, and according to the company engineers and historical data, none anticipates the first intervention by 37,500 km.

Table 1 Fitted distribution for costs and empirical probability of repair occurrences

Taking into account the fitted distributions presented in the Table 1 and the procedure described in Sect. 3.1, a sample of the distribution of the total contract cost was achieved. Figure 7 represents the estimated histogram with smooth probability density function and the plot of the empirical distribution function of the random variable that represents the total cost of the contract.

Fig. 7
figure 7

Histogram (a) and empirical cumulative distribution plot (b) for the contract total cost of the given example

This allows the manager to have some risk related estimate for the total contract cost. For example, given α = 0.95, the cost estimate is \(q_{F_{C^*_4}}(0.95)=25{,}857\) monetary units, while for α = 0.05 the value achieved was around \(q_{F_{C^*_4}}(0.05)=12{,}840\) monetary units.

4 The Price

This section presents a procedure to define the price of an assistance contract which is based on the idea that this price is a linear function of the estimated costs.

4.1 The Price of the Contract

The price of an assistance contract can be defined as the estimated costs to which was added a profit margin. To define the price of the contract is necessary to choose the model of the truck, the periodicity of the regular maintenance defined by the constructor, the maximum mileage covered by the contract or the duration of the contract. If the duration is a limit of time, then consider K = f km(t) km for t months. If the duration is a limit of kilometres, say K km, then it’s no necessary to adjust the value. As so, and considering A (km) the periodicity of the regular maintenance defined by the constructor, we define \(m=\max \{i : K \leq (i+0.5)\times A, i=1,\dots ,n\}\) as the number of intervals to be considered for this contract. Therefore, the total cost for this generic contract will be \(q_{F_{C^*_m}}(\alpha )\) with a risk 1 − α. The true limit of kilometres for this contract will be K max = (m + 0.5) × A kilometres. Finally, the price of the contract will be given by:

$$\displaystyle \begin{aligned} \rm{Price}(\alpha,\beta)=\beta \times q_{F_{C^*_m}}(\alpha) \;, {} \end{aligned} $$
(8)

where β is the margin defined by the business manager and 1 − α the risk.

4.2 Penalty

For the type of assistance contracts that this paper refers to, is usual to set a limit of kilometres that a truck can travel, and it is not plausible for a truck to stop at the exact moment the contract ends. When the trucks passes that limit, a monetary value (penalty cost) is applied for the extra kilometres travelled. We modelled the total cost of a contract as the cost for m intervals and the penalty cost as an adjustment to a cost of a contract with m + 1 intervals. As so, the penalty cost is then defined by,

$$\displaystyle \begin{aligned} C_{P}(n,\alpha, A_k)=\frac{q_{F_{C^*_{n+1}}}(\alpha) - q_{F_{C^*_{n}}}(\alpha) }{A_k} \end{aligned} $$
(9)

monetary units per kilometre, for contracts with n intervals and periodicity of maintenance of A k kilometres.

5 Results

The approach explained in this paper was implemented in two companies at NORS, regarding the historical data available on closed contracts. Both Figs. 8 and 9 represent the real and estimated profit margin for the all range of the real duration of assistance contracts on both companies. The numbers inside each cell represent the number of contracts of that type. For comparison terms, the beta value in (8) was settled to β = 1.05 for both companies. It may be observed that contracts with longer duration and more kilometres travelled tend to have a smaller margin, or even turned into losses.

Fig. 8
figure 8

Real (a) and estimated (b) profit margins of Company A versus duration of the assistance contracts

Fig. 9
figure 9

Real (a) and estimated (b) profit margins of Company B versus duration of the assistance contracts

It is important to notice that, as expected, the contracts with longer duration and with more kilometres have much more variance on the costs, and therefore their risks are significantly higher. Nevertheless, the other contracts are reasonably stable within this model parametrization, and a detailed analysis should be made by the management in order to consider whether to assign the same risk factor for all the contracts or to make it depend on the longevity of the contract.

In other perspective, in Figs. 10 and 11, which represents the real and estimated margins of the assistance contracts of both companies. It can be observed that the profit has a large variability and extended contracts tend to decrease the margin.

Fig. 10
figure 10

Real (a) and estimated (b) profit margins of Company A versus mileage

Fig. 11
figure 11

Real (a) and estimated (b) profit margins of Company A versus mileage

This project was developed in an industrial company with the purpose to help the management of the assistance contracts to estimate and simulate their risks and prices. In fact, we went a little further, as besides the developed mathematical model, a graphical user interface (GUI) was built in order to implement the model and to give additional information to the management as presented in Fig. 12. This GUI, allows the business manager to estimate contract price proposals according to the model presented, to compare it with the ones that were previously settled by the company, as well as to explore and graphically analyse all current situations of all the contracts regarding one particular type of truck, and, of course be aware of the risk level he is taking when making the proposal. Finally, the parametrization of the model regarding, for example, the best fitted distributions is almost automatic and it is monthly updated with the new information from the on-going contacts, leading to as much as possible adherence to the reality.

Fig. 12
figure 12

Snapshot of the developed Graphic User Interface

6 Conclusions

In this paper we address a challenge made by two NORS group companies regarding the costs and risks related with their assistance contracts for trucks. We built a stochastic model using the Monte Carlo simulation for estimating, at a certain risk, the contracts total cost. The price of the contracts is a function of the total cost. The resulting model was applied to two different models of trucks, belonging to different segments.

In the short term, the results indicate that the existing parameterization lead to good results without significant losses. However, for long range contracts, there is an increased risk of losses. The model was implemented on the company, via a Graphical User Interface that allows the managers to estimate contract price proposals according to this formulation, to compare them with the ones that were previously settled by the company, as well as to explore and graphically analyse the historical data, that is, take an overview of the current situation regarding all the contracts for each particular type of truck, and, of course be aware of the risk level when making a particular proposal to a new client.

As this is an ongoing project, further work is to be done. One point of interest will be in optimizing the margin, β (8), of the contract for the risk taken. At this point, the margin is chosen by the business manager and takes into account some client segmentation. Other point of interest would be to explore the references of truck parts. For example, there are some type of single repairs that costs about 1.1 times the total value of the contract. According to historical data, these repairs are very rare but these situations must be covered by the profit margin in the other contracts. On the other hand, if the life of the truck parts are modelled, it is possible to anticipate costs, or even enable more information on eventual contracts renewals.