Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The ocean transportation of cargoes such as chemicals and petrochemical gases is undertaken by vessels that are technologically advanced, highly specialised and capital intensive, with a wide range of technical specifications. The heterogeneous and relatively small global fleet of such vessels and a concentrated ownership structure leads to low liquidity in the second-hand asset markets. For these reasons, asset valuation is a much more challenging task than for the larger segments of commodity shipping, but no less important for the market players and financial institutions involved. Research into the formation of second-hand ship prices has hitherto been based on time series of values for generic vessels in the tanker or drybulk sectors (see for instance [3, 5,6,7]). As the sole exception, Adland and Koekebakker [2] propose a nonparametric ship valuation model based on sales data for Handysize bulk carriers, though they consider only the size and age of the ship and the state of the freight market.

In this paper, we extend the research on ship valuation using micro data on vessel transactions and technical specifications by proposing a semi-parametric generalized additive model. Under the assumption of separable factors, we can quantify the pricing effect of a large number of technical variables in the valuation of highly sophisticated chemical tankers. Such a vessel valuation model is particularly valuable for brokers, financiers and owners when performing “desktop valuations” of specialised ships where brokers estimates are costly or perhaps not available.

2 Methodology

A Generalized Additive Model is the extension of a generalized linear model to a combination of linear predictors and the sum of smooth functions of explanatory variables. In general, a model may look like

$$g(\mu_{i} ) = X_{i}^{*} \theta + f_{1} (x_{1i} ) + f_{2} (x_{2i} ,x_{3i} ) \ldots$$
(1)

where µi ≡ E(Yi), Yi is the response variable distributed according to some exponential family distribution, X * i is a vector of explanatory variables that enter the model parametrically, θ is a corresponding parameter vector and fj are smooth functions of the variables that are modelled non-parametrically. GAMs provide enough flexibility to take non-linear relationships into account without making any specific assumptions about the functional form of these relations. They also provide reliable results in samples of moderate size.

The bases for our estimations are thin plate regression splines (TPRS) in combination with a general cross validation procedure (GCV). Standard bases for regressions splines, such as cubic splines, require the user to choose knot locations, i.e. the basis dimension. Furthermore, they allow only for the representation of the smooth of one predictor variable. TPRS surmount these problems and are in a limited sense ‘optimal’ with respect to these problemsFootnote 1. Given the problem of estimating g(x) such that yi = g(xi) + εi, thin plate spline smoothing estimates g(x) by finding the function f minimising

$$\left\| {y - f} \right\|^{2} + \lambda J_{md} (f)$$
(2)

where Jmd(f) is a penalty function measuring the “wiggliness” of f and λ is a penalisation parameter. Instead of choosing the basis dimension to control the models smoothness, the trade-off between model fit and smoothness is controlled by the smoothing parameter λ. If λ = 0 the spline is unpenalized while λ → \(\infty\) leads to a straight-line estimate (over-smoothing). Furthermore, λ relates to the effective degrees of freedom (EDF) of a smooth term, i.e. the larger the complexity of a smooth term the larger the EDF and the lower λ.

The problem of choosing some optimal value for λ is solved via generalized cross validation (GCV). For more details on GAM’s and the practical implementation the interested reader is referred to for example Härdle et al. [4] or Wood [8]. One disadvantage in using GAMs is that hypothesis testing is only approximate and that satisfactory interval estimation requires a Bayesian approach. Using the Bayesian posterior covariance matrix and a corresponding posterior distribution allows us to calculate p-values and confidence intervals. Typically, the p-values calculated this way will be too low, because they are conditional on the uncertain smoothing parameter [8]. Therefore, we are restrictive when interpreting results and significance levels.

Our independent variables include both macro and ship-specific variables and can be justified as follows (subscript i is omitted, but refers to the value of the variable for sales transaction i, or at the time of transaction i for the macro variables):

NB:

Newbuilding price (USD/Compensated Gross Tonnes, CGT). The cost of ordering a brand new vessel (i.e. replacement value).

EARN:

spot market vessel earnings ($/day) as calculated on the benchmark Houston–Rotterdam route basis $/tonne rates for 3000 tonnes ‘easychem’ parcels.

SIZE:

deadweight carrying capacity of the vessel (tonnes). A larger vessel should attract a higher price due to higher earnings capacity, all else equal.

SPEED:

design speed of the vessel (knots). A greater speed indicates higher efficiency, though this may come at a cost of higher fuel consumption.

AGE:

Age of the vessel at the time of the sale (years). As vessels depreciate, older vessels have lower values, all else equal.

NOTANK:

the number of cargo tanks. A higher number is increases the potential number of different chemical parcels carried simultaneously.

IHULL:

Dummy variable indicating hull configuration (double hull, double bottom, double sides and single hull).

ICOAT:

Dummy variable indicating tank coating type (epoxy-, polyurethane-, zinc-, and stainless steel-coating). Higher-grade coating increases the cargo flexibility.

IIMO:

Dummy variable for the vessel’s IMO classification of the environmental and safety hazard of the cargoes (Type 1, 2, 3 with Type 1 being most severe).

ICOUNTRY:

Country of build as a proxy for perceived overall build quality.

CARGODIV:

an interaction variable representing cargo diversity of the vessel as measured by the product of the number of coatings and number of tanks.

PUMPDIV:

an interaction variable representing the ability and flexibility of cargo handling as the product of the number of discharge pumps and pump capacity

As an example, the most comprehensive model specification can be written as:

$$\begin{aligned} g(E(PRICE_{i} |.)) & = \gamma_{0} + s(NB_{i} ) + s(EARN_{i} ) + s(SIZE_{i} ) + s(AGE_{i} ) + I_{i}^{HULL} \\ & \quad + s(CARGODIV_{i} ) + s(PUMPDIV_{i} ) + I_{i}^{IMO} \\& \quad+ s(SPEED_{i} ) + I_{i}^{COUNTRY} \\ \end{aligned}$$
(3)

All regressions are carried out using g(.) = log(.) as link-function and assumes that second hand prices follow a Gamma distribution, PRICEi ~ G(α, β) (α, β). Experiments with the Normal distribution and different link-functions did not improve results.

3 Data and Empirical Results

Tables 1 and 2 show the descriptive statistics for our variables. The dataset obtained from Clarkson Research Ltd. includes 842 observations of chemical tanker sales since October 1990. We remove vessels sold under unusual circumstances, including those sold at auction, judicial sales, vessels sold with attached time charter contracts and en-bloc transactions, leaving 736 observations for further analysis.

Table 1 Data overview
Table 2 Data distribution for build country, coating, hull and IMO type

Table 3 shows the regression results for four different specifications (A to D) of the general pricing model in Eq. 1. The upper panel shows the results for the non-parametric components as estimated degree of freedom (EDF) which reflects the degree of non-linearity present in the regressors and the significance of this explanatory factor. The lower panel provides information on the parametric components given as point estimate (PE) and its significance which can be interpreted directly.

Table 3 Regression results for models A through D

The results are broadly consistent across specifications and can be summarized as follows. Firstly, the relationships between asset value and the replacement cost (NB price), vessel age, earnings and vessel size are non-linear and highly significant. Secondly, non-double-hulled tonnage attracting a substantial discount. Thirdly, perhaps somewhat surprising, tank coating does not significantly affect asset values. Fourthly, our proxies for versatility and efficiency (CARGODIV, PUMPDIV, SPEED) are highly significant. Finally, IMO classification matters, albeit perhaps not in the way expected, as IMO2 and IMO3 vessels carry a premium compared to the technically more advanced IMO1 vessels. The explanatory power of the model is relatively high, starting out at 81.9% for the basic ‘macro’ model and increasing to 86.3% as we add more technical vessel variables.

Table 4 presents the results for our most comprehensive model (Eq. 3). The results from the earlier specifications remain robust. Additionally, vessels built in certain countries (Denmark, Germany and Norway) attract a quality premium, while Ukrainian-built tonnage has perceived lower quality reflected in asset values.

Table 4 Regression results for final model

Figure 1 presents the smooth of new building prices, earnings, size and age to second-hand prices. The relationships are strongly non-linear, with small confidence bands. Second-hand prices increase with size, decrease with age, and are also broadly increasing with the replacement cost and spot market earnings. The latter effect is less clear for high values, due to fewer observations and mean reversion in rates [1].

Fig. 1
figure 1

Smooth of NB price, earnings, size, and age

Figure 2 illustrates the joint non-linear effect of vessel age and vessel size on second-hand prices, similar to Adland and Koekebakker [2].

Fig. 2
figure 2

3D plot of asset values against size and age

4 Concluding Remarks

We have developed a comprehensive multivariate semi-parametric framework for the estimation of chemical tanker second hand prices. Previous non-parametric models have shown that non-linear modelling is appropriate, but have suffered from the curse of dimensionality. Our model surmounts these issues and extends the existing literature by applying semi-parametric GAMs to a cross sectional dataset of actual sale and purchase transactions of chemical tankers. Even the heterogeneous nature of chemical tankers and the high variation in chemical tanker second hand prices can be satisfactorily modelled with this framework. Ship specific factors which have not been included in previous models are shown to have a significant impact on prices and the explanatory power of this model appears to outperform linear methods of estimation. Most of the factors turned out to show the expected effects on prices. To sum up, semi-parametric methods—especially GAMs—provide an appropriate framework to model asset values for highly heterogeneous assets.