1 Introduction

Private vehicle ownership plays a pivotal and ubiquitous role in the travel choices of individuals and households. Given the growing focus on global climate change, energy crisis, air contamination and sustainable development issues, the topic is of great importance to decision makers. Thus, it is undoubtedly one of the most researched transportation topics. Household vehicle ownership model is an important part of urban comprehensive transportation modeling system. In the previous studies on household vehicle ownership, the transportation planners mainly focus on the choice dimensions including vehicle fleet size, vehicle type, vehicle usage and vehicle transactions.

In this paper, emerging economies refer to economic entities with a booming growth represented by developing countries, particularly those with a great potential for development based on emerging technologies (e.g., mobile internet, internet of things, big data, artificial intelligence, etc.) and economic models (e.g., sharing economy). In recent years, emerging economies have maintained a high level of contribution to global economic growth, thus enhancing their positions and influences in the world economy, which will continue to provide a great impetus for future global economic growth.

With the rapid economic growth, the vehicle population and motorization levels keep growing in emerging economies and occupy increasingly high global shares. Through the discussion and review on vehicle-ownership models, some valuable references can be provided for emerging economies in this field. On the other hand, the rise of new economic model also exerts an imperceptible impact on the travel behavior and lifestyle of urban residents. Under the dual influence of mobile internet and sharing economy , a series of travel modes are gradually emerging, such as e-hailing, shared car and shared bike, which may lead to a redistribution of transportation market and exert a strong impact on the existing travel modes. At the same time, the research on alternative-fuel vehicles (AFV) is expanding along with the popularization of new energy technology. Thus, it is necessary to conduct a comprehensive and in-depth review on modeling studies for better understanding the demand and tendency of household vehicle ownership in emerging economies in the new era.

The initial studies concerning the demand model of household vehicles appeared in the 1930s1. Most of relevant studies focused on the analysis of private car ownership, while other types of household vehicles were less considered. The early researchers placed emphasis on the analysis of vehicle ownership at the aggregate level2, 3, such as at the regional or national level. The main drawback of the method is that it could not capture the underlying behavioral mechanisms, which actually guided the household decision-making process. In addition, the policy sensitivity and precision of the aggregate methods were limited4. Therefore, investigating the vehicle ownership at the disaggregate level could make up for the deficiency of aggregate models. Furthermore, this method considered each “individual” or “household” as the analytical unit, which could provide a more precise and detailed model to evaluate relevant policies5. This paper will focus on the analysis at disaggregate household level. The methodology could take numerous covariates into consideration, while the modeling forms vary from simple to complex.

In recent years, with the progress of computer technology, many advanced methods and theories are being applied to the household vehicle ownership and demand models. The progress in vehicle-ownership modeling has been extensively discussed in previous literature reviews. Jong et al.6 provided an overview of car ownership models developed prior to 2002, and compared the model types on the basis of 16 criteria. Potoglou and Kanaroglou7 offered a critical review of household discrete-choice-based automobile demand models for conventional and alternative-fuel automobiles. In addition, the household dynamic vehicle-ownership models (holding models vs. transactions models) were compared by de Jong8 in 2009. Anowar et al.9 classified the modeling methods of vehicle ownership into four categories from the perspectives of unobserved factors and time evolution: exogenous static model, endogenous static model, exogenous dynamic model and endogenous dynamic model. The research based on exogenous model treats vehicle ownership being independent of other decisions, while the endogenous model examines the mutual effect among different decisions of vehicle ownership and considers the effects of unobservable factors. So far, there are some limitations in current literature given the fact that existing reviews are mostly focused on developed countries but less for developing countries or emerging economies. In addition, the discussions on household vehicle ownership in the new economic and policy situation are still lacked in previous review. In light of limitations in existing literature, this paper is aimed at filling this gap. Specifically, the retrospective review and prospective discussion in this paper will be focused on vehicle-ownership modeling for emerging economies, such as China and India, while the influence of sharing economy and mobile internet on vehicle ownership will be emphasized. With an overview on vehicle-ownership disaggregate modeling process in various aspects, the research experience and direction can be provided for the future vehicle-ownership model development in emerging economies.

The paper will review the household vehicle-ownership modeling from multiple perspectives including data, methodology, application and prospect. The next section presents a review of the data source and type, SP/RP data and panel data for existing studies. After this section, we offer an overview of modeling methodology of household vehicle ownership by two categories of the static and dynamic models. Section 4 discusses the application of vehicle-ownership models in emerging economies, especially for developing countries. In addition, a prospective view on the challenges and opportunities are provided in Sect. 5.

2 Data

Data acquisition is an essential prerequisite for modeling household vehicle ownership. The data of existing literature are summarized in three aspects as follows: data source and type, SP/RP data and panel data. From those aspects, some survey methods with reference value can be learned. In addition, in emerging economies, RP and SP data can be combined to improve the accuracy and precision of vehicle-ownership model and panel data can be applied to dynamic modeling analysis considering the dimension of time.

2.1 Data Source and Type

On the basis of data source and data type, the existing studies (since 1990) on vehicle ownership are selected and classified, as shown in Table 1. Through statistics and analysis, several observations can be obtained from the table.

Table 1: Data sources and types of vehicle-ownership research.
  • More than 50% of the study areas for vehicle ownership are located in North America (47), especially in the United States (37). The current studies in Europe (26) are about 30%, while the studies from Asia (12) and Australia (2) are relatively fewer.

  • Existing data sources mainly come from the developed countries, while only a few developing countries have conducted disaggregate studies for vehicle ownership (e.g., China, Malaysia and Thailand). In addition, effective panel data have not been observed in developing countries possibly due to the limitation of economic and technical conditions.

  • In the USA studies, the main data type is travel survey data which were often derived from NHTS10,11,12,13,14,15,16, NPTS17,18,19,20, BATS21,22,23,24,25,26 and NYMTC’s RT-HIS27,28,29,30. The collection of above large-scale data is supported by relevant government agencies with sufficient sample size and high reliability.

In addition, the current data can be classified into two broad categories: large-scale data collected under the support of government and small-scale data collected by researchers. In general, the former category is mostly RP data and has a large sample size (over 10,000 households in most cases), while the latter category is usually conducted for a specific research purpose and has a smaller sample size (ranging from a few hundreds to a few thousands). In addition, the large-scale data are more widely used than those collected by researchers in the existing studies, probably because data in the latter category are usually not shared among researchers.

In terms of data collection methods, the main methods are listed as follows: telephone interview, mail-back survey, home interview and internet survey. Along with the development of communication technology, the method of data collection is changing gradually. Take NPTS87 as an example to analyze the trend of collection method, surveys were conducted by the home-interview method in 1967 and 1977, and by the telephone method in 1988 to decrease cost and expand sample size, then a written diary and a household roster of trips were used in 1995 to assist respondents to recall the trips made by them and with their household members. Besides, e-mail and online questionnaires have also become common ways of internet-based survey with the development of internet.

From the above analysis, the following ways are adopted to improve the response rate of survey in existing studies: (1) rewarding the respondents with cash or gifts as an incentive35, 36; (2) carrying out a face-to-face survey better with the assistance of relevant personnel (e.g., the police)66, 84; and (3) interview with the computer-assisted system to enhance the convenience and accuracy of the survey31, 32. With the increasing maturity of information technology under the new economic situation, online and e-mail questionnaires with computer-assisted system can provide a convenient way for data acquisition in emerging economies. In addition, the online reward mechanism can be used to induce responses (e.g., online sweepstakes). If it is necessary to conduct a household survey or a face-to-face interview, the relevant departments (such as residential committee, police station, etc.) can intervene so as to improve the data acquisition efficiency and accuracy.

2.2 SP and RP Data

Household vehicle demand models can be developed based on two types of survey data: revealed preference (RP) survey data or stated preference (SP) survey data. RP survey data correspond with actual choices observed through a survey of household vehicle ownership88, while SP survey allows for a flexible design of hypothetical scenarios related to key attributes and obtaining multiple choice observations on one individual under designed hypothetical scenarios89. Both data types have their respective drawbacks: RP data cannot capture respondents’ preferences on alternatives that do not exist in the current market or are unfamiliar; and SP data may not produce reliable forecasts of real choices.

Integrating RP and SP data can take advantage of the complementary strengths of each data source and help to develop a model with higher accuracy and precision levels. Morikawa et al.90 initially integrated RP and SP data for travel mode choice model development. In recent years, the RP/SP joint model is frequently used in the studies of travel mode choices91, 92. In the research on household vehicle ownership, RP and SP data can also be integrated for the analysis of alternative-fuel vehicles (AEV). Brownstone et al.93 firstly introduced the method of combining RP and SP data into the dynamic simulation system of automobile demand. Subsequently, they compared the joint RP/SP multinomial logit (MNL) and mixed logit (ML) models for traditional, alternative energy and electric vehicles using the RP and SP data in California94. Among them, RP data are important for actual use of vehicles, and SP data are helpful for obtaining the attributes of vehicles that are not available in the current market. Thus the combination of RP and SP data can allow for estimating the common parameters more precisely. The research results show that ML model performs better than MNL model in terms of goodness-of-fit. Golob et al.95, 96 conducted SP experiments for both alternative-fuel vehicle and petrol vehicle with consideration of various attributes such as body type, fuel type, refueling range and purchase price. Hensher and Greene97 analyzed RP/SP data through an NL (nested logit) model and three ML models, so as to study the preference of traditional and alternative energy vehicles for the households with vehicles. In addition, a stated preference experiment is used to examine how environmental attitudes might influence the behavior with respect to vehicle choice under an emission charging regime in Beck’s study58.

In standard SP experiments, the alternatives are designed without regard to the respondent’s RP choice even if RP data are pooled with SP data in model estimation. On the contrary, “pivoting” design is sometimes applied in SP experiments. “Pivoting” indicates that the attributes in SP experiments are created by changing attributes of the chosen RP alternative98,99,100, which means that unobserved factors in RP choices will be brought into SP choices. As a simple example of pivoting, respondents are asked to make a choice among alternative routes by assuming that the route travel time and cost are higher or lower than those of the actual choice in their recent trip. “SP-off-RP” scenarios, which can be considered a form of “Pivoting”, indicate a situation that the RP chosen alternative is worsened and other alternatives are improved to induce a choice change. Unlike pivoting experiments, the respondents in SP-off-RP experiments face the same number of alternatives in RP and SP scenarios and the correspondence is one-to-one between SP and RP alternatives. Due to the reason above, SP-off-RP questions have an advantage to be closer to the realism and easier for respondents to understand. There are some studies based on SP-off-RP data below. Train and Wilson101 first proposed a modeling approach to deal with endogeneity issues in SP-off-RP data in 2008 and applied the approach to model agricultural shippers’ mode and route choices in the Pacific Northwest. Then, Train and Wilson102 presented a series of Monte Carlo exercises that explore properties of estimators based on simulated SP-off-RP data. The result shows that SP-off-RP questions are more efficient and accurate than standard SP questions. Guevara and Hess103 proposed a limited information maximum likelihood (LIML) approach to address the endogeneity issue in discrete-choice models based on SP-off-RP data. van Cranenburgh et al.104 generalized the SP-off-RP estimation procedure and estimated a generalized model to understand the vacation behavior under high travel cost conditions. Due to the complexity of SP-off-RP modeling process, there are few relevant studies conducting SP-off-RP experiments on household vehicle ownership, except those by Yu et al.105 and Jiang et al.106 on survey design for the choice modeling in this area.

2.3 Panel Data

From the perspective of space–time dimension, the econometric data can be classified into the following three categories: (1) Cross-sectional data: Cross-sectional data are observations that come from different individuals or groups at a single point in time. These data highlight the differences among individuals; (2) Time-series data: a series of data obtained from the continuous observation at different timings on the same object, which should pay attention to the behavior consistency of the selected objects and the sequential correlation of random errors within the time interval; (3) Longitudinal data or Panel data: a kind of data that integrate the cross-sectional data and time-series data. It can be used to analyze the characteristics of the data of each object in time series. Panel data cannot only analyze the difference between individuals through model parameters, but also describe the dynamical variation characteristics of individuals107.

By summarizing the literature, it was found that most vehicle demand models relied on cross-sectional data from behavioral surveys, while panel data were generally used for dynamical analysis of vehicle ownership. For example, Woldeamanuel et al.70 analyzed the changes in household car ownership over time based on the German Mobility Panel data from 1996 to 2006. In addition, the panel data were also used to analyze the vehicle holding and transaction behaviors. Yamamoto et al.44 developed a competing-risk-duration model, where data were obtained from a three-wave panel survey conducted in California in 1993, 1994, and 1996 to forecast demand for alternative-fuel vehicles. The collection of Panel data could be substituted by the retrospective survey for vehicle dynamic demand modeling. For example, Mohammadian et al.53, 55, 56 conducted a retrospective survey on the household vehicle ownership in Toronto, Canada, and collected relevant data from over 900 families for vehicle transaction modeling.

As the acquisition cost of panel data is quite high, the alternative pseudo-panel data or quasi-panel data were generally used in many research tasks to simplify the data collection procedure and avoid challenges. The repeated cross-sectional databases over multiple time points can be merged to generate pseudo-panel data, which is a compromise between a single year cross-sectional datasets and panel datasets truly collected across multiple years51, 73. Differing from panel data, they allow for individual differences in each time period and comparing behaviors of individual groups with similar characteristics in different time periods. Anowar et al.51 employed a pseudo-panel approach to study vehicle-ownership evolution in Montreal region, Canada using cross-sectional OD survey datasets of 1998, 2003 and 2008. In this study, a temporal variable is introduced called “time elapse from 1998” to represent the time difference between the survey year from base year (1998), so as to examine trends of vehicle ownership over the years when panel data are difficult to obtain. On the other hand, the quasi-panel data were applied to impact analysis of changes in neighborhood characteristics or built environment on vehicle ownership35, 64. In the acquisition process of ideal panel data, the relevant attributes need to be measured before or after the residential relocation. Due to the challenges to collect true panel data, quasi-panel data may be collected as substitutes. Unlike true panel data, previous variables of quasi-panel data are not measured directly at time (t − 1), but through recalled at time (t). For instance, Cao et al.35 applied mail-out and mail-back approach to collect data containing the recalls of variables at time (t − 1) for each of movers, who are all current residents moving into the neighborhood within the previous year.

The acquisition of panel data is a long-term process, which requires the accumulation of a great amount of efforts over a long time period. Fast-growing emerging economies have greatly compressed the motorization process, for which the developed countries might spend many decades, into just a number of years. Given the rapid growth of emerging economies, the true panel data may not be appropriate for vehicle-ownership forecast in a short time period. Therefore, in the current situation, the retrospective survey, quasi-panel survey or SP survey may be applied for modeling dynamic vehicle ownership in emerging economies. Meanwhile, it is necessary for government or research institutions in developing countries to carry out a large-scale panel data survey, so as to lay a foundation for future studies on the trends of vehicle ownership.

3 Methodology

Vehicle ownership is a mature research topic for travel demand analysis, while the review on its modeling methodology is an indispensable part for this research topic. In this paper, the modeling methodologies for vehicle ownership are classified into two broad categories: static and dynamic models. The existing literature discussed about the static vehicle-ownership models more extensively but less so about the dynamic models except for hazard-based duration models. Since there is a large gap in the dynamic modeling of vehicle ownership in emerging economies, it can be another research hotpot to analyze the development trend and dynamic change of vehicle ownership with reliable data support from there. Although the existing literature are mostly focused on the developed countries, their modeling approaches have important reference values for analyzing vehicle ownership in emerging economies. In addition, the advanced modeling methodology can be applied in emerging economies to better forecast the vehicle ownership and demand.

3.1 Static Vehicle-Ownership Models

The static vehicle-ownership models predict vehicle ownership at a particular time while ignoring the dynamics of vehicle-ownership evolution. The modeling approaches can be classified in the following common categories. The summary of modeling approaches for static vehicle ownership is displayed in Table 2.

Table 2: Summary of modeling approaches for static vehicle-ownership models.

3.1.1 Standard Discrete-Choice Model


The standard discrete-choice model was generally applied to analyze vehicle ownership and vehicle type. For vehicle ownership, various ownership levels are usually considered as discrete choices made by each household. For vehicle type, discrete-choice model is usually applied to analyze the preference of households on different types of vehicles but usually not for the vehicle count of each type. The commonly used models for vehicle type are multinomial logit (MNL) model and nested logit (NL) model.

In general, the binary logit (BL) model or binary probit (BP) model were preferred by most researchers to evaluate the levels of household vehicle ownership39, 66, 80, such as whether a household owns vehicle(s) or not. This type of model has a simple form and can identify and quantify significant factors affecting whether private car is owned or not. However, it is unable to determine the specific number of vehicles owned by households. Whelan1 used the hierarchical binary logit model and considered the composition of preference set. Based on this model, the author further analyzed influential factors of vehicle purchase and explained reasons why some households do not own a vehicle.

Concerning the fleet size of vehicles, the BL and BP models are no longer applicable. Two general decision mechanisms have been extensively used for model development: the ordered-response mechanisms (ORM) and the unordered-response mechanisms (URM). The household vehicle ownership generally appears in the form of orderly discrete variables in travel surveys. Correspondingly, the inherent order of discrete variables could be explored by the ordered-response model (ORM). The representative ORMs of vehicle ownership include the ordered logit (OL) model and ordered probit (OP) model10, 19, 21, 27, 35, 37, 47. On the other hand, the multinomial logit (MNL) model is the most widely employed among the unordered-response models based on the random utility maximization (RUM) principle29, 38, 57, 67, 82, 108. Compared with the OP and OL models, the MNL model has a disadvantage that alternatives are unordered in nature. However, the MNL model has a specific utility function for each alternative and can, therefore, fit data better. Bhat and Pulugurta21 compared the MNL model with ORM and found that the former was superior to the latter in goodness-of-fit.

As an extension of the MNL model, the nested logit (NL) model overcomes the Independence of Irrelevant Alternatives (IIA) property and allows for unequal cross-elasticities for alternatives in different nests. The model can accommodate correlation between alternatives in common nests, which can be measured by a nesting coefficient (the nesting coefficient takes a value between zero and one, while the less value indicates higher correlation or greater substitutability between the alternatives)30, 85, 109. For example, Kermanshah and Ghazi109 developed NL model to partition the vehicle fleet decision into two levels, in which the first level represented whether a vehicle is owned and the second level represented the ownership of one and two or more vehicles.

The standard discrete-choice model can also be applied for vehicle-type choices. For instance, Choo and Mokhtarian33 developed MNL model and classified models into nine categories for analysis. McCarthy and Tay43 explored a double-level NL model based on the fuel efficiency and evaluated the effect of fuel efficiency on vehicle purchase. In this model, the first level included three nests of low, middle and high fuel efficiency, and the second level includes a reasonable number of make/models using a random sampling procedure to define a choice set for each nest. Mohammadian and Miller54 developed an NL model based on data collected in Toronto, and considered the possibility of choosing six vehicle types.

3.1.2 Count Model


The count model is applicable to the case that the dependent variable takes non-negative integers with a considerable proportion of zero values. As the household vehicle count is a non-negative integer, several researchers developed count models for household vehicle counts. In the existing studies, count models mainly fall into two categories: the Poisson regression model32 and negative binomial regression model31, 110. Besides, the Poisson-log-normal model, developed based on the Poisson distribution mixed with log-normal heterogeneities, was also used by researchers66. This model captures the unobserved heterogeneity with a symmetric normal distribution but its main drawback is not having a closed-form expression for the likelihood function.

3.1.3 Discrete–Continuous model

The joint discrete–continuous model system provides an integrated econometric framework for discrete and continuous decisions. There are two types of discrete–continuous models which are currently applied in the literature111.

The first type of models was initially proposed by Heckman112 to solve the problem of self-selection, so as to obtain the consistent estimation of the continuous model, namely two-stage procedures (or Heckman correction). Dubin and McFadden113 extended the two-stage binary model mentioned above to the case of multiple choices. For the research on vehicle ownership, Fang111 developed a discrete–continuous model by combing the multivariate ordered equations with Tobit equations and jointly estimated vehicle-type choice and usage. Spissu et al.24 presented a joint multinomial logit (MNL)—continuous regression model of vehicle-type choice and miles of travel. Konduri et al.11 developed a probit-based discrete–continuous model and applied it to the conjoint modeling for vehicle-type choice and tour length. Similarly, Liu et al.16 also applied the discrete–continuous model to the analysis of household vehicles. Therein, the multinomial probit (MNP) model is used to estimate the household vehicle ownership, the MNL model is used to estimate the vehicle class and vintage decision, and the regression model is used to estimate the vehicle usage.

The multiple discrete–continuous extreme value (MDCEV) model is another type of utility-based econometric model. In some cases, the decision-making of household vehicle ownership may be simultaneously related to both discrete variables (number of vehicles, vehicle types, etc.) and continuous variables (vehicle use and mileage, etc.). To solve such multiple discrete–continuous choice problems, Bhat114, 115 proposed this modeling framework. On the basis of random utility theory, the MDCEV model has a closed-form likelihood expression and can be applied to the situation with a large number of discrete choices. Then Bhat and Sen22 applied the mixed MDCEV model, which could simultaneously deal with the unobserved heteroscedasticity and the error correlations among the utility functions of different vehicle type. Ahn et al.84 developed the MDCEV model to explain the consumer preferences on vehicles with different fuels. In another study, Bhat et al.116 adopted the joint nested MDCEV–MNL model, where the MDCEV model was developed to analyze the vehicle type and usage in the first level and MNL model was developed to analyze the vehicle make/model in the second level. Through the same modeling framework, Vyas et al.42 modeled choice combinations of household vehicle fleet size and vehicle type and identified the primary driver of each combination.

3.1.4 Multidimensional Choice Model

By constructing a series of submodels for different choice dimensions, various decision processes can be jointly modeled in a unified mixed multidimensional choice modeling system. The submodels can be ordered, multinomial, continuous and count models, etc.12, 13, 23, 26 For example, Bhat and Guo23 developed a joint model with MNL model for residential location choice and OL model for vehicle ownership. Bhat et al.12 used the bivariate MNP model to analyze the choice process of residential location and vehicle ownership, where the two MNP models are mutually jointed in a bivariate system.

In most previous literature, bivariate ordered probit (OP) model was developed to analyze both household automobile and motorcycle ownership, which can take account of the interdependencies between them76, 77, 117, 118. Furthermore, the multivariate ordered probit model can be developed to extend the choice dimension. Fang111 developed the Bayesian multivariate OP model to estimate household decisions on the number of vehicles in each category. And Ma et al.81 presented a multivariate OP model to understand four types of vehicle ownership within a household, including automobile, motorcycle, electric bicycle and human-powered bicycle. In addition, Zhao18 applied a multivariate negative binomial model for the number of vehicles by type and provided a way to capture the underlying preference of vehicle type.

3.1.5 Copula-Based Model


Recently, some researchers applied Copula functions to develop joint models for household decisions on vehicle type and use24, 25, 41, 56, 119. The advantage lies in the accommodation of non-normal marginal distributions and closed-form likelihood expressions. Bhat and Eluru119 explained Copula functions in details and applied the Gaussian copula to the joint decision analysis of residential choice and daily vehicle miles of travel (VMT). Spissu et al.24 analyzed the relationship between vehicle types and miles of travel using discrete–continuous models based on Copula functions . In the copula-based framework of Eluru et al.25 and Paleti et al.41, GEV-based logit model is used to represent the choices of residential location, vehicle count and vehicle type. The modeling framework can adapt to a multidimensional decision system composed of joint residential choices and vehicle composition. Rashidi and Mohammadian56 utilized a set of multivariate Archimedean copula density functions to approximate the multivariate probability density function and captured the correlation among multiple transaction types.

3.1.6 Structural Equation Model (SEM)

When applied to model vehicle ownership, the structural equation model considers the vehicle ownership as endogenous variables to explain the complex relationships with the residential environment and travel behavior. SEM is able to help us distinguish three types of effects: total effect, direct effect and indirect effect. Therein, the direct effect can be interpreted as the reflection of ‘effect’ variable on the change of ‘cause’ variable; the indirect effect can be perceived as the effect of one variable on another variable through one or more endogenous variables; the total effect is the sum of direct effect and indirect effect. For example, in the model of Giuliano and Dargay20, not only can the direct effect of income on travel decision be considered, but also the indirect effect can be measured by the effect of income on vehicle ownership and vehicle ownership on travel decisions.

3.1.7 Simultaneous Equation System

The simultaneous equation model system contains choice models of interdependence and can be applied to model vehicle ownership17, 28, 62, 116. Chen et al.28 developed two simultaneous equations containing two dependent variables: vehicle ownership and the propensity to use vehicles. In the study, the use of vehicle for commute can be observed, while the potential propensity is unobserved. Thus, the authors assumed that the latent propensity includes the unobserved traits/attitudes toward car use. Also, Schimek17 developed a simultaneous equation system to explore individuals’ residential choices and travel decisions by setting vehicle ownership as an intermediating variable.

3.2 Dynamic Vehicle-Ownership Models

With regard to the disaggregate studies on household vehicle ownership over the past few decades, most of them employed the static model based on the cross-sectional data, while a small number of them employed the dynamic model. In recent years, some progress has been made in dynamic vehicle model, which can be used to analyze the tendency of household vehicle demand. Panel dataset is generally preferred, since it can capture the dynamic characteristics of vehicle-ownership decisions. In the panel sample, households are usually asked to report the number and type of vehicles at an equal time interval (usually 1 year). The households can choose to maintain or change the vehicle ownership for the maximum utility121. Due to the challenge in panel data collection, there are almost no studies on dynamic vehicle-ownership model at disaggregate level in emerging economies, which will become one of the future research directions for emerging economies.

Recently, the duration model has been a hotpot in vehicle dynamic analysis. This model is originated from the field of biostatistics, where it is called “survival analysis” and applied to explore the life span as well as factors affecting survival rate. Hockerman122 firstly carried out a research on vehicle transaction model, and used “whether or not to trade” to represent trading decisions. In the existing studies of vehicle transaction behavior, the duration model is the most common, especially the hazard-based duration model. This kind of models can be used to investigate the vehicle-ownership duration and vehicle transaction behavior (as a function of characteristics of the household, the economy and so on)63, 122.

The hazard indicates the probability that a specific event will end at a certain time given that the event can last until then. The hazard-based duration model generally can be classified into the parametric, semi-parametric and non-parametric models. In terms of the conventional duration model, the vehicle ownership will be terminated by a single event45, 63, 79. However, the termination of an existing vehicle ownership actually can be attributed to several types of events: acquire a new or used vehicle, replace with a new or used vehicle and dispose of without replacement8, 44, 53, 55, 71, 107. Accordingly, the hazard-based duration model falls into the following two categories:

3.2.1 Single Hazard Duration Model

Single hazard duration model assumes that the termination of vehicle ownership is caused by an independent event. For example, the duration model developed by de Jong63 investigated household vehicle holding duration and analyzed the effects of multiple factors on it.

3.2.2 Competing Hazards Duration Model

Competing hazards duration model explains the several probable types of events that can occur at the end of duration (e.g., acquire, dispose, and replace), and define individual hazard for each specific exit state. However, it is reported that there do exist dependences among hazards of different events, and the correlation needs to be analyzed. Gibert107 was one of the first to use the duration model on the basis of panel data to estimate the vehicle holding duration. Yamamoto et al.44, 71 developed a competing hazards duration model to analyze the household vehicle transaction behavior, and further assumed the existence of three types of competing hazards. To consider the effect of transaction, the household variables were converted into dummy variables indicating whether it changed or not, which were then input into the model as the covariates. In addition, the transaction itself changed the demand and motivation to own the vehicle and further affected the timing and type of subsequent transactions. Mohammadian and Rashidi56 conducted a study that aimed to use competing proportional hazard model to combine the type and timing of transaction at a disaggregate level. With regard to the error term of transaction type in the nested framework, a set of multivariate Archimedean copula density function was used to estimate the multivariate probability density function, so as to accommodate the inter-correlations among multiple types of transactions.

In addition, random-effect model and simultaneous equation model system can also be used for vehicle dynamic analysis. For example, Mohammadian and Miller53 used the random-effect model to distinguish the heterogeneity and state-dependence of choice behaviors. Then, they used the NL model to analyze the choice of vehicle type and identify the significant variables that affect transaction choices. Rashidi and Mohammadian46 developed the hazard-based dynamic equation system and analyzed the timing choice for vehicle transaction, residential mobility and employment relocation. Among them, both workplace and residential location were endogenous variables.

3.3 Comparison of Model Types

As per research object and purpose, household vehicle demand model can be roughly classified into the following four categories: vehicle ownership (VO), vehicle type (VT), vehicle use (VU) and vehicle transaction (VTR) models. Table 3 displays relations between the model types and modeling approaches, as per applications seen in literature. By the comparison of different model types, the appropriate modeling approaches can be selected for different modeling objectives in emerging economies. The different model types are clearly illustrated and compared as follows.

Table 3: Matrix for model types and modeling approaches.

3.3.1 Vehicle-Ownership Model

Vehicle-ownership model is the most common vehicle demand model. The studies in literature on vehicle ownership are comprehensive. The model is usually used to forecast: (1) whether a vehicle is owned or not (binary choice); (2) the number of owned vehicles, which is often represented as ‘vehicle count’ or ‘fleet size’ in literature. The research object basically is just automobile (car) when the model is used alone.

3.3.2 Vehicle-Type Model

‘Vehicle’ has two interpretations in the literature of vehicle-type model. When ‘vehicle’ is a collective concept, vehicle type contains household private vehicles (including motorized and non-motorized), such as cars, motorcycles, bicycles and electric bicycles; when ‘vehicle’ refers to automobile (car), ‘vehicle type’ refers to the classification of automobiles. Classifications are different in various surveys, while Table 4 provides a list of examples for comparisons.

Table 4: Vehicle classification schemes.

3.3.3 Vehicle Use Model

Vehicle use model usually is not studied separately. Previous study tends to combine it with vehicle-ownership model or vehicle-type model. Vehicle miles traveled (VMT) or vehicle kilometers traveled (VKT) and the number of vehicle trips are used to represent the usage of vehicles in vehicle use model.

3.3.4 Vehicle Transaction Model

Vehicle transaction model belongs to dynamic demand model. The panel data are usually used to explore the development trend of household vehicle ownership and analyze vehicle choice behaviors of replacing, disposing and acquiring. The duration model is usually chosen as the main modeling approach.

4 Application

4.1 Relevant Explanatory Factors

The influential factors of household vehicle ownership can be primarily classified into the following categories: household attributes, individual attributes, built environment, transit, vehicle and other attributes. Through literature review, it can be found that household attributes and built environment attributes are two most important considerations in vehicle demand analysis. In recent years, the transit attributes have become another important factor being considered in the vehicle-ownership model9. The specific variables in each category are presented in Table 5.

Table 5: Classification of explanatory factors.

4.2 Application in Developing Countries

With the growth of urbanization rate and increase in residents’ income level, the cities in developing countries are generally facing challenges caused by high-speed motorization. The popularity of private cars has brought a series of problems, such as parking difficulties, environmental pollution and energy consumption. As a major component of emerging economies, developing countries have also conducted quite a few studies on vehicle-ownership modeling.

In recent years, approximately 50% urban trips worldwide are undertaken by private cars. It is estimated that the car trips will reach 6.2 million by 2025, which is about twice as many as trips in 2005123. Developing countries have a higher share in private vehicle stock increase124. Take China as an example, the private vehicle stock in China has increased from 43.39 to 108.50 million since 2006125. By 2025, car ownership will grow 36% per year in China, 14% in India and nearly 1% in North America and Europe86. However, the car ownership per capita in developing countries (such as China and India) is much lower than that in developed countries; it, therefore, has a great space for future growth126.

The current studies in developing countries can be partly listed in Table 6. From the table, we can get some interesting findings about the factors influencing vehicle ownership in developing countries. First, the distance to CBD (or city center) is a common explanatory variable in analyzing the effect on vehicle ownership. According to the study of Yamamoto75 and Li et al.80, it can be seen that the households close to Central Business District (CBD) are more inclined to own vehicles (car, motorcycle and bicycle) in Malaysia and China; Besides, in the study of Chile108, households are more likely to own a car when they are 10 km away from the CBD, while the households far away from this distance are less likely to own a car. The reason is that both “distance to CBD” and “distance to CBD squared” are significant in the model. The study also indicated that the households far from CBD tend to drive more kilometers. Second, population density is another important considered variable in the built environment attributes. Car ownership is negatively correlated with population density in the current studies of developing countries (China and Malaysia). In Malaysia, households in the area with high population density prefer to own motorcycles and bicycles. On the contrary, it is found that the families in the area with high population density tend to own less motorcycles in Shanghai, which may be caused by the policy of restricting the usage of motorcycles in urban areas. Third, household income is a household attribute of great concern. The households with low income are less likely to own a car (Thailand85 and Iran123), while the households with higher income are more likely to own a car (China80 and Chile108). Fourth, from the studies in developing countries, it is common to find that the more workers in a family, the more likely the household owns a vehicle75, 76, 123 or travels more kilometers108.

Table 6: Main empirical studies of vehicle ownership in developing countries.

For the cities and metropolitan areas with high population density and developed public transportation system, vehicle-ownership concerns whether a household owns a car or how many cars a household owns. By contrast, a diverse but particularly important issue in developing countries is the vehicle ownership of other types, such as motorcycles, electric bicycle and bicycle. Non-motor vehicles are still an indispensable part of the study on vehicle ownership. In the study of Li et al.80, households with a bicycle or e-bicycle are less likely to own a car. Cherry et al.127 analyzed the dynamics of electric bike ownership in Kunming, China and drew conclusions that e-bikes are replacing bus, car/taxi and bicycle for undertaking trips over the years. And Ma et al.81 provided the error correlation matrix of multivariate ordered probit model to analyze the correlation between the vehicle ownership of different types. It is found that there is a large substitutive relation between automobile and electric bicycle/motorcycle, and the vehicle ownership of electric bicycle/motorcycle and bicycle are mutually promoted, while motorcycle and electric bicycle are mutually substituted.

The motorization process in cities in developing countries is dissimilar from that of developed countries. Most cities in developing countries have the characteristics of high population density, rapid growth in gross domestic product (GDP), and mixture of private transportation modes82. In terms of vehicle ownership, there are some literature that compare developing countries and developed countries. For example, Sanko et al.76 analyzed the household car and motorcycle ownership in Bangkok (Thailand) and Kuala Lumpur (Malaysia) and compared them with that in Nagoya (Japan). The results show that car and motorcycle ownerships are substitutable in the developing countries but are complementary in Nagoya (developed countries). Compared with car and motorcycle ownership behaviors in Nagoya in 1991 and 2001, behaviors in Nagoya in 1981 are closer to those in Bangkok and Kuala Lumpur in 1991 and 2001. This finding indicates that the behaviors of vehicle ownership in developing countries are similar to the early behaviors in developed countries. From the perspective of data collection, there are quite a few studies on demand prediction based on aggregate data in the context of developing countries125, 126, 128, 129. It is primarily because researchers have easy access to the complete aggregate data from national statistics bureau. However, most developing countries have not yet established a complete microsurvey database at household level so far. Therefore, researchers can only conduct small-scale survey to develop disaggregate vehicle-ownership model, which is time-consuming and may suffer from sampling biases. On the other hand, the developed countries benefit from their well-archived database for disaggregate modeling analysis. At present, the developing countries still lack panel data, which require long time and high cost to collect.

4.3 Emerging Travel Technologies

Technologies are driving an unprecedented wave of innovations in mobility services. With the development of mobile internet and sharing economy, a series of travel modes are gradually emerging such as e-hailing, shared car and shared bike, which greatly influence the life and activity-travel patterns of urban residents. As a result, the household vehicle ownership may be affected if the residents shift from private vehicles to new travel modes. For instance, the use of e-hailing and shared car may reduce the demand of private automobiles for households, and shared bike may become an alternative to non-motor vehicles. In this context, the previous conclusions may no longer hold, since the model for household vehicle ownership lacks consideration of the new alternative travel modes. Thus, it is of great importance to analyze household vehicle ownership with emerging travel technologies and economic models, especially for emerging economies.

4.3.1 E-Hailing

E-hailing is a process of ordering a car, taxi, limousine, or any other form of transportation pickup via a mobile device, which can provide platforms that allow drivers and customers to communicate efficiently. E-hailing service has developed rapidly and become popular all over the world. Some of the most well-known e-hailing companies are Lyft, Uber, Curb, DiDi, etc. Zhong et al.130 analyzed how the different factors influenced the travelers’ mode choice on e-hailing service based on a web survey in Shanghai, China. Rayle et al.131 conducted a survey for e-hailing users in San Francisco and found that at least half of the e-hailing trips used public transit or private cars before. The findings of Wu et al.132 indicated that hailing services had taken over not only private vehicle travels but also travels fulfilled by other modes such as mass public transit, walking, and bicycling.

4.3.2 Car Sharing

Car sharing originated from Zurich, Switzerland in 1948133 and became popular since the 1990s. In the past 20 years, car sharing has been popularized in Europe and extended to North America, South America, Asia and Australia134. As a new travel mode oriented between car and public transit, car sharing has been introduced to guide residents to change their travel mode from private cars to shared cars, which plays an active role in reducing private car ownership and greenhouse gas emissions135. Car sharing has been found to be very successful in reducing car ownership. Caulfield67 examined the characteristics of household with car ownership in Dublin, Ireland, and identified households that would most likely give up an extra car and use a car-sharing scheme. Kim et al.136 explored the factors affecting the attitudes to electric vehicle sharing program, program participation and car ownership in Seoul. Based on the survey in Netherlands, Nijland and van Meerkerk137 found that car sharers own less cars and drove less than prior to car sharing, and the shared cars mostly replace a second or third car to be owned.

4.3.3 Bike Sharing

Growing concerns about global motorization and climate change have led to increasing interest in sustainable transportation alternatives such as bike sharing. Several studies have documented bike sharing’s social and environmental benefits, which include reduced auto use, increased bicycle use, etc.138 Shaheen et al.139 found that bike sharing was capturing modal share from bus transit, walking, auto, and taxis, and was attractive to car owners.

Furthermore, private vehicles are responsible for a large proportion of oil consumption, leading to air pollution and greenhouse effects in urban areas around the world. Adding alternative-fuel cars to the market would effectively lower gasoline and diesel consumption and the emission of pollutants84. With the advance of new energy technology, vehicles with untraditional energy (such as hybrid electric vehicle and battery electric vehicle) has come into the market140. SP survey is mostly used in the research on new energy vehicles. For example, Potoglou and Kanaroglou141 examined the factors and incentives that influence a household’s choice for cleaner vehicles, where the alternatives in SP experiments included a conventional gasoline vehicle, a hybrid vehicle and an alternative-fuel vehicle. Caulfield et al.142 analyzed the individual preferences for hybrid electric and alternative-fuel vehicles, and then reached the conclusion that the reliability, automobile safety, fuel costs, and cost price were most concerned by respondents.

5 A Prospective View on the Challenges and Opportunities

At present, a large amount of literature cover a wide range in household vehicle-ownership modeling and certain progress has been made recently. However, there are still opportunities and challenges in data collection and modeling for emerging economies, which can be stated as follows:

5.1 Panel Data Collection

The commonly used cross-sectional data are unable to identify the changes of travel behavior in continuous time and is, therefore, not applicable to the dynamic analysis of household vehicle ownership. However, the collection of panel data suffers from high cost, long time and low retention rate. To shorten the data collection cycle, the retrospective survey method may be adopted instead. Meanwhile, pseudo-panel data can be an alternative according to the recent studies, since the pseudo-panel method is able to stitch a series of cross-sectional datasets together. In addition, quasi-panel data through recall can be another alternative. Since the developing countries or emerging economies are unable to obtain reliable panel data in a short time, it is more feasible to conduct retrospective survey or apply quasi-panel method to analyze the dynamic trend of vehicle ownership in the current situation. Also, the developing countries can gradually plan and start to collect panel data for future research.

5.2 RP/SP Data Fusion


There are different drawbacks in the RP and SP data. An RP/SP data fusion can take advantage of the complementary strengths of each data source and help to develop a model with a higher level of precision and accuracy. In this regard, the integration of RP and SP data can be achieved through the estimation process of RP/SP or SP-off-RP joint models. SP-off-RP experiment can become a research direction of vehicle-ownership modeling in emerging economies. The experiment can overcome the issues associated with unrealistic scenarios in conventional SP experiments and respondents have the same alternatives in RP and SP settings, so that the experimental results can be closer to those in the real situation. Train and Wilson 101 proposed a full information maximum likelihood (FIML) estimation method for SP-off-RP data using simulation- based integration. Although the seminal work is widely cited, the modeling approach is rarely applied in practice probably due to its complexity. Thus, the modeling process for SP-off-RP data has a room for improvement. Very recently, a limited information maximum likelihood (LIML) method based on control function (CF) is proposed by Guevara and Hess103, which may provide an opportunity for the application of SP-off-RP data in emerging economies. However, the two-stage control-function (2SCF) approach is also not perfect because of its lower efficiency. It calls for future effort to develop a more sophisticated modeling approach for SP-off-RP data regarding vehicle ownerships in emerging economies.

5.3 Population Issues

The lack of population data is also a challenge for model application in the context of developing countries. Synthetic population and population evolution processes are proposed to generate base year and future-year microlevel population data. The development of population evolution modeling system, which is important for realistic travel demand forecasting, can provide the linkages among demographics, land use, and transportation. Synthetic population is a key input to transportation microsimulation models. Due to the confidentiality and privacy issues in using census, methodologies to generate synthetic population as a replacement of census have received considerable attention in the recent literature. For example, Ye et al.143 proposed the iterative proportional updating (IPU) algorithm to generate synthetic population where both household and person attributes can be controlled. Sun et al.144 proposed a mixture modeling framework for generating representative household structures in population synthesis. Eluru et al.145 designed an analytical framework to predict future-year population characteristics by updating all the relevant attributes of households and individuals in base year. For modeling vehicle ownership and forecasting vehicle demand in emerging economies, it is critical to project the population growth and household structural change over time, particularly under special policy scenarios (e.g., birth control policy and its recent relaxation in China).

In addition, through the comprehensive literature review, the following useful insights can be obtained. For data collection, the questionnaires can be collected through internet based on professional survey platforms to reduce the investment of human and financial resources. The collection of SP data and panel data could enrich data sources in developing countries. As for the types of vehicle-ownership models to be focused on, researchers may make attempts to develop vehicle-type models and vehicle transaction models, which are still worth further exploration. Since the level of motorization in developing countries is generally lower than that in developed countries, it is still important to consider non-motor vehicle ownership in a long period of time to come. Besides, explanatory variables can be considered comprehensively to analyze their impacts on household vehicle ownership in emerging economies.

With the rapid development of urban motorization and mobile internet, the research on household vehicle ownership is facing new opportunities and challenges. Under the emerging economy, the values of urban residents have changed and many of them are favor of green travel modes, which are environmentally friendly, sustainable and healthy. Against this backdrop, there are certain research prospect in travel sharing (shared car or shared bike) and new energy vehicles, which are associated with the household vehicle ownership. In addition, several representative policies also have great implications for household vehicle ownership, such as purchase restrictions in China. Thus, it is necessary to conduct an in-depth study on the demand and tendency of household vehicle ownership in the context of emerging economies. And a great deal of experience can be learned from the previous studies in developed countries. Compared to developed countries, developing countries (e.g., China, India) should have a larger space for research in household vehicle ownership due to the great potential of future vehicle demand with rapid economic growth. In the complex policy and economic environment of emerging economies, the vehicle ownership is affected by multiple factors, such as population policy, environmental awareness, sharing economy, emerging technologies, etc. The complexity of vehicle-ownership behaviors in emerging economies may bring great challenges for modeling and forecasting their vehicle demand in the future.

6 Summary

Household vehicle-ownership model has been widely applied in travel demand modeling system, as witnessed by nearly 100 applications cited in this review article. This paper concluded and discussed the behavioral models of household vehicle ownership with focus on their potential to be applied in emerging economies. Discussions are made on four aspects, including data, methodology, application and prospect.

This review paper carried out classifications of various models. From the perspective of questionnaire survey design, models can be classified into those developed based on data from RP survey, SP survey and integrated data from both RP and SP surveys. The models can also be classified into those based on data being collected at a single point or multiple points on temporal axis (e.g., cross-sectional data or panel data), which correspond to static and dynamic models. In terms of model type, models can be classified into vehicle ownership, vehicle use, vehicle type, and vehicle transaction models.

With respect to the application of model, this review paper analyzed the influential factors of vehicle demand. It is found that the household demographic attributes and built environment attributes are the most primary consideration in vehicle-ownership models. In addition, this review paper discussed the possible effects of travel sharing and new energy vehicle based on emerging technologies on the vehicle ownership, as well as the application of vehicle-ownership models in emerging economies.

Finally, this review article states that there are both opportunities and challenges for vehicle-ownership model development in emerging economies in the future. In light of limitations and gaps in current studies, researchers could make the following efforts in their future research: exploring the transaction behavior and dynamic trend of vehicle ownership based on panel or quasi-panel data; exploring the SP-off-RP experiments and modeling progress in emerging economies and analyzing the preferences of residents for household vehicles.