Skip to main content

Advertisement

Log in

Simultaneous preference estimation and heterogeneity control for choice-based conjoint via support vector machines

  • Published:
Journal of the Operational Research Society

Abstract

Support vector machines (SVMs) have been successfully used to identify individuals’ preferences in conjoint analysis. One of the challenges of using SVMs in this context is to properly control for preference heterogeneity among individuals to construct robust partworths. In this work, we present a new technique that obtains all individual utility functions simultaneously in a single optimization problem based on three objectives: complexity reduction, model fit, and heterogeneity control. While complexity reduction and model fit are dealt using SVMs, heterogeneity is controlled by shrinking the individual-level partworths toward a population mean. The proposed approach is further extended to kernel-based machines, conferring flexibility to the model by allowing nonlinear utility functions. Experiments on simulated and real-world datasets show that the proposed approach in its linear form outperforms existing methods for choice-based conjoint analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Abernethy J, Evgeniou T, Toubia O and Vert J (2008). Eliciting consumer preferences using robust adaptive choice questionnaires. IEEE Transactions on Knowledge and Data Engineering 20(2):145–155.

    Article  Google Scholar 

  • Arora N and Huber J (2001). Improving parameter estimates and model prediction by aggregate customization in choice experiments. Journal of Consumer Research 28(2):273–283.

    Article  Google Scholar 

  • Atchade YF (2006). An adaptive version for the metropolis adjusted Langevin algorithm with a truncated drift. Methodology and Computing in applied Probability 8(2):235–254.

    Article  Google Scholar 

  • Bertsekas D (1982). Constrained Optimization and Lagrange Multiplier Methods. Academic Press: New York.

    Google Scholar 

  • Camm JD, Cochran JJ, Curry DJ and Kannan S (2006). Conjoint optimization: An exact branch-and-bound algorithm for the share-of-choice problem. Management Science 52(3):435–447.

    Article  Google Scholar 

  • Chapelle O and Harchaoui Z (2005). A machine learning approach to conjoint analysis. In: Osherson DN, Kosslyn SM (eds) Advances in Neural Information Processing Systems, vol 17, pp. 257–264. MIT Press: Cambridge, MA.

  • Cui D and Curry D (2005). Prediction in marketing using the support vector machine. Marketing Science 24(4):595–615.

    Article  Google Scholar 

  • Evgeniou T, Boussios C and Zacharia G (2005). Generalized robust conjoint estimation. Marketing Science 24(3):415–429.

    Article  Google Scholar 

  • Evgeniou T, Pontil M and Toubia O (2007). A convex optimization approach to modeling heterogeneity in conjoint estimation. Marketing Science 26(6):805–818.

    Article  Google Scholar 

  • Gelman A and Pardoe I (2006). Bayesian measures of explained variance and pooling in multilevel (hierarchical) models. Technometrics 48(2):241–251.

    Article  Google Scholar 

  • Green PE, Krieger AM and Wind Y (2004). Thirty years of conjoint analysis: Reflections and prospects. In Wind Y, Green PE Marketing Research and Modeling: Progress and Prospects, International Series in Quantitative Marketing, vol 14, pp. 117–139. Springer: Berlin.

    Chapter  Google Scholar 

  • Hensher D, Louviere J and Swait J (1998). Combining sources of preference data. Journal of Econometrics 89(1):197–221.

    Article  Google Scholar 

  • Hsu C.-W., Chang C.-C. and Lin C.-J. (2010). A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University.

  • Irani S, Dwivedi Y and William M (2014). Analysing factors affecting the choice of emergent human resource capital. Journal of the Operational Research Society 65(6):935–953.

    Article  Google Scholar 

  • Maldonado S and López J (2014). Alternative second-order cone programming formulations for support vector classification. Information Sciences 268:328–341.

    Article  Google Scholar 

  • Maldonado S, Montoya R and Weber R (2015). Advanced conjoint analysis using feature selection via support vector machines. European Journal of Operational Research 241(2):564–574.

    Article  Google Scholar 

  • Maldonado S, Weber R and Basak J (2011). Simultaneous feature selection and classification using kernel-penalized support vector machines. Information Sciences 181(1):115–128.

    Article  Google Scholar 

  • Mankila M (2004). Retaining students in retail banking through price bundling: Evidence from the swedish market. European Journal of Operational Research 155(2):299–316.

    Article  Google Scholar 

  • Mercer J (1909). Functions of positive and negative type, and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London 209:415–446.

    Article  Google Scholar 

  • Rosenthal JS et al. (2011). Optimal proposal distributions and adaptive MCMC. In: Brook S, Gelman A, Jones GL, Meng X-L (eds) Handbook of Markov Chain Monte Carlo, pp. 93–112. Chapman and Hall: Boca Raton.

  • Rossi PE, Allenby GM and McCulloch R (2005). Bayesian statistics and marketing. Wiley: New York.

    Book  Google Scholar 

  • Schebesch K and Stecking R (2005). Support vector machines for classifying and describing credit applicants: detecting typical and critical regions. Journal of the Operational Research Society 56(9):1082–1088.

    Article  Google Scholar 

  • Schölkopf B and Smola AJ (2002). Learning with Kernels. MIT Press: Cambridge.

  • Scholl A, Manthey L, Helm R and Steiner M (2005). Solving multiattribute design problems with analytic hierarchy process and conjoint analysis: An empirical comparison. European Journal of Operational Research 164(1):760–777.

    Article  Google Scholar 

  • Thyne M, Lawson R and Todd S (2006). The use of conjoint analysis to assess the impact of the cross-cultural exchange between hosts and guests. Tourism Management 27(2):201–213.

    Article  Google Scholar 

  • Tikhonov AN and Arsenin VY (1977). Solution of Ill-posed Problems. Winston & Sons: Washington.

    Google Scholar 

  • Toubia O, Evgeniou T and Hauser J (2007a). Optimization-based and machine-learning methods for conjoint analysis: Estimation and question design. In: Gustafsson A, Herrmann A, Huber F (eds) Conjoint Measurement: Methods and Applications, pp. 231–258. Springer: New York.

  • Toubia O, Hauser J and Garcia R (2007b). Probabilistic polyhedral methods for adaptive choice-based conjoint anaysis. Marketing Science 26(5):596–610.

    Article  Google Scholar 

  • Tsafarakis S, Grigoroudis E and Matsatsinis N (2011). Consumer choice behaviour and new product development: An integrated market simulation approach. Journal of the Operational Research Society 62(7):1253–1267.

    Article  Google Scholar 

  • Vapnik V (1998). Statistical Learning Theory. Wiley: New York.

    Google Scholar 

  • Venkatesh V, Chan FK and Thong JY (2012). Designing e-government services: Key service attributes and citizens’ preference structures. Journal of Operations Management 30(1):116–133.

    Article  Google Scholar 

  • Verbeke W, Dejaeger K, Martens D, Hur J and Baesens B (2012). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research 218(1):211–229.

    Article  Google Scholar 

  • Yajima Y (2005). Linear programming approaches for multicategory support vector machines. European Journal of Operational Research 162(2):514–531.

    Article  Google Scholar 

Download references

Acknowledgments

The authors thank Olivier Toubia and Bryan Orme for providing the data for the two empirical applications. The first author was funded by FONDECYT project 1160894 and by CONICYT Anillo ACT1106. The second author was supported by FONDECYT projects 1140831 and 1160738. The third author was supported by FONDECYT project 1151395 and FONDEF Project IT13I20031. This research was partially funded by the Complex Engineering Systems Institute, ISCI (ICM-FIC: P05-004-F, CONICYT: FB0816).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastián Maldonado.

Appendices

Appendix A

Strictly convexity of problem (6)

In order to prove that our Formulation (6) is strictly convex, we first rewrite it in a compact form. For this, we follow the derivation of Yajima (2005) for multicategory SVM. Let us denote by

$$\begin{aligned} \widetilde{\mathbf{w}}=[{\mathbf{w}}_0^\top ,\mathbf{w}_1^\top ,\ldots ,\mathbf{w}_N^\top ]^\top \in {\mathfrak {R}}^{J(N+1)}, \end{aligned}$$

and

$$\begin{aligned} {\mathcal{Q}}(\theta )=\left[ \begin{array}{lllll} N\theta I_J &{} -\theta I_J &{}-\theta I_J &{}\cdots &{} -\theta I_J \\ -\theta I_J &{} (1+\theta )I_J &{} 0 &{} \cdots &{} 0 \\ -\theta I_J &{}0 &{}\ddots &{}\ddots &{}\vdots \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} 0 \\ -\theta I_J &{} 0 &{} \cdots &{} 0 &{} (1+\theta )I_J \end{array} \right] \in {\mathfrak {R}}^{J(N+1)\times J(N+1)}, \end{aligned}$$
(11)

where \(I_J\) denotes the identity matrix of size J. Then, the quadratic term in (6) can be expressed as

$$\begin{aligned} \sum _{i=1}^N(\left\| \mathbf{w}_i\right\| ^2 +\theta \left\| \mathbf{w}_i-\mathbf{w}_0\right\| ^2) = \tilde{\mathbf{w}}^\top {\mathcal{Q}}(\theta ) \tilde{\mathbf{w}}. \end{aligned}$$
(12)

Proposition 1

For any \(\theta >0\), the matrix \({\mathcal{Q}}(\theta )\) is symmetric definite positive. Moreover,

$$\begin{aligned} {\mathcal{Q}}(\theta )^{-1}=\frac{1}{N}\left[ \begin{array}{lllll} \frac{\theta +1}{\theta } I_J &{} I_J &{} I_J &{}\cdots &{} I_J \\ I_J &{} \frac{\theta +N}{\theta +1} I_J &{} \frac{\theta }{\theta +1}I_J &{} \cdots &{} \frac{\theta }{\theta +1}I_J \\ I_J &{}\frac{\theta }{\theta +1}I_J &{}\ddots &{}\ddots &{}\vdots \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \frac{\theta }{(\theta +1)}I_J \\ I_J &{} \frac{\theta }{\theta +1}I_J &{} \cdots &{} \frac{\theta }{(\theta +1)}I_J &{} \frac{\theta +N}{\theta +1}I_J \end{array} \right] . \end{aligned}$$
(13)

Proof

It is clear that the matrix \({\mathcal{Q}}(\theta )\) is symmetric definite positive (cf. (12)). Now, we denote by \(F_i\in {\mathfrak {R}}^{J\times J(N+1)}\) and \(C_i\in {\mathfrak {R}}^{J(N+1)\times J}\) the i-th block (in row) and the i-th block (in column) of \({\mathcal{Q}}(\theta )\) and \({\mathcal{Q}}(\theta )^{-1}\), respectively. Then,

$$\begin{aligned} F_1C_1=I_J,\quad F_1C_i=\frac{1}{N}\left(N\theta -\frac{N\theta +\theta ^2N}{\theta +1}\right)=0,\ i=2,\ldots ,N+1, \end{aligned}$$
$$\begin{aligned} F_iC_1=0,\quad F_iC_i=I_J,\quad F_iC_j=\frac{1}{N}(\theta -\theta )=0, i\ne j. \end{aligned}$$

Thus, the result follows. \(\square\)

Appendix B

Dual formulation of problem (6)

Let us denote by \({\varvec{\xi }}_t^k=(\xi _{1t}^k,\ldots ,\xi _{Nt}^k)\in {\mathfrak {R}}^N\), and by \(\mathbf{X}^k_{t}=\left[ \begin{array}{l} 0\\ X^k_{t} \end{array} \right] \in {\mathfrak {R}}^{(N+1)J\times N}\) with

$$\begin{aligned} X^k_{t}=\left[ \begin{array}{llll} \mathbf{x}_{1t}^1- \mathbf{x}_{1t}^k &{}0 &{} &{} 0 \\ 0 &{} \ddots &{} \ddots &{} \vdots \\ &{} \ddots &{} \ddots &{} 0\\ 0 &{}\cdots &{} 0&{} \mathbf{x}_{Nt}^1- \mathbf{x}_{Nt}^k \end{array} \right] \in {\mathfrak {R}}^{NJ\times N}. \end{aligned}$$

Then, the constraints of the problem (6) can be expressed as follows

$$\begin{aligned} {\varvec{\xi }}_t^k\ge 0,\quad {\mathbf{X}_t^k}^{\top }\tilde{\mathbf{w}}\ge \mathbf{e}-{\varvec{\xi }}_t^k, \ t=1,\ldots ,T,\ k=2,\ldots ,K. \end{aligned}$$

With this notation, the Lagrangian function associated to formulation (6) is given by

$$\begin{aligned} L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k)= & {} \frac{1}{2}\tilde{\mathbf{w}}^\top {\mathcal{Q}}(\theta ) \tilde{\mathbf{w}}+\sum _{t=1}^T\sum _{k=2}^K\left[ C({\varvec{\xi }}_{t}^k)^\top \mathbf{e}-{{\varvec{\alpha }}_t^k}^\top ({\mathbf{X}_t^k}^{\top }\tilde{\mathbf{w}}- \mathbf{e}+{\varvec{\xi }}_t^k)\right. \nonumber \\&\left. -({\varvec{\xi }}_{t}^k)^\top \mathbf{s}_t^k\right] . \end{aligned}$$
(14)

Then, Problem (6) can be written equivalently as

$$\begin{aligned} \min _{\tilde{\mathbf{w}},{\varvec{\xi }}_t^k}\max _{{\varvec{\alpha }}_t^k,\mathbf{s}_t^k}\{L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k):{\varvec{\alpha }}_t^k,\mathbf{s}_t^k\ge 0,\, t=1,\ldots ,T,\, k=2,\ldots ,K \}. \end{aligned}$$

Hence, the dual formulation (see eg, Bertsekas, 1982) of (6) is given by

$$\begin{aligned} \max _{{\varvec{\alpha }}_t^k,\mathbf{s}_t^k}\min _{\tilde{\mathbf{w}},{\varvec{\xi }}_t^k}\{L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k):{\varvec{\alpha }}_t^k,\mathbf{s}_t^k\ge 0,\,t=1,\ldots ,T,\, k=2,\ldots ,K \}. \end{aligned}$$

The above expression enables us to compute the dual problem based only on the Lagrange multipliers \(\varvec{\alpha }\). The first-order conditions of the inner minimization problem yields to

$$\begin{aligned} \nabla _{\tilde{\mathbf{w}}}L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k)= & {} {\mathcal{Q}}(\theta ) \tilde{\mathbf{w}}-\sum _{t=1}^T\sum _{k=2}^K\mathbf{X}_t^k {\varvec{\alpha }}_t^k=0,\end{aligned}$$
(15)
$$\begin{aligned} \nabla _{{\varvec{\xi }}_t^k}L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k)= & {} C\mathbf{e}-{\varvec{\alpha }}_t^k-\mathbf{s}_t^k=0. \end{aligned}$$
(16)

Since \(\mathbf{s}_t^k\ge 0\), from (16) it follows that \({\varvec{\alpha }}_t^k\le C\mathbf{e}\) for \(t=1,\ldots ,T\), and \(k=2,\ldots ,K\).

Remark 1

Note that using (1), (15), and the notation of \(\mathbf{X}_t^k\), we have that

$$\begin{aligned} \mathbf{w}_0=\frac{1}{N}\sum _{i=1}^N \mathbf{w}_i. \end{aligned}$$

On the other hand, by using (15) and (16) in (14), we obtain that

$$\begin{aligned} L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k)= \sum _{t=1}^T\sum _{k=2}^K{{\varvec{\alpha }}_t^k}^\top \mathbf{e}-\frac{1}{2}\left\| {\mathcal{Q}}(\theta )^{1/2} \tilde{\mathbf{w}}\right\| ^2. \end{aligned}$$

Since \({\mathcal{Q}}(\theta )\) is nonsingular, it follows from (15) that the above expression can be written as

$$\begin{aligned} L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k)= \sum _{t=1}^T\sum _{k=2}^K{{\varvec{\alpha }}_t^k}^\top \mathbf{e}-\frac{1}{2}\left\| {\mathcal{Q}}(\theta )^{-1/2} \sum _{t=1}^T\sum _{k=2}^K\mathbf{X}_t^k {\varvec{\alpha }}_t^k\right\| ^2. \end{aligned}$$
(17)

The following result allows us to rewrite the above equality.

Proposition 2

For \(\theta >0\), let \(\widetilde{{\mathcal{Q}}}(\theta )= NI_{JN}+\theta \mathcal{J}\in {\mathfrak {R}}^{JN\times JN}\), where \(I_{JN}\) denotes the identity matrix of size JN and

$$\begin{aligned} \mathcal{J}= \left[ \begin{array}{lll} I_J &{} \cdots &{} I_J \\ \vdots &{}\ddots &{}\vdots \\ I_J &{} \cdots &{} I_J \end{array} \right] \in {\mathfrak {R}}^{JN\times JN}. \end{aligned}$$

Then, there exists a symmetric matrix \(\widetilde{{\mathcal{Q}}}(\theta )^{1/2}\) satisfying \((\widetilde{{\mathcal{Q}}}(\theta )^{1/2})^2=\widetilde{{\mathcal{Q}}}(\theta )\).

Proof

Let

$$\begin{aligned} \widetilde{{\mathcal{Q}}}(\theta )^{1/2}= \sqrt{N}I_{JN}-\frac{1-\sqrt{1+\theta }}{\sqrt{N}}\mathcal{J}. \end{aligned}$$
(18)

Note that \(\mathcal{J}^2=NJ\). Then,

$$\begin{aligned} (\widetilde{{\mathcal{Q}}}(\theta )^{1/2})^2=NI_{JN}-2(1-\sqrt{1+\theta })\mathcal{J}+\frac{(1-\sqrt{1+\theta })^2}{N}(N\mathcal{J})=NI_{JN}+\theta \mathcal{J}. \end{aligned}$$

\(\square\)

By using the relation (13), Proposition 2, and the definition of \(\mathbf{X}_t^k\), the expression (17) reduces to

$$\begin{aligned} L(\tilde{\mathbf{w}},{\varvec{\xi }}_t^k,{\varvec{\alpha }}_t^k,\mathbf{s}_t^k)= \sum _{t=1}^T\sum _{k=2}^K{{\varvec{\alpha }}_t^k}^\top \mathbf{e}-\frac{1}{2N(\theta +1)}\left\| \widetilde{{\mathcal{Q}}}(\theta )^{1/2} \sum _{t=1}^T\sum _{k=2}^K X_t^k {\varvec{\alpha }}_t^k\right\| ^2. \end{aligned}$$

Hence, the dual formulation is given by

$$\begin{aligned} \begin{array}{ll} \max _{{\varvec{\alpha }}_t^k\in {\mathfrak {R}}^N}&{}\ \sum\limits_{t=1}^{T}\sum\limits_{k=2}^{K}{{\varvec{\alpha }}_t^k}^\top \mathbf{e}-\frac{1}{2N(\theta +1)}\left\| \widetilde{{\mathcal{Q}}}(\theta )^{1/2} \sum _{t=1}^T\sum _{k=2}^K X_t^k {\varvec{\alpha }}_t^k\right\| ^2\\ \text{ s.t. }&{}\quad 0\le {\varvec{\alpha }}_t^k\le C\mathbf{e},\ t=1,\ldots ,T,\ k=2,\ldots ,K. \end{array} \end{aligned}$$
(19)

Remark 2

From (15) and (13), it follows that

$$\begin{aligned} \mathbf{w}_0=\frac{1}{N}\sum _{i=1}^N\sum _{t=1}^T\sum _{k=2}^K (\mathbf{x}_{it}^1-\mathbf{x}_{it}^k) { \alpha }_{it}^k, \end{aligned}$$

and

$$\begin{aligned} \mathbf{w}_i=\frac{1}{N(\theta +1)}\left( (\theta +N)\sum _{t=1}^T\sum _{k=2}^K (\mathbf{x}_{it}^1-\mathbf{x}_{it}^k) { \alpha }_{it}^k+\theta \sum _{j=1,j\ne i}^N\sum _{t=1}^T\sum _{k=2}^K (\mathbf{x}_{jt}^1-\mathbf{x}_{jt}^k) { \alpha }_{jt}^k\right) , \end{aligned}$$
(20)

for \(i=1,\ldots ,N\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

López, J., Maldonado, S. & Montoya, R. Simultaneous preference estimation and heterogeneity control for choice-based conjoint via support vector machines. J Oper Res Soc 68, 1323–1334 (2017). https://doi.org/10.1057/s41274-016-0013-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1057/s41274-016-0013-6

Keywords

Navigation