Keywords

1 Introduction

Since negotiation is a complex decision making process involving two or more parties discussing many issues in an effort to reconcile their opposing interests [9], it may require support and facilitation to avoid impasses, deadlocks or stalemates. Therefore a number of support methods and software tools have been recently developed to facilitate negotiations. From the methodological viewpoint, various multiple criteria decision making (MCDM) methods [8, 12, 16] are applied to help negotiators at the prenegotiation phase in constructing their own negotiation offer scoring systems. Such systems measure the scales of concessions and visualize the negotiation progress and therefore are of use in quantitative evaluation of the negotiation offers. Various formal decision support models are implemented in the negotiation support systems (NSS) or electronic negotiation systems (eNS) used in business research and training, such as OpenNexus (http://en.opennexus.pl/), Inspire [5] or Negoisst [13]. Decision support provided by the vast majority of NSS/eNSs is based on the simple additive weighting (SAW) method [4]. For discrete negotiation problems, SAW requires assigning rating points to each element of the negotiation template assuming that more preferable issues and options obtain higher ratings. A SAW-based negotiation offer scoring system allows to evaluate any offer built with the options defined within the template by adding up the ratings of these options.

Even though SAW seems easy, intuitive and technically uncomplicated, there is some empirical evidence of its drawbacks and of problems with using SAW-based scoring systems. Interestingly, it has been observed [10] that a majority (57 %) of decision makers, when given a choice of the method for defining their preferences, express them qualitatively using linguistic or descriptive labels. If quantitative scores are used, they are usually of ordinal nature. Thus, it should not be surprising that some earlier electronic negotiation experiments showed that negotiators do not precisely know how to interpret SAW-based ratings and therefore misuse the scoring systems and incorrectly interpret the final scores of offers [17]. Furthermore, laboratory experiments performed with groups of students of economics asked to rank the negotiation offers and to compare them with other predefined rankings determined automatically by means of various versions of SAW, revealed many problems with comparing and selecting the predefined ranking that best fits the students’ intrinsic preferences [11]. Most frequently, the negotiators evaluated as more useful (better) a predefined ranking that differed more from their own subjectively defined one. These are, however, interpretative problems that can be reduced or alleviated, as we believe, by implementing appropriate visualization techniques and tools [2].

In this paper we focus on the prenegotiation process of building a negotiation offer scoring system by means of SAW to find out whether the negotiators are able to construct systems that reflect their preferences in an accurate and reliable way. In our research we analyze a dataset of electronic negotiation experiments conducted in the Inspire system, with a predefined multi-issue bilateral business negotiation case. We study the ability of the negotiators to transform correctly the preferential information included in the case description into a system of ratings to be used later to evaluate complete packages exchanged by the parties during the actual negotiation. We measure the scale of potential inaccuracy in determining the negotiation offer scoring systems. Inspired by earlier research by Vetschera [15], we use a negotiation case with precise graphical information about the parties’ preferences and therefore are able to introduce two separate measures of accuracy: a more general ordinal accuracy and a detailed cardinal accuracy measure. Finally we analyze the influence of the negotiators’ correctness in defining the scoring systems on the negotiation results obtained as well as the difference between the objective quality of such compromises and the subjective perception of their quality resulting from inaccurate rating systems.

The paper consists of four more sections. In Sect. 2 we describe briefly the Inspire system and its protocol for defining the negotiators’ preferences, as well as the case used in our experiment including details of the preference representation used. In Sect. 3 we discuss two notions of accuracy of preference definition that we use to measure the quality of the scoring systems built by the negotiators. In Sect. 4 we analyze the experimental results, while in Sect. 5 some future work is suggested.

2 Inspire

2.1 The System and Its General Functionalities

Inspire [5] is an eNS that supports bilateral negotiations conducted via the Web. It has been used for teaching and training, simulations and research in negotiations since the late 1990s. Data from the Inspire experiments have been widely used by a number of researchers investigating, among other things, cross-cultural aspects of electronic negotiations [7], the process of strategy formulation and communication [17], negotiators’ behavior and motivations [6]; and decision aspects of negotiations [15].

Inspire supports negotiators throughout the whole negotiation process; however, for our experiment the most important are its decision support facilities implemented in the prenegotiation phase. As regards decision support Inspire offers a SAW-based tool that helps negotiators to analyze their preferences and set up priorities regarding different elements of the negotiation template. This tool is implemented as an element of the prenegotiation preparation check-list imposed on the users by the Inspire protocol. The process of building a negotiation offer scoring system consists of three steps which follow the general SAW requirements [4]. In the first stage a pool of 100 rating points is distributed among all the negotiation issues to define their weights. In the second stage the negotiator rates the options within each issue assigning the maximum score, equal to the issue weight, to the best (most preferred) option, and 0 to the least preferred one. All the intermediate options obtain scores greater than 0 but lower than the issue weight. In the third stage Inspire displays a list of selected complete packages with global scores determined as the sums of the ratings of options that comprise these packages. If the user changes the global scores of selected packages, Inspire, by applying elements of conjoint analysis [1], recalculates the ratings of issues and options in the initial scoring system.

2.2 The Negotiation Case and the Preferential Information

Various negotiation cases may be used for experiments with Inspire. In our experiment a Mosico-Fado bilateral negotiation case was implemented, in which a musician and a broadcasting company discuss the terms of a potential contract. In this case the negotiation template is defined by means of four issues, each with a predefined list of salient options, which allows to build 240 various offers (see Table 1).

Table 1. Mosico-Fado negotiation template.

In the Mosico-Fado case each negotiator, representing either the musician or the broadcasting company, is provided with private information containing a detailed description of their preferences that should be used in building a negotiation offer scoring system. The structure of preferences of the parties is described both verbally and graphically. An example of preference description is presented in Fig. 1.

Fig. 1.
figure 1

Verbal and graphical representation of preferences in Inspire

As noted in the case description, the graphical representation of the preferences was elaborated by the negotiating parties and accepted by their supervisors. The circle sizes indicate the importance of each issue and option. However, what was also emphasized in private information, the circles were drawn casually, so their radiuses do not necessarily reflect the preferences very precisely and accurately. Note that in the description of the circles mention was made of both the circle sizes (areas) and their radiuses, which may be confusing, since this indicates different reference points in the process of building a formal scoring system of offers. Complete graphical information about the preferences of both parties is presented in Appendix.

3 Measuring the Accuracy of the Negotiation Offer Scoring Systems

Inspire does not verify the correctness or accuracy of the scoring systems built individually by the negotiators; it allows them to rate the issues and options at their own discretion and according to their own understanding and interpretation of verbal and graphical preference information. Thus, a fundamental research question arises: if and to what extent the negotiators adhere to the preference description while building their SAW-based negotiation offers scoring system. The negotiators’ accuracy can be measured with two different statistical concepts: (1) by analyzing the relationship between the scoring system and determining the correlation coefficients; (2) by analyzing the similarities of the scoring systems and measuring the distances between the negotiator’s own system and the reference one. The first of these approaches could be implemented if the relationships between the rankings of full packages were to be studied, each represented by a single frequency distribution. In our problem, each scoring system is represented by a series of five frequency distributions (one representing issue weights and four representing option ratings within each issue) with some elements of these distributions being strongly mutually dependent. This would require a thorough reconsideration and modification of the correlation-based approach. Therefore, the second approach will be applied here, which is easier to modify and interpret in the analytical context of our problem.

3.1 Ordinal Accuracy

Before measuring the similarities of the scoring systems, basic information about preserving the general preference information can be verified. It can be checked whether the negotiators follow the order of preference represented by the circle sizes for the ratings of both issues and options. This notion of agreement in defining preferences will be called ordinal accuracy. Formally, if \( n \) issues (or options) \( A_{1} , \ldots ,A_{n} \) are ordered according to decreasing preferences (the circle sizes representing these issues decrease while moving from \( A_{1} \) to \( A_{n} \)), the ratings \( u(A_{i} ) \) of the issues are accurate if they satisfy the following condition

$$ u\left( {A_{1} } \right) > u\left( {A_{2} } \right) > \ldots > u\left( {A_{n} } \right) . $$
(1)

For instance, if the preferences regarding the negotiation issues presented in Fig. 1 are analyzed and scored, the ordinal accuracy requires that u(“Number of concerts”) > u (“Number of songs”) > u(“Royalties for CDs”) > u(“Signing bonus”). The ordinal accuracy index of the scoring system built by the ith negotiator can be represented as a ratio of the number of correct rankings (\( n_{i}^{\text{cor}} \)), i.e., subjective rankings that are in agreement with rankings in the reference order, to the total number (\( n \)) of all the rankings that have to be built for the negotiation template.

Fig. 2.
figure 2

The structure of globally ordinally accurate negotiators

$$ OA_{i} = \frac{{n_{i}^{\text{cor}} }}{n}. $$
(2)

In our problem, \( n = 5 \), since there is one ranking representing the importance of the issues and four others, reflecting the orders of salient options for each issue respectively. Note that ordinal accuracy can also be measured, for instance, by means of the Kendall tau rank correlation coefficient. However, as mentioned before, this would require a modification of the original formula since not every pair of elements of the negotiation template can be compared (e.g. rankings of options of different issues cannot be compared).

3.2 Cardinal Accuracy

By determining the global deviations (distances) between the ratings subjectively assigned by the negotiators and the ideal ratings which follow from the corresponding circles (areas or radiuses), cardinal accuracy of the negotiation offer scoring system can be measured. However, the specificity of the SAW algorithm, in which the option ratings of one issue depend on the ratings assigned previously to this issue (see Sect. 2.1), requires a different approach to measuring cardinal accuracy for issues and options. Cardinal inaccuracy of issue ratings (\( II_{i} \)) for the negotiator \( i \) is measured as a sum of differences in ratings for each issue \( j \) with respect to the reference ideal ratings:

$$ II_{i} = \mathop \sum \limits_{j} \left| {u_{j}^{\text{ref}} - u_{j}^{i} } \right| , $$
(3)

where: \( u_{j}^{\text{ref}} \) is the reference rating (radius-based or area-based) of the \( j \)th issue, and

\( u_{j}^{i} \) is the subjective rating of the jth issue defined by the ith negotiator.

While determining the cardinal inaccuracy of option ratings we need only to verify if the proportions of the circle sizes (radiuses) are preserved by the negotiators regardless of the rating of the issue under consideration. This way we will avoid double-counting of the deviations resulting from the issue ratings incorrectly assigned. Thus, we will determine the normalized reference ratings for options of each issue separately (\( \bar{u}_{jk}^{\text{ref}} \)) and compare them with the normalized subjective ratings (\( \bar{u}_{jk}^{\text{i}} \)) of the negotiator to determine the normalized deviations. The normalized deviation for each option will be multiplied by the reference issue rating (\( u_{j}^{\text{ref}} \)) resulting in the option inaccuracy rate. Formally, cardinal inaccuracy of option ratings of the \( j \)th issue for the \( i \) th negotiator can be measured by the following formula:

$$ OI_{ij} = u_{j}^{\text{ref}} .\sum\nolimits_{{K = 1,..,N_{j} }} {\left| {\bar{u}_{jk}^{ref} - \bar{u}_{jk}^{i} } \right|} , $$
(4)

where \( N_{j} \) is the number of options of the issue \( j \).

A simple example of measuring the inaccuracy of option ratings assigned by a representative of Fado for the issue of the number of concerts is presented in Table 2. The normalized deviations \( {\left| {\bar{u}_{jk}^{ref} - \bar{u}_{jk}^{i} } \right|} \) are then aggregated according to formula (4) and the ordinal inaccuracy index is determined as \( OI = 32.\left( {0 + 0.2 + 0.37 + 0} \right) = 18.24. \)

Table 2. Normalized inaccuracy rates for option ratings.

To determine the global cardinal inaccuracy rate for the whole scoring system of the \( i \)-th negotiator, the issue inaccuracy rate and the option inaccuracy rates for all issues need to be aggregated:

$$ CI_{i} = II_{i} + \mathop \sum \limits_{j} OI_{ij} . $$
(5)

4 Online Experiment and Results

We analyzed the results of a negotiation experiment conducted in Inspire in the spring of 2014. The participants of this experiment were 378 students from Poland, Austria, China, Taiwan, Great Britain, Ukraine and Canada, paired into 189 active instances. Once the incomplete records had been eliminated, 176 representatives of the Mosico party and 174 representatives of the Fado party have been considered to analyze the accuracy of building a negotiation offer scoring system and its impact on the negotiation outcome.

4.1 Ordinal Accuracy in Building the Scoring Systems

Analyzing the Inspire’s dataset we were surprised to find that 52 representatives of the Mosico party (32 %) and as many as 114 of the Fado party (66 %) party were inaccurate from the viewpoint of ordinal inaccuracy (\( OA < 1 \)). Such a high percentage of inaccurate Fados may be caused by the peculiar structure of preferences defined for their party, with the first two issues equally important and represented by circles of the same size. However, due to some optical illusions (see [14] ), for some of them those two circles might have looked different. Therefore we eliminated from the list of inaccurate Fados those who claimed that the number of concerts is more important than the number of songs (and vice versa), but by no more than 5 rating points, and were accurate for other issues. This still left as many as 81 of them (46 %) inaccurate. The situation looked similar if ordinal inaccuracy was determined for the ratings of options within each negotiation issue (see Table 3). It is surprising that Mosicos, who were more accurate in defining the issue ratings, are now more inaccurate than Fados in building their individual option ratings for the successive issues.

Table 3. Ordinal accuracy in option ratings.

Based on the information regarding the inaccuracy of the issue and option ratings, we determined the global ordinal accuracy index according to formula (2). Thus, we counted for each negotiator the number of accurate rankings out of five different rankings they were ask to build. The results, determined separately for the Mosico and Fado parties, are shown in Fig. 2.

Even though the percentage of fully inaccurate negotiators is the same for the Mosico and Fado parties, the numbers of fully accurate ones differ significantly. In the Mosico group there were 69 negotiators (39 %) who built their negotiation offer scoring systems preserving the ordinal preferential information for both issue and option ratings (\( OA = 1 \)). Among Fado’s representatives the group of fully accurate negotiators was 17 pp smaller than among the Mosico’s ones. These relatively small percentages of accurate negotiators are intriguing and thought-provoking, since we did not expect the negotiators to map the preferential information into the system of ratings precisely, but only to follow the order of preferences visualized by the circle sizes. This did not require any sophisticated calculations or analysis but only a thorough glance.

4.2 Cardinal Accuracy in Building the Scoring Systems

Next we analyzed the negotiators’ scale of cardinal accuracy of issue ratings using formula (3). We used two reference ratings: area-based and radius-based (see Appendix). When analyzing the cardinal inaccuracy of issue ratings for ordinally accurate and inaccurate negotiators we found that the results differ depending on the reference rating used (see Table 4).

Table 4. Mosicos’ and Fados’ cardinal inaccuracy for issue ratings (\( II \)).

No matter which reference rating is applied, the representatives of the Fado party who are ordinally accurate are, on average, more cardinally accurate than the ordinally inaccurate ones. The same margin of five rating points in differences between the scores of the first two issues was applied, as in the ordinal accuracy analysis. However, Fados seem to refer to radiuses rather than the areas of circles. For the Mosicos, there is no significant difference in cardinal inaccuracy if a radius-based reference system is used (\( p = 0.091 \)). However, the ordinally accurate Mosicos seemed focused more on circle sizes (areas) than on radiuses. If we compare them with ordinally inaccurate Mosicos, the difference in the cardinal accuracy is significant (\( p = 0.000 \)). Comparing the Mosicos and the Fados, applying the same notion of ordinal accuracy, we see that the Mosicos are cardinally more accurate than the Fados.

Next, based on formula (5), we determined the global cardinal inaccuracy rates for Mosicos and Fados in our experiments (see Table 5). The \( CI \) rates prove once again that the ordinally accurate negotiators are also far more cardinally accurate (for both reference ratings the differences are statistically significant for \( p = 0.000 \)) than those who did not preserve even the order of preferences. Therefore we can reject the conjecture formulated at the beginning of Sect. 4.2, that the ordinally inaccurate negotiators might have built rating systems that are relatively close to the reference ones (ideally accurate).

Table 5. Mosicos’ and Fados’ global cardinal inaccuracy (\( CI \)).

4.3 Accuracy of Scoring Systems and the Negotiation Outcomes

Knowing the scale of negotiators’ inaccuracy in defining the scoring systems we aimed at verifying its potential impact on the negotiation agreement. The inaccurate negotiators, if they rely on their incorrect scoring systems, may have a false impression of the negotiation reality, may interpret the negotiation progress and concessions incorrectly and, consequently, may accept mediocre or weak agreements. Therefore we analyzed the percentage of agreements reached by accurate and inaccurate negotiators and scored the agreements reached using the negotiators’ subjective scoring systems as well as the reference ones. The results for Mosicos and Fados are presented in Tables 6 and 7, respectively.

Table 6. Rates of agreements reached by the Mosico group.
Table 7. Rates of agreements reached by the Fado group.

In the Mosico group, if the results of accurate and inaccurate negotiators are compared within each type of the scoring system (individual, radius-based and area-based), no significant differences are observed. However, from the viewpoint of the external observer, both the accurate and inaccurate Mosicos reached agreements of similar quality. On the other hand, if we compare the outcomes for accurate and inaccurate Mosicos separately, we will see that the accurate negotiators, by relying on their accurate scoring system, had a correct perception of reality and were able to interpret the negotiation progress and history correctly. The differences in the ratings of agreements between the individual, radius- and area-based scoring systems are not significant. Yet, the inaccurate Mosicos had, on average, a false impression of their efficiency and of the quality of their performance. They thought they had reached quite profitable agreements (77.7 rating points on average), while objectively their agreements were significantly worse, i.e. 72.9 if measured by the radius-based scoring system, and 72.0, if by the area-based one. We may presume that they may similarly incorrectly interpret the whole negotiation process. The question is: if they had known the real value of the offers submitted and the potential agreement, would they have negotiated differently and obtained better results?

The situation is a little more evident if we analyze the results for the Fado group. Here, from the viewpoint of the external observer, the results obtained by the negotiators are objectively worse in the group of the inaccurate negotiators than in the group of the accurate ones (81.3 vs. 76.5 for the radius-based and 79.9 vs. 74.9 for the area-based scoring systems). Similarly, the inaccurate Fados interpreted their agreements to be significantly better (80.8 on average) than they actually were, when scored by means of the reference ratings (76.5 and 74.9 respectively).

5 Conclusions and Future Work

In our research we tried to check whether the negotiators build their negotiation offer scoring systems in accordance with their intrinsic preferences (or the ones that were imposed on them). We realized that in a vast majority of situations SAW-based scoring systems are inaccurate and give the negotiators a false perception of the negotiation progress and of the results they obtain. Unfortunately, we are not able to answer unambiguously the question: what (if anything) would have changed in the negotiation style, concession strategy or the results if the inaccurate negotiators had built their scoring system correctly and had had a correct perception of the negotiation situation throughout the whole negotiation process. The results obtained for the Mosico and Fado groups (see Tables 6 and 7) are ambiguous, and confirm that the accurate Fados performed significantly better, while Mosicos’ results are even better, but are not confirmed by statistical significance tests.

There is, however, another question that was not answered here, mainly due to the lack of adequate data, and which is of a more fundamental nature: what is the cause for building such inaccurate scoring systems and how to help the negotiators to avoid making errors in rating the issues and options. To answer the first question, in-depth research is required that will examine the occurrence of a syndrome of fast thinking and various heuristics [3] in the analytical process of building negotiation offers scoring systems. It will also require experimenting with different methods of visualizing the preferences (e.g. using bars instead of circles) and different algorithms for eliciting the negotiators’ preferences. Hence, our future research will consist in designing and performing new electronic negotiation experiments investigating in detail the causes of inconsistencies in the preference elicitation processes in electronic negotiations and producing prescriptive conclusions on the methodological solutions that would eliminate potential behavioral and technical errors made by the negotiators or caused by the support algorithm of too high cognitive demand.