Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The present chapter lays out the basis for the understanding of Quality of Experience (QoE) as it is followed by the book.Footnote 1 The terms quality and Quality of Experience are typically used with an engineering goal in mind, reflecting the fact that perceived quality is a key criterion for evaluating systems, services or applications during the design phase or during operation. As such, QoE research often takes a measurement-centered, reductionist’s perspective, to assess known services and identify quality-relevant criteria. How to create certain (possibly new) types of “experiences” typically is the domain of User Experience (UX) research. An in-depth comparison of QoE and UX is given in Chap. 3. In the present chapter, a combined engineering- and perception-oriented view is used to discuss work on QoE from different fields.Footnote 2

In the present chapter, quality and QoE are addressed from the perspective of a person whose experiencing in a given situation involves a technical application, service or system. Figure 2.1 depicts the multi-layered context that characterizes the person’s situation. The signal(s) as well as the different contexts influence the perception and quality formation processes discussed in this chapter. The different contexts as well as the associated ecosystem of multimedia usage are discussed in more detail in Chap. 7. The contextual information is addressed in more detail in Chap. 4, in terms of factors influencing quality and QoE. In turn, the present chapter presents definitions and considerations in the context of QoE, focusing on the perceptual and cognitive processes underlying the quality formation in the perceptual world of the person.

Fig. 2.1
figure 1

Different contexts a person may be embedded in, inspired by De Moor and Geerts, cf. [14, 19]. Each context is associated with a specific ecosystem that several different stake-holders are involved in, and in which the person takes different roles (as a viewer, friend, customer etc.), as further discussed in Chap. 7

This perspective can be illustrated using the following example: A person watches a soccer match on TV at home with friends. Here, the signals are of acoustic and visual form [signal(s) in Fig. 2.1]. The person interacts with the other persons, possibly with the TV set and the home environment (interactional context). Jointly watching the soccer match in the home environment sets the situational context. The socio-cultural background of the group of friends forms the socio-cultural context. How the person under consideration experiences the soccer match and evaluates the quality of the (technically mediated) experience depends on the audiovisual signals and the contextual settings. As such, this information represents the inputs to the quality formation process discussed in this chapter.

The remainder of this chapter is structured as follows: Sect. 2.2 reviews the related work on quality and QoE in different fields, and provides an updated view of QoE, introducing complementary terms and concepts. In Sect. 2.3, a conceptual model of the quality formation process is presented. In Sect. 2.4, general considerations on quality assessment and evaluation are summarized, and Sect. 2.5 discusses open issues and trends.

2 QoE Foundations, Terms and Definitions

In this section, we discuss the terms and concepts of quality and Quality of Experience. In the first step, we introduce our view of the concepts ‘perception’ and ‘experience’—or ‘experiencing’—as used in this book.

Here, perception is the conscious processing of sensory information the human subject is exposed to. Perception is assumed to involve two subsequent processing stages before a percept finally appears in the perceivers world, namely,

  1. 1.

    Conversion of stimuli via the respective physiologically adequate sensory organs into neural signals.

  2. 2.

    Processing and transmission of these neural signals in the central nervous system up to the cortex, finally resulting in the appearance of specific percepts in the person’s perceptual world.

Based on this view, we define experiencing as follows:

Experiencing is the individual stream of perceptions (of feelings, sensory percepts and concepts) that occurs in a particular situation of reference.

Here, we follow the widely accepted understanding that experiencingFootnote 3 can have hedonic (feelings) and pragmatic (concepts) aspects (see Chap. 5). In terms of the application-domain of this book, experiencing may result, for example, from an encounter of a human being with a system, service or artifact. Experiencing in this definition does not include a quality judgement. Quality judgements are considered to be the result of additional cognitive processes on top of experiencing, as described in more detail in the remainder of the section. A conceptual model of the perception, experiencing and quality formation processes is presented in Sect. 2.3.

2.1 Quality and Quality of Experience: Related Work

In the following, we discuss different concepts of quality and important contributions from other authors, before we present an updated view of quality and quality formation.

Qualia

The concept of Quality of Experience can be related with the concept of Qualia. Based on the considerations by Jackson [24], Qualia can be seen as an inherent property to experiencing that cannot be shared by verbal description or technical means, that is, it can only be accessed via individual experiencing. The respective perceptual features may be referred to as Quale. Jackson writes [24]: “Tell me everything physical there is to tell about what is going on in a living brain, the kind of states, their functional role, their relation to what goes on at other times and in other brains, [...] you won’t have told me [...] about the characteristic experience of tasting a lemon, smelling a rose, hearing a loud noise or seeing the sky.” In the context of the present chapter, examples according to the Qualia concept are the listening to a spatial audio production, or the use of a smartphone with intuitive touch input, representing experiencing that cannot be explained verbally to a person who has never had a comparable encounter.

Qualitas and quality

Martens and Martens [35] discuss two extremes of existing approaches for understanding quality: (1) an “objective”, rationalistic and product-oriented approach, and (2) a perceptual, “subjective” approach. The authors discuss these two approaches along four quality definitions, whereby (QD1) as qualitas and (QD2) as EXCELLENCE / GOODNESS are most useful in the context of this book. The approach of (QD1) focuses on generalizable characteristics and properties of the item under consideration in terms of quality as the description of the item’s characteristics. In contrast, the perceptual approach (QD2) requires the human evaluator to actually experience the perceptual ‘event’ under consideration and evaluate the experience in terms of “evaluated excellence or goodness”. This approach is strongly related with the degree of need fulfillment [35] or utility. Note that the two notions of quality (QD1 and QD2) are inline with Letowsky’s work on sound quality [34].

Utility and “Quality of Experiencing”

Two connotations of the term utility Footnote 4 in the context of experiencing have been distinguished by Kahneman [29]:

  • Experienced Utility ...as the judgment in terms of good/bad of a given experience, related with individually perceived “pleasure and pain”, “point[ing] out what we ought to do, as well as determine what we shall do” (Kahneman [28], making reference to Bentham [4]). Experience(ing) in this context may refer to painful medical investigations such as colonoscopy as in [29], or pleasant phases of experiencing, for example during a concert, or to the quality of life at large [29].

  • Decision Utility is considered by (external) observation in terms of whether or not certain decisions have been taken, for example on whether or not a service is being used, a low-quality phone call is being ended or a web-item is being clicked.

In principle, both connotations of utility are of relevance for this book: Experienced utility is related with perception and experiencing from an individual perspective. In turn, decision utility is a useful concept when it comes to whether or not a service or application is actually being used, and thus relates to the concept of acceptance (see next section and Chap. 7).

The previous and following discussions mainly focus on experienced utility, and it is noted that Kahneman explicitly uses the term quality of experience in this regard. Note that Kahneman has illustrated his ideas referring to quite different domains than the ones addressed in this book, such as medical treatments like colonoscopy, or a person’s own life (at large!). For assessment, Kahneman distinguishes a moment-based and a memory-based approach [28]: For the moment-based approach, momentary or instantaneous judgments of experience are asked for, and for remembered utility (memory-based), respective judgments refer to past or just ended phases (or episodes) of experience(ing). The so-called peak-end effect and temporal integration properties related with momentary or remembered quality (utility) are addressed in Chap. 9.

Standards’ Views

One of the most comprehensive reviews of quality definitions has been given by Reeves and Bednar [54]. They identify the most pervasive definition to be “the extent to which a product or service meets and/or exceeds a customer’s expectations”, which they account to be a definition coming from the service marketing literature. According to their review, services were what was most difficult to include in previous quality definitions up to that date. Around 1990, it was acknowledged that “only customers judge quality” and “all other judgments are essentially irrelevant” (cited by [54] from [67]). It is noteworthy that this perspective is well reflected in standardized quality definitions, such as the one in ISO 9000:2000 [21]:

  • Quality ...“is the ability of a set of inherent characteristics of a product, system or process to fulfill requirements of customers and other interested parties”.

The current definition of Quality of Service (QoS) by the ITU-T is similar to the ISO-definition of quality given above, with an explicit view from a service operator’s or manufacturer’s perspective:

Quality of Service [The] Totality of characteristics of a telecommunications service that bear on its ability to satisfy stated and implied needs of the user of the service.

The standardized QoE definition most frequently used in the QoE (and QoS) context is the one according to ITU-T Rec. P.10 (Amendment 2, 2008):

  • QoE(P.10) ...“The overall acceptability of an application or service, as perceived subjectively by the end user.”

  • Note 1: Includes the complete end-to-end system effects.

  • Note 2: May be influenced by user expectations and context.

It was pointed out by Möller [38] and others that the inclusion of the term acceptability as the basis for a QoE definition is not ideal. As a consequence, during the Dagstuhl Seminar 09192, May 2009, acceptability has been newly defined [38]:

  • Acceptability ...“is the outcome of a decision [yes/no] which is partially based on the Quality of Experience.”

It is noted that this definition is inline with Kahneman’s decision utility.

Several authors such as Martens and Martens [35] and Jekosch [26] have made reference to quality as defined by earlier engineering-, service- or production-related standardization bodies.

Quality, Quality Elements and Quality Features

A definition of quality extending the standards’ view is that by Jekosch [26]:

  • Quality results from the “judgment of the perceived composition of an entity with respect to its desired composition”.

Here, the desired composition refers to the set of internal references and expectations against which the perceived composition is being compared.

To reflect the design process in typical quality management or engineering concepts, in [26] Jekosch takes up the definition of quality element from the Deutsches Institut für Normung (DIN):

Quality element ...is the “contribution to the quality of a material or immaterial product... in one of the planning, execution or usage phases.” [26]

In simple terms, quality elements can be seen as the material or immaterial knobs and screws that may affect perceived quality. In contrast, a quality feature can be described as [26]:

Quality feature ...is the the perceived characteristics of an entity “that is relevant to the entity’s quality”.

Factors affecting quality perception (that is, quality elements) are summarized in Chap. 4, and quality features for different multimedia services are outlined in Chap. 5. An in-depth discussion of the relation between QoS and QoE is given in Chap. 6.

2.2 Quality of Experience: Updated Terminology

From the previous discussions, it is obvious that the term quality has different connotations, depending on the context it is used in (see work by Parasuraman et al. [45], Reeves and Bednar [54], Blauert and Jekosch [7], Jekosch [26] and Martens and Marten [35]). In this subsection, we present our synthesis of the different views on quality and present new or updated definitions of relevant terms.

For the following considerations, we apply Jekosch’s definition of quality [26] so as to exclusively address perception that involves sensory processing of external stimuli:

Quality (based on experiencing) results from the “judgment of the perceived composition of an entity with respect to its desired composition”.

This way, we explicitly distinguish it from assumed quality:

Assumed quality corresponds to the quality and quality features that users, developers, manufacturers or service providers assume regarding a system, service or product that they intend to be using, or will be producing, without however grounding these assumptions on an explicit assessment of quality based on experiencing.

Here, it is noted that the underlying assumptions or expectations are positioned at a different level of the perceptual/cognitive system than actual sensory and emotional references,Footnote 5 namely, at the level of concepts. Assumed quality as introduced here comprises the traditional views of quality as it was used up to the 1990s in the context of quality management, for example in the production cycle in terms of excellence and conformance to specifications (cf. Reefes and Bednar [54]). Yet, to a certain extent, it also includes the view of quality in terms of “meeting and/or exceeding customer’s expectations” [54], which is more inline with the definition of quality (based on experiencing) as given above. However, assumed quality excludes explicit experiencing involving sensory processing of external stimuli.

Another term used in the following is quality of experiencing. This term is equivalent to Kahneman’s use of “quality of experience” [29] and the related concept of experienced utility outlined in more detail in Sect. 2.2.1. We here define this concept as follows:

Quality of experiencing is the degree of delight or annoyance of a person during the process of experiencing.Footnote 6 It results from the person’s evaluation of the fulfillment of his or her expectations and needs with respect to the utility (pragmatic and hedonic) in the light of the person’s context, personality and current state.

In the above definition, context refers to the multi-layered view discussed in Sect. 2.1, see Fig. 2.1. Personality refers to “...those characteristics of a person that account for consistent patterns of feeling, thinking and behaving”, following Pervin and John [48], and current state is used in terms of “situational or temporal changes in the feeling, thinking or behavior of a person” (translated from German from Amelang [1]). Note that the current state is both an influencing factor of experiencing (see also Chap. 4), and a consequence of the experiencing.

In this chapter, quality of experiencing refers to judgments during or after experiencing (cf. momentary utility/experience versus remembered utility/experience as in [28, 29], see previous subsection). In the following, for the applications addressed in this book, let us consider that the experiencing explicitly involves some kind of technology that impacts the signals presented to the person. For example, this may be a person’s overall judgment on the quality of experiencing a concert show, or a soccer match on television together with friends. Note that we use the term experience here referring to an evaluation of the experiencing at a given moment in time, or in retrospect, considering a certain period of experiencing (cf. remembered utility or experience, [28], discussed in detail in Chap. 10).

For the special case of quality of experiencing addressing the context of using multimedia services and applications, in the Qualinet White paper [40] we had proposed the following definition:

  • Quality of Experience (QoE): “is the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user’s personality and current state.”

  • Here, an application is defined as:

  • Application: “A software and/or hardware that enables usage and interaction by a user for a given purpose. Such purpose may include entertainment or information retrieval, or other.” [40]

  • Service: “An episode in which an entity takes the responsibility that something desirable happens on the behalf of another entity.” (Dagstuhl Seminar 09192, May 2009, cited after [60])

For the definition of QoE, a number of specifications are added to the definition of quality of experiencing by the context of applications or services: A snapshot is taken, resulting in the exchange of experiencing by experience. Further, the person takes the role of a user [14, 19]. The experiencing happens in the context of using the application or service. In our definition of quality of experiencing, utility is considered to have both pragmatic and hedonic connotations, where enjoyment is implicitly considered in terms of a (perceived) need.

However, we identify a major limitation of the above QoE definition in the fact that it addresses the explicit experiencing of an application or service. Instead, we believe that a more global view should be taken that also comprises the evaluation of the contribution of a given application, system or service implementation to the quality of experiencing as defined above in a more global sense. Further, the delight or annoyance related with the experiencing needs to be evaluated to come to QoE, which appears less clear from the above definition. As a result, the following updated definition of QoE is proposed:

Quality of Experience (new) (QoE) is the degree of delight or annoyance of a person whose experiencing involves an application, service, or system. It results from the person’s evaluation of the fulfillment of his or her expectations and needs with respect to the utility and/or enjoyment in the light of the person’s context, personality and current state.

With the inclusion of the term system, even the use of, for example, concert halls, public address systems or television sets can be included in the QoE definition. We here acknowledge the fact that a person that uses an ICT (Information and Communication Technology) product actually takes the role of a user, see De Moor’s and Geerts work [14, 19]. However, it appears less evident that a person attending a concert and possibly judging upon the quality of experiencing the concert including the employed PA system is an actual user. As a consequence, we have re-introduced the person instead of the user. It is clear that, if the interaction with the application, service or system is at the core of the consideration, the person mainly takes the role of a user.

3 Experiencing and Quality Formation

In the following, we present a conceptual model of the quality formation process, taking the perspective of the experiencing person. The quality formation process comprises a perception-component at its basis, as well as the higher-level reference-based quality formation, which we consider as parallel and interactive processes.

3.1 Perception and Experiencing Process

The basis for quality and QoE as addressed in this book is perception. Figure 2.2 schematically depicts a current concept of how the neural signal processing during perception takes place in an iterative way. The process of perception starts by the incidence of respective stimuli to one or multiple of the human sensory organs. In the sensory organ(s), the physical representations of the stimuli are converted into neural representations that include characteristic electric signals. This representation is conveyed to the brain through neural transmission for further processing. Throughout the transmission to the respective brain region, these representations are transformed from initial representations of stimuli into more abstract, symbolic representations. For details on assessing the neurophysiological basis of these processes refer to, for example, Chap. 8.

Fig. 2.2
figure 2

Schematic illustration of the authors’ concept of the perception process. Circles represent perceptual processes, two parallel horizontal lines represent storages for different types of representations, and boxes outside of the person represent input information. Note that continuous lines represent direct input to the perception process. Here, for simplification, contextual information is assumed to be processed, too, but by parallel perceptual processes not shown in the picture. The person’s state refers to both the cognitive as well as the physiological, current state of the person. In turn, assumptions here refer to the person’s attitude and concepts. See text for further details

Current physiological knowledge supports the following model assumptions with regard to different levels of neural processing. In the first, sensory processingstep, neural processing by the sensory periphery results in a multidimensional neural topologically organized representation, covering aspects of time, space, frequency and activity (see e.g. Raake and Blauert [51] for a model framework for spatial audio perception and quality, and complementary considerations in the work by Blauert et al. [8]). The neural representation precedes the actual formation of perceptual objects (perceptual event formation). Further steps towards this goal are conducted at higher levels of the brain, where multiple parallel processors perform the bottom-up pre-segmentation of the multidimensional feature representation, leading to a Gestalt-analysisFootnote 7 of features for object- and event identification. Subsequent processing steps analyze the pre-segmented features in terms of objects in the specific modalities, such as visual objects or aural scene objects, or words in an utterance. Already at these levels, perception is influenced by remembered perceptual events and subsequent feedback-based adaptation of the processing, such as, for example, noise suppression once a human voice is sensed. As a consequence, the neural features likely to belong to the same object are associated. The pre-segmentation and object-formation can be subsumed under the process perceptual event formation. At this stage, information from other modalities is already integrated, via respective sensory processing.

Based on internal references and rules, hypotheses are created in a top-down manner that are verified against the bottom-up perceptual evidence [8]. In Fig. 2.2, this process is denoted as anticipation and matching. The result of the iterative processes of perceptual event formation and anticipation and matching are recognized objects of perception, that have a specific perceived character. The presence of certain stimuli may lead to exploratory action, such as the so-called turn-to-reflex in audio-visual perception, where a low-level representation of an impulsive sound from a given direction typically causes a reflexive turning of the head towards the sound source [11]. Similarly, in a top-down manner, actions such as exploratory head-movements [5], tactile exploration [33] or overt attention type eye-movements may be carried out due to salient properties of the stimuli and/or contextual information, or may be governed by higher-level cognitive processes that direct visual attention [16, 22]. This, in turn, alters the sensory and subsequent neural input information (see also [42]).

It is noted that contextual and/or task-related information given to persons are processed via their sensory organs and the subsequent neural processing, too, possibly in other modalities. Such information either directly affects the perceptual process, or does so via information made available in terms of higher-level concepts, here referred to as assumptions (see Fig. 2.2). Further, in principle, perception is largely co-determined by the person’s (current) state. It reflects the “situational or temporal changes in the feeling, thinking or behavior of a person” (translated from German from Amelang et al. [1]).

Memory and Perceptual References

In Fig. 2.2, different stores (storages) are depicted by parallel lines. According to authors such as Cowan [13], Coltheart [12] and Baddeley [2], different levels of memory have been identified, with respective roles in the perception process, and respective storage durations. Such memory levels are:

Sensory memory: Is a peripheral memory that stores sensory stimulus representations for short durations between 150 ms and 2 s so as to be retrieved by higher processing stages [2, 12, 13, 36].

Working memory: Stores re-coded information at symbolic level for longer durations from a few up to tens of seconds [3].

Long-term memory: Covers longer time spans up to years or even a full lifetime. It involves multiple stages of encodings in terms of symbolic and perceptual representations [2]. Current theories assume that a central executive component controls the linking between long-term memory and working memory via an episodic buffer at working memory level that integrates information into episodes, and that this central component is associated with attention [3].

Perceptual references as depicted in Fig. 2.2 can be present at different levels of memory: Working memory for the perceptual integration of a scene and respective scene analysis, as well as information retrieved from long-term memory, for example for the identification of objects in a scene or words in an utterance. Similarly, the perceived character or the respective perceptual event or flow of events can be situated in working memory, or be stored in long-term memory, for example after verbal or episodic re-coding has occurred. (cf. Chap. 9).

In this context, learning of perceptual or conceptual references is directly associated with expertise and know-how. In Fig. 2.2, learning is considered as being implicitly integrated into the processes that are involved in perception, which enables more fine-grained performance with learning, as well as the increasing availability of respective detailed references in long-term memory. For example, a person that is fluent in a learned language can relate utterances with respective references, and a skilled musician or sound engineer will be able to associate a given auditory percept with respective actions—while an unskilled person is usually not able to do so. Considerations on categories of references can be found in Neisser’s cognitive system theory, cf. [42].

3.2 Quality-Formation Process

A conceptual model of the quality formation process has been proposed by Jekosch [26], further adapted in [50]. This view is extended in the following, see Fig. 2.3. The quality-formation process can be seen as a parallel but higher-level cognitive process related with the process of experiencing (cf. Sect. 2.3.1). Here, it is assumed that experiencing itself may be subject to quality of experiencing evaluation, in case that the person reflects upon it (reflection & attribution in Fig. 2.3). This reflection can be triggered by an external task to evaluate what has been experienced (for example in a quality test), during or after the process of experiencing [28, 29]. Here, the task is contained in the assumptions as the abstract conceptual expectations and attitude of the person (Fig. 2.3). Or, the reflection may be triggered by unexpected events, where the experiencing deviates from assumptions.

Fig. 2.3
figure 3

Quality formation process during experiencing (includes Fig. 2.2). The picture extends the initial ideas from [26, 50]. See text for details. Note that this picture does not include the interaction components that need to be added for services such as human-to-human communication or human-computer interaction (for further discussion see Chap. 11)

The triggering of an actual quality evaluation is represented in Fig. 2.3 by a quality-awareness component that operates like a cognitive gate, focussing the person’s attention on some sort of quality evaluation. The resulting reflection is linked with the identification of emotional, sensory, conceptual or actional quality features of the experience, as well as respective desired features. In this particular case, the output of the quality formation process, labeled as quality in the picture, corresponds to the quality of experiencing. According to the above definitions, the final step of quality formation lies in some kind of comparison of expected and experienced features (cf. Chap. 5 and further considerations on expectation in Sect. 2.2.1). An example of this case are unexpected events in the plot of a movie that the person watches, which may lead to a positive or negative judgment of quality of experiencing.

Since the experiencing results from the processing of the (perceived) character of the items under consideration (cf. Sect. 2.3.1), any impact of technology on the perceived character may alter the experienced, for example in terms of the degree of immersion, or enjoyment. During the reflection and attribution stage, the causes for certain states of experiencing may be reflected upon. Here, the technology or system as the underlying cause of enjoyment or annoyance may not be noticed as such. A typical example is that of a telephone conversation with substantial delay on the line, where experienced conversation problems may be attributed to the other interlocutor rather than the delay induced by the system [53, 57]. In cases where the perceived character is considered to be the cause, and the person attributes this to the system, the resulting quality evaluation may comprise both notions of quality of experiencing and of quality based on experiencing (“I did not enjoy watching the documentary on TV yesterday, since the quality of the picture was so bad”; “The movie session yesterday at your place was amazing, your projector is really awesome!”). Here, the quality awareness is triggered by events in terms of perceived character.

Another case is that of an explicit quality test in the laboratory or in the field. Here, test-specific contextual or task information affects the assumptions based on which a person may experience certain stimuli. According to Jekosch’s terminology, the person conducts a “controlled quality evaluation” [26], and quality awareness is triggered by the respective task- or context-based assumptions.Footnote 8 Here, the resulting quality typically corresponds to quality based on experiencing, of course also depending on the employed test method (see Sect. 2.4). A similar case is that of a person who has, for example, the intention to buy a new multimedia device. Then, too, respective assumptions may trigger an evaluation of different systems in the shop, leading to a judgment of quality based on experiencing.

In all of the cases discussed up to here, the quality evaluation of the service, application or system involves an actual experiencing including the respective perception process. However, as mentioned earlier, often times users or system designers develop a notion of assumed quality, before or without an actual process of experiencing taking place. This notion may strongly be influenced, for example, by what people read or hear about a product, or what they think about the brand. Here, the perception- and experiencing-path shown on the right side of Fig. 2.3 does not carry information from direct sensory processing that can be exploited during quality formation. It is currently under debate within the QoE community, which criteria must be fulfilled by respective non-perceptual sources of information, where no ground truth data in terms of explicit quality based on experiencing is available. Such sources of information can include system specifications, quality metrics such as PSNR (Peak Signal to Noise Ratio) or SSIM (Structural Similarity index, cf. [64] and Chap. 19), quality prediction algorithms, or even models of the human quality formation process (Sect. 2.4). It is generally accepted that key performance indicators (KPIs) such as packet loss or stalling rate alone cannot be used as a direct measure of quality in the sense laid out here (e.g. [17, 20, 49]).

For all the cases discussed above, the person’s state as well as his/her personality play a key role, and impact on multiple of the presented processes. Further, as outlined in the description of the perception and experiencing process in Sect. 2.3.1, personality is contained in the processes themselves, as well as the value system established by references. Due to their involvement of memory, and the respective access to this memory during quality formation, the underlying references are influenced by contextual factors and undergo temporal changes. In the process, perceptions and knowledge about the service or system are turned into references that belong to the domain of the person’s expectations. The reference formation and assessment of features in terms of their plausibility are performed in a top-down manner, where attentional processes at different levels of the perceptual-cognitive system of the human person steer the information provided by the bottom-up components (cf. Raake and Blauert [51]).

References and Semiotics

Let us now take a closer look at internal references and their use during quality perception as discussed above. A reference-related concept initially suggested by Piaget is the one of schemata and the respective formation or adaptation processes in terms of accommodation and assimilation (see e.g. Neisser’s [41] and Jekosch’s [26] works). This concept is useful for the (qualitative) understanding of perceptual and conceptual references and their formation: In case of unknown perceptual/cognitive information, a disequilibrium with available references (schemata) and thus the anti-cipated event may result. In this context, assimilation refers to the adaptation of the stimulus-related representation, so as to fit to an existing schema. In turn, accommodation refers to the case that not the representation of the perceived or experienced, but the (reference) schema is adjusted. If a person encounters multiple similar phases of experiencing over time, the initially flexible schema may be crystalizing into a new schema during a learning process. These considerations help to understand the formation of references for example when using new types of technology such as spatial audio or 3D video.

Fig. 2.4
figure 4

Quality perception in the context of creation/production. The person and creator may be identical. Quality of experiencing or quality based on experiencing (for the respective definitions see Sect. 2.2.2) will be used by the creator as target for optimization. Obviously, for the creator, the creation process is comprised in the experiencing, too. Note that the creator’s experiencing during creation is a different one from the experiencing of the created, cf. [26]

It is useful to further consider that perception and cognition as well as communication can be discussed in terms of the underlying “signs”. Jekosch [25, 26] has introduced semiotics, that is, the science of signs, into quality assessment research. Semiotics addresses the relation between the sign carrier and the associated meaning, and the perspectives of different persons that may interact with a sign such as a picture, video sequence or speech message. To this aim, different sign models have been introduced [26, 43, 44, 47]. The classical triadic form is composed of a sign carrier, the referent, and the meaning. In our context, the sign carrier may be the physical form of the sign as in case of a transmitted video sequence or a word in a phone call. The referent is the item the sign stands for, and may be abstract or concrete (for example, the specific chair shown in a video sequence). The meaning results from the interpretation of the sign by the interpreting person. The dynamic process during which the effect of a sign (or rather of an interconnected set of signs) is created is referred to as semiosis [43]. Here, semiosis can be any kind of interpretation of a sign by a cognitive system. Obviously, a different meaning may be assigned by different interpreting persons, who can have the role of the creator or the receiver of the sign.

Semiotics is a very useful concept to discuss, for example, the criteria based on which quality is being judged by a given person, that is, whether the sign carrier, the referent or the meaning for the person have been addressed. For the example of a photography, the carrier could be judged upon (in terms of the camera, lighting conditions, framing, coding, resolution, paper used for printing, etc.), the referent (what objects are shown in the picture), or the meaning (what does the picture tell, what is its impact on me, etc.?).

The creation and experiencing processes of media, involving technology during its creation, is illustrated in Fig. 2.4. It should be noted that technology comes into play at different stages here, namely at the creation stage, the (post-)processing stage, and the presentation stage (includes possible transmission and display). Further note that the presentation may also apply to the viewing by the photographer during post-production. Artists or content producers create entities (carriers, signs) that can be experienced, and thereby may attempt to deliberately provoke or achieve specific experiencing. As discussed by the authors of this chapter also in the Qualinet QoE White Paper [40], in terms of semiosis, “meaning” is associated with the creator’s intentions (“sender”), while at the “receiving” end, “meaning” results from interpreting the content during experiencing.

Expectations and Service Context

Let us now take a more service-oriented viewpoint, discussing that quality is based on the comparison of perception with expectations. The aspect of expectation has been addressed in a more global manner in the context of marketing research, considering the person’s role as a customer (cf. Fig. 2.1). Service Quality is used in the respective works by Parasuraman et al. [46], Boulding et al. [9], and Zeithaml et al. [66] in terms of perception vs expectations. Here, perception may refer to both the perception during encounter with a service, and the conceptual impression of the Service Quality related with a given company after a number of encounters, namely in terms of Customer Satisfaction and Dissatisfaction (CS/D, [9, 46, 66]).

A model of expectations vs perceptions-based Service Quality has been proposed by Boulding et al. [9], introducing different types of expectations. Here, the impact of external information on expectations is explicitly considered. To this aim, two types of expectation are distinguished, namely will expectations in terms of what users expect will be happening for their next interaction with a company’s service, and should expectations in terms of what should be happening for that next encounter, based also on what they may know about the performance of competitors’ services. Both types of expectations are assumed to be time-varying and dependant on what has been perceived during previous service encounters. Boulding et al. further contrast should expectations from ideal expectations in terms of what the customer wants “in an ideal sense” for the respective type of a service.

Zeithaml et al. [66] distinguish two levels of expectations in relation to the acceptance of a certain service configuration in a given context: (1) The “desired service” corresponds to what the user wishes to have, in terms of a construct in-between the should and ideal expectations as of Boulding et al. [9]; (2) the “adequate service” reflects what the user may still perceive as acceptable under given contextual and situational constraints, for example related with the current weather or the given location she is in (and, for example, respective degradations, as they may be encountered during mobile service usage). Hence, the “adequate service” expectation-level is what determines the acceptability for the customer.Footnote 9 The zone in-between the two expectation-levels (1) and (2) is referred to as the “zone of tolerance” for what is being perceived [66]. This concept of expectation has been adopted in recent work by Sackl and Schatz [56], who have applied it for explaining different quality tests that varied in terms of the considered user-types (affecting the ideal or “desired services” expectation level), and the context-specific influences (assumed to be affecting mainly the level of “adequate services”). It is noted that this Service-quality perspective bears several similarities to the quality taxonomies developed by Möller for different types of telecommunications-related services, see [37, 38] and Chap. 5.

Another noteworthy expectation-related perspective addresses the (product) features that underly customer satisfaction. According to Kano’s model [30], features can be subsumed in terms of three types of requirements: (1) Must-be requirements (sometimes referred to as hygiene-factors)—their under-fulfillment leads to dissatisfaction, while their fulfillment does not lead to satisfaction (example: today’s touch-control in smartphones); (2) one-dimensional requirements (i.e. performance-factors)—their fulfillment is linearly related with satisfaction (example: bandwidth of customer’s home internet connection); (3) attractive requirements—unexpected features that, if fulfilled, lead to delight (example: high resolutions of smartphone displays when first introduced some time ago). It is obvious that with time, features that have initially been of type (3) will ultimately end up to be features of type (1), that is, are generally expected to be fulfilled. We will not further detail the Kano-model and surrounding work in marketing research here. It is obvious that it is a useful tool for describing why certain service innovations such as color TV or later high definition video eventually become must-be requirements.

4 Quality Assessment

The central question for quality assessment is how to operationalize the concept of QoE in terms of performing reliable and valid measurements. The respective quality of quality assessment methods [37] is of cardinal importance, since the respective results can easily be misused. The overarching question is: How can we quantify quality, and how can we measure it? This question is of course not unique to media-related quality (of experiencing) as mainly addressed in this book, but also extends to numerous other disciplines, for example food quality (cf. Lawless and Heymann [32]) or service quality in a broader sense (cf. Parasuraman et al. [45], Reeves and Bednar [54]). In this context, according to Jekosch [25], assessment is the “measurement of system performance with respect to one or more criteria. Typically used to compare like with like, whether two alternative implementations of a technology, or successive generations of the same implementation”, with the criterion being quality based on experiencing or quality of experiencing. Ideally, quality assessment methodologies should act as a translator between the quality elements (see above and Chap. 4), and QoE, or the underlying quality features (see Chap. 5). Quality assessment methods can be classified into perception-basedFootnote 10 and instrumentalFootnote 11 ones, depending on whether human subjects are involved in the assessment process or not. A brief discussion of these two assessment approaches is given in the following. More details can be found, for example, in [37, 39, 50] for speech and audio quality, in [52, 63, 65] for video quality, and in [32] for food quality.

Perception-based methods are the most valid way to assess quality, and typically provide the ground-truth data for the development of instrumental methods. Perception-based methods are used in tests with human evaluators to gather quality-related information for a certain test condition or set of stimuli. To this aim, test subjects are presented with one or several simultaneously or subsequently available stimuli, or are involved in an interaction with a system or another person via the system. The test participants are asked for (quantitative) ratings of momentary or remembered quality on a set of scales, or of qualitative descriptions of the features of the stimuli. In a subsequent statistical analysis of their judgments, a QoE value for each of the test conditions is determined. This and more complex statistical analysis of the test data can provide information about the underlying structure and dependencies on the applied test conditions, that is, the quality elements.

Instrumental methods provide estimates of quality using an appropriate algorithm or instrument. These estimates are based on quality metrics such as the Peak-Signal to Noise Ratio [64], estimation algorithms such as the so-called E-model for speech [23], or explicit quality models that implement certain portions of the human perception and quality-formation process (peripheral signal processing, cognitive processing). The different algorithms are fed by a set of input features acquired from the technical system, or with signals as they would be presented to human assessors in a respective test. The type of model input can be utilized to classify different instrumental methods: (1) Signal-based models that employ the signal (as processed by the system) as single input (No-Reference methods, NR), or plus some reduced or explicit version of the reference (reduced- or full-reference models, RR, FR, respectively). (2) Parametric algorithms that predict QoE based on certain system or signal parameters. The latter can further be subdivided into (a) parametric planning models fed with a-priori known system parameters and (b) packet-level or bitstream models that extract parametric information at the packet level. (3) Hybrid algorithms, which apply a mix between signal-based and parametric information.

In addition to the above differentiation, assessment methods can be distinguished as utilitarian and analytical, depending on what type of output information they provide. Here, the term utilitarian makes direct reference to utility, and represents a typically single-valued index based on which systems or services can be ordered with regard to their quality. In turn, analytic means that the perceptual features relevant for quality are being assessed.

Utilitarian Quality Assessment The purpose of utilitarian measurements is to objectively quantify an “overall” or “general” impression of quality. This assumes that the subject is in some form of integrative state of mind, where the influence of the impression for the individual attributes, the context, the mood, the expectations, the previous experience, traditions and so on, are all combined into one single-valued rating (providing a ranking “worse-to-better”) that establishes the basis for some form of action of the person.

Analytic Quality Assessment The main aim of analytic assessment methods is to decompose and measure certain quality features related with a given stimulus or system (Chap. 5). They result in a multi-dimensional description inherent to the character of the experience. These different features can then be used either for diagnostic purposes, that is, when systems are analyzed, or for analyzing the relation between utilitarian quality and underlying stimulus characteristics.

5 Discussion

In this chapter, we have introduced a procedural model of the quality formation process, and have linked it with related quality and QoE concepts and research. The goal was to take a perceptionist’s view (cf. Blauert [6]) by treating QoE from an individual’s perspective. There are still crucial issues to be addressed in the context of quality and QoE research, and the application of respective methods. For the time being, the majority of research efforts has been focused on quality based on experiencing. Only little work has been devoted to assessing actual QoE in terms of quality of experiencing. One of the key challenges here is the handling of the respective, let us call it, Schrödinger’s cat problem of QoE research, namely, how can QoE be assessed without interfering with the experiencing, that is, how can random experiencing [26] be probed? This question is particularly important in the context of applications or systems that trigger new types of schemata or references, as in case of 3D Video or spatial audio (cf. Chaps. 17 and 20). Some approaches along these lines have been proposed by, for example, Staelens et al. [59] and Jumisko-Pykköo et al. [27]. Another approach is the assessment of an inferred quality of experiencing, for example in terms of the persons’ acceptance: If users are dissatisfied with a given usage session, they may abandon it, which may be observed in measures such as call durations (see Skype’s blog [55]), durations of watching individual videos (see Dobrian et al. [15]), or cancelation rates in web-browsing (see e.g. Shaikh et al. [58]). Another approach that is more instructive in terms of the quality-formation process, is not to ask persons for actual quality ratings, but rather try and understand what actually characterizes the experiencing, and what role the underlying quality elements play for it: Along these lines, physiological correlates of experiencing will be discussed in Chap. 8, the role of emotions in QoE will be addressed in Chap. 9 and Chap. 11 discusses the role of interaction performance for the QoE of interactive services or applications. Further work in this direction is related with the understanding of appeal of media such as pictures or movies, and the understanding of the role of quality elements and features in this context. These approaches will be supported by the explicit inclusion of exploratory and attentional processes in quality assessment and respective instrumental models, which is expected to gain further importance in future research [16, 18, 51].