1 The N400 and the Late Posterior Positivity

An ERP-component is the summation of the post-synaptic potentials of large ensembles (in the order of thousands or millions) of neurons synchronized to an event. When measured from the scalp, continuous ERP waveforms manifest themselves as voltage fluctuations that can be divided into components. A component is taken to reflect the neural activity underlying a specific computational activity carried out in a given neuroanatomical module. The N400 component is a negative deflection in the ERP signal that starts around 200–300 ms post-word onset and peaks around 400 ms. Besides the N400 component, there is a set of later positive-going ERP components that is visible at the scalp surface between approximately 500 and 1000 ms. The most prominent element is the late posterior positivity (LPP, also known as semantic P600), which is maximal at parietal and occipital sites.

The two most prominent interpretations of the underlying neuro-cognitive function of the N400 are the integration and the retrieval view. On the integration account, the N400 amplitude ‘indexes the effort involved in integrating the word meaning of the eliciting word form with the preceding context, to produce an updated utterance interpretation’, [DBC19]. On the retrieval/access account ‘the N400 amplitude reflects the effort involved in retrieving from long-term memory conceptual knowledge associated with the eliciting word which is influenced by the extent to which this knowledge is cued (or primed) by the preceding context, [DBC19]. What is left open by the above characterization is which properties of words and the context underly the N400 amplitude. Five prominent properties that have been suggested are (i) semantic features, (ii) plausibility, (iii) semantic similarity, (iv) selectional restrictions and (v) schema-based knowledge. However, taken individually, none of the five features can explain the N400 amplitude.

Evidence for semantic features as being correlated with the N400 amplitude comes from the fact that the correlation between the N400 amplitude and the cloze probability, that is the probability of a target word to be the best completion in a cloze test, is not monotone.

figure a

In (1) ‘pine’ but not ‘tulips’ comes from the same semantic category ‘tree’ as the best completion ‘palms’. Though ‘pines’ and ‘tulips’ have the same low cloze probability (< 0.05), their N400 amplitudes differ. Within category violations (pines) elicit smaller N400 amplitudes than between category violations (tulips). Federmeier and Kutas argue that this result suggests that it is feature overlap like being tall or having a similar form that affords within category violations a processing benefit relative to between category violations, [FK99, p. 485].

However, feature overlap with the best completion is not without exceptions, as shown by the following example.

figure b

Though the critical word ‘victims’ shares few semantic features with the best completion ‘divers’, no N400 effect is observed.

A second candidate is plausibility which can be quantized by offline rating tasks using, e.g., a Likert scale. Plausibility is often related to the integration view of the N400. The less plausible a resulting interpretation is the more difficult it is to integrate the critical word in the preceding context. Evidence for the role of plausibility comes from the fact that in the Federmeier & Kutas study best completions elicited the smallest N400 amplitude and the highest plausibility ratings. Between category violations elicited the largest N400 amplitudes and got the lowest plausibility ratings. Within category violations were intermediate on both variables, [FK99, p. 486]. However, in semantic illusion data like that in (3) no N400 effect is observed although the sentence has an implausible interpretation.

figure c

A third candidate is semantic similarity. On this account the N400 amplitude is modulated by the degree to which a critical word in a target sentence is semantically related to the words preceding it in the context. One way of quantifying semantic similarity is to use Latent Semantic Analysis. On this account pairwise term-to-document semantic similarity values (SSVs) are extracted from corpora (see [KBW0] for an application). Semantic similarity underlies the Retrieval-Integration model of [VCB18]. One of its strengths is that it can explain semantic illusion data as given in (3). As there is a semantic relation between the arguments preceding the verb (‘fox’, ‘poacher’) and the verb itself (‘hunted’) no N400 effect is expected for the verb.

However, similar to both the notions of semantic feature overlap and plausibility, there are counterexamples to the thesis that the N400 amplitude is (monotonically) related to the corresponding LSA value. Kuperberg et al. [KPD11] showed that the degree of causal relationship in three-sentence scenarios with matched SSVs influences the N400 amplitude: highly related < intermediately related < causally unrelated. The authors conclude that it is the situation model constructed from the context (message-level meaning) that influences semantic processing of the critical word and not semantic relatedness. Similarly, [KBW0] could show an influence of high- versus low-constraint contexts on the N400 amplitude for controlled SSVs.

A fourth property is related to selection restrictions imposed by verbs. Each verb imposes constraints on its arguments that are independent of the context in which it is used. One prominent example of such a constraint is animacy. Violations of selectional restrictions (typically) evoke robust N400 effects that are larger than those for non-expected words that do not violate these restrictions. Furthermore, the amplitude of the N400 in case of such violations is not modulated by semantic similarity measured by LSA.

figure d

In (4) taken from [PK12] both ‘drum’ and ‘coffin’ violate the animacy constraint imposed by ‘strum’ on its actor argument. Furthermore, though ‘drum’ is semantically more related to the preceding context than ‘coffin’ using LSA (0.18 vs. 0.01), the two N400 amplitudes did not differ. By contrast, the N400 amplitude evoked by words that do not violate selectional restrictions is modulated by semantic relatedness quantized by LSA.

figure e

Similar to the case of ‘drum’ and ‘coffin’, the semantic relatedness to the preceding context differs: 0.18 for ‘drummer’ vs. 0.00 for ‘gravedigger’. However, in contrast to (4), in (5) the N400 amplitude for the semantically unrelated ‘gravedigger’ is larger than that for ‘drummer’.

However, violations of selection restrictions need not always produce an N400 effect, which brings us to the fifth property that is related to schema-based knowledge.

figure f

In (6) ‘jacket’ violates the selection restriction (animacy) imposed by the verb ‘build’. Although a robust (large) N400 effect is expected due to the restriction violation only an attenuated N400 is measured compared to the expect ‘snowman’. This data suggests that the N400 is also modulated by schema-based knowledge about a particular scenario that is depicted by a discourse (cf. [PK12, KBW0]). This knowledge is based on a semantic network of interrelated concepts and goes beyond the information provided by words in a single sentence. For example, in (6) a winter scene involving children is described. The corresponding semantic network is related to the clothes of the children which are most likely such that they keep warm, a condition satisfied by jackets. Evidence for such a dependency of the N400 on schema-based knowledge comes from the fact that the attenuation of the N400 effect of such examples depends on the context in which the target sentence containing the critical word is embedded. Leaving this context out, e.g. the two sentences preceding the target sentence in (6), leads to a robust N400 effect on the critical word.

The second ERP component that we are considering here is the Late Posterior Positivity (LPP) which is usually associated with the impossibility of an interpretation. Evidence for this functional interpretation comes from examples like those in (7)

figure g

In each case an LPP is elicited due to the violation of a selection restriction that blocks a direct interpretation. An LPP is not only elicited if there is a violation of selection restrictions, but also if direct interpretation is blocked differently. One example are so-called reversal anomalies that are a subset of the semantic illusion data.

figure h

In (8) no selection restriction is violated as both arguments satisfy the animacy constraint imposed by ‘serve’. What is unexpected and explains the elicited LPP is the assignment of thematic roles. Instead of the waitress being the actor and the customer being the theme, the roles are reversed. An LPP can be also elicited on the discourse level:

figure i

In (9) it is the order of events which is unexpected. The first sentence triggers schema-based knowledge about a restaurant which includes particular kinds of actions that are partially ordered. This ordering requires the opening of the menu to occur before the leaving of the restaurant.

In summary, an LPP is elicited whenever a direct interpretation is impossible due to selection restrictions or world knowledge about thematic roles or schema-based knowledge.

2 The Functional Interpretation of the N400 and the LPP

Our main theses concerning the two ERP components are: (i) Two principle levels of representation must be distinguished: situation models (global) and event models (local); (ii) predictions are related to the level of situation models whereas integration operations are related to both levels; they are based on (a) syntagmatic relationships, (b) semantic features and (c) world knowledge; (iii) the N400 is directly related to predictions and, therefore, to the level of situation models; in addition, it is related to integration at the level of situation models but not to integration at the level of event models; its amplitude is modulated by a complex semantic-cognitive property and a pragmatic (discourse) property related to linking, i.e. the referent of the critical word needs to be linked to an object that is already part of the current situation model; and (iv) the LPP is related to failure at the level of integration at the situation model and at the event model.Footnote 1

In this article we will pursue two aims that are closely related. On the one hand, we will combine functional interpretations of the N400 and the LPP that have been given in the neurolinguistic literature (access and integration) with concepts used in formal semantic theories (e.g. update operations). On the other hand, we will outline an extension of a dynamic semantics in which these functional interpretations can be incorporated. For example, we interpret access as the introduction of objects or features into the model and integration as an update operation. Predictions are modelled in terms of probability distributions on frames.

2.1 Predictions and Situation Models

We follow growing evidence that predictions are based on scripts. A script is a standardized sequence of events that together make up a particular complex situation and that describes some stereotypical human activity such as going to a restaurant or visiting a doctor. Script knowledge is common knowledge that is shared between speakers of a community or culture. This knowledge comprises information about sorts of events and the sorts of objects typically involved in the realization of a script. In addition, it includes information about the temporal and causal relations between the events and which sorts of objects are related to which sorts of events. Consider the following example from [MTD+17].

figure j

These examples are partial descriptions of a restaurant script. Knowledge about such a script includes knowledge about events like ordering, bringing and eating as well as objects participating in these events like waitresses, food and bills. One possible temporal ordering of the events is: enter, being seated, bring menu, order food, bring food, ask for bill, bring bill, pay bill, leave. Examples like (10) show that script knowledge not only constrains the sort of objects participating in an event relative to a particular thematic role but that the sort of the object also depends on the temporal placement of the event in the temporal order specified by the script. Theoretically, a bringing event as in (10-a) can be located at any of the three possibilities in the temporal order. Hence, expected objects are (instances of) food, the menu or the bill. By contrast, in the context of (10-b) the bringing is temporally located after the being seated so that the menu is the most expected object. Finally, in (10-c) the expected object is the food because the bringing event is temporally located after the ordering. The two above examples show that script knowledge can impose additional constraints on objects and events by constraining for particular events and objects participating in them. On the other hand, a context can constrain strongly for a particular situation model but not for a specific event or a specific object that is related to this event [KJ16]. For example in the blizzard example in (6), the jacket is not expected as a theme of the building event. Semantic processing of the critical word ‘jacket’ is facilitated because the semantic features associated with its interpretation are expected relative to a particular situation model (winter scene) and an object already introduced into this model (children) though these features are (highly) unexpected or even anomalous relative to the current event model (building). Objects of sort ‘jacket’ are expected as clothes of the children because the situation model ‘winter scene’ expects clothes that keep warm.

The important point about script knowledge is that upon its instantiation it activates a network of individuals and events as well as relations between these objects. Given such a network, predictions are not restricted to the current event (e.g. ‘What is being brought?’) or the next event (e.g. ‘Which event is mentioned next and which objects participate in it?’) but can target both objects that have already been introduced into the current situation model (‘What were the children wearing?’) as well as objects that are likely to be encountered in the continuation of the description of the situation (e.g. the bill and the leaving event). Two principle cases need to be distinguished: (a) to what degree are the features (properties) of a newly introduced object expected given the current partial description of a situation model?, and (b) can a newly introduced object be related to an object that has already been introduced into the current situation model?

More formally, suppose that a context specifies a situation model whose prototypical realization consists of the action sequence \(e_1 \ldots e_r\) with objects participating in them given by the set \(\{ o_1, \ldots , o_t \}\) and that so far the initial sequence \(e_1 \ldots e_k\) has been introduced into the context. Predictions are possibly related to any of the events \(e_{k+1} \ldots e_r\) and objects participating in them as well as relative to participants that are related to objects involved in one of the events \(e_1 \ldots e_k\). Hence, script knowledge allows to capture ‘long-range dependencies, [MTD+17].

2.2 The Functional Interpretation of the N400

Expectations are based on semantic features (or properties) of objects. Consider again example (1) repeated below for convenience.

figure k

Given the preceding context, expected features are the tropics as the natural geographical range, and tall trees as sort for visability. Objects that satisfy all of these features, like palms, are most expected, followed by objects like pines that satisfy a proper subset (being trees and being tall) and objects like tulips which satisfy none of these features being the least expected. The N400 amplitude is modulated in accordance with these expectations leading to our first thesis concerning the functional interpretation of the N400:

figure l

However, the following example shows that this thesis is too weak to fully capture the behavior of the N400.

figure m

(13) is a partial description of a concert scenario whereas ‘the bass was strummed ...’ is a partial description of an event in the concert. Each realization of such a scenario has attributes musicians, instruments and actions whose values are the set of musicians, instruments and actions, respectively. For example, for the partial description in (13) one has: musicians = \(\{ pianist \}\), instruments = \(\{ bass \}\) and actions \(\{ play, strum\}\). Predictions are related to features of objects belonging to the values of these attributes: How likely is it that this concert also has a drummer or a guitarist, respectively and how likely is it that a drum is an instrument? For these objects, the respective probabilities are high. For example, guitarist and drummer are expected as extensions of the value of the musicians attribute whereas a drum is expected as an extension of the value of the instruments attribute. By contrast, neither a gravedigger nor a coffin are expected relative to these two attributes. Thus, according to thesis (12), one would expect a larger N400 amplitude for ‘coffin’ than for ‘drum’, contrary to the fact that both elicit amplitudes of the same magnitude. One may argue that the amplitude of the N400 in examples like (13) is due to a selection restriction violation which overrides any semantic relationships based on features and world knowledge. However, this strategy fails to explain the absence of an N400 effect for the critical words in (6) as well as for the critical words in the following semantic illusion data in which the thematic role assigned to the argument(s) clashes with the constraints imposed by the verb on these roles.

figure n

The problem with (12) is that it ignores semantic relationships that exist between objects belonging to different attributes. It does not constrain how a newly introduced object is or can be related to objects that have already been introduced into the current situation model. We hypothesize that the difference between ‘drummer’ and ‘guitarist’ on the one hand and ‘drum’ and ‘coffin’ on the other lies in the way they can be anaphorically linked to the preceding context. Consider first the examples in (15) taken from [Bur06].

figure o

Burkhardt found an attenuated N400 effect for bridged DPs (Konzert - Dirigent) and an enhanced effect for new DPs (Nina - Dirigent) compared to the given DP (Dirigent - Dirigent). We follow Burkhardt and assume that this modulation of the N400 amplitude is related to discourse linking. In (15) this modulation cannot be related to the event model to which the object introduced by the interpretation of the critical word belongs because this object is the first to be introduced into this model. Rather, what is at stake in these contexts is a constraint to the effect that the newly introduced object needs to be linked to an object that has already been introduced. In (15-a) and (15-b) a concert script (scenario) is introduced in the first sentence. In (15-b) this situation model is explicitly introduced by the DP ‘the concert’. In (15-a) the interpretation of the two occurrences of the DP ‘the conductor’ can be anaphorically linked by the relation of identity. In (15-b) the interpretation of ‘the conductor’ can be linked to the interpretation of ‘the concert’ in the preceding context. The conductor is the value of an attribute that is defined for the concert, e.g. the attribute conductor. By contrast, in (15-c) no situation model to which an object of sort ‘conductor’ can be linked is explicitly introduced. As a result, ‘the conductor’ cannot be anaphorically linked to the preceding context.

We generalize discourse linking in the following way. Let \(o_1 \ldots o_k\) be the objects already introduced into the current situation model and let \(o_{k+1}\) be the object related to the interpretation of the currently processed word, \(o_{k+1}\) has to be linked to an \(o_i, 1 \le i \le k\). As a consequence, linking can also be done relative to the current event model. Linking defined in this way satisfies ‘maximize anaphoricity’ because each newly introduced object needs to be related to an object already introduced and is therefore a necessary condition to ensure discourse coherence. When taken together we arrive at our second hypothesis for the functional interpretation of the N400 component.

figure p

On this approach, the effect of a selection restriction violation is to exclude one possibility of linking the critical word to the current situation model via a particular thematic role in the current event model. This violation alone is therefore not sufficient to block the establishment of a bridging inference. This is different if the situation model is reduced to a single event model, e.g. if the context is made up by of a single sentence.

figure q

In (17) ‘sour’ must be linked to the trains because no other objects have been introduced so far. Since there is no attribute of objects of sort ‘train’ for which ‘sour’ is an admissible value, linking fails. As expected ‘sour’ in (17) elicits an N400.

2.3 The Functional Interpretation of the LPP

According to the preceding section, the N400 is based on two factors: paradigmatic relationships based on semantic features and anaphoric linking, or, more generally, the establishing of a bridging inference. The linking operation fails, if no bridging inference can be established. We hypothesize that this failure triggers a revision-modification operation. At the ERP level, this operation is indexed by the LPP.

figure r

An example of a revision operation is to question bottom-up information. Consider e.g. (15-c). A comprehender could countenance a reading or hearing error and assume ‘a conductor’ instead of ‘the conductor’. Alternatively, bottom-up information already processed can be questioned in a similar way. Depending on which bottom-up information is questioned, situation models that have already been discarded can again become options. A third strategy is to extend the current situation model with additional information. One possibility is to introduce a concert as the subject or topic about which Tobias talked to Nina. This has the effect that some other situation models that are options according to bottom-up information become excluded, for example, models in which the topic is not a concert.

If the linking operation succeeds, the current situation model is updated with the information provided by the critical word. This success does not imply that a corresponding transition at the level of the current event model is possible as well. Two principle cases must be distinguished: For the first case consider example (19).

figure s

Although the critical word ‘served’ can be linked to an object already introduced into the current situation model (‘restaurant’) and no selection restriction violation occurs an LPP is elicited. Recall that predictions relative to arguments of a verb are dependent on the placement of the events in the temporal ordering if the sort of events denoted by the verb can occur more than once in this ordering. Generalizing this pattern, one has that each sort of objects that is admissible in a particular situation model is related to a particular set of action-role pairs that specify in which actions it can occur in which thematic roles in this situation model. For example, in a restaurant script an object of sort ‘waitress’ is at least assigned the set \(\langle serve, actor \rangle , \langle ask, actor \rangle , \langle ask, theme \rangle \). If objects of a particular sort are assigned such a set, they are said to be free only for pairs in this set. We hypothesize that if the interpretation of the critical word is assigned an action and a thematic role for which it is not free, an LPP is elicited. This is the case if an object of sort waitress is assigned the theme role in a serving event in a restaurant scenario. The second principle case occurs if objects are not assigned action-role pairs that are relevant in the situation model.

figure t

In (20), none of the action role pairs like \(\langle dig,actor\rangle \) associated with the object ‘gravedigger’ is licensed by the situation model ‘playing music’. Such objects are free for any action-role assignment that respects the selection restrictions and no LPP is elicited. This accounts for the absence of an LPP for ‘gravedigger’ in a ‘playing music’ script. We hypothesize that freeness is a second factor underlying the LPP.

figure u

In response to a violation of a freeness constraint, one strategy open to a comprehender is to extend the set of possible situation models by changing the freeness constraint. For example, upon encountering ‘The restaurant owner forgot which waitress the customer had served’, a comprehender can extend his action-role assignments for restaurant scripts by adding the action role pair \(\langle serve, actor\rangle \) to the sort ‘customer’ and the pair \(\langle serve, theme\rangle \) to the sort ‘waitress’. As a result, restaurant scripts now also allow serving events in which customers serve waitresses. Freeness is a special case of anaphoric linking that differs from it in the following two respects. First, in contrast to linking, freeness is restricted to the current event model and second, satisfaction of sortal constraints is not sufficient as shown by (19).

2.4 The Processing Model Underlying the N400 and the LPP

The processing model outlined in the last two sections based on particular functional interpretations of the N400 and the LPP consists of three steps. In the first step participants and actions must be linked to objects that have already been introduced into the current situation model. The leading question is: ‘Does this information continue information already supplied in the context?’ Success of this linking operation is a precondition for the next operation to be applied. This has the following consequences: (i) if the linking operation fails, neither paradigmatic relationships based on semantic features nor the freeness constraint play a role and (ii) as an effect, the N400 amplitude is therefore not modulated by this relationship and this constraint, in accordance with the empirical findings about this component. Failure of the linking operation triggers a revision-modification operation that is indexed by the LPP. Processing of the remaining text is continued on the basis of the result of this operation. In the case that the linking operation succeeds, the second step consists in integrating the new information into the current situation model. The leading question is: ‘How probable is this information given the information in the context?’. This operation is related to paradigmatic relationships based on semantic features. As an effect, the modulation of the N400 amplitude is graded. Hence, the N400 is related to the linking operation in a double way. If it fails, an N400 effect is elicited and if it succeeds a graded N400 effect results with the limiting case that no N400 effect is elicited. The final step is related to integrating the new information in the current event model. This integration fails if the freeness constraint is violated. Similar to the failure of the linking operation, a revision-modification operation is triggered which is indexed by the LPP. Processing is continued with the result of this operation which, again, is a changed model. Since the N400 indexes integration at the situation model, no N400 is elicited if freeness is violated. If this constraint is not violated, the new information gets integrated into the current event model without eliciting an LPP.

The LPP indexes the impossibility of executing an integration operation, either at the level of situation models or at the level of event models. Both the N400 and the LPP are elicited at most once. If linking fails, a biphasic N400 - LPP is elicited. Since the other operations are not executed no second effect in relation to these two components is produced. If linking succeeds, no LPP in relation to this operation is elicited. Such an effect is still possible if freeness is violated. Similarly, an N400 effect can be produced in relation to the integration operation based on (successful) linking and the semantic-cognitive property. Since a violation of freeness does not elicit an N400 effect, this effect is produced at most once.

Empirical evidence for this model is based on two studies. First, a study by [DMK16] challenges the one-step model of language comprehension proposed in [HHBP04], who considered sentences like (22).

figure v

For each sentence, the N400 amplitude was measured relative to the critical word. They found that there was no difference in the N400 onset or peak latency between the semantic violation ‘sour’ and the world-knowledge violation ‘white’. The authors concluded that semantic and world knowledge are processed in parallel during language comprehension. In a recent study this conclusion was challenged by [DMK16]. Similar to [HHBP04], the authors used correct sentences, semantically violated sentences and sentences violated by world knowledge. In contrast to [HHBP04], the critical word was kept constant. In addition to analyzing standard measures for component onset, i.e. the fractional area under the N400 curve and the relative-criterion-peak latency measure, they used a cluster-based permutation test that is sensitive to picking up differences by taking into account biophysical constraints in the testing procedure and which are able to deal with the multiple comparison problem. Specifically, this method allowed to determine the time point at which each of the conditions reached a fixed 2 \(\upmu \)V criterion starting from the peak preceding the N400. When using this method, the authors found that the semantic violation condition differed significantly from the world-knowledge condition with regard to the time point when the 2 \(\upmu \)V criterion was reached: the former crossed this criterion earlier than the latter.

The second study is [PK12] who found that the onset of the LPP to selection restriction violations in examples like ‘The pianist played the music while the bass was strummed by the drum / coffin during the song’ was somewhat later (approximately 100 ms) than the LPP evoked on verbs in semantic illusion data like ‘The restaurant owner forgot which waitress the customer had served’. Applied to our approach, the results of the study in [DMK16] support a sequential execution of the operations associated with linking and the semantic-cognitive property. Success of the linking operation is a precondition for the execution of the operation associated with the semantic-cognitive property. The result by [PK12], on the other hand, is evidence for a temporal dissociation of the two conditions evoking an LPP: failure of linking and violation of a freeness constraint.

3 The Formal Framework

In order to account for the empirical neurophysiological findings in the previous sections in theoretical linguistics it is necessary develop a truth-theoretical formal semantics that reflects the empirical results. Our approach is based on Frame Theory and Incremental Dynamics extended by continuations.

3.1 Frame Theory

Frames are elements of a separate domain \(D_f\) of frames. Each frame is related to a particular object (an individual or a (complex) event) as its root and is a partial description of that object in a particular world. Being a partial description of an object, a frame is linked to a relational structure that is built by (finite) chains of attributes. This link is captured by a function \(\theta \) which maps a frame f to a set of pairs \(\theta (f) = \{ \langle R_1,o_1\rangle , \ldots \langle R_n,o_n\rangle \}\); each pair consists of an attribute chain \(R_i\) and an object \(o_i\) that is related to the root of the frame by the chain. The \(R_i\) are 3-ary relations (\(R_i\subseteq D_f\times D_o \times D_o\)) that are functions in the sense that different objects cannot be related to the frame root by the same chain. Being partial descriptions, frames can be ordered by the information ordering \(\sqsubseteq \). A frame \(f'\) is an extension of a frame f (\(f \sqsubseteq f'\)) iff (i) f and \(f'\) have the same root and (ii) \(\theta (f) \subseteq \theta (f')\). Furthermore, \(f''\) is said to be a subframe of f (\(f''\preceq f\)) if it is embedded in f, that is in f there is a chain connecting the root of f with the root of \(f''\) (e.g., a conductor frame is a subframe of a concert frame). For a given object, its associated frame stores information got during a discourse so far as well as world knowledge. Besides the domain \(D_f\), there are the domains \(D_i\) of individuals and the domain \(D_e\) of events, which together make up the domain \(D_o\) of objects. We extend our approach in [NP19a] by set-valued frames for the current situation. Situation models sm are based on complex events. Their associated frames \(f_{sm}\) have an attribute actions whose value is the set of actions (events) occurring in this scenario together with an associated frame (denoted by \(a(f_{sm})\)). A second attribute is participants whose value is a set of individuals together with an associated frame \(p(f_{sm})\). Each element of this set is related to at least one action or one other participant, the set of these pairs \(pr(f_{sm})\) is the value of the attribute participancy_relation. The value of the attribute order is a set \(o(f_{sm})\) of pairs of events that preorders the value of the actions attribute. Situation frames are sorted by SM which are sorts of complex events like ‘wintery scenario’ or ‘restaurant scheme’.

Our frame theory is embedded into a particular type logic that combines de Groote’s continuation-based framework with van Eijck’s Incremental Dynamics, [DG06, vE01]. De Groote, [DG06], extends Montague’s framework with a continuation-passing style technique. In addition to the two basic types e of entities and t of truth values, there is a third type \(\gamma \), representing the type of contexts or environments. Terms of this type store the information from what has already been processed in the computation of the meaning of the whole discourse, [Leb12]. The type \(\gamma \) is taken as a parameter which can define any complex type. This has the effect that the context can easily be elaborated without affecting the core of the logical framework, [Leb12]. For example, in [DG06] the context is a list of objects (or discourse referents), whereas in [Leb12] it is taken as a list of propositions or a list of pairs consisting of an object and a proposition. The interpretation of a sentence can change the context, e.g. by adding a new object or by adding an anaphoric relationship between discourse referents. This updated context needs to be passed as an argument to the interpretation of the next sentence. In De Groote’s approach this requirement is implemented by defining the meaning of a sentence not as a set of contexts or a relation between contexts but as a function of its (input) context and a continuation with respect to the computation of the meaning of the whole discourse. Specifically, continuations are of type \(\langle \gamma , t\rangle \). Hence, a continuation denotes what is still to be processed in the computation of the meaning of the whole discourse, [Leb12]. As a result, the interpretation of a sentence is of type \(\langle \gamma , \langle \langle \gamma , t\rangle , t \rangle \rangle = \varOmega \). For example, the interpretation of (23-a) is (23-b).

figure w

In (23-b) \(c^*\) is the context obtained by updating the input context c. The conjunct \(\phi (c^*)\) indicates that an updated context is passed as an argument to the continuation of the proposition expressed by (23-a). If the context c of type \(\gamma \) is interpreted as a list of objects (or discourse referents), both proper names in (23-a) contribute an object. For example, the interpretation of ‘John’ is (24).

figure x

In (24) P is a dynamic property of type \(\langle e, \varOmega \rangle \) and  :  :  is an update function of type \(\langle e, \langle \gamma , \gamma \rangle \rangle \), i.e. it maps an object and a context to a (new) context. Applied to (23-b), the updated context is \(c^* = j {:}{:} m {:}{:} c\). When taken together, the interpretation is (25) and the updated context j :  : m :  : c is accessible by future computations.

figure y

The update of a discourse interpreted as D with a sentence interpreted as S both of type \(\varOmega \) is defined by \(\lambda c. \lambda \phi . D(c)(\lambda c'. S(c')(\phi )) \).

We follow [NP19b, NP19a, NPG18], based on Incremental Dynamics, [vE01], and take a context as a stack. A stack can be thought of as a function from an initial segment \(\{0, \ldots , n-1\}\) of the natural numbers \(\mathbb {N}\) to entities of a domain \(D_o\) that are stored in the stack. Hence, a stack can equivalently be taken as a sequence of discourse objects \(\{ \langle 0, d_0 \rangle , \ldots , \langle n-1, d_{n-1} \rangle \}\) of length n. If c is a stack, |c| is the length of c. By c(i) we denote the object at position i at stack c. A link between stack positions and discourse objects that are stored at a position is established by two operations. First, there is a pushing operation:

figure z

Pushing an object d on the stack extends the stack by this element at position |c|. The second operation retrieves a discourse object from the stack.

figure aa

We write c[i] for ret(i)(c). In our application objects stored at a position i are pairs consisting of an object and an associated frame. Such objects are called discourse objects.

3.2 Adapting the Framework

The framework introduced in the last section still resembles standard semantic theories in one important respect. The interpretation of sentences is derived in parallel to its syntactic structure. This way of deriving the interpretation is not built on an incremental left-to-right processing strategy. For example, a sentence with a transitive verb is derived by first combing the verb with the direct object and only then is the resulting VP combined with the subject. By contrast, neuro- and psycholinguistic studies and experiments are based on an incremental left-to-right processing strategy. This makes it necessary to calculate semantic representations for non-constituents. For example, in the context of ‘The cat chases ...’ it is necessary to have a semantic representation of the combination of the NP and the verb before the second NP is encountered, [BS17]. This example also shows a second problem. ‘The cat’ can be interpreted e.g. as actor, as theme or as experiencer. This indeterminacy of a thematic role assignment must be modelled too in a formal framework.

Incremental Left-to-Right Processing. As our starting point for implementing an incremental left-to right processing strategy we choose [BS17], which presents an event semantics with continuations based on [DG06]. In this framework all expressions are translated as terms of type \(\langle \langle t, t\rangle , \langle t, t\rangle \rangle \). For example, the general format for the interpretation of a verb is (28).

figure ab

In (28) c is of type \(\langle t, t \rangle \) and ranges over continuations which take the existential quantifier in their scope; p is of type t and ranges over continuations within the scope of the quantifier and which provide additional information about the event. (28) maps two continuations to a truth value. This type is also used for the determiner ‘a’ and the interpretation of common nouns. This has the effect that the general rule of combination is functional composition: \(\llbracket A + B \rrbracket := \lambda c.(\llbracket A \rrbracket (\llbracket B \rrbracket (c)))\). A verb and its arguments are combined by thematic roles, which too are of type \(\langle \langle t, t \rangle , \langle t, t \rangle \rangle \).

figure ac

Note that the interpretation of common nouns and the determiner ‘a’ contains a (possibly free) indexed object variable. When a determiner and a common noun are combined, it is supposed that both indices are the same. Furthermore, the interpretation of a thematic role contains a free event variable which is assumed to be the same variable as the event variable of the verb. This has the effect that constructions with more than one verb cannot be accounted for.

We will adapt this framework in the following way. First, instead of having contexts of type t and continuations of type \(\langle t, t\rangle \), we follow de Groote and have contexts of type \(\gamma \) and continuations of type \(\langle \gamma , t\rangle \). Second, in our approach objects that are associated with the interpretation of lexical elements are always related to the current situation model and/or the current event model, which are interpreted as discourse objects, i.e. pairs consisting of a (complex) event and an associated frame. Both kinds of models are not fixed once and for all but change during the processing of a discourse. For events, this is obvious because with each verb a new event is introduced. Empirical evidence for a fine-grained individuation of situation models comes from ERP-experiments using data like the following.

figure ad

[DDC18] found an N400 effect at the critical word ‘Joggen’ compared to the critical word ‘Abtrocknen’. This is taken by the authors as evidence that comprehenders expect the description of a situation model (or a complex event) to be continued in the next sentence or the subsequent discourse. Whenever this expectation is not satisfied because a new situation model (or complex event) is described an N400 effect is elicited. For example, in (30) a breakfast scenario is followed by a scenario describing an outdoor activity. Hence, two different situation models are involved. In our approach situation and event models are similar to indexical elements of a discourse like the speaker, the speech time and the reference time which, too, change during the processing of a discourse due to new bottom-up information. We therefore assume that the current situation model and the current event model are stored in particular stack position called sm and em, respectively. Specifically, we assume that they are stored at the positions 0 and 1, respectively. This has the effect that the current situation model and the current event model are always accessible if new bottom-up information is processed. In contrast to other elements like the speaker or the reference time situation models and event models are built up incrementally.

The Interpretation of DPs. We follow [Cha15] and [BS17] and assume that the interpretation of a verb in the lexicon does not (yet) provide information about thematic roles. Rather, thematic roles are introduced separately. Specifically, we assume the following structure for DPs: \([[Det N]_{DP_{1}} [TR]]_{DP_{2}}\). Whereas N provides sortal information, TR assigns a thematic role by which the object introduced by the interpretation of Det is related to the event introduced by the interpretation of the verb. On this interpretation the assignment of a thematic role can be taken as a non-deterministic operation that introduces branching.

Evidence for such a non-deterministic assignment is the fact that semantic processing in the brain is done in a left-to-right, incremental manner (see [BS17] for examples and further evidence). Further empirical evidence for such an analysis of thematic roles comes from studies involving languages like German in which the thematic role can at least sometimes be uniquely determined from the case of the determiner. Consider the following examples from [FS01].

figure ae

The authors observed an N400 effect at the position of an inanimate subject (actor) following an animate object (theme) in German verb-final sentences, (31-b). No such effect was found for (31-a) where both arguments are animate. In our approach this is explained as follows. In (31-b) ‘welchen Angler’ is (deterministically) assigned the theme role because ‘welchen’ being accusative only allows for this role. As an effect, the actor argument is expected next. However, ‘Zweig’, being inanimate, cannot be assigned this role so that an N400 effect compared to ‘der Jäger’ is elicited (see [BSS08] for a similar analysis based on predictions). If in English or Dutch thematic roles were assigned on the basis of a thematic role hierarchy (actors outrank themes) or a syntactic analysis based on an NP VP structure, one would likewise expect an N400 to be elicited on the verb or the second noun in fragments like ‘For breakfast, the eggs would eat...’ and ‘De speer heeft de atleten...’. However, no such effects are observed (see [BSS08] for further discussion).

The Interpretation of Common Nouns and Verbs. The interpretation of common nouns and verbs has to reflect the fact that each lexical element of one of these two syntactic categories can possibly modulate the amplitude of the N400 as well as that of the LPP. According to the analysis of these ERP-components given above, the N400 amplitude is modulated by a linking property and paradigmatic relationships based on features. By contrast, the LPP is related to the failure of a constraint. Either linking fails or a freeness constraint is violated. These properties and constraints apply to the level of situation models and/or the level of event models. Following the considerations in Sect. 2.4, we further assume that there is a temporal dissociation between the two properties: the linking property applies before paradigmatic relationships are applied.

The relation between these constraints and our formal framework is the following. Consider the case of common nouns.Footnote 2 They are part of DPs with the structure \([[Det N]_{DP_{1}} [TR]]_{DP_{2}}\). Each component in this structure is related to a particular update operation, which, in turn, is correlated to a particular information ordering. Furthermore, the properties associated with the ERP components are related to these constituents and their update operations in a particular way. Similar to standard dynamic approaches, the interpretation of the determiners ‘a’ and ‘the’ is a domain expansion operation: a new object is pushed on the stack. Hence, this operation is directly related neither to the current situation model nor to the current event model. The interpretation of the nominal element (i.e. the head noun) is related to linking and paradigmatic relationships based on features and applies to the level of the current situation model. Linking is modelled as an update operation that targets the participancy_relation attribute in these models. This operation tests whether the frame component of a newly introduced discourse object o can be a subframe of an extension of the frame component of an object \(o'\) that has already been introduced into the current situation model. If this test is successful, the pair \(\langle o', o \rangle \) is added to the value of the participancy_relation attribute. For example, in (15-a) above the conductor can be linked to the concert by extending the concert frame with the attribute conductor whose value is the frame associated with the interpretation of ‘conductor’ in the second sentence. Linking fails, if no such relationship between o and some \(o'\) in the situation model can be established. In this case none of the remaining update operations are executed. The update operation associated with paradigmatic relationships is related to the participants attribute. It adds the newly introduced object together with its associated frame to the value of this attribute. The precondition of this update operation is a successful execution of the linking update operation. This means that there must be an \(o'\) such that \(\langle o', o\rangle \) is an element of the participancy_relation attribute. This operation has no side-effects, i.e. it always succeeds provided its precondition, the update operation associated with the linking property, is satisfied.

The operations associated with linking and paradigmatic relationships based on features together integrate a new object into the current situation model. However, success of these two operations does not guarantee that the newly introduced object can be successfully integrated into the current event model as well. Integration at the level of an event model is always related to the current event and a thematic role. This integration operation fails if a freeness constraint associated with the sort of o is violated. If successful, this update operation adds the pair \(\langle R_{tr}, o\rangle \) to the value of \(\theta \) for the current event. This operation is related to the TR element in a DP structure. The relationships between DP structure, update operations, and the levels and attributes they apply to is summarized in the Table 1 and formally defined in Sect. 3.3.

Table 1. Update operations and the level and attributes they apply to.

3.3 Formal Definitions of the Update Operations

The update operations listed in Table 1 are uniformly of type \(\langle \gamma , \langle \langle \gamma , t \rangle , t \rangle \rangle = \varOmega \). The update operation interpreting the determiners ‘a’ and ‘the’ is defined in (32).

figure af

The determiners ‘a’ and ‘the’ push a new discourse object on the stack, i.e. they add such an object to the input context. \(R_o\) is the lift of the domain \(D_o\) to the relational level with frames: \(R_o(f)(o)(o') = 1\) iff \(root(f) =o \wedge o = o' \wedge o \in D_o\). The frame component is the most general one which applies to any object in the domain because so far no sortal information is provided (see [NP19a] for details). The update operation correlated with linking is defined in (33).

figure ag

The constraint \(o \in D_{\sigma }\) is related to the sortal information of the head noun. For example, for ‘dog’, \(\sigma = dog\) and \(D_{\sigma }\) is the set of dogs. The linking operation tests whether the frame \(f_o\) associated with the newly introduced object o is a subframe (\(f_o \preceq f'_{o'}\)) of an extension \(f'_{o'}\) of the frame \(f_{o'}\) (\(f_{o'} \sqsubseteq f'_{o'}\)) associated with an object \(o'\) already in the current situation model (see [NP19a] for definitions and further details). If this test succeeds, the pair \(\langle o', o\rangle \) is added to the participancy_relation attribute of the current situation model. It is not required that \(o'\) be an element of the input context c. This is the case because objects can be added via a modification operation (accommodation) to the current situation model if linking fails. (see below Sect. 4.3 for details). The frame component is not added because this is accounted for in the update operation correlated with paradigmatic relationships based on features, which is defined in (34).

figure ah

This operation always succeeds provided the preceding update operation associated with linking succeeds. It adds the newly introduced object together with its associated frame o at position \(c[|c|-1]\) to the participants attribute of the current situation model. The associated frame is extended by the sortal information provided by the head noun. Similar to \(R_o\), \(R_{\sigma }\) is the lift of the subdomain \(D_{\sigma }\) of objects of sort \(\sigma \) to the relational level: \(R_{\sigma }(f)(o)(o') = 1\) iff \(root(f) =o \wedge o = o' \wedge o \in D_{\sigma }\). The update operation correlated with a thematic role constituent is defined as follows.

figure ai

The update operation correlated with the thematic role constituent tests whether this role is already defined for the current event model. If this is not the case, this model is updated by adding the thematic role together with the object o to the value of \(\theta \) yielding a new frame \(f'_{e_{em}}\).

In contrast to the interpretation of DPs, the interpretation of verbs is related neither to a determiner nor to a thematic role. Therefore, the four update operations are related to the interpretation of a verb as a whole and are not distributed over several constituents. The first three update operations do not differ from those for DPs except for the fact that the newly introduced object is added to the actions attribute. The interpretation of a verb introduces a discourse object on the stack. This object needs to be linked to an object that is already an element of the current situation model. If this operation is successful, the pair relating the event to this object is added to the participancy_relation of the current situation model and next the object together with its associated frame is added to the actions attribute of the current situation model. The update operations differs w.r.t. the thematic role. In the case of a DP the thematic role constituent relates an object to the current event model by a thematic role. By contrast, the interpretation of a verb adds sortal information about the current event to this model. Hence, the contribution of tr is the relation \(R_{\sigma }\) and not a thematic role \(R_{tr}\). Furthermore, the event \(e_{em}\) is updated by o because this event has now been introduced.

figure aj

Each update operation is correlated with a particular information ordering. For situation models, the most general ordering is defined in (37).

figure ak

According to (37), situation model \(sm'\) extends situation model sm if it contains at least the information about all objects in sm. Specifically, \(sm'\) extends sm by (i) possibly having information about more objects, (ii) by having more information about objects in sm or (iii) by having more information about relations between objects. The information ordering correlated with the update operation associated with linking is defined in (38).

figure al

The (possible) extension is related to the participancy_relation attribute of a situation model. \(\sqsubseteq _{link}\) only reflects the changes of a (successful) linking operation to the value of the sm position. It does not reflect the test that is executed inside this operation. However, this test only checks whether the linking operation can be successfully executed. The information ordering correlated with the update operation associated with paradigmatic relationships based on features is defined in (39).

figure am

Both \(\sqsubseteq _{link}\) and \(\sqsubseteq _{sem}\) are subrelations of \(\sqsubseteq _{sm}\). The information ordering correlated with the updated operation associated with thematic role assignment is defined in (40).

figure an

The ordering \(em \sqsubseteq _{em} em'\) on event models holds if the two models describe the same event and if \(f'_e\) contains at least the information about that event that \(f_e\) contains. The additional information is either sortal information about the event or information relating the event to an object by means of a thematic role.

4 Probability Distributions and Information Metrics

Having defined update operations together with their information orderings that are related to ERP components, we are interested in probabilities between a given context and its possible continuations relative to these update operations and orderings. The relation between probabilities and the ERP components, in particular the N400, is the following. The N400 amplitude on a word w in a context \(c = w_1 \ldots w_t\) is typically inversely related to its conditional probability given this context: \(P(w \, | \, c)\), [KJ16]. Underlying this relation is a model of online processing according to which at every step during this processing there exists a probability distribution over the words that could be encountered next. On this view, a prediction is simply the presence of such a probability distribution, (see [KJ16] for an overview). This conditional probability can be measured in at least two ways. The first way uses subjective human ratings and is based on the notion of cloze probability. Participants are presented the context plus the target sentence with the critical word missing. They are then asked to fill in the first word that comes to their mind. The cloze probability is the percentage of participants who provide this word as the filler. A second way of quantifying predictability is as the information-theoretic notion of surprisal. Given an initial sequence of words \(w_1 \ldots w_{t-1}\), \(w_t\) can be viewed as a random variable. Its surprisal (or self-information) is defined as the negative logarithm of the conditional probability \(P(w_t \, | \, w_1 \ldots w_{t-1})\) and is estimated by probabilistic language models trained on large text corpora. In contrast to these strategies, we define probabilities not at the level of word forms (or referring expressions) but at the semantic level. Interpreting lexical items as objects of type \(\varOmega \) has the effect that each input context is related to its set of possible continuations on which probability distributions can be defined. More specifically, one has the following. For a given context c, \(\lambda \phi .T(c)(\phi )\) is of type \(\langle \langle \gamma , t\rangle , t \rangle \) and, therefore, a set of continuations. Each continuation is a set of contexts. The contexts in a continuation can be ordered according to one of the information orderings defined in the preceding section. It is therefore necessary to lift the orderings on these models in a first step to the level of contexts. Since there are three orderings, we get a total of three lifts:

For the ordering correlated with the update operation associated with linking, the lifted ordering on contexts is defined in (41-a).

figure ao

For example, the lifted ordering correlated with the update operation associated with linking \(c[sm] \sqsubseteq _{link} c[sm']\) requires that the frame component of the discourse object stored in \(c[sm']\) extends the corresponding component in c[sm] according to (38). The other two lifted orderings are defined analogously.

4.1 Probability Distributions on Frames

Next we define properties of frames. We start with properties of events in event models. Let the current event be e of sort \(\sigma \) with a frame \(f_e\) and \(\theta (f_e) = \{\langle R_1, o_1 \rangle , \ldots \langle R_m, o_m \rangle \}\). Each \(R_i\) is a relation that maps \(f_e\) and its root to a (unique) object o. Hence, to each \(R_i\) corresponds the property of frames \(Q_i = \{ f \, | \, \exists o. R_i (f)(root(f))(o) \} = dom(R_i)\). The frame \(f_e\) is therefore related to the property \(Q_1 \cap \ldots \cap Q_m\). If the next expression is a DP, it contributes the discourse object \(\langle o, f_o\rangle \) with \(\theta (f_o) = \{ \langle R'_1, o'_1\rangle , \ldots , \langle R'_k, o'_k \rangle \}\). Relative to the current event model, this triggers a move along the information ordering \(\sqsubseteq _{tr}\) based on the update operation tr defined in (35). Let \(tr_1 \ldots tr_l\) be the thematic roles defined for events of sort \(\sigma \). If in the given context \(tr_1, \ldots , tr_j\) have already been discharged, information growth is possible only with respect to the thematic roles \(tr_{j+1}, \ldots tr_l\). Hence, \(f_o\) is related to \(f_e\) by some thematic role \(tr_k, j+1 \le k \le l\) with interpretation \(R_{tr_k}\). One therefore has \(\theta (f'_e) = \theta (f_e) \cup \{ \langle R'_1, o'_1\rangle , \ldots , \langle R'_k, o'_k \rangle \}\) and \(Q_{f'_e}\) the corresponding property of frames.

We define conditional probability functions on subsets of \(D_f\). \(P_{\sqsubseteq _{em}}(Q_{f'} \, | \, Q_f)\) is the probability that frame f can be extended to frame \(f'\) by a move along the information ordering \(\sqsubseteq _{tr}\). \(P_{\sqsubseteq _{tr}}(Q_{f'} \, | \, Q_f) > 0\) indicates that frame \(f'\) is accessible from frame f relative to \(\sqsubseteq _{tr}\). The probability \(P_{\sqsubseteq _{tr}}\) of a move along \(\sqsubseteq _{tr}\) depends on the context. For example, given that the actor of an event has already been introduced, the probability of extending the frame of the current event by this relation is 0 because the corresponding update operation fails.

For situation models, properties of frames are defined in a way similar to that for event models. Given a situation model sm with associated frame \(f_{sm}\) and \(\theta (f_{sm}) = \{ \langle R_1, S_1 \rangle \ldots , \langle R_n, S_n \rangle \}\), the corresponding property is \(Q_1 \cap \ldots \cap Q_n\) where \(Q_i = \{ f_{sm} \, | \, \exists S.R_i(f_{sm})(root(f_{sm}))(S)\}\). Similar to the case of an event model, the contribution of the next word is based on the discourse object \(\langle o, f_o\rangle \). For situation models, there are two update operations with corresponding information orderings \(\sqsubseteq _{link}\) and \(\sqsubseteq _{sem}\). Hence, one gets two conditional probability distributions: \(P_{\sqsubseteq _{link}}\) and \(P_{\sqsubseteq _{sem}}\). The constraints on these distributions are the same as in the case of \(P_{\sqsubseteq _{em}}\). For example, \(P_{\sqsubseteq _{link}} (Q_{f'} \, | \, Q_f) > 0\) means that frame \(f'\) is accessible from frame f along the ordering \(\sqsubseteq _{link}\).

4.2 Information Metrics: Entropy and Entropy Reduction

The situation model sm stored at the stack of a context c can be taken as a partial description of a (complete) situation model \(sm_c\), i.e. one has \(sm \sqsubseteq _{sm} sm_c\). Given context c and sm, a comprehender wants to know which \(sm_c\) is described by the discourse. Each new word that is processed contributes additional information and therefore (possibly) decreases the uncertainty the comprehender has about which situation model is described. The comprehender expects the new information to comply with her expectations based on discourse principles (linking) and paradigmatic relationships as well as her world knowledge. The update operation associated with linking targets bridging inferences. The conductor example in (15) shows that at this level the N400 amplitude is smallest in the case of an identity relation as in (15-a). This kind of DP does not exclude any extensions that were possible before this DP was encountered because the information related to this DP was already known in the input information state. For bridged DPs, this will in general not be the case because some extensions are excluded by establishing a linking relation that was not known before. Take, for example, the case of the jackets in (6). This excludes situations in which the children were wearing coats or ski suits. If linking fails, no transition along \(\sqsubseteq _{sem}\) is possible so that all continuations are discarded. This data suggests that the update operation associated with linking is related to the information metric of entropy reduction. Hence, we hypothesize the following relation to the modulation of the N400 amplitude:

figure ap

Let us make this idea formally precise. One way to proceed is to define entropy over maximal continuations relative to a particular situation model. However, the number of possible continuations in such contexts is in general far too large. We will therefore use another approach and define n-step entropy instead (see [Fra13] for further details). We start by defining conditional probabilities \(P_{\sqsubseteq _{link}}(c_j \, | \, c_i)\) between contexts relative to the ordering \(\sqsubseteq _{link}\) defined above for situation models. Let \(f_{sm_i}\) be the frame component of the discourse object stored at position sm in context \(c_i\).

figure aq

In the next step we define conditional probabilities for n-step transitions. This is done by using the chain rule from probability theory.

figure ar

In (44) \(c^{t+i-1}_1\) is the context got from \(c_1\) by \(t+i-2\) moves along the ordering \(\sqsubseteq _{link}\). More generally, \(c_i^j\) is the context got from context i by \(j-i\) moves along the ordering \(\sqsubseteq _{link}\). The definition of n-step entropy is given in (45).

figure as

\(\varPhi ^n\) is the set of n-step continuations. Processing word \(w_{t+1}\) leads to the new context \(c_1^{t+1}\) which drops out of the computation of uncertainty concerning the situation model described by the discourse. The relevant entropy at this point is over the probabilities of moves in \(\varPhi ^{n-1}\) so that the simplified reduction in entropy due to \(w_{t+1}\) becomes (46).

figure at

However, using entropy reduction in this way is problematic for cases involving paradigmatical relationships as in the example of the holiday resort.

figure au

Recall that for ‘pines’ the N400 amplitude was less enhanced than that for ‘tulips’. However, if for example pines and tulips have the same (low) conditional probability, they do not differ with respect to entropy reduction. As a result, the N400 amplitude for ‘pine’ and ‘tulips’ should be the same, contrary to the empirical findings. This shortcoming is similar to using cloze probabilities. Both ‘pines’ and ‘tulips’ have the same (low) cloze probability.

We suggest the following solution to this problem. One has to compare the actual (surviving) continuations with the continuations that have the highest conditional probability. Let us make this precise. Given a particular context c with \(c[sm] = \langle e_{sm}, f_{sm} \rangle \), there is a maximal sortal constraint on elements of the values of \(f_{sm}\). This constraint is determined by selectional restrictions, bottom-up information and world knowledge. Given these constraints, particular extensions are most expected, i.e. have the highest conditional probability relative to \(\sqsubseteq _{sem}\). For example, in the case of (47) these are extensions which assign to the theme of the planting event objects that are tall trees whose geographical range are the tropics. Whereas palms satisfy all of these features, pines only satisfy two (they are trees and tall) and tulips satisfy none of these features. Hence, the question is: to what degree do the actual found features satisfy the most predicted ones? This idea can be made precise as follows.

Let the input context got after processing (47) up to but excluding the critical word be \(c_t\) and the next word be \(w_{t+1}\) with interpretation \(\langle o, f_o\rangle \). In the case of (47) this is either ‘palm’, ‘pine’ or ‘tulip’. In all three cases the plant o (i.e. the palms, the pines or the tulips) can be linked to the event of planting by the theme relation. One has that \(f_o\) is a subframe of an extension \(f'_{o'}\) of the frame \(f_{o'}\) associated with the planting event \(o'\) (\(f_o \preceq f'_{o'}\) and \(f_{o'} \sqsubseteq f'_{o'}\)). Bottom-up information only yields the sortal information provided by the head noun. Enriching this information with world knowledge yields frames with the following values: \(\theta (f_{palm}) = \{ \textsc {sort} = \textit{palm}, \textsc {range} = \textit{tropics}, \textsc {species} = \textit{plant}, \textsc {subspecies} = \textit{tree} , \textsc {height} = \textit{tall}\}\), \(\theta (f_{pine}) = \{ \textsc {sort} = \textit{pine}, \textsc {range} = \textit{moderate}, \textsc {species} = \textit{plant}, \textsc {subspecies} = \textit{tree} , \textsc {height} = \textit{tall}\}\) and \(\theta (f_{tulip}) = \{ \textsc {sort} = \textit{tulip}, \textsc {range} = \textit{moderate}, \textsc {species} = \textit{plant}, \textsc {subspecies} = \textit{flower} , \textsc {height} = \textit{small}\}\).Footnote 3 These frames will be referred to by \(f_{found}\). Predictions are calculated by extensions of \(c_t\) along the information ordering \(\sqsubseteq _{sem}\). Instead of entropy reduction in the case of linking, we consider n-step conditional probabilities. Probabilities at the level of contexts relative to the ordering \(\sqsubseteq _{sem}\) are defined in a way similar to those for \(\sqsubseteq _{link}\).

figure av

We are interested in those contexts got after n-steps that have the highest conditional probability given \(c_t\) relative to the ordering \(\sqsubseteq _{sem}\). \(\phi ^n\) is the set of n-step continuations.

figure aw

Let’s assume for the sake of simplicity that \(max(c, \sqsubseteq _{sem})\) is a singleton, i.e. there is only one continuation of length n. Let \(c^*\) be the maximal element in this continuation relative to \(\sqsubseteq _{sem}\) with \(c^*[sm_{c^*}] = \langle e_{sm_{c^*}}, f_{e_{sm_{c^*}}}\rangle \) and \(p(f_{e_{sm_{c^*}}}) = \{o_1, \ldots o_k\}\). Since \(w_{t+1}\) contributed the object o which is linked to the planting event \(o'\) by the theme relation \(R_{theme}\), we need the object \(o_j\) in \(p(sm_{c^*})\) for which one has \(\langle o', o_j\rangle \in pr(f_{e_{sm_{c^*}}})\) and \(\langle R_{theme}, o_j\rangle \in \theta (f_{o'})\). The frame associated with \(o_j\) is \(f_{o_j}\). Recall that we are interested in the question: given \(f_{found}\) i.e. the frame for the palms, the pines or the tulips, what is the percentage of features that this frame has in common with \(f_{o_j}\)? The set of features common to both frames is given by \(\theta ^*(f_{found}) \cap \theta ^*(f_{o_j})\) where \(\theta ^*(f)\) is the projection of \(\theta (f)\) to its relational component. Finally, one calculates the percentage in (50).

figure ax

If \(\theta ^*(f_{o_j})\) is \(\theta ^*(f_{palm})\), one gets: For ‘palm’, (50) yields a value of 1. By contrast, for ‘pine’, \(\theta ^*(f_{found}) \cap \theta ^*(f_{o_j})\) has three elements which yields a value of 0.60. For ‘tulips’, finally, one has \(\theta ^*(f_{tulip}) \cap \theta ^*(f_{o_j}) = \{\textsc {species} = \textit{plant} \}\) and one gets 0.20. Tulips satisfy only the most general feature that is determined by ‘plant‘ for its theme argument.

4.3 The LPP and Exception Handling

Due to lack of space, we can only sketch how the LPP component is related to our formal framework. By way of example, we will illustrate with the linking operation. Recall that the linking update operation is based on the establishment of a bridging inference. So far, there are only two possibilities: such an inference can be established or not. However, what is required is a threefold distinction between true and false bridging inferences and the failure of such an inference. Recall that empirical evidence for such a distinction is twofold. First, N400 amplitudes that correspond to failure of linking in our approach are maximal and are independent of semantic similarity and paradigmatic relationships based on features. Second, cases of failure of linking in our approach elicit an LPP (usually associated with semantic violations) whereas this is not the case for cases in which linking succeeds but is false according to general world knowledge (usually associated with world knowledge violation).

We follow [dGL10] and [Leb12] and assume that update operations that have side-effects depend on a function sel. In our approach, for the linking update, sel takes a context c, an object o, a frame f and returns, if successful, another frame \(f'\). It is defined in (51).

figure ay

By itself, sel is a partial function: If it returns an object \(o'\), linking is successful. In this case the established bridging inference can either be true or false. If no object is returned, sel raises an exception to the effect that no object was found whose frame can be linked by a feature to the frame \(f_o\). The exception is catched and the object will be returned to the exception handler. The handler introduces a new object into the context whose associated frame allows for a bridging inference with \(f_o\). Hence, the linking update operation is called with an enriched context that makes a bridging inference possible. Formally, this can defined in terms of an exception handling mechanism (see [Leb12] for details).

figure az

In (52) D is the discourse up to the linking operation. It is of type \(\langle \langle \gamma , t\rangle , t\rangle \). S is the update operation associated with linking. \(\textit{handle} \,\,\textit{with}\) takes a set of continuations, an exception of type \(\chi \) and a set of continuations and maps it to a set of continuations. The effect of the exception handling is to execute D with respect to continuations that are augmented by an addition object together with its associated frame so that a bridging inference relative to \(f_o\) becomes possible. Note that in this case the frame for \(o''\) can directly be assumed to have the required attribute (feature) that links it to the frame \(f_o\). The revised linking operation is given in (53).

figure ba

The test for a bridging inference is now part of the sel-function which provides or fails to provide an argument of the update operation. Our hypothesis for the LPP is given in (54).

figure bb

For the update operation associated with thematic roles, other handling mechanisms are required that modify constraints like those imposed by freeness. We assume that such constraints are part of the common ground which, in turn, is part of the initial context of a discourse. Elaborating on this strategy must be left to another occasion. This account of the LPP may also shed some light on the fact that the evocation of an LPP is sometimes task-dependent. For example, the critical word in ‘De bomen die in het park speelden...’ (The trees that in the park played...’) elicited an LPP effect compared to the expected ‘stonden’ (‘stood’) (and no N400 effect) when participants made explicit sentence acceptability judgments about these sentences, but when participants simply read the sentences for comprehension, the critical words only evoked an N400 effect and no LPP effect (see [Kup07] for references and further details). In our approach this difference is explained as follows. Participants execute an exception handling operation (accommodation) if they know that the discourse is continued or if they have to evaluate the coherence of the discourse so far. If they only have to read a particular discourse up to a particular point, there is no need to adapt the current context in order to continue or answer a question related to its coherence.

5 Comparison to Three Related Models

Three related models that have been proposed in the literature are the Retrieval-Integration model by Brouwer and colleagues, the MUC-model by Baggio and Hagoort and the approach by Rabovsky and colleagues that is based on a probabilistic representation of meanings.

The Retrieval-Integration model of Brouwer et al. [BFH12, BCVH17, DBC19], is based on the assumption that incremental, word-by-word language processing proceeds in retrieval-integration cycles where each cycle is modelled by a function process which maps a word form \(w_1\) and a context to an updated context. The function process, in turn, is the composition of two functions retrieve and integrate. The former maps a word form and the prior context to the disambiguated meaning of the word form whereas the latter takes this meaning and the context and maps it to an updated context. The N400 component reflects the effort involved in retrieving from long-term memory conceptual knowledge associated with the eliciting word, which is influenced to the extent to which this information is cued (or primed) by the preceding context, [DBC19, p. 2]. The retrieval operation is viewed as a bottom-up process that does not involve integrative semantic processing or semantic composition, [BFH12, p. 134]. Top-down information, e.g. from the existing mental representation of the preceding sentence fragment, does play a role, but it adds to the activation pattern and does not constrain the pattern of activation. A reduced (attenuated) N400 amplitude reflects facilitated access, and hence retrieval, of lexical information, [DBC19, p. 2]. As an effect, the N400 amplitude for a critical word should be relatively insensitive to the plausibility of a sentence within which it is contained. For example, if one of two words makes a given sentence implausible, while the other does not, there will be no N400 effect if both are approximately equally primed by the preceding context. A by-product of this conception of the retrieval operation is that the language processing system is able to anticipate or predict upcoming words, [BFH12, p. 134]. In this approach, the absence of N400 effects in semantic illusion sentences results from contextually-cued retrieval mechanisms that are based on semantic similarity or semantic associations, [DBC19, p. 2]. An N400 effect is observed for critical words that are semantically weakly associated with the prior context. By contrast, if there is a strong semantic association, no N400 effect occurs.

According to the Retrieval-Integration model, late positivities to which the LPP belongs reflect the word-by-word construction, reorganization or updating of a mental representation of what is being communicated. It is functionally interpreted as the brain’s natural electrophysiological reflection of updating a mental representation with new information. Each member of this family corresponds to a specific subprocess of this updating process. Subprocesses include: accommodating new discourse referents; establishing linking relations between discourse referents; assigning thematic roles to discourse referents; imposing constraints on discourse referents; revision of already established relations and resolving conflicts between different sources of information. Integration difficulty does not result from a conflict between two or more processing streams. Rather, it reflects the degree to which the current mental representation needs to be adapted to incorporate the current input, [BFH12, p. 138].

[BCVH17] use a neurocomputational model that is an extension of a Simple Recurrent Network to implement this approach. This network instantiates the process function with its two subprocesses retrieve and integrate. The N400 amplitude is an index of the amount of processing involved in activating the conceptual knowledge associated with an incoming word in memory. Specifically, the N400 amplitude for a word w is taken as the degree of change that w induces in the activity pattern of the retrieval layer that implements the ‘retrieve’ subprocess. Similarly, the LPP amplitude for a given word w is estimated as the degree of change that processing this words induces in the activity pattern of the integrate which implements the integrate subprocess.

The Retrieval-Integration model and our model have in common that language processing is taken as a biphasic process with the first phase indexed by the N400 and the second by the LPP. The difference is twofold. First, we distinguish between a global level of the situation model and a local level of the event model. The representation of an incoming word must be integrated at both levels, which is modelled by update operations. Integration at the level of the situation model is related to the N400. Adding the object is followed by an integration operation which adds the semantic representation of the incoming word to the situation model. Since this operation adds new information to this model, its associated probability distribution is changed. This change leads to a change in the expectations of the comprehender. Hence, the N400 is related to an operation that changes and, therefore, constrains the model.

In both models the LPP is indexed by integration. However, the set of operations modelling this integration operation is only a subset of those assumed in the Retrieval-Integration model. In the latter model integration captures all kinds of semantic update operations, whereas in our model these operations are restricted to those related to the current event model. For example, the LPP is related to establishing a linking operation.

Rabovsky et al. [RHM18], interpret N400 amplitudes as the change induced by an incoming word in a probabilistic representation of meaning. In this model each word in a sentence provides clues that constrain the formation of a probabilistic representation of the event described by the sentence, [RHM18, p. 693]. The context and each word is represented by a set of activation units which are modelled as probability distributions over features. Examples of such units are ‘Agent’, ‘Action’ and ‘Patient’. Features for the ‘Agent’ unit include ‘woman’, ‘man’, ‘boy’ and ‘girl’ and capture semantic similarities among event participants. The magnitude of the activation update produced by each successive word of a sentence corresponds to the change in the model’s probabilistic representation that is triggered by that word, [RHM18, p. 693]. The N400 amplitude of the n-th word is defined as the semantic update (SU) induced by this word. This update is defined as the sum of the absolute values of the change of each unit’s activation (across the model) that the word triggers. For a given unit \(a_i\) the change is the difference between the unit’s activation after processing the n-th word and the activation of this unit prior to processing it, i.e. after the (n-1)-th word.

figure bc

Consider the sentence fragment ‘I take my coffee with cream and ...’. The activation state associated with this fragment already implicitly represents a high subjective probability that in addition to cream the speaker takes her coffee with sugar. As an effect, the state will change very little if ‘sugar’ is in effect found as the next word and the N400 amplitude is small. If instead ‘dog’ is encountered, the activation state is changed to a much larger degree so that a larger N400 amplitude is elicited.

In contrast to most other accounts of N400 activity this model does not assume separate stages for lexical access and subsequent integration. It resembles an access view in that the change in activation state is fast, automatic and implicit. However, there is no separate step that consists of the isolated representation of the incoming word. Rather, the resulting activation state already is the updated activation state, i.e. the change that is triggered by this word. Hence, this activation state can be taken as representing the result of integrating the representation of the incoming word with the representation of the context. This model and our model have in common that the effect of processing a word is represented as a change in information state. However, in contrast to this model, in our model a static representation can be isolated for each word, which is the frame representation of the concept associated with this word. Furthermore, in our model, separate stages of processing are distinguished: the two stages of N400 activity and the stage indexed by the LPP. By contrast, the resulting activity state in the Rabovsky et al. approach represents all aspects of the event described by the sentence, [RHM18, p. 700].Footnote 4

The approach by Baggio and Hagoort, [BH11], is based on the Memory-Unification-Control (MUC) model of language processing in the brain. The memory component is a lexicon that stores phonological, syntactic and semantic information about morphemes, words and other constructions. What gets stored are unification-ready structures which supply constraints across levels of description. The unification component combines stored lexical information to more complex units. This is done by solving (or unifying) sets of constraints given by the context and an input, say the next word in a sentence. This solving of constraints is done in a dynamic fashion. Memory supplies constraints for the Unification component, which retains a context for subsequent stages of memory retrieval and unification, [BH11, p. 1341f]. Finally, the Control component presides over executive functions in language like turn taking in conversations. Each component corresponds to a set of brain regions. The memory component is localized in temporal regions (superior temporal gyrus, STG; middle temporal gyrus, MTG; and inferior temporal gyrus, ITG). The unification component is subserved by the inferior frontal gyrus (IFG) and the Control component is localized in anterior cingulate and dorsolateral prefrontal cortices.

The N400 is explained as the result of the summation of currents injected by frontal into temporal areas (unification) with currents that are already circulating within temporal cortex due to the local spread of activation to neighbouring neuronal populations (pre-activation). More specifically, the N400 component reflect reverberating activity within the MTG/STG-IFG network, [BH11, p. 1358f]. Processing an initial fragment of a sentence or a discourse sets up a context, i.e. a set of unification-ready structures or constraints, in MTG/STG. This corresponds to the pre-activation component. Encountering the next word of the sentence/discourse similarly activates a unification-ready structure representing the meaning of this word. The next step is the unification component, i.e. the solution of the constraints representing the context and the new word, which amounts to calculating the unification of the unification-ready structures. If the constraints representing the context include features that are also part of the constraint associated with the new word, there will be some overlap between the populations in MTG/STG associated with the context and those associated with the word. The relation to the N400 is the following. The larger the overlap of features between the representations of the context and the new word, the smaller the amplitude of the N400. Consider the sentence ‘The girl was writing letters when her friend spilled coffee on the paper/tablecloth’. Processing the initial fragment up to but excluding the final word sets up a representation of the context that activates more features contained in the representation of ‘paper’ than in the representation of ‘tablecloth’. This is due to features activated with the representations of the words ‘write’ and ’letters’. As a result, the N400 amplitude for ‘paper’ is smaller than that for ‘tablecloth’.

Similar to this theoretical account of the N400 we assume that N400 activity is related to two components: prediction and integration at the level of situation models. However, whereas in the Baggio and Hagoort account the N400 amplitude is modulated only by the unification component, this amplitude is a function of both components in our model. Second, in the Baggio and Hagoort approach unification is an operation at the sentential or discourse level because the representation of the context and that of the incoming word are combined (unified) to a new (updated) context. By contrast, in our approach integration is related to two different levels: the situation model and the event model. The N400 activity is related to integration in the situation model, i.e. to the combination of the representation of the incoming word and the representation of the situation model. Integration at the event model is related to the LPP. Finally, in our approach stochastic frames are used as representations in the lexicon which results in a probabilistic framework that allows for a weighting of features.

6 Closing Outlook

We have outlined a formal framework in which results from neuro-linguistic research on the N400 and the LPP can be incorporated. Obviously, this framework needs to be extended in several directions. Two of the most important directions are: (i) besides the N400 and the LPP, data on the Left Anterior Positivity has to be accounted for as well as more data on the N400 and the LPP; (ii) our implementation of a left-to-right processing strategy only accounts for simple sentences. Extending it to include constructions like proper quantification and modification, e.g. in form of adjectives, adverbs or relative clauses, requires a more complex framework that has to use some kind of storing mechanism (see [BS17] for a similar argument).