Keywords

1 Introduction

As traditional computer vision technology further matures, higher level forensic semantic understanding of visual surveillance data has been gaining increasing attention. Such forensic semantic analysis deals with a propositional assumption to be investigated after an incident and the answer to the propositional assumption should be an epistemic reasoning result upon pre-observed evidential and contextual cues. Therefore, such forensic semantic analysis of visual surveillance data requires intelligent reuse of low level vision analytic results with additional visual, and non visual, contextual cues. However, unlike domains that can solely rely on deterministic knowledge model, in visual surveillance, contextual knowledge as well as low level vision analytic results are fraught with facets of uncertainties, incompleteness and inconsistencies. Therefore, the key challenges for such high level analysis approaches are the choice of an appropriate contextual knowledge representation and the proper reasoning mechanism under uncertainty. Depending on how such approaches handle uncertainty, they can be roughly categorized into intensional and extensional approaches [1]. In intensional approaches, also known as state based approaches, uncertainty is attached to ‘subsets of possible states’ and handle uncertainty taking into account relevance between the states. In extensional approaches, also known as rule-based systems treat uncertainty as a generalized truth value attached to formulas and compute the uncertainty of any formula as a function of the uncertainties of its sub formulas. There is trade-off between the two approaches. Intentional approaches assume completeness of the state model, therefore, semantically clear but computationally clumsy. Extensional approaches are computationally convenient but semantically sloppy. In forensic visual surveillance, however, considering the variety of possible semantics in scenes, extensional approaches have advantages in the flexibility and expressive power due to their ability to derive a new proposition based only on what is currently known (a) regardless of anything else in the knowledge base (locality) and (b) regardless of how the current knowledge was derived (detachment). locality and detachment are together referenced to as modularity [1]. Due to the advantage of extensional approaches, there has been some extent of work on the use of logic programming language with different uncertainty handling formalisms for visual surveillance and computer vision problems. In such approaches, intermediate metadata comes from vision analytics and additional visual or non visual contextual cues are encoded as either symbolized facts or rules. Then uncertainty comes with vision analytics are represented according to the chosen uncertainty formalism and attached to their symbolized facts. Similarly, uncertainty as general trustworthiness or priority among rules is also represented according to the chosen uncertainty formalism and attached to given contextual rules. Once such an uncertainty attachment is done, principled inference, which is often nonmonotonic, is conducted. The examples of such principled inferences are default reasoning [2] to handle inconsistent information, abduction [3] to find most probable hypothesis of given observation and belief revision over time upon the change of observation, etc. In this pipeline, therefore, appropriate uncertainty assignment as well as proper uncertainty formalism plays an important role. However, there are often cases that the trustworthiness of rule itself is also uncertain thereby, uncertainty attachment to rule itself should be rather functional. the more X then the more Y type of knowledge is one of the examples. To enable such type of rule modeling, in this chapter, we further explorer our previous work [46], where we proposed the use of subjective logic [7] with logic programming and demonstrated that the proposed approach can cover inconsistent information handling as default reasoning and bidirectional reasoning as can be typically done in intensional approaches. We first propose a reputational subjective opinion function that is similar to fuzzy membership function but also can take into account uncertainty of membership value itself. Then we further adopt subjective logic’s fusion operator to accumulate the acquired opinions over time. To demonstrate reasoning under uncertain rules, we present a preliminary experimental case study by intentionally restricting the type of available metadata to the results from human detection and tracking algorithms. Automatic human detection and tracking is one of the common analytics and becoming more widely employed in automated visual surveillance systems. The typical types of meta-information that most human detection analytic modules generate comprise, for instance, localization information such as coordinate, width, height, time and (optionally) additional low-level visual feature vectors. We intend to use further such information for evaluating the relationship between two persons and, more specifically, for estimating whether one person could serve as a witness of another person in a public area scene. Examples for (linguistic) domain knowledge applicable to this scenario include: (1) (At least) two distinct people are required for building a relationship. (2) The closer the distance between two people is, the higher is the chance that they can identify each other. (3) If two persons approach each other directly (face-to-face) then there is a higher chance that they can identify each other. Such linguistic knowledge can be modeled and encoded as rules by the proposed approach. The case study is further extended to demonstrate more complex forensic reasoning by considering additional contextual rules together with the shown uncertain rules.

Table 8.1 A comparison of previous extensional approaches

The rest of the chapter is organized as follows. In Sect. 8.2, we briefly review related work regarding intensional and extensional approaches with more focus on the latter one. In Sect. 8.3, we will first give a short introduction to subjective logic theory. In Sect. 8.4, we introduce our approach to modeling uncertain rules. Section 8.5 presents a case study scenario in a typical public area scene and deals with rule encoding and preliminary experimental demo results. Section 8.6 further extend the scenario with more complex situational rules. Finally, Sect. 8.7 concludes with discussions and future research directions.

2 Related Work

To address high level context modeling and reasoning in the visual surveillance domain, traditionally, whole model based approaches such as Bayesian networks have been used. Such approaches are called ‘intensional’. Bremont et al. [8] employs a context representation scheme for surveillance systems. Hongeng et al. [14] considers an activity to be composed of action threads and recognizes activities by propagating constraints and likelihood of event threads in a temporal logic network. Other approaches use a qualitative representation of uncertainty [15], HMM to reason about human behaviors based on trajectory information [16], a use of bayesian network and AND/OR tree for the analysis of specific situations [17] or a GMM based scene representation for reasoning upon activities [18]. In such approaches, contextual knowledge is represented as a graph structure having nodes that are considered as symbolic facts. In the sense of logic, connected two nodes can be interpreted as a propositional logic rule that can consider only one relation, the causality implication. A piece of propositional knowledge segment should exist within the whole graph structure, thereby, once uncertainty propagation mechanism is learnt, adding additional pieces of knowledge will require restructuring causality influence relation of the whole graph structure. This aspect restricts expressive power and increases the modeling cost. Due to this complexity and lack of modularity, such approaches have been focusing on relatively narrow and specific semantics. However, as forensic sense of semantics in visual surveillance is gaining more attention, more flexible knowledge representation and uncertainty handling mechanism is required. For this reason, there has been some work on the use of logic programming languages to achieve better expressive power and on the use of different uncertainty handling formalisms to reason under uncertainty. The achievement of better expressive power is mainly due to the first-order predicate logic that logic programming provides. While propositional logic deals with simple declarative propositions, first-order logic additionally covers predicates and quantifiers. Akdemir et al. [9] proposed an ontology based approach for activity recognition, but without uncertainty handling mechanism (In ontology community, Description Logics (DLs) are often used as knowledge representation formalism and DLs are decidable fragments of first-oder-logic.). Shet et al. [11] proposed a system that adopts Prolog based logic programming for high-level reasoning. In [12] the same authors extended their system with the bilattice framework [19] to perform the task of detecting humans under partial occlusion based on the output of parts based detectors. Jianbing et al. [10] used rule-based reasoning with Dempster Shafer’s Theory [20] for a bus surveillance scenario. Anderson et al. [13] used fuzzy logic [21] to model human activity for video based eldercare. Han et al. [46] proposed the use of logic programming and subjective logic [7] to encode contextual knowledge with uncertainty handling, then demonstrated bidirectional conditional inference and default reasoning. Such logic framework based uncertainty handling approaches can be categorized as ‘extensional’. Table 8.1 shows a brief comparison of the previously proposed extensional approaches. the table shows that the coverage of the subjective logic based approach is most broad. For example, while some provides information fusion capability for fusing two contradictory information sources, such as Dempster Shafer’s fusion operator, bilattice’s operator and subjective logic’s consensus operator, only some of them support default reasoning that handles such contradictory information to draw reasonable decision and belief revision. Indeed, bidirectional inference is only supported by subjective logic based approach. In this chapter, we further propose an approach to modeling uncertain propositional rules and inference under such uncertain rules for high level semantic analysis of visual surveillance data. In the sense of linguistic interpretation of the rules, the most similar previous approach to the proposed work would be [13]. In the work, quantitative low level features from human detection are linguistically symbolized into terms such as ‘high’, ‘medium’, ‘low’ and ‘very low’ according to their corresponding membership functions. Therefore, in such approach, defining membership function is critical. Then the linguistic symbols are used to form a conjunctive logical patterns of a human activities. This means, rules contain symbolized static facts. In our approach, rules allow to contain variable itself. Indeed, our approach even allows uncertainty on a membership-like function by the use of the reputation operator in subjective logic thereby, relieves the burden of defining exact form of membership-like function.

3 Subjective Logic Theory

Jøsang [22, 7] introduced subjective logic as a framework for artificial reasoning. Unlike traditional binary logic or probabilistic logic (the former can only consider true or false, and the latter can consider degrees of truth or falseness), subjective logic explicitly represents the amount of ‘lack of information (ignorance) on the degree of truth about a proposition’ in a model called opinion and comes with a rich set of operators for the manipulation of opinions [7]. The idea of explicit representation of ignorance is introduced from belief theory and the interpretation of an opinion in bayesian perspective is possible by mapping opinions to beta distributions. It is also different from fuzzy logic: while fuzzy logic maps quantitative measure to non-crisp premises called fuzzy sets (e.g. ‘fast’, ‘slow’, ‘cold’, ‘hot’ etc.), subjective logic deals with the uncertain belief itself on a crisp premise (e.g. ‘intrusion happened’, ‘accident happened’, etc.). However, in the sense of interpretation, mapping of an opinion into the linguistic certainty fuzzy set (i.e., ‘very certainly true’, ‘less certainly true’, etc) is also possible. In general, subjective logic is suitable for modeling real situations under partial ignorance on a proposition’s being true or false. Known application areas are trust network modeling, decision supporting, etc. However, to the best of our knowledge, the application of subjective logic in computer vision related domains has been limited to [46] that demonstrated the capability of default reasoning and bidirectional interpretation of conditional rules. In this section, we will give a brief introduction to subjective logic theory.

Definition 8.1

(Opinion) [7] Let \(\varTheta =\{x,\overline{x}\}\) be a state space containing \(x\) and its complement \(\overline{x}\). Let \(b_x\), \(d_x\), \(i_x\) represent the belief, disbelief and ignorance in the truth of x satisfying the equation: \(b_x+d_x+i_x=1\) and let \(a_x\) be the base rate of \(x\) in \(\varTheta \). Then the opinion of an agent \(ag\) about \(x\), denoted by \(w^{ag}_x\), is the tuple \(w^{ag}_x = (b^{ag}_x,d^{ag}_x,i^{ag}_x,a^{ag}_x)\).

Definition 8.2

(Probability expectation) [7] Let \(w^{ag}_x = \{b^{ag}_x,d^{ag}_x,i^{ag}_x,a^{ag}_x\}\) be an opinion about the truth of x, then the probability expectation of \(w^{ag}_x\) is defined by: \(E(w^{ag}_x)=b^{ag}_x+a^{ag}_xi^{ag}_x\).

Fig. 8.1
figure 1

Opinion triangle and beta distribution (Colour figure online)

Opinions can be represented on an so called opinion triangle as shown in Fig. 8.1. A point inside the triangle represents a \( (b_x,d_x,i_x) \) triple. The corner points marked with Belief, Disbelief or Ignorance represent the extreme cases, i.e., no knowledge \((0,0,1)\), full disbelief \((0,1,0)\) and full belief \((1,0,0)\). The base rate \(a_x\) represents the prior knowledge on the tendency of a given proposition to be true and can be indicated along the base line (the line connecting Belief and Disbelief). The probability expectation \(E\) is then formed by projecting the opinion onto the base line, parallel to the base rate projector line (see the blue line) that is built by connecting the \(a_x\) point with the Ignorance corner (see the red line). An interesting property of subjective opinions is their direct mapping to beta distributions. Beta distributions are normally denoted as \(Beta(\alpha ,\beta )\) where \(\alpha \) and \(\beta \) are its two parameters (\(\alpha \) represents the number of positive observations and \(\beta \) represents amount of negative observations about a crisp proposition respectively). The beta distribution of an opinion \(w_x = (b_x,d_x,i_x,a_x)\) is the function \(Beta(\alpha ,\beta )\) where \(\alpha =2b_x/i_x+2a_x\) and \(\beta =2d_x/i_x+2(1-a_x)\). In Fig. 8.1, Example (1) shows an opinion about a proposition of an agent, that can be interpreted as seems likely and slightly uncertain true, and Example (2) shows full ignorance (a.k.a. vacous opinion) at the time of judgement about a proposition. Assuming base rate to be 0.7 in the example we get expectation value also to be 0.7 and the beta distribution appears biased towards ‘True’ though the opinion represents full ignorance.

4 Modeling Uncertain Rule Using Subjective Logic

The proposed uncertain rule modeling approach mainly relies on rule-based system that enables logic programming. The traditional rule-based system, which can only handle binary logic, is extended to allow representation of uncertainty using subjective opinions and operators. For a given propositional knowledge, we assume a fuzzy-like membership function that grades degree of truth. Then we focus on that the interpretation of such membership function can be dogmatic, thereby, when the function is projected on the opinion space, it only lays on the bottom line of the opinion space. Indeed, in many cases, the exact shape of the function is hard to determine. To address this aspect, we introduce a reputational function that evaluates the trust worthiness of the fuzzy-like membership function. Then we introduce accumulation of the resulted opinions overtime. In this section, we will first give a brief overview how rules are expressed in logic programming. Thereafter, comes with further details of the uncertain rule modeling.

4.1 Logic Programming

Logic programming mainly consists of two types of logical formulae, rules and facts. Rules are of the form \(A \leftarrow f_0,f_1,\ldots ,f_m\) where \(A\) is rule head and the right hand side is called body. Each \(f_i\) is an atom and ‘,’ represents logical conjunction. Each atom is of the form \(p(t_1,t_2,\ldots ,t_n)\), where \(t_i\) is a term and \(p\) is a predicate symbol that takes \(n\) terms (i.e. arity \(n\)). Terms could either be variables or constant symbols. Negation is represented with the symbol \(\lnot \) such that ‘\(\mathrm{A}=\lnot \lnot \mathrm{A}\)’. Both positive and negative atoms are referenced to as literals. Given a rule \(head \leftarrow body\), we interpret the meaning as IF body THEN head. Traditionally, resolved facts that matches to a rule is called extension. In extensional approaches [11, 12, 10, 46] mentioned in Sect. 8.2, rules have been used to define and reason about various contextual events or activities.

4.2 Logic Programming Extended Using Subjective Logic

To extend logic programming with subjective logic, the CLIPS [23] rule engine was used as a basis to provide flexibility for defining complex data structure as well as for providing a rule resolving mechanism. To extend this system, a data structure opinion(agent,proposition,b,d,i,a) was defined that can be interpreted as a fact of arity 6 with the following terms, agent (opinion owner), proposition, belief, disbelief, ignorance, and atomicity. To represent propositions, we extended the structure so that it can take arity \(n\) properties as well. Therefore, given a predicate \(p\) the proposition can be described as \(p(a_1,a_2,\ldots ,a_n)\). In our system, therefore, each fact is represented as the form of \(w^{agent}_{p(a_1,a_2,\ldots ,a_n)}\). Namely, rules are defined with the opinion and proposition structure. Additionally, functions of subjective logic operators taking opinions as parameters were defined. In this way, uncertainty in the form of opinion triangle is attached to rules and facts. This aspect is depicted as follows:

Definition 8.3

(Opinion Assignment) Given a knowledge base \(\fancyscript{K}\) in form of declarative language and Subjective Opinion Space \(O\), an opinion assignment over sentences \(k \in \fancyscript{K}\) is a function \(\phi : k \rightarrow O\). s.t.

  1. 1.

    \(\phi _{ fact}: \textit{Fact} \rightarrow O\), e.g. \(w^{a}_{p(a_1,a_2,\ldots , a,_n)} = (b,d,u,i)\)

  2. 2.

    \(\phi _{ Rule}: \textit{Rule} \rightarrow O\), e.g. \((w^{a_c}_{p_c(a_{c1},\ldots ,a_{cn})} \leftarrow w^{a1}_{p_1(a_{11},\ldots ,a_{1n})},\ldots , w^{ai}_{p_n(a_{i1},\ldots ,a_{in})}) = (b,d,u,i)\)

  3. 3.

    where \(\bigcirc \!\!\!\!\!\!*\) indicates one of subjective logic’s operators.

Example for a given rule \(w^{a_c}_{p_c(a_{c1},\ldots ,a_{cn})} \leftarrow w^{a1}_{p_1(a_{11},\ldots ,a_{1n})},\ldots , w^{ai}_{p_n(a_{i1},\ldots ,a_{in})}\),

\(\phi _{ inference}\) denoted \(cl(\phi )\) : \(q \rightarrow O\), where \(\fancyscript{K} \models q\) called Closure.

It is important to note that there are different ways of opinion assignment. While Definition 8.3—2 assigns an opinion to a whole rule sentence itself, Definition 8.3—3 assigns an opinion to the consequence part of the rule (rule head). The assigned opinion is functionally calculated out of opinions in the rule body using appropriate subjective logic operators. Definition 8.3—2 especially plays an important role for prioritizing or weighting rules for default reasoning [6]. Given the initial opinion assignment by Definition 8.3—1 and 2, the actual inference is performed by Definition 8.3—3 and 4, where Definition 8.3—4 is further defined as follows:

Definition 8.4

(Closure) Given a knowledge base \(\fancyscript{K}\) in form of declarative language and an opinion assignment \(\phi \), labeling every sentence \(k \in \fancyscript{K}\) into Subjective Opinion Space \(O\), then the closure over \(k \in \fancyscript{K}\), is the opinion assignment function \(cl(\phi )(q)\) that labels information \(q\) entailed by \(\fancyscript{K}\) (i.e. \( \fancyscript{K} \models q\)).

For example, if \(\phi \) labels sentences \(\{a,b,c \leftarrow a,b\} \in \fancyscript{K}\) as \(\phi _{ fact}(a)\), \(\phi _{ fact}(b)\) and \(\phi _{ Rule}(c \leftarrow a,b)\), then \(cl(\phi )\) should also label \(c\) as it is information entailed by \(\fancyscript{K}\). The assignment can be principled by the definition of closure. For example, an opinion assignment to \(c\), in a simple conjunctive sense can be \(\phi _{ fact}(a) \cdot \phi _{ fact}(b) \cdot \phi _{ Rule}(c \leftarrow a, b)\), where \(\cdot \) represent conjunction in Subjective Logic. In our system, to support the rich set subjective logic operators, we made the specification of Definition 8.3—3 in rule description as follows (note that, most of rule based systems also support describing actions in the head part of a rule):

(8.1)

Due to the redundancy that arises when describing rules at the opinion structure level, we will use abbreviated rule formulae as follows:

(8.2)

where \(\bigcirc \!\!\!\!\!\!*\) indicates one of subjective logic’s operators. This way of representing rules, we can build a propositional rules that comprise opinions about a predicate as facts, check logical conjunction based existence of involved opinions and finally define resulted predicate with opinion attached by calculating opinion values with subjective logic operators. To realize this concept, a prototype system integrating binary logic programming and subjective logic calculus has been implemented. For the logic programming part, the CLIPS [23] rule engine was used.

4.3 Uncertain Propositional Rules

In logic programming, a conditional proposition \(y \leftarrow x\) is interpreted as IF \(x\) THEN \(y\). However, there are often cases that we may want to interpret the meaning as the more \(x\) then the more \(y\) or the more \(x\) then the less \(y\), etc. In this case, the opinion attached to the consequence of the rule should be rather functional in terms of the elements within the rule body. Therefore, the opinion assignment suit to this interpretation is Definition 8.3—3. In the sense of intrinsic linguistic uncertainty of the rule, it resembles fuzzy rules shown by Anderson et al. [13, 21]. In the work, quantitative low level features of human detection results such as ‘centroid’, ‘eigen-based height’ and ‘ground plane normal similarity’ are linguistically mapped into non-crisp premises (i.e. fuzzy sets) as ‘(H)igh’, ‘(M)edium’, ‘(L)ow’ and ’(V)ery Low’. Then fuzzy rules defines the conjunctive combination of those linguistic symbols to draw higher semantics such as ‘Upright’, ‘In Between’ and ‘On the ground’ (e.g. \(Upright(L) \leftarrow Centroid(H), EigenHeight(M), Similarity(H)\) [13]). Therefore, introducing appropriate fuzzy membership functions for each linguistic terms and proper handling of the membership functions is an important issue. In this view, Mizumoto et al. [24] showed comparison of sophisticated mathematical handling of ambiguous concepts such as ‘more or less’ having various shapes. One another thing worth to note concerning fuzzy logic is that, even if there are Zadeh’s original logical operators, there are yet another ways of defining logical operators as well. For example, for given two quantitative variables \(x\) and \(y\) come with corresponding membership functions \(\mu _a\) and \(\mu _b\), Zadeh’s AND operator is defined as \(x\) AND \(y = min(\mu _a(x),\mu _a(y))\). In so-called ‘t-norm fuzzy logic’, any form of t-norms can be considered as AND operators. For example, in the case of using product t-norm, the AND operator can be defined as \(x\) AND \(y = \mu _a(x) \cdot \mu _b(x)\) [25]. This aspect still remains controversial among most statisticians, who prefer Bayesian logic [26]. Contrary, as explained in the Sect. 8.3, subjective logic can be interpreted in the sense of bayesian and also the final quantitative opinion space can also be interpreted in the sense of fuzziness (i.e. ‘very certainly true’, ‘less certainly true’, etc). This way, we believe that subjective logic can better bridge the interpretation of fuzzy intuitive concepts with better bayesian sense. The basic idea of our approach is as follows:

  1. 1.

    For a given propositional rule ‘the less (more) \(y\) \(\leftarrow \) the more \(x\)’ we could introduce a membership-like function \(\mu _i : x \rightarrow y\).

  2. 2.

    It is clear that the function \(\mu _i\) should be monotonically decreasing (increasing) but the shape is not quite clear.

  3. 3.

    Considering potentially possible multiple membership like functions \(\mu _i\), however the values of \(\mu _i(x)\) at the two extreme point of \((min_x \le x \le max_x)\) tend to converge but the values in between are diverge therefore, the values of later cases are more uncertain.

  4. 4.

    Considering the aspect of 3. we introduce so-called reputational opinion function on the function \(\mu _i\) and combine it with raw opinion obtained from \(\mu _i\) using subjective logic’s reputation operator.

Fig. 8.2
figure 2

Uncertain rule modeling using subjective logic’s reputation operator

This idea is depicted in Fig. 8.2, where the actual reputational operation is defined as follows:

Definition 8.5

(Reputation) [27] Let \(A\) and \(B\) be two agents where \(A\)’s opinion about \(B\)’s recommendations is expressed as \(w^A_B=\{b^A_B,d^A_B,u^A_B,a^A_B\}\), and let \(x\) be a proposition where \(B\)’s opinion about \(x\) is recommended to \(A\) with the opinion \(w^B_x=\{b^B_x,d^B_x,u^B_x,a^B_x\}\). Let \(w^{A:B}_x=\{b^{A:B}_x,d^{A:B}_x,u^{A:B}_x,a^{A:B}_x\}\) be the opinion such that:

$$\begin{aligned}\left\{ \begin{array}{ll} b^{A:B}_x=b^A_B b^B_x&d^{A:B}_x=d^A_B d^B_x\\ u^{A:B}_x=d^A_B + u^A_B + b^A_B u^B_x&a^{A:B}_x=a^B_x \end{array}\right. \end{aligned}$$

then \(w^{A:B}_x\) is called the reputation opinion of \(A\). By using the symbol \(\otimes \) to designate this operation, we get \(w^{A:B}_x=w^A_B \otimes w^B_x\).

For actual evaluation of a given function \(\mu _i\), an opinion assignment function on the given \(\mu _i\) need to be defined. Although there could be also another ways of such function, in our approach, this is modeled as follows:

$$\begin{aligned} w^{reput^{\mu _i(x)}}_{\mu _i(x)} = \left\{ \begin{array}{cc} b_x=k+4(1-k)(\mu _i(x)-\frac{1}{2})^2 \\ d_x=\frac{1-b_x}{Dratio} \\ u_x=1-b_x-d_x .\\ \end{array} \right. \end{aligned}$$
(8.3)

where \(k\), represents the minimum boundary of belief about the value from \(\mu _i(x)\), and the \(Dratio\) indicates the ratio for assigning the residue of the value \(\mu _i\) to disbelief and uncertainty. This is depicted as Fig. 8.2d.

5 Case Study I

5.1 Scenario Setting for Case Study

At this stage we focused on evaluating the modeling approach itself rather than the reliability of the person detection algorithm. Therefore, we manually annotated a test video from one of i-LIDS [28] data sample with ground truth metadata for human detection comprising bounding boxes and timing information (shown in Fig. 8.3). In total, 1 minute of test video was annotated in which there are 6 people. For our purposes, we intentionally marked one person as suspect. Then we encoded following linguistic contextual knowledge according to the proposed approach as explained in Sect. 8.4. (1) (At least) two distinct people are required for building a relationship. (2) The closer the distance between two people is, the higher is the chance that they can identify each other. (3) If two persons approach each other directly (face-to-face) then there is a higher chance that they can identify each other. Then we calculate subjective opinions between the person marked as suspect and other human instances over time.

5.2 Uncertainty Modeling

5.2.1 Distance

The distance between a pair of people would be one of the typical pieces of clue for reasoning whether one person could serve as a witness of another person. This relates to the general human knowledge that The closer two people are in distance, the more chances of perceiving the other are. Humans are very adapted to operating upon such type of uncertain and ambiguous knowledge. Exactly modeling such a relation is not trivial, but we can approximate it with a monotonic decreasing function about the possibility of perceiving each other. This aspect is depicted as three possible curves in the middle of Fig. 8.4a, where \(x\) represents the distance between the persons as calculated from the person detection metadata and \(\mu _i\) represents the likelihood that two persons at this distance would perceive each other, \(maxdist\) is the maximum possible (i.e. diagonal) distance in a frame and \(a_i\) is the estimated probability that two humans could have recognized each other at the \(maxdist\) distance. However, the value derived from such function is not fully reliable due to the variety of real world and uncertainty in the correctness of the function and uncertainty in the distance value itself. Considering the aspect of distance, it is clear that both the extreme cases i.e. very close or very far are much more certain than in the middle of the range. Thus, to better model the real world situation, the reputational opinion function need to be applied to any chosen function \(\mu _i\). This is modeled as opinion on the reliability of \(\mu _i(x)\) by applying Eq. (8.3). In order to evaluate the impact of choosing different functions in Fig. 8.4a, three different types of \(\mu _i\) functions (a concave, convex and linear) have been applied. The derived reputational opinions showed similar aspects having peaks of certain belief at each extreme cases as shown in Fig. 8.5.

Fig. 8.3
figure 3

Scenario setting for case study I

Fig. 8.4
figure 4

Candidate uncertainty functions regarding distance and direction

5.2.2 Direction

Similarly, we also used direction information between two persons. The linguistic knowledge to be modeled is if two persons approach each other directly (face-to-face) then there is a higher chances of perceiving each other. The corresponding direction-based relevance function is shown in Fig. 8.4b, where \(\varTheta \) represents the angle between the persons heading directions as calculated from the person detection metadata and \(\mu _i\) represents the likelihood that two persons at the angle would perceive each other and \(a_i\) is the expected minimum probability that two humans could have recognized each other at any angle. However, again the trustworthiness of the values from such functions \(\mu _i\) is uncertain, especially in the middle range of the \(\varTheta \). To roughly model such aspect, for a chosen function \(\mu _i(\varTheta )\), the same reputational function from Eq. (8.3) was used again. The impact of choosing different \(\mu _i\) showed similar behavior as of direction based opinions as shown in Fig. 8.5.

Fig. 8.5
figure 5

Samples of reputational opinion according to distance and Eq. (8.3)

5.3 Rule Encoding

In addition to the uncertainty modeling, logic programming is used to represent the given contextual rules as explained in Sect. 8.4.2. Encoded rules in form of Eq. (8.2) are as follows:

$$\begin{aligned} w^{Rule1}_{witness(H_1,H_2,T_1)} \leftarrow&\;\Big (w^{Human_Detector}_{human(H_1,T_1)}\wedge w^{Human_Detector}_{human(H_2,T_1)}\Big )\nonumber \\&\; \otimes \Big (w^{\mu _{dist}(d)}_{witness(H_1,H_2,T_1)} \otimes w^{reput^{\mu _(d)}}_{\mu _{dist}(d)}\Big ). \end{aligned}$$
(8.4)
$$\begin{aligned} w^{Rule2}_{witness(H_1,H_2,T_1)} \leftarrow&\;\Big (w^{Human_Detector}_{human(H_1,T_1)}\wedge w^{Human_Detector}_{human(H_2,T_1)}\Big )\nonumber \\&\; \otimes \Big (w^{\mu _{dir}(d)}_{witness(H_1,H_2,T_1)} \otimes w^{reput^{\mu (d)}}_{\mu _{dir}(d)}\Big ). \end{aligned}$$
(8.5)
$$\begin{aligned} w^{Rule3}_{witness(H_1,H_2,T_1)} \leftarrow \Big (w^{Rule1}_{witness(H_1,H_2,T_1)} \wedge w^{Rule2}_{witness(H_1,H_2,T_1)}\Big ). \end{aligned}$$
(8.6)
$$\begin{aligned} w^{Rule4}_{witness(H_1,H_2,T_n)} \leftarrow \oplus _{i=1}^{n} w^{Rule3}_{witness(H_1,H_2,T_i)}. \end{aligned}$$
(8.7)

The first rule (8.4) starts considering the necessary condition, meaning that there should be a distinct pair of two people. Therefore the conjunction operation \(\wedge \) on two opinions [29] is used that is very similar to the operation \(P(A) \cdot P(B)\) except that in subjective logic the opinion can additionally represent ignorance. Then, for the resulting set of frames the reputational opinion about the distance opinions is calculated as described in Sect. 8.5.2. Each result is assigned to a new opinion with the predicate of the appropriate arity and is assigned the name of agent with the final belief values. In this case, the final opinion value represents that there is an opinion about two persons being potential witnesses of each other from an agent named \(Rule1\). The second rule (8.5) is almost same as rule (8.4). The only different part of this rule is that the reputational opinion is about direction. The third rule (8.6) combines the evidences coming from rule (8.4) and (8.5). The conjunction operator \(\wedge \) is used to reflect that for reliable positive resulting opinions both evidences should have appeared with a certain amount of belief. The last rule (8.7) is about accumulating the belief over time using the consensus operator \(\oplus \) that is defined as follows:

Definition 8.6

(Consensus) [30] Let \(w^{A}_{x}=(b^{A}_{x},d^{A}_{x},i^{A}_{x},a^{A}_{x})\) and \(w^{B}_{x}=(b^{B}_{x},d^{B}_{x},i^{B}_{x},a^{B}_{x})\) be opinions respectively held by agents \(A\) and \(B\) about the same state \(x\), and let \(k=i^{A}_{x}+i^{B}_{x}-i^{A}_{x}i^{B}_{x}\). When \(i^{A}_{x}, i^{B}_{x} \rightarrow 0\), the relative dogmatism between \(w^{A}_{x}\) and \(w^{B}_{x}\) is defined by \(\gamma \) so that \(\gamma = i^{B}_{x} / i^{A}_{x}\). Let \(w^{A,B}_{x}=(b^{A,B}_{x},d^{A,B}_{x},i^{A,B}_{x},a^{A,B}_{x})\) be the opinion such that:

$$\begin{aligned}&k \ne 0 : \left\{ \begin{array}{l} b^{A,B}_{x} = ( b^{A}_{x} i^{B}_{x} +b^{B}_{x} i^{A}_{x} ) / k \\ d^{A,B}_{x} = ( d^{A}_{x} i^{B}_{x} +d^{B}_{x} i^{A}_{x} ) / k \\ i^{A,B}_{x} = ( i^{A}_{x} i^{B}_{x} ) /k \\ a^{A,B}_{x} = \frac{ a^{A}_{x} i^{A}_{x} + a^{B}_{x} i^{A}_{x}- (a^{A}_{x} + a^{B}_{x}) i^{A}_{x} i^{B}_{x} }{ i^{A}_{x} +i^{B}_{x} - 2 i^{A}_{x} i^{B}_{x} }\end{array}\right.\nonumber\\& k = 0 :\left\{ \begin{array}{l} b^{A,B}_{x} = \frac{\gamma b^{A}_{x} +b^{B}_{x} }{\gamma + 1} \\ d^{A,B}_{x} = \frac{\gamma d^{A}_{x} + d^{B}_{x} }{\gamma + 1} \\ i^{A,B}_{x} = 0 \\ a^{A,B}_{x} =\frac{\gamma a^{A}_{x} + a^{B}_{x} }{\gamma +1 } .\\ \end{array}\right. \end{aligned}$$

Then \(w^{A,B}_{x}\) is called the consensus opinion between \(w^{A}_{x}\) and \(w^{B}_{x}\), representing an imaginary agent \([A,B]\)’s opinion about \(x\), as if that agent represented both \(A\) and \(B\). By using the symbol \(\oplus \) to designate this operator, we define \(w^{A,B}_{x}=w^{A}_{x} \oplus w^{B}_{x}\).

Figure 8.6 shows a graphical representation of the rules in a tree form.

Fig. 8.6
figure 6

Tree representation of rules

5.4 Experimental Result

Using the rules described in Sect. 8.5.3, we calculated subjective opinions between a person marked as suspect and other human instances over time. Figure 8.7 shows a snapshot of the visualization in the prototype comprising a video player and an opinion visualizer. While the video is being played the corresponding metadata is transformed into the corresponding opinion representation. The translated opinions are fed into the rule-engine which automatically evaluates the rules. The right part of Fig. 8.7 shows the opinion about the proposition ‘human 5 is a witness for the suspects marked red’ and its corresponding mapping to beta distribution. For verification of these results, a questionnaire was prepared to collect scores about the witnessing chances for each of the ‘pairs’ in the scene (e.g. human1 and suspect, human2 and suspect , etc). Seven people from our lab took part in the questionnaire. Then changing the uncertainty functions on uncertain rules, we tested the behavior of the proposed approach to check whether it well models human intuition. Although there can be 9 possible combinations of uncertainty functions (i.e. 3 distance functions and 3 direction functions), to better contrast the impact of changing such uncertainty functions, we have fixed the direction function to the type of \(\mu _3\) defined in Fig. 8.4b and tested with 3 different direction functions shown in Fig. 8.4a. Then the mean and standard deviation, \(min\) and \(max\) of the ‘human opinions’ were calculated and compared to the computed results. According to [7], the following criteria should be applied to the computed results.

  1. (1)

    The opinion with the greatest probability expectation is the greatest opinion.

  2. (2)

    The opinion with the least uncertainty is the greatest opinion.

  3. (3)

    The opinion with the least relative atomicity is the greatest opinion.

Fig. 8.7
figure 7

Visualization of the experiment

Fig. 8.8
figure 8

Experimental result

In the described experiment, due to the small size of possible pairs, only the first criterion was applied and the final expectation values of each opinion for candidate pairs were plotted jointly with the questionnaire based result as shown in Fig. 8.8. The final result turns out to be following the tendency of questionnaire based human ‘opinions’. The change of uncertainty function seems not introducing that critical differences. However, there were more differences between the expected values, when the final expectation values were low, for instance, though it was a slight differences, \(\mu _3\) tend to yield larger expectation value then \(\mu _2\) and \(\mu _1\). The differences ware smaller when the final expectation values were getting higher. However, in any cases, the order on the ranking of witnesses show the same results. Therefore, in the sense of human like reasoning, it seems that the proposed approach well models human intuition.

6 Case Study II

In this section, we further explorer the proposed case study scenario for more complex contextual forensic reasoning. Especially, we will consider the situation that is needed to be modeled in the sense of so-called default reasoning [2].

6.1 Scenario Setting for Case Study II

Consider a conceptual scenario that a security personnel wants to get suggestions of most probable witnesses of a selected suspect in a scene. Given an assumption that automatic vision analytics are running and extracting basic semantics, we will also assume two virtual situations as shown in Fig. 8.9, where, witnesses are reasoned according to the uncertain spatio-temporal rules as demonstrated in Sect. 8.5. In all situation we will assume that ‘witness2’ has higher opinion then ‘witness1’. In addition to this, we will assume optional cases that additional evidential cues are detected. In Fig. 8.9a, ‘witness2’ is talking on the phone. In Fig. 8.9b, the optional case is the detection of a license plate of the car seems to belong to the ‘witness1’ and ‘witness2’ comes with face detection.

Fig. 8.9
figure 9

Scenario setting for case study 2

6.2 Reasoning Examples

Given the scenario with optional cases, we will also assume that (1) people usually do not recognize well when they are talking on the phone, (2) identifiable witness is a good witness. (3) License plate is better identifiable source than face detection because we can even fetch personal information of the owner easily. Therefore, under optional assumption, for example, in Fig. 8.9a, ‘witness1’ should be better witness, and in Fig. 8.9b, ‘witness1’ should be suggested as a better witness. This kind of non monotonic reasoning under inconsistent information is called default reasoning and defined as follows:

Definition 8.7

(Default theory) [2] Let \(\varDelta =(D,W)\) be a default theory, where W is a set of logical formulae (rules and facts) also known as the definite rules and D is a set of default rules of the form \(\frac{\alpha :\beta }{\gamma }\), where \(\alpha \) is known as the precondition, \(\beta \) is known as the justification and \(\gamma \) is known as the conclusion.

Han et al. [6] showed that this aspect can be modeled using subjective logic as well under the opinion assignment defined in Definition 8.3 in Sect. 8.4.2. Here, it is important to note that unlike the case of uncertain rule modeling, the type of opinion assignment to prioritize belong to Definition 8.3—2. and the default inference scheme belongs to Definition 8.3—4. As shown in [6], we set \(T \simeq (1,0,0)\) (full truth), \(DT_1 \simeq (0.5,0,0.5)\) (weak default true), \(DT_2 \simeq (0.8,0,0.2)\) (strong default true), \(F \simeq (0,1,0)\) (full false), \(DF_1 \simeq (0,0.5,0.5)\) (weak default false), \(DF_2 \simeq (0,0.8.0.2)\) (strong default false), \(*\simeq (0.33,0.33,0,34)\) (contradiction), \(U \simeq (0,0,1)\) (full uncertainty) and \(\bot \simeq (0.5,0.5,0)\) (full contradiction). For the rest of truth values we will use opinion triple representation (b,d,i). The default inference scheme using subjective logic is as follows:

Definition 8.8

(\({Default\;inference}_{sl}\)) [6] Given a query sentence \(q\) and given \(S\) and \(S^{\prime }\) that are sets of sentences such that \(S \models q\) and \(S^{\prime } \models \lnot q\), then the default inference is the truth value assignment closure \(cl_{sl_{di}}(\phi )(q)\) given by:

$$\begin{aligned} cl_{sl_{di}}(\phi )(q)=\underset{S \models q}{\bigoplus } u \sqcup \left[\underset{p \in S}{\prod }cl_{sl} (\phi )(p)\right] \oplus \lnot \underset{S^{\prime } \models \lnot q}{\bigoplus } u \sqcup \left[\underset{p \in S^{\prime }}{\prod }cl_{sl}(\phi )(p)\right]. \end{aligned}$$
(8.8)

Example 1

(Witness talking on the phone) Assume the following set of rules about determining good witness including the uncertain spatio-temporal relation based witness reasoning rule described in Sect. 8.5.3. Then also assume the following opinion assignment that \(witness2\) (denoted as \(wit\_2\)) has higher opinion being the witness than \(witness1\) (denoted as \(wit\_1\)).

$$\begin{aligned} \phi _{Rule} \left[w^{Rule4}_{witness(H_1)} \leftarrow \oplus _{i=1}^{n} w^{Rule3}_{witness(H_1,H_2,T_i)} \right]&= DT_1. \nonumber \\ \phi _{Rule}\left[\lnot w_{witness(H_1)} \leftarrow w_{talking\_on\_phone(H_1)} \right]&= DT_2. \nonumber \\ \phi _{RuleEval} \left[w^{Rule4}_{witness(wit\_1)}\right]&= (0.6,0.15,0.25). \nonumber \\ \phi _{RuleEval}\left[w^{Rule4}_{witness(wit\_2)}\right]&= (0.7,0.10,0.20). \end{aligned}$$

Given two default true and default false rules and facts that can be seen as definite true, the inference for reasoning better witness using default logic with subjective logic is as follows.

$$\begin{aligned} cl_{sl_{di}}&(\phi )(w_{witness(wit\_1)}) = [U \sqcup ( (0.6,0.15,0.25) \cdot DT_1)] .\nonumber \\&= [U \sqcup (0.44,0.15,0.41)] = (0.44,0.15,0.41) \sim (Expectation = 0.54) .\nonumber \\ cl_{sl_{di}}&(\phi )(w_{witness(wit\_2)}) = [U \sqcup ( (0.7,0.10,0.20) \cdot DT_1)] .\nonumber \\&= [U \sqcup (0.50,0.10,0.40)] = (0.50,0.10,0.40) \sim (Expectation = 0.60) . \end{aligned}$$

Above result shows that given the weak rules, ‘witness2’ is more probable witness candidate than ‘witness1’. Then, let us consider the weak opinion assignment to the additional contextual cue that witness2 is using the phone. This semantics can be interpreted as ‘the witness seems to using a phone but not quite sure’.

$$\begin{aligned} \phi _{fact}[w_{talking\_on\_phone(wit\_2)}] = (0.6,0.15,0.25). \end{aligned}$$

Given the additional information, the inference on witness2 is being witness is as follows.

$$\begin{aligned} cl_{sl_{di}}&(\phi )(w_{witness(wit\_2)}) \nonumber \\&= [U \sqcup ( (0.7,0.10,0.20) \cdot DT_1)] \oplus \lnot [U \sqcup ( (0.6,0.15,0.25) \cdot DT_2)] \nonumber \\&= [U \sqcup (0.50,0.10,0.40)] \oplus \lnot [U \sqcup (0.59,0.15,0.26)] \nonumber \\&= (0.50,0.10,0.40) \oplus \lnot (0.59,0.15,0.26)\nonumber \\&= (0.50,0.10,0.40) \oplus (0.15,0.59,0.26)\nonumber \\&= (0.34,0.47,0.19) \sim (Expectation = 0.39). \end{aligned}$$

The resulting opinion (0.34, 0.47, 0.19) on witness2’s being a good witness now weaker than (0.44, 0.15, 0.41) which is for the case of witness1’s being a good witness. The expectation values also captures this aspect. Thus, this result shows that the inference scheme well models human intuition.

Example 2

(Witness with face detection vs. license plate detection) Consider the following set of rules about determining good witness and the following opinion assignment to capture the scenario described in Sect. 8.6.1 and depicted in Fig. 8.9b.

$$\begin{aligned} \phi _{Rule}\left[w^{Rule4}_{witness(H_1)} \leftarrow \oplus _{i=1}^{n} w^{Rule3}_{witness(H_1,H_2,T_i)}\right]&= DT_1 .\\ \phi _{Rule}\left[w_{witness(H_1)} \leftarrow w^{Rule4}_{witness(H_1)} \cdot w_{hasFaceDetectInfo(H_1)}\right]&= DT_1 .\\ \phi _{Rule}\left[w_{witness(H_1)} \leftarrow w^{Rule4}_{witness(H_1)} \cdot w_{hasLicenseDetectInfo(H_1)}\right]&=DT_2 .\\ \phi _{RuleEval}\left[w^{Rule4}_{witness(wit\_1)}\right]&= (0.6,0.15,0.25) .\\ \phi _{RuleEval}\left[w^{Rule4}_{witness(wit\_2)}\right]&= (0.7,0.10,0.20) .\\ \phi _{fact}\left[w_{hasLicenseDetectInfo(wit\_1)}\right]&= (0.6,0.15,0.25) .\\ \phi _{fact}\left[w_{hasFaceDetectInfo(wit\_2)}\right]&= (0.6,0.15,0.25) .\\ \end{aligned}$$

Given two default true and default false rules and facts that can be seen as definite true, the inference for reasoning better witness using default logic with subjective logic is as follows.

$$\begin{aligned} cl_{sl_{di}}&(\phi )(w_{witness(wit\_1)}) \\&= [U \sqcup ( (0.6,0.15,0.25) \cdot DT_1 \cdot (0.6,0.15,0.25) \cdot DT_2)] \\&= [U \sqcup ( (0.44,0.15,0.41) \cdot (0.59,0.15,0.26))] \\&= (0.33,0.28,0.39) \sim (Expectation = 0.36) . \end{aligned}$$
$$\begin{aligned} cl_{sl_{di}}&(\phi )(w_{witness(wit\_2)}) \\&= [U \sqcup ( (0.7,0.10,0.20) \cdot DT_1 \cdot (0.6,0.15,0.25) \cdot DT_1)] \\&= [U \sqcup ( (0.5,0.1,0.4) \cdot (0.44,0.15,0.41))] \\&= (0.3,0.24,0.47) \sim (Expectation = 0.33) .\\ \end{aligned}$$

Above result shows that given the evidences, ‘witness2’ is slightly more probable witness candidate than ‘witness1’ because license plate info is more informative thereby strongly considered than face related information by the opinion assignment. However, due to the opinion on the fact level is not certain, the values were not strongly forced the belief but rather increased the uncertainty in the final opinion. The expectation values also captures this aspect. Thus, this result shows that the inference scheme well models human intuition.

7 Discussions and Conclusion

Intelligent forensic reasoning upon metadata acquired from automated vision analytic modules is an important aspect of surveillance systems with high usage potential. The knowledge expressive power of the reasoning framework and the ability of uncertainty handling are critical issues in such systems. In this chapter, based on our previous work on the use of logic programming with subjective logic, we extended the framework so that it can also handle uncertain propositional rules. The approach is mainly based on the fuzzy-like membership function and the reputational operation on it. Although we still need to extend this concept to large scale data, we advocate that this work showed the potential of the proposed approach. The main advantage of the proposed approach is that it offers more choices to model complex contextual human knowledge by enriching the expressive power of the framework. The other advantage of the proposed approach is that the modeled uncertain rules can be used with another principled reasoning scheme. In this chapter, especially, we have demonstrated how the reasoning results from uncertain spatio-temporal rules could be used with default reasoning. Another interesting property of the system is that, unlike traditional probability based conditional reasoning, this approach allows for representing lack of information about a proposition. We could also roughly assign our subjective priors with lack of information, and observations can also be represented with any degree of ignorance, therefore we believe this better reflects human intuition and real world situations. Another beneficial property is the flexibility of assigning opinions to formulae. Especially, rule can embed its own opinion calculation scheme thereby, allows for sophisticated propagation of opinions through the inference pipeline. There are, however, still several open issues such as how to better model the reputational function, how to automatically assign proper prior opinions to rules, etc. Our future research will cover further extending and applying the shown approach to more complicated scenarios using automatically generated large scale data.