Trajectory Anomaly Detection Based on the Mean Distance Deviation

Hu, Xiaoyuan; Xu, Qing; Guo, Yuejun

doi:10.1007/978-3-030-63820-7_16

Xiaoyuan Hu¹¹,
Qing Xu¹¹ &
Yuejun Guo¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1332))

Included in the following conference series:

International Conference on Neural Information Processing

2459 Accesses
3 Citations

Abstract

With the development of science and technology and the explosive growth of data, there will be a lot of trajectories every day. However, how to detect the abnormal trajectory from many trajectories has become a hot issue. In order to study trajectory anomaly detection better, we analyze the Sequential conformal anomaly detection in trajectories based on hausdorff distance (SNN-CAD) method, and propose a new measurement method of trajectory distance Improved Moved Euclidean Distance (IMED) instead of Hausdorff distance, which reduces the computational complexity. In addition, we propose a removing-updating strategy to enhance the conformal prediction (CP). Then, we also put forward our Non-conformity measure (NCM), Mean Distance Deviation. It can enlarge the difference between trajectories more effectively, and detect the abnormal trajectory more accurately. Finally, based on the technical measures mentioned above and under the framework of enhanced conformal prediction theory detection, we also build our own detector called Mean Distance Deviation Detector (MDD-ECAD). Using a large number of synthetic trajectory data and real world trajectory data on two detectors, the experimental results show that MDD-ECAD is much better than SNN-CAD in both accuracy and running time.

This work has been funded by Natural Science Foundation of China under Grants No. 61471261 and No. 61771335. The author Yuejun Guo acknowledges support from Secretaria dUniversitats i Recerca del Departament dEmpresa i Coneixement de la Generalitat de Catalunya and the European Social Fund.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Anomaly Detection Based on the Global-Local Anomaly Score for Trajectory Data

Spatio-temporal trajectory anomaly detection based on common sub-sequence

Article 06 October 2021

An overview on trajectory outlier detection

Article 02 February 2018

Keywords

1 Introduction

With the rapid proliferation of closed circuit television cameras, satellites and mobile devices, massive trajectory based on different kinds of moving objects such as people, hurricanes, animals and vehicles [11] have been generated (Fig. 1). Undoubtedly trajectory data analysis plays a vital role and abnormal trajectory detection is one of the most key issues for this topic. There are more mature methods based on distance similarity [10]. Although it is relatively high in complexity in terms of large-scale data, it works well in small and medium-scale trajectory anomaly detection. The more famous method, Sequential conformal anomaly detection in trajectories based on hausdorff distance (SNN-CAD) method, is proposed by Laxhammar et al. [6]. Their detection method is mainly based on the conformal prediction (CP) theory [8]. Firstly, the Hausdoff distance [1] is used to calculate the distance between the trajectories as the trajectory similarity. Then Non-conformity Measure (NCM) [9] is given in the light of K-Nearest Neighbor [2], and finally uses conformal prediction detection theory to determine whether the trajectory is abnormal. However, the NCM can not distinguish the abnormal trajectory very well, and the detection accuracy is not high.

In the view of this, we take into account the new NCM, Mean Distance Deviation (MDD), and present a removing-updating strategy to enhance conformal anomaly detection. Accordingly we propose the Mean Distance Deviation based on Enhanced Conformal Anomaly Detector (MDD-ECAD), which can deal with trajectory anomaly detection very well.

Also imporantantly, in this paper, we propose a new distance measure by improving Euclidean Distance (ED), which is called Improved Moved Euclidean Distance (IMED). It can characterize the trajectory distance efficiently. What is more, IMED does not require that the length of the trajectories must be same and its computational complexity is small.

The rest of this paper is organized as follows. Section 2 introduces the relevant background knowledge of our work. Section 3 presents the details of our method MDD-ECAD. Experimental data and results are described in Sect. 4. Finally, the paper is concluded in the Sect. 5.

2 Background

In this section, we will introduce the basic concept of trajectory and the specific details of CP theory.

2.1 Trajectory Type

In general, a trajectory data we study in this paper is a sequence of coordinate points in Cartesian coordinate system. Speaking ahead of time, a trajectory can be simply represented as $ T = (a_{1},a_{2},\cdots ,a_{n})$.

2.2 Conformal Prediction

Conformal prediction (CP) makes use of the past of experience to determine precise levels of confidence in new prediction. Generally speaking, assume a training data $\left\{ \left( x_{1},y_{1} \right) ,\left( x_{2},y_{2} \right) ,\cdots ,\left( x_{l},y_{l} \right) \right\} $ where $x_{i}\left( i = 1,\cdots ,l \right) $ is the input data, that is some data observed or collected by some means. And $y_{i}\left( i = 1,\cdots ,l \right) $ is the output data, that is the label predicted by some method. For exame, $x_{i}$ is a trajectory data collected by sensor and $y_{i}$ is the label with only abnormal or normal type in the trajectory anomaly detection. Given a new observed data $x_{l+1}$, the basic idea of conformal prediction to estimate the p-value $p_{l+1}$ of $x_{l+1}$ by designed NCM according to training data. Finally, the $p_{l+1}$ is compared with the pre-defined threshold $\epsilon $ to determine the label of $x_{l+1}$.

If $p_{l+1}< \epsilon $, $x_{l+1}$ is identified as conformal anomaly. Otherwise, $x_{l+1}$ is determined as normal. However, the key to estimate the p-value of the new example is how to design effective NCM. Next, we will introduce the concept of a Non-Conformity Measure (NCM) whose purpose is to measure the difference between the new example and a set of observed data.

Formally, NCM is a mathematical function. We can get a score $\alpha _{i}$ about the difference between the example $x_{i}$ and the rest of dataset by a certain NCM. The score of $x_{i}$ is given by

$$\begin{aligned} \alpha _{i} = A\left( X_{j\ne i},x_{i} \right) \end{aligned}$$

(1)

where X is a set of data; $x_{i}$ is a example of dataset X; A(.) is a form of NCM.

Based on formula (1), the score $\left( \alpha _{1},\alpha _{2},\cdots ,\alpha _{l+1} \right) $ is gained. Then the p-value of $x_{l+1}$, $p_{l+1}$, is determined as the ratio of the number of trajectories that have greater or equal nonconformity scores to $x_{l+1}$ to the total number of trajectories. The p-value is defined as follows:

$$\begin{aligned} p_{l+1} = \frac{\left| \left\{ \alpha _{i}|\alpha _{i}\ge \alpha _{l+1},1\le i\le l+1 \right\} \right| }{l+1} \end{aligned}$$

(2)

where $\left| \left\{ \cdot \right\} \right| $ computers the number of elements in the set. CP will estimate a set of p-value to predict the lable of the new example and work excellently by using an effective NCM, especially when $\epsilon $ is close to the proportion of abnormal data in the dataset.

The Sequential Hausdorff Nearest Neighbor Conformal Anomaly Detector (SNN-CAD) method was developed by laxhammar et al. [6]. Their main contribution is to use Hausdorff distance to calculate the trajectory distance and use k-nearest neighbor as NCM. Suppose there are two sets of $ T_{a} = \left\{ a_{1},a_{2} ,\cdots ,a_{m} \right\} $, $ T_{b} = \left\{ b_{1},b_{2} ,\cdots ,b_{n} \right\} $. The Hausdorff distance can refer to this article. As for NCM, it is defined as follows:

$$\begin{aligned} \alpha _{i}= \sum _{T_{b}\in Neig(T_{a})}d\left( T_{a},T_{b} \right) \end{aligned}$$

(3)

Where d(.) is a kind of tracjectory distance, Neig ($T_{a}$) represents the k-nearest neighbor of $T_{a}$.

3 Our Method

3.1 Improved Moved Euclidean Distance

In order to measure the distance between two trajectories effctively, researchers have put forward various methods to calculate the distance. The most commonly used and famous ones are ED, HD, and DTW. However, comparing the advantages and disadvantages of the above three distances, we come to the following conclusions: (1) DTW and HD can handle the unequal length trajectory data, but the computational complexity is too high to deal with large and medium-sized data. (2) ED calculates the trajectory distance quickly with the simple implementation, but it can not do anything for the unequal trajectory data.

After our discussion, we can’t help thinking about how to calculate quickly and deal with unequal data. For this purpose, based on ED, we propose a new distance measure Improved Moved Euclidean Distance (IMED) to enlarge the difference between trajectories for better performing trajectory anomaly detection. The proposed distance measure can manage both equal and unequal length trajectories. The basic idea is to fix the longer tracjectory, moving the shorter tracjectory backward until the longer tracjectory is completely matched. Given two trajectories, $ T_{a} = \left\{ a_{1},a_{2} ,\cdots ,a_{m} \right\} $, $ T_{b} = \left\{ b_{1},b_{2} ,\cdots ,b_{n} \right\} $. Assuming $n \ge m$,the IMED is defined as follows:

$$\begin{aligned} d_{IME}(T_{a},T_{b}) = \frac{\sum _{j = 0}^{n-m}\sqrt{\sum _{i = 1}^{m}\left\| b_{i+j}-a_{i} \right\| ^{2}}}{n-m+1} \end{aligned}$$

(4)

especially, when n = m, $T_{a}$ and $T_{b}$ have the same trajectory length.

3.2 Mean Distance Deviation

An appropriate NCM is very critical and widely used for general anomaly detection. Generally, if a trajectory is similar to its neighboring trajectories, we can think that it is normal. Otherwise, if a trajectory is not the same as the trajectories around it, we can judge that the trajectory is abnormal. Actually, the employment of the local neighborhood is a fundamental consideration widely used in many anomaly detection methods, such as the classic KNN. In SNN-CAD, they use the sum of k-nearest neighbors of a trajectory as an indicator of comparison with other trajectories. The larger the value of KNN, the greater the difference between the behavior of the trajectory and the surrounding trajectories. However, it is not ideal to use KNN to judge whether the trajectory is abnormal. For this reason, we propose a new NCM, Mean Distance Deviation (MDD). It is proved by the later experimental data (Sect. 5) that this method is much better than KNN in trajectory anomaly detection. Now we will give its specific definition (Fig. 2):

$$\begin{aligned} \alpha \left( T_{a} \right) = \sqrt{\frac{\sum _{T_{b}\in Neig(T_{a})}(MD(T_{a})-MD(T_{b}))^{2}}{k}} \end{aligned}$$

(5)

where

$$\begin{aligned} MD = \frac{\sum _{T_{b}\in Neig(T_{a})}d(T_{a},T_{b})}{k} \end{aligned}$$

(6)

3.3 Removing-Updating Strategy

The process of anomaly detection based on CP is to calculate the p-value of each data, and then compare with the given threshold value $\epsilon $ to determine whether the trajectory is abnormal. Because the abnormal data detected last time will interfere with this detection, we propose a removing-updating strategy to CP.

Specifically, when calculating the p-value of all data, the most abnormal data will be removed. Then update the threshold and repeat the above process with the remaining data until the threshold is 0.

4 Experiment

In this section, in order to evaluate the effect of MDD-ECAD and IMED, we compared MDD-ECAD algorithm with SNN-CAD algorithm, as well as several distances IMED, HD and DTW based on the synthetic data and real life data.

4.1 Data Sets

Synthetic tracjectories I [5] presented for anomaly detection is created by laxhammar et al. [6] using the trajectory generator software. It includes 100 datasets with 2000 trajectories in each dataset, about $1\%$ of which are abnormal trajectories. In addition, each trajectory is composed of a series of two-dimensional coordinate points. To expand the dataset for experiment, we use another Synthetic tracjectories [3] including synthetic trajectories II, synthetic trajectories III, synthetic trajectories IV. The three synthetic trajectories each contain 100 trajectory datasets with $\epsilon $ equal to 0.05, 0.01, and 0.02. And each dataset has 2000 trajectories with the number of sample points ranging from 20 to 100.

Aircraft trajectories [7] has in all 470 two-dimensional trajectories, involving 450 normal and 20 abnormal ones. And the trajectory length in the set varies from 12 to 171 sampling points.

4.2 Performance Measure

Trajectory anomaly detection is also two classification problem. Therefore, we can use the following evaluation indicators: true positive (TP), false positive (FP), false negative (FN), and true negative (TN). Precison (P), Recall (R) and F1 are used to test the classification accuracy. F1 score is used to evaluate the effect of all experiments. The larger F1 value is, the better the algorithm effect is.

$$\begin{aligned} P = \frac{TP}{TP+FP},\,R = \frac{TP}{TP+FN},\,F1 = \frac{2*P*R}{P+R} \end{aligned}$$

(7)

4.3 Experimental Results and Analysis

Table 1. The F1 results (%) of synthetic trajectory datasets with two methods.

Full size table

Table 2. The F1 results (%) of real-life trajectory datasets with differnt methods.

Full size table

In the experiment, we mainly compare the performance of SNN-CAD and MDD-ECAD. It can be seen from Table 1 that the F1 of MDD-ECAD is 89.11%, 86.35%, 90.76% and 92.31% respectively, higher than that of SNN-CAD. For testing our method on complex real life data, Table 2 shows MDD-ECAD still outperforms SNN-CAD and iVAT+ [4]. The F1 of MDD-ECAD is as high as 95%, while the SNN-CAD is only 75% and iVAT+ is 90%. The reason why SNN-CAD doesn’t perform excellent may be that its NCM can’t amplify the abnormal tracjectory behavior greatly, and the MDD we used can make up for this defect very well. In addition to the problem of detection framework, we use the removing and updating strategy to avoid the secondary interference of obvious abnormal trajectories to others.

Table 3. The F1 results (%) of MDD-ECAD with different distance measures.

Full size table

Table 4. Runtimes (s) of MDD-ECAD with differnt distance measuers.

Full size table

In order to compare the performance of IMED, HD and DTW, we use the three distances in MDD-ECAD method. Table 3 shows that the F1 of IMED is higher than HD and DTW, which indicates that IMED can measure the distance between trajectories better. In addition, the running time of IMED, HD and DTW is given (see Table 4), and it is obvious that the running time of IMED is fewer. From the theoretical analysis, IMED has the minimal computational complexity and no doubt runs the fastest. The experiment just verifies this point.

5 Conclusion

In this paper, in order to improve performance of SNN-CAD, we propose a new method to calculate the trajectory distance. An excellent Non-conformal measurment and a removing-updating strategy are also used for our anomaly detector. Large number of experimental data shows that our detector is better than SNN-CAD.

References

Dubuisson, M.P., Jain, A.K.: A modified Hausdorff distance for object matching. In: Proceedings of 12th International Conference on Pattern Recognition, vol. 1, pp. 566–568. IEEE (1994)
Google Scholar
Güting, R.H., Behr, T., Xu, J.: Efficient k-nearest neighbor search on moving object trajectories. VLDB J. 19(5), 687–714 (2010)
Article Google Scholar
Guo, Y., Bardera, A.: SHNN-CAD+: An improvement on SHNN-CAD for adaptive online trajectory anomaly detection. Sensors 19(1), 84 (2019)
Article Google Scholar
Kumar, D., Bezdek, J.C., Rajasegarar, S., Leckie, C., Palaniswami, M.: A visual-numeric approach to clustering and anomaly detection for trajectory data. Vis. Comput. 33(3), 265–281 (2015). https://doi.org/10.1007/s00371-015-1192-x
Article Google Scholar
Laxhammar, R.: Synthetic trajectories (2013)
Google Scholar
Laxhammar, R., Falkman, G.: Sequential conformal anomaly detection in trajectories based on Hausdorff distance. In: 14th International Conference on Information Fusion, pp. 1–8. IEEE (2011)
Google Scholar
Leader, D.S.G.: Aircraft trajectories. https://c3.nasa.gov/dashlink/resources/132/
Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn. Res. 9(March), 371–421 (2008)
MathSciNet MATH Google Scholar
Smith, J., Nouretdinov, I., Craddock, R., Offer, C., Gammerman, A.: Anomaly detection of trajectories with kernel density estimation by conformal prediction. In: Iliadis, L., Maglogiannis, I., Papadopoulos, H., Sioutas, S., Makris, C. (eds.) AIAI 2014. IAICT, vol. 437, pp. 271–280. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44722-2_29
Chapter Google Scholar
Toohey, K., Duckham, M.: Trajectory similarity measures. Sigspatial Special 7(1), 43–50 (2015)
Article Google Scholar
Zheng, Y.: Trajectory data mining: an overview. ACM Trans. Intell. Syst. Technol. (TIST) 6(3), 1–41 (2015)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, China
Xiaoyuan Hu & Qing Xu
Graphics and Imaging Laboratory, Universitat de Girona, Girona, Spain
Yuejun Guo

Authors

Xiaoyuan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Qing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yuejun Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qing Xu .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut's University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, X., Xu, Q., Guo, Y. (2020). Trajectory Anomaly Detection Based on the Mean Distance Deviation. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1332. Springer, Cham. https://doi.org/10.1007/978-3-030-63820-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-63820-7_16
Published: 17 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63819-1
Online ISBN: 978-3-030-63820-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Trajectory Anomaly Detection Based on the Mean Distance Deviation

Abstract

Similar content being viewed by others

Anomaly Detection Based on the Global-Local Anomaly Score for Trajectory Data

Spatio-temporal trajectory anomaly detection based on common sub-sequence

An overview on trajectory outlier detection

Keywords

1 Introduction

2 Background

2.1 Trajectory Type

2.2 Conformal Prediction

3 Our Method

3.1 Improved Moved Euclidean Distance

3.2 Mean Distance Deviation

3.3 Removing-Updating Strategy

4 Experiment

4.1 Data Sets

4.2 Performance Measure

4.3 Experimental Results and Analysis

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Trajectory Anomaly Detection Based on the Mean Distance Deviation

Abstract

Similar content being viewed by others

Anomaly Detection Based on the Global-Local Anomaly Score for Trajectory Data

Spatio-temporal trajectory anomaly detection based on common sub-sequence

An overview on trajectory outlier detection

Keywords

1 Introduction

2 Background

2.1 Trajectory Type

2.2 Conformal Prediction

3 Our Method

3.1 Improved Moved Euclidean Distance

3.2 Mean Distance Deviation

3.3 Removing-Updating Strategy

4 Experiment

4.1 Data Sets

4.2 Performance Measure

4.3 Experimental Results and Analysis

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation