1 Introduction

In 2012, the public’s awareness of maritime safety was heightened due to two high profile maritime incidents, the sinking of the Costa Concordia, off the shore of Italy, and the loss of the Rabaul Queen ferry, off Papua New Guinea, both leading to the loss of life at sea. In less than 50 years, cargo has grown 14 times, fleet capacity has grown eight times, oil tankers became twenty times bigger, average tankers seven times bigger, dry bulk vessels ten to fifteen times bigger, while ultra large cruise ships are now dwarfing the Titanic as their capacity has risen to 6000 passengers (Stopford 2009; ANAVE 2013). Vessel growth in size, speed and volume in relation to shipping lane size, has made vessels difficult to maneuver, especially around major ports and inland waterways requiring a change in collision avoidance techniques (Westrenen and Praetorius 2012). Recent developments indicate a significant increase in traditional maritime risks, but also highlight the introduction of unique challenges in maritime shipping at various levels.

For many years, practitioners and researchers from the field of maritime safety have turned towards information and communication technologies (ICT) in order to reduce risk. As such, the automatic identification system (AIS) was developed, primarily as a tool for maritime safety and vessel collision avoidance and is an integral component of various vessel traffic services (VTS), vessel traffic management systems (VTMS) and vessel traffic monitoring information systems (VTMIS). A number of vessel tracking systems are open to the public through the Internet; as such, marinetraffic (MT) is part of an open, community-based project that provides real-time information, regarding vessel movements and port traffic across the coastlines of many countries around the world. MT is at any time tracking more than 460,000 vessels and processing more than 50 million position reports per day, covering more than 10,000 ports and marinas across the globe. Information regarding these vessels is collected from over 1600 AIS receivers located at the coastline of more than 150 countries. A wide range of maritime stakeholder’s access such information on a daily basis, in an attempt to increase the efficiency and the safety of their operations at sea. For example,

  • Port authorities, coast guard, border controls, search and rescue teams combine vessel tracking information with other proprietary solutions accurately monitor and assess threats at sea;

  • Pilots, tug operators, towage and salvage make use of vessel tracking information for assistance when navigating to a distress call to accurately track a vessel entering a port;

  • Insurance companies use vessel historical data for incident investigation. The current condition, route and port calls of a vessel may also affect the insurance policy applied.

  • Crewmembers, families of seafarers, recreational sailors and even passengers frequently access such information to learn about a specific vessels position, route, and estimated time of arrival.

Overall, stakeholders and a variety of end users make use of vessel tracking information to increase their own understanding and perception of reality at sea, to support their decision making and situation management. At a cognitive level, situation management is a goal directed process of (a) collecting information (b) perceiving and recognizing situations (c) analyzing past situations and predicting future situations and (d) realistic reasoning, planning and implementing actions so that desired goal situation is reached with some pre-defined constrains (Rothblum 2002). Intelligent systems have a great potential for addressing decision-making problems, because they can model the involved players and produce good results in low computational time (Gomes et al. 2014). Intelligent decision support systems use data and mathematical models, that possess the characteristics of flexibility, adaptability, comprehension, and the capacity to manage uncertain and constantly changing information (Krishnakumar 2003), so as to support stakeholders decision making. They aim at automating steps (a)–(c) while providing human operators with proposals in support of their own decision making (d).

Maritime domain awareness (MDA) is the effective understanding of activities, events and threats in the maritime environment that could impact global safety, security, economic activity or the environment (Santos and Lunday 2009). The major challenge faced today by MDA, is developing the ability to identify patterns emerging within huge amounts of data, fused from various sources (information fusion) and generated from monitoring thousands of vessels a day, so as to act proactively to minimize the impact of possible threats. Recent advancements in ICT have created opportunities for increasing MDA, through better monitoring, but most importantly understanding vessel movements. Statistical inference and machine learning algorithms can provide crucial help in this process. Achieving situational awareness, perceiving and comprehending elements and their contextual meaning in the environment within a given volume of time and space, while projecting their status into a future timeframe (Endsley 1988), is a critical element of MDA (US Department of Homeland Security 2005). Increasing Maritime Domain Awareness in light of safety and efficiency can be viewed as three-step process where,

  • Accurately assessing the maritime environment: assessment of objects and their relations, amongst themselves and their environment, to provide a better understanding of the current situation. Supporting an operators (a) and (b) processes.

  • Impact assessment: projections of possible future situations and evaluations regarding evolving situations in an attempt to determine possible threats. Supporting human operators (c) analysis of past situations and predictions of future situations.

  • Proactive hazard prevention and increased efficiency through process optimization. Supporting a human operator’s decision making (d).

Deploying tools targeting at increasing MDA can potentially lead to significant improvements in safety and security but also energy and economic efficiency (forecasting congestion at ports, route emissions and others).

In this manuscript, we describe our work on employing machine learning methods and specifically artificial neural networks (ANN), as a basis for accurately predicting a vessels future behavior with an emphasis on solution practicality. To this end, we focus on deploying a web-based infrastructure that can produce good results in low computational time. This work is meant to sit on the fence between theoretical computer science and software engineering that can provide practical solutions to everyday problems (applied soft computing). Predicting a vessels behavior with ANNs raises a number of unique design challenges. A balance needs to be sought between prediction accuracy and training times. Another challenge is processing data of such volume and velocity (AIS messages regarding vessels are received every 20–90 s). As the predictive capacity could potentially be added to VTMIS such as MarineTraffic, a system operating constantly and tracking thousands of vessels at any given time, special attention needs to be paid to data related design choices. Our overall objective is to design and develop a system, which exhibits the characteristics identified below,

  • Is capable of learning vessels behavioral pattern from previous historical data available from MarineTraffic

  • Is capable of real-time vessel behavior prediction on user request, in low computational time

  • Is publicly accessible through the World Wide Web and can overlay vessel predictions on an interactive map for visualization purposes

  • Has the capacity to operate as the foundation for vessel collision avoidance and anomaly detection systems

To overcome various obstacles and accurately address our stated requirements, a number of design choices were made that are documented throughout this report. In the following sections of our manuscript, we first describe our steps towards data preparation and model building. In the successive section we present our training and evaluation results. Following this, we provide design and deployment details of our prototype system. We conclude our manuscript with an example case study investigating vessel collision detection.

2 Predicting a vessels behavior

2.1 Model selection

As far back as 1969 research efforts focused on the field of forecasting and have since been exploring methods of increasing forecasting accuracy. Approaches used throughout the given literature implement a wide variety of forecasting methods, such as ANNs, ARIMA/ARMA models, Box Jenkins and others (Makridakis and Hibon 2000). In an attempt to evaluate their forecasting capacity, Makridakis, Hibon et al. published a number of papers reporting on the effectiveness of a variety of well-documented models tested on real world datasets (Makridakis et al. 1982, 1993; Makridakis and Hibon 2000). According to this analysis automatic ARIMA modeling with intervention analysis (AAM1) and Automated artificial neural networks head the ranking, followed by Automated Parzens methodology with auto regressive filter (ARARMA) and Robust ARIMA univariate Box-Jenkins (Lopez et al. 2011). Although forecasting has long been considered as the field of research in the domain of linear statistics. Traditional approaches, such as Box-Jenkins or ARIMA method (Box and Jenkins 1976; Pankratz 1983), assume that real world observations are generated from linear processes (Zhang et al. 1998). However, they may be totally inappropriate if the underlying mechanism is nonlinear. It is unreasonable to assume a priori that a particular realization of a given time series is generated by a linear process (Zhang et al. 1998). Modeling of non-linear systems is far more difficult than linear systems.

An artificial neural network (ANN) is a machine learning information-processing paradigm inspired by biological nervous systems. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons), which work in unison to solve specific problems (Bevilacqua 2006). In general, a neural network is a parallel system, capable of resolving problems that linear-computing cannot (Verber 2012). ANN have a broad applicability to various real world problems including classification and pattern recognition, data processing, control, robotics but also prediction. The unique characteristics of ANNs—adaptability, nonlinearity, arbitrary function mapping ability—make them quite suitable and useful for forecasting tasks. According to Karlaftis and Vlahogianni, ANNs have been mainly used as data analytic methods because of their ability to work with massive amounts of multi-dimensional data, their modeling flexibility, their learning and generalization ability, their adaptability and their good predictive ability (Karlaftis and Vlahogianni 2011).

In the maritime domain ANNs have been employed for tasks such as forecasting traffic flow at the Suez canal (Mostafa 2004), predicting wave influence on the yaw motion of a ship (Nicolau et al. 2004) and vessel classification (Lagerweij et al. 2009). Lagerweij et al., analyze moving object trajectories from maritime vessels and classify vessels into three categories based on AIS data. In their work, they perform the tasks of clustering, classification and outlier detection from vessel trajectory data with the goal of identifying irregular vessel behaviors. In their work Perera, Oliveira and Soares, propose an ANN as the mechanism for detecting and tracking multiple vessels based on radar/laser tracking data (Perera et al. 2012).

Less work however has been conducted in the field of vessel movement prediction using an ANN, mostly due to the lack of data. Ebada developed an artificial intelligent system, capable of predicting accurately the turning tracks of ships (Ebada 2005). The physical and operational data of a ship are described and used as inputs into the system in order to predict the turning maneuvers. Closely related to this work is Simsir and Ertugrul (2009) study, with the aim of predicting the future coordinates of a manually controlled vessel using a trained ANN in the Bosporus Straits. The ANN was trained by using position and speed data collected from vessels, which navigated manually in the Strait. They were able of accurately predicting vessel positions in a 3-min ahead window. Rhodes Bomberger, Seibert and Waxman developed a Fuzzy ARTMAP classification neural network architecture where normal vessel speeds for different regions in a port area are learned by clustering (Bomberger et al. 2006). New data that is not recognized by the network during online operation is considered anomalous. The same research group has also proposed and implemented associative learning of motion patterns for anomaly detection, where associative neural networks learn to predict future vessel locations in a port given a current (Rhodes et al. 2007). Anomaly detection can be defined as a method that supports situation assessment process by indicating objects and situations that deviate from the expected, known or “normal” behavior and thus may be of interest for further investigation (Laxhammar et al. 2015). In the academic literature, the proposed models and algorithms are more or less data driven in the sense that normalcy is determined by machine learning algorithms analyzing a relative large set of historical data assumed to reflect normalcy (Zandipour et al. 2008). Learning techniques concerning this issue involve both supervised and unsupervised learning paradigms. As supervised learning methods require a representative dataset to train the predictive model in many cases throughout the literature unsupervised learning approaches are implemented such as SOM-based spatial outlier detection method and others (Cai et al. 2013).

2.2 Exploration and data preparation

Neural networks are only as good as the data they are given and the questions that are asked of them (Azoff 1994). One of the major constrains of applying machine learning techniques to vessel position prediction in the past, has been the lack of data necessary for training the ANN. The data used for this study is provided by MarineTraffic.com and is based on AIS. AIS transmissions can be defined as spatial time series (Rhodes et al. 2007), describing the movements of vessels across geographic regions. An AIS message contains the vessel’s maritime mobile service identity (MMSI)—a unique nine digit identification number; Navigation status; rate of turn; speed over ground; positional accuracy; course over ground; True heading; true bearing at own position; UTCsSeconds. Additionally messages may contain radio call sign, vessel name, vessel type; vessel dimensions; vessel draught, vessel destination and vessels estimated time of arrival. Ships broadcast original position reports at time intervals that vary between 3 s and more than 30 s, depending on their speed and the type of their AIS transponder. These reports are used for the real-time map display, but MT only archives them every 2 min for each vessel, as this interval is enough for tracking purposes and the applications are not used for navigation purposes (Table 1).

Table 1 In the table an instance of the dataset provided by MT is presented

For the neural network training, we decided that even a smaller frequency of position reports is enough for prediction accuracy. An interval of 15 min would be enough to detect an anomaly or course collision in vessel tracks while at the same time we ensure that the time series will be uninterrupted even for areas where the AIS signal reception is poor and the collected data is not as frequent as in fully covered areas. The following SQL query is used to achieve the described data down sampling:

2.3 Model building

The structure of the neural network needs to be able to hold the complexity of the at state problem. One of the most known and widely used architecture types is the multi-layer perceptron. Although there has been extensive research on the optimal design of neural networks, it is still largely an art and a matter of experimentation as each problem presents a distinct challenge (Zhang et al. 1998). Besides selecting the input and output neuron size consideration needs to be taken into the selection of hidden neurons and number of hidden layers. For neural networks implemented to predict future values in a time series dataset, the number of input neurons corresponds to the number of lagged observations required to discover the underlying pattern (Zhang et al. 1998). Neural networks can be trained to predict on-step-ahead (based on the time interval data is collected in the dataset prediction will occur for the following value) or multi-step-ahead forecasting. The conventional approaches for multistep ahead forecasting, include iterative or direct methods of forecasting. In iterative approaches, forecast values are used as inputs for the preceding calculation, while direct methods require neural networks to have several output nodes to directly forecast each step into the future (Zhang et al. 1998).

We exploit the fact that a number of behavioral parameters can be implied and computed based on the predicted vessel positions. Thus, our forecasting neural network is required to predict a vessels latitude and longitude at a future point in time, while we programmatically calculate other values (such as bearing and speed as discussed in following sections). For our given dataset we experimented with a backward window size of four; thus requiring a structure that could handle eight input neurons. For short-term prediction (15 min), we trained the neural network to output a single prediction (latitude and longitude) while for long term prediction we experimented with iterative and direct prediction approaches.

In our given approach, we make use of incremental pruning. This enables the ANN to autonomously select the optimal hidden layer structure based on its capacity to learn best. In such an approach, we predetermine the number of input and output layers while providing a range of minimum to maximum numbers of hidden neurons and layers. The algorithm will incrementally increase the size of the neural network and retrain at each increment until it reaches the maximum limits. When reaching the maximum the configuration that trained best is considered the optimal network configuration. Incremental pruning led us to a proposed structure of 1 hidden layer with 53 hidden neurons, although good results were also achieved with two hidden layers with 50 and 14 hidden neurons (Fig. 1).

Fig. 1
figure 1

The proposed ANN architecture. Our proposed solution makes use of a backward window size of four; thus requiring a structure that could handle eight input neurons (latitude and longitude for four positions). The output is the predicted position in a future point in time

An important element of the neural networks structure is their net inputs by using a scalar-to-scalar function called “the activation function or threshold function or transfer function”, output a result value called the “unit’s activation” (Karlik and Olgac 2010). In general, the activation function introduces a degree of nonlinearity that is valuable for most ANN applications. The predicted output of our ANN is in the range [−1, 1]; thus we selected the hyperbolic tangent function as an activation function for the hidden and output layers (Gomes and Ludermir 2013).

Training is the means by which neural network weights are adjusted to give desirable outputs. The propagation training algorithm will go through a series of iterations that will most likely improve the neural network’s error rate by some degree (Heaton 2011). The error rate is the percentage difference between the actual output from the neural network and the ideal output provided by the training data (Heaton 2008). The mean square error (MSE) is an error calculation method used in describing how well a machine learning method, typically a regression model, represents the data being modeled (Heaton 2011). This process is repeated until the error for each training pattern drops under a certain accepted level. A large number of different algorithms have been proposed to solve the problem of updating the weights in an appropriate way, by adapting the parameters during the learning process (Braga et al. 2000). Training algorithms are distinguished into two separate categories, global and local adaptation strategies. Global adaptation categories utilize the state information of the entire network to modify the global parameters, whereas local adaptation strategies makes use of specific weight information, as local gradient, to adjust each weight parameters individually. To train our model we experimented with a number of training algorithms including Back Propagation, Resilient Propagation, Quick Propagation, Manhattan Propagation and Levenber Marquardt Training. Best training times were achieved with ‘Resilient back propagation. RPROP is based on the traditional backpropagation method with just one difference: weight updating is done by evaluating the behavior of the error function. With RPROP, the value of the weight update is calculated by evaluating the partial derivative sign from one iteration to another, improving the learning process, eliminating some problems encountered in the backpropagation algorithm and making the proposed method faster than the traditional one (Riedmiller and Braun 1993; Souza et al. 2004). RPROP was able to achieve a target MSE of 0.01 in only a few seconds. Vessel data is so voluminous that it is impossible to train the ANN on raw past vessel data. To this end, we exploit the fact that specific vessel types follow repetitive patterns in short periods. We concentrate our study on passenger vessels that perform repetitive voyages within a given timeframe (such as around the Aegean Islands where voyages are performed within a few hours or days). Predicting the behavior of a vessel engaged in tramp trade or similar trade, would require a totally different approach, as on many occasions these vessels may not have performed a similar voyage in their short past data. This tradeoff allowed for much shorter training times (5–10 s) and a desired MSE of 0.01 (Fig. 2).

Fig. 2
figure 2

Error rate (MSE) reduction through training iterations (Epochs)

2.4 Model evaluation

We implement our ANN in C# using the Encog3 Machine Learning Library (Heaton 2008). Encog is an open source advanced machine learning framework that supports a variety of algorithms, as well as support classes to normalize and process data. Most Encog training algorithms are multi-threaded and scale well to multicore infrastructure. To train and test our ANN we made a selection of passenger vessels and loaded data regarding these from the previous 48–72 h (depending on prediction).

The conventional approach to evaluating a ANN accuracy usually involves randomly setting aside a portion of the dataset e.g. 70 % for training and 30 % for testing. The training data set is used exclusively for model development and then the test sample is used only to assess the trained network. After training was completed, we evaluated the trained ANN by feeding it data that was excluded from the original dataset. This data was collected in the following 24 h and was previously unseen to the ANN during training. This data was pre and post-processed in the same method with the training data. In the following figures we report on evaluation results regarding predictions (Figs. 3, 4, 5, 6).

Fig. 3
figure 3

Future (15 min ahead steps) latitude prediction for a vessel sailing around the Aegean, Greece

Fig. 4
figure 4

Future (15 min ahead steps) longitude prediction for a vessel sailing around the Aegean, Greece

Fig. 5
figure 5

Future (4 h ahead steps) latitude prediction for a vessel sailing around the Aegean, Greece

Fig. 6
figure 6

Future (4 h ahead steps) longitude prediction for a vessel sailing around the Aegean, Greece

In the following tables predicted values and true values are provided for a number of different vessels travelling across the sea . For each of these the error rate is calculated (Tables 2, 3, 4).

Table 2 Predicted position (latitude and longitude) for a vessel previously unseen by the ANN
Table 3 Predicted position (latitude and longitude) for a vessel previously unseen by the ANN
Table 4 Predicted position (latitude and longitude) for a vessel previously unseen by the ANN

3 Solution web architecture and deployment

3.1 Deployment and design choices

Consequently, to successfully testing and evaluating the ANN model we chose to design and deploy a prototype web application, which would meet the previously identified requirements. Due to the nature of our application, the web application was built using the ASP.NET MVC 5 (model view controller) framework. The ASP.NET MVC 5 is a framework for building scalable, standards-based web applications using well-established design patterns, that places an emphasis on a loosely coupled application architecture and highly maintainable code (Chadwick et al. 2012; Galloway et al. 2012). The model-view-controller pattern is an architectural pattern that encourages strict isolation between the individual parts of an application (loose coupling). The MVC pattern splits an application into three layers: the model, the view, and the controller. The model represents core business logic and data. Models encapsulate the properties and behavior of a domain entity and expose properties that describe the entity. The proposed solution contains models encapsulating vessel data. The view in MVC is responsible for transforming a model or models into a visual representation. In web applications, this most often means generating HTML to be rendered in the user’s browser, although views can manifest in many forms such as AJAX interactive map mashups as in our solution. The controller, essentially C# code, controls the application logic and acts as the coordinator between the view and the model. Controllers receive input from users via the view, and then work with the model to perform specific actions, passing the results back to the view. A user can select any vessel from the interactive map initiating the prediction procedure (Fig. 7).

Fig. 7
figure 7

Code map of MVC5 Home Controller and prediction function (generated by MS Visual Studio 2013 code analysis)

The “Prediction” function is capable of retrieving vessel data from the database, normalizing this, training the ANN and returning the predicted geographical vessel position. Predictions from the neural network are transformed to the original data scale before positioned on the map at (http://mob0.marinetraffic.com/). The system architecture employs technologies that support the interoperability between loosely coupled components. In particular, the system design follows the principles of service oriented architectures exposing SOAP or REST (web services) interfaces. Data from the vessel position database but also vessel predictions are returned as a JSON or XML web service (Fig. 8).

Fig. 8
figure 8

Passenger vessel future position (latitude and longitude) prediction overlaid on an interactive map for user visualization (http://mob0.marinetraffic.com/)

As opposed to going through a lengthy training process and only deploying the neural network after the process has been successful, we choose to train our network per vessel on user request in real time. When a user selects a vessel or specific geographic area, according to the required prediction, previous data is loaded. This data is fed into the neural network for training until its training rate is below 0.01. As this is a demanding CPU process, we have deployed a cloud infrastructure to support scaling on demand. Following this, the system will store the trained vessel ANN and recall it in the near future (24 h) for following predictions. To guarantee the solutions quality of service and practical applicability a number of functional requirements had to be met. Several design choices were made to successfully address these as presented in the following table (Table 5).

Table 5 Requirements mapped to design choices

3.2 Solution case study

Our proposal can potentially be used as the predictive foundation for various intelligent systems, including vessel collision prevention, vessel route planning, operation efficiency estimation and even anomaly detection. In this section, we present a case study investigating vessel collision detection. A user is permitted to select a geographical area containing a number of passenger vessels. In such circumstances, if no trained ANN is present for each vessel, the training process is initiated and a predicted vessel track is returned for each vessel. We compute a vessels bearing and speed between geographical points (latitude and longitude). We call an implementation function of the Harversive formula in C#, that is capable of calculating the distance and bearing between two geographical points. If at any point, the calculated distance is smaller than a predetermined minimum an alert is generated by the system, informing the human operator. The same function is capable of predicting a vessels collision with ground when overloaded with the vessel prediction array and nautical information (an array of land coordinates). The following UML sequence diagram depicts the flow of messages between functions and other components in such a scenario (Fig. 9).

Fig. 9
figure 9

Sequence diagram depicting flow of messages in vessel collision detection scenario (generated by Visual Studio 2013 code analysis)

4 Conclusion

In this manuscript, we report on our ongoing work in adding predictive capacity to VTMIS and specifically MarineTraffic.com. In this, we describe our work on employing machine learning methods and specifically neural networks, as a basis for accurately predicting a vessels future behavior with an emphasis on solution practicality. To this end, we focus on deploying a web-based infrastructure that can produce good results in low computational time. Further improvements are currently being performed in order to accommodate contextual information during the machine learning process.