1 Introduction

Maritime traffic situation awareness intends to provide both static and dynamic navigation information (e.g., traffic density, speed, trajectory) to enhance maritime safety, especially in the autonomous ship era. Significant attentions are paid to obtain high-fidelity maritime traffic situation awareness (MTSA) information under typical yet complicated traffic scenarios [1, 2]. Many studies implement the MTSA task via the support of automatic identification system (AIS) data. Indeed, ship crew and maritime traffic regulators can determine real-time traffic situations via instant AIS data. The radar data is employed to further enhance maritime traffic situation awareness information with the help of identifying ship echoes, which is particularly useful in ship berthing and unberthing procedure. However, the AIS data may be contaminated by various duplicated samples, whilst radar may fail to detect wooden ships sailing in the channels [3, 4].

The rapid development in computer vision techniques shows their potential in implementing traffic situation awareness task due to the advantages of easy understandability, low cost, easy deployment, etc. Previous studies focus on extracting traffic spatial-temporal information from maritime surveillance videos via the logic of ship detection, tracking, etc. [5, 6]. Note that the MTSA can be easily fulfilled with consecutive trajectory based on visual ship tracking results, which involves trajectory clustering, navigation decision-making, collision avoidance analysis [7, 8].

It was found that many MTSA related studies have been conducted via the support of varied maritime data sources (e.g., AIS, RADAR). Overall, previous studies were mainly implemented with the support of high-fidelity maritime data (i.e., we do not need to cleanse the input maritime data). But, it can be inferred that the raw maritime data sources may contain unexpected outliers during data collection procedure. In addition, many imaging interferences can be obviously observed in the MTSA tasks. For instance, maritime images under difference visibilities (strong lighting condition, low visibility, etc.) may significantly degrade MTSA model performance. To address the above-mentioned disadvantages, we propose a MTSA-oriented framework consisted of fog interference removal, ship tracking, ship trajectory outlier rectification, and traffic situation analysis.

Our contributions can be ascribed into the following aspects: (1) we propose a systematic framework to explore ship trajectories from maritime images and analyze maritime traffic situations; (2)we introduce dark channel prior model to remove fog in maritime videos, and thus provide fog-free images for the purpose of accurate ship trajectory extraction; (3) we extract ship imaging trajectory from the fog-free images via steps of ship tracking and trajectory rectification, and then we implement the maritime traffic situational awareness from pixel-wise perspective; (4) we verify the proposed framework performance on three typical yet common maritime traffic scenarios. The remaining of this study is organized as follow. Section 2 provides literature review focusing on the MTSA via varied maritime data sources. Section 3 illustrates the methodology details for image defogging and ship trajectory extraction. Section 4 provides the experimental results, and Section 5 briefly concludes the study.

2 Literature review

2.1 Ship trajectory data extraction and outlier removal

High-resolution ship trajectory is vital for maritime traffic control and management, and thus many research focuses are paid to obtain accurate ship trajectories from varied maritime data sources (e.g., AIS data, radar, maritime surveillance videos) [9, 10]. Previous studies suggest that raw AIS data may contain unexpected outliers due to limited data transmission capacity, fake ship maritime mobile service identity, deliberately shutdown AIS facility, etc. [11, 12]. Guo et al., introduced an improved kinematic interpolation framework to reconstruct ship trajectory via the help of AIS data with steps of AIS data preprocessing, time interval distribution equalization, outlier removal and kinematic interpolation operation [13]. Chen et al., proposed an alternative method to detect ships in coastal channels with high-frequency coastal array radar, aiming to help maritime traffic regulations obtain accurate real-world ship trajectory information when ship AIS data is missing [14]. Maritime surveillance videos provide on-site ship trajectories in an easy yet informative manner considering that images principle is quite close to human being visual perception mechanism [15,16,17].

2.2 Maritime traffic situation awareness

Maritime traffic situation awareness attracts significant attentions in the maritime traffic community due to the increasing waterway activities (e.g., traffic volume significantly surge in inland waters). Sui et al., introduced complex network theory to map the ship relationships (under varied typical traffic scenarios) into complex network with the help of establishing macro index [18]. Szlapczynski et al., proposed a novel framework to recognize and visualize ship collision avoidance warning information via the collision threat parameters area technique, which aims to enhance maritime traffic safety under adverse weather conditions [19]. Sharma et al., proposed a goal-directed task analysis model to implement the maritime traffic situation via maritime domain information and pilotage operations with the help of ship AIS data [20]. Du et al., estimated the give-way ship intention to establish basic framework for the stand-on ship as second line of defense considering that few attentions were paid to enhance such maritime traffic situation [21]. Similar studies can be found in [22,23,24]. The MTSA is an active topic in the maritime community which is implemented by exploring varied data sources to obtain both kinematic and static maritime traffic information.

2.3 Visual analysis under complex scene

Many studies are conducted to fulfill maritime traffic situation awareness task with the help of cutting-edge computer vision related techniques. Yang et al., proposed varied detectors to highlight features of small, cluttered and rotated objects for the purpose of robust object detection [25,26,27]. Hong et al., allocated four anchor boxes to each detection scale in their Gaussian-YOLO layer to tackle the multi-scale problem in detection [28]. Shi et al., proposed a metric-based few-shot method to mitigate insufficient ship training samples disadvantages in the object detection task [29]. Zhang et al., believed that the volume and diversity of the dataset can be expanded by including both original images and enhanced images. Moreover, a hybrid training method was proposed to synthetically train degraded images to enhance the adaptability of the proposed model in tackling the small target detection challenge [30]. In comparison, we addressed the maritime traffic situation awareness problem through the dark channel prior fog removal model and the SAMF tracking model.

3 Methodology

The proposed framework aims to achieve the MTSA task and autonomous navigation for smart ships. To that aim, the proposed framework is implemented consisting steps of fog removal, ship tracking, abnormal ship imaging trajectory correction. More specifically, the first step is to obtain the fog-free maritime images, which aims to fulfill the lighting radiance recovery task. After that, we can obtain raw ship positions from the fog-free images via scale adaptive kernel correlation filter (SAMF) tracking model. The curve fitting model is further introduced to remove abnormal ship positions, and thus MTSA is analyzed via the support of pixel-wise ship imaging trajectory data. The schematic overview for the proposed framework is shown in Fig. 1.

Fig. 1
figure 1

Overview for the proposed maritime traffic situation awareness framework

3.1 Fog removal with dark channel prior model

It is observed that adverse weather conditions may introduce imaging interference, which will further degrade MTSA performance and accuracy. For the foggy condition, the classic dark channel prior model presents satisfied performance in the fog removal task, which removes fog through the atmospheric degradation model that formulated as Eq. (1). We obtain the output fog-free image \(\mathrm{O}\left(\mathrm{m}\right)\) based on estimation of \(\mathrm{t}\left(\mathrm{m}\right)\) and \(\mathrm{G}\) through formulating Eq. (1) into Eq. (2).

$$\mathrm{R}\left(\mathrm{m}\right)=\mathrm{O}\left(\mathrm{m}\right)\mathrm{t}\left(\mathrm{m}\right)+\mathrm{G}(1-\mathrm{t}(\mathrm{m}\left)\right)$$
(1)
$$\mathrm{O}\left(\mathrm{m}\right)=\frac{\mathrm{R}\left(\mathrm{m}\right)-\mathrm{G}}{\mathrm{t}\left(\mathrm{m}\right)}+\mathrm{G}$$
(2)

where \(\mathrm{R}\left(\mathrm{m}\right)\) is the foggy image, \(\mathrm{O}\left(\mathrm{m}\right)\) is the output fog-free image, \(\mathrm{G}\) is the atmosphere light and \(\mathrm{t}\left(\mathrm{m}\right)\) is the transmission rate. The symbol \(\mathrm{m}\) is a pixel coordinate.

3.1.1 Estimation of G and t(m)

First, we estimate the atmospheric light \(\mathrm{G}\). The assumption for implementing the dark channel prior model is that fog-free block intensity for certain channel is obviously lower than those of the counterparts. More specifically, the fog-free block intensity is quite close to zero, which can be approximately as Eq. (3).

$${\mathrm{O}}^{\mathrm{d}}\left(\mathrm{m}\right)=\underset{\mathrm{c}\in \{\mathrm{r},\mathrm{g},\mathrm{b}\}}{\mathrm{min}}(\underset{\mathrm{n}\in {\Psi }\left(\mathrm{m}\right)}{\mathrm{min}}\left({\mathrm{O}}^{\mathrm{c}}\right(\mathrm{n}\left)\right)\approx 0$$
(3)

where \({\mathrm{O}}^{\mathrm{c}}\left(\mathrm{n}\right)\) is the color channel for the ship image \(\mathrm{O}\) at pixel \(\mathrm{n}\), the symbol \({\Psi }\left(\mathrm{m}\right)\) is a local image patch for the dark channel centering at pixel \(\mathrm{m}\). The \({\mathrm{O}}^{\mathrm{d}}\left(\mathrm{m}\right)\) is the dark channel of output figure at pixel \(\mathrm{m}\), and the value is quite close to zero when ship image \(\mathrm{O}\) is taken at good visibility condition. The parameter \(\mathrm{n}\) is an iterated pixel variable from \({\Psi }\left(\mathrm{m}\right)\). The symbol \(\mathrm{c}\) is color channel (i.e., r (red), g (green) and b (blue) channel).

Based on the above assumption, we formulize the Eq. (1) into Eq. (4) by applying dark channel method.

$${\mathrm{R}}^{\mathrm{d}}\left(\mathrm{m}\right)=\left(1-\mathrm{t}\left(\mathrm{m}\right)\right)\mathrm{G}$$
(4)

The \(\mathrm{t}\left(\mathrm{m}\right)\) decreases along with distance become larger, and thus we can use \({\mathrm{R}}^{\mathrm{d}}\left(\mathrm{m}\right)\) to estimate \(\mathrm{G}\). Note that fog-free zone intensity is obviously lower than its counterparts, which indicates that intensity for the image block taken under fog condition will significantly higher than those under normal weather condition (e.g., sunshine). In another word, we select pixels with intensity values ranges top 1‰ from the dark channel image (due to that the fog density for the pixels is usually significantly larger those of the counterparts). The pixels with maximum intensity are further selected as the estimation value of atmospheric light \(\mathrm{G}\) for the input maritime image.

After that, we estimate transmission rate \(\mathrm{t}\left(\mathrm{m}\right)\) with the support of Eq. (5). Previous study suggested that the transmission coefficient \(\mathrm{t}\left(\mathrm{m}\right)\) for each local patch in the maritime ship image is constant, and thus the \(\mathrm{t}\left(\mathrm{m}\right)\) is replaced with \(\stackrel{\sim}{\mathrm{t}}\left(\mathrm{m}\right)\). We can obtain the Eq. (6) with the support of Eq. (3). The fog-interfered maritime images help ship officials better aware on-site traffic situations (e.g., distance between our ship and its neighbors). To that aim, we retain partial fog during the fog-removal procedure by introducing a convergence factor \({\upbeta }\), and the Eq. (6) is formulated as Eq. (7). Note that larger \({\upbeta }\) may distort the raw foggy maritime image, and smaller \({\upbeta }\) may degrade the model fog-removal performance.

$${\mathrm{R}}^{\mathrm{d}}\left(\mathrm{m}\right)=\underset{\mathrm{c}}{\mathrm{min}}\left(\underset{\mathrm{n}\in {\Psi }\left(\mathrm{m}\right)}{\mathrm{min}}\left(\frac{{\mathrm{R}}^{\mathrm{c}}\left(\mathrm{n}\right)}{{\mathrm{G}}^{\mathrm{c}}}\right)\right)=\stackrel{\sim}{\mathrm{t}}\left(\mathrm{m}\right)\underset{\mathrm{c}}{\mathrm{min}}\left(\underset{\mathrm{n}\in {\Psi }\left(\mathrm{m}\right)}{\mathrm{min}}\left(\frac{{\mathrm{O}}^{\mathrm{c}}\left(\mathrm{n}\right)}{{\mathrm{G}}^{\mathrm{c}}}\right)\right)+1-\stackrel{\sim}{\mathrm{t}}\left(\mathrm{m}\right)$$
(5)
$$\stackrel{\sim}{\mathrm{t}}\left(\mathrm{m}\right)=1-\underset{\mathrm{c}}{\mathrm{min}}\left(\underset{\mathrm{n}\in {\Psi }\left(\mathrm{m}\right)}{\mathrm{min}}\left(\frac{{\mathrm{R}}^{\mathrm{c}}\left(\mathrm{n}\right)}{{\mathrm{G}}^{\mathrm{c}}}\right)\right)$$
(6)
$$\stackrel{\sim}{\mathrm{t}}\left(\mathrm{m}\right)=1-\underset{\mathrm{c}}{{\beta \mathrm{min}}}\left(\underset{\mathrm{n}\in {\Psi }\left(\mathrm{m}\right)}{\mathrm{min}}\left(\frac{{\mathrm{R}}^{\mathrm{c}}\left(\mathrm{n}\right)}{{\mathrm{G}}^{\mathrm{c}}}\right)\right)$$
(7)

3.1.2 Obtain output fog-free image

Now we have obtained the estimation value of \(\mathrm{G}\) and \(\mathrm{t}\left(\mathrm{m}\right)\) and come to the last step of fog removal. The raw fog-removal image (i.e., output from previous step) may contain unexpected noisy pixels, and \({\mathrm{t}}_{\mathrm{t}}\) is introduced as a threshold to suppress the noise pixels. The fog-removal maritime image is obtained with Eq. (8) which formulated from Eq. (2). The fog-removal maritime image is obtained with Eq. (8).

$$\mathrm{O}\left(\mathrm{m}\right)=\frac{\mathrm{R}\left(\mathrm{m}\right)-\mathrm{G}}{\mathrm{m}\mathrm{a}\mathrm{x}\left(\mathrm{t}\left(\mathrm{m}\right), {\mathrm{t}}_{\mathrm{t}}\right)}+\mathrm{G}$$
(8)

3.2 Ship imaging trajectory determination with SAMF tracking model

Ship trajectory extraction accuracy heavily depends on ship tracking model performance. In our study, we introduce a SAMF tracking model (which is a type of an improved kernel correlation filter) to obtain ship imaging trajectories from fog-free maritime images. We initialize the SAMF tracker with ship position in the first frame (which is manually labeled). The raw input fog-free maritime image is transformed into a one-dimensional data \(\mathrm{b}=\left[{\mathrm{b}}_{1},{\mathrm{b}}_{2},\cdots ,{\mathrm{b}}_{\mathrm{n}}\right]\) for the purpose of fulfilling ship tracking task. Moreover, we obtain cyclic shifting ship training samples based on basic ship sample which is labeled as \({\mathrm{D}}_{\mathrm{b}}= \left[{\mathrm{b}}_{\mathrm{n}},{\mathrm{b}}_{1},\cdots ,{\mathrm{b}}_{\mathrm{n}-1}\right]\).

The goal of the tracking model is to obtain the ship candidate samples. In fact, the problem can be regarded as a linear ridge regression model, and the objective function is formulated as Eq. (9). The closed solution is for Eq. (9) is formulated as Eq. (10).

$$\underset{\mathrm{u}}{\mathrm{min}}\sum\nolimits_{\mathrm{j}=1}^{\mathrm{v}}{(\mathrm{f}\left({\mathrm{b}}_{\mathrm{j}}\right)-{\mathrm{y}}_{\mathrm{j}})}^{2}+{{\upalpha }\parallel \mathrm{u}\parallel }^{2}$$
(9)
$${\mathrm{u}=({\mathrm{B}}^{\mathrm{T}}\mathrm{B}+{\upalpha }\mathrm{I})}^{-1}{\mathrm{B}}^{\mathrm{T}}\mathrm{y}$$
(10)

where \({\upalpha }\) determines regularization level for the ship tracking model, symbol \(\mathrm{f}\left({\mathrm{b}}_{\mathrm{j}}\right)\) is the jth ship base sample, the \({\mathrm{y}}_{\mathrm{j}}\) is the jth ship candidate and symbol \(\mathrm{u}\) is the closed solution. The symbol \(\mathrm{f}\left(\mathrm{b}\right)={\mathrm{u}}^{\mathrm{T}}\mathrm{b}\) is a linear combination of the input ship base samples. Note that \(\widehat{\mathrm{u}}\) (\(\widehat{\mathrm{b}}\)) is the discrete Fourier transformation for the \(\mathrm{u}\left(\mathrm{b}\right)\), \(\mathrm{B}\) is the circulant matrix, \(\mathrm{I}\) is an identity matrix.

We employ the property of circulant matrix \(\mathrm{B}\) to simplify the Eq. (10), which is further formulated into Eq. (12).

$$\mathrm{B}= {\mathrm{F}}^{\mathrm{H}}\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left(\mathrm{F}\mathrm{b}\right)\mathrm{F}$$
(11)

where \(\mathrm{F}\) is a discrete Fourier transform matrix, which transforms ship sample data from spatial-temporal domain into frequency domain (i.e., Fourier domain). \({\mathrm{F}}^{\mathrm{H}}\) is the Hermitian transpose for the \(\mathrm{F}\). The diagonal matrix for the \(\mathrm{F}\mathrm{b}\) (the Fourier transformation for the \(\mathrm{b}\)) is presented as \(\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left(\mathrm{F}\mathrm{b}\right)\).

$$\widehat{\mathrm{u}}=\left({\widehat{\mathrm{b}}}^{\mathrm{*}}\odot \widehat{\mathrm{y}}\right)/\left({\widehat{\mathrm{b}}}^{\mathrm{*}}\odot \widehat{\mathrm{b}}+{\upalpha }\right)$$
(12)

The symbol \(\mathrm{*}\) is the complex conjugate, and \(\odot\) is the element-wise product. The symbol \({\mathrm{F}}^{-1}\) is the inverse transformation operation for the \(\mathrm{F}\).

To find accurate ship positions in maritime images, we employ bilinear interpolation method to exploit ship features in varied scales. Given a ship template \({\mathrm{T}\mathrm{D}}_{\mathrm{T}}=\left({\mathrm{T}\mathrm{D}}_{\mathrm{x}},{\mathrm{T}\mathrm{D}}_{\mathrm{y}}\right)\) in the first maritime image, we define a scale pool \(\mathrm{S}\mathrm{P}=\left\{{\mathrm{s}\mathrm{p}}_{1},{\mathrm{s}\mathrm{p}}_{2},\cdots ,{\mathrm{s}\mathrm{p}}_{\mathrm{r}}\right\}\) for obtaining ship training samples at different scales, and the symbol \({\mathrm{T}}_{\mathrm{w}}\) is the window size for identifying ship visual features in the maritime image. We exploit ship sample at different cosine window sizes for the purpose of identifying potential ship candidates from current image. The target ship sample size is adjusted to same scale, and the maximum response between the input training sample and ship candidates is obtained with Eq. (13).

The \({\mathrm{s}\mathrm{p}}_{\mathrm{r}}\) is the rth ship sampling scale, and \(\widehat{\mathrm{f}}\left({\mathrm{z}}^{{\mathrm{s}\mathrm{p}}_{\mathrm{r}}}\right)\) is the maximum response in between the input ship sample and ship candidates in Fourier domain. The final ship candidate is the result of the following formula (13):

$$\mathrm{arg} \ \mathrm{max}{\mathrm{F}}^{-1}\widehat{\mathrm{f}}\left({\mathrm{z}}^{{\mathrm{sp}}_{\mathrm{r}}}\right)$$
(13)

3.3 Ship trajectory outlier removal

We carefully check the raw ship imaging trajectories (i.e., the output ship positions in each frame obtained by the previous step), and found that many trivial yet abnormal oscillations are found in the ship trajectory dataset. The main reason is that target ship is temporary occluded by neighboring ship in maritime ships, and thus the SAMF tracking model may fail to obtain distinguished ship visual feature from the fog-free maritime images (i.e., the SAMF tracking model extracts features from the occluded ships). To address the issue, we employ the curve fitting model to correct out ship trajectory data outliers considering that ship movement in short period is constant. More specifically, significant ship displacement is considered as the trajectory outliers (see Eq. (14)), and the curve fitting model and down-sampling mechanism are employed to suppress the ship imaging trajectory outliers. The curve fitting model used for correcting out x and y axis outliers are shown in Eqs. (15) and (16), respectively. Moreover, we note that ship trajectory positions (i.e., x and y axis) in neighboring frames show abnormal back and forth variation tendency (e.g., ship in real-world move forward while the ship imaging trajectory in neighboring frames present backward movement status). We suppress the data outliers by down-sampling the ship trajectory data at given frame interval (see Eq. (17)).

$$\left\{\begin{array}{c}\left|{x}_{i}-{x}_{i+p}\right|>{T}_{1}\\ \left|{y}_{i}-{y}_{i+p}\right|>{T}_{2}\end{array}\right.$$
(14)
$${\mathrm{CF}}_{\mathrm{x}}=\sum\nolimits_{\mathrm{e}=0}^{\mathrm{E}}{\mathrm{g}}_{\mathrm{e}}{\mathrm{D}}_{x}^{e}$$
(15)
$${\mathrm{CF}}_{\mathrm{y}}=\sum\nolimits_{\mathrm{e}=0}^{\mathrm{E}}{\mathrm{v}}_{\mathrm{e}}{\mathrm{D}}_{y}^{e}$$
(16)
$$\begin{array}{cc}CF=\left[{\mathrm{CF}}_{x}^{i}, {\mathrm{CF}}_{y}^{i}\right], & s.t. i\%p=0 \end{array}$$
(17)

where ship position at the x-axis (y-axis) for the ith frame is \({\mathrm{x}}_{\mathrm{i}}\) (\({\mathrm{y}}_{\mathrm{i}}\)). The ship position for the \((\mathrm{i}+\mathrm{p})\)th frame for x and y-axis are denoted as \({\mathrm{x}}_{\mathrm{i}+\mathrm{p}}\) and \({\mathrm{y}}_{\mathrm{i}+\mathrm{p}}\), respectively. The parameters \({\mathrm{T}}_{1}\) and \({\mathrm{T}}_{2}\) are the thresholds for identifying ship position outliers in x and y axis, respectively. The parameter \(\mathrm{e}\) is the curve-fitting order for the x axis, and symbol \(\mathrm{E}\) is the maximum order. The \({\mathrm{g}}_{\mathrm{e}}\) and \({\mathrm{v}}_{\mathrm{e}}\) are the curve-fitting coefficients for the x and y-axis, respectively. The symbol \({\mathrm{D}}_{\mathrm{x}}^{\mathrm{e}}\) (\({\mathrm{D}}_{\mathrm{y}}^{\mathrm{e}}\)) is the ship position at x-axis (y-axis) for the Dth frame, and the \({\mathrm{C}\mathrm{F}}_{\mathrm{x}}\) (\({\mathrm{C}\mathrm{F}}_{\mathrm{y}}\)) is the outlier-removal counterpart with the help of curve-fitting model. The \(\mathrm{C}\mathrm{F}\) is the final trajectory data which involves ship x (i.e., \({\mathrm{C}\mathrm{F}}_{\mathrm{x}}^{\mathrm{i}}\)) and y-axis (i.e., \({\mathrm{C}\mathrm{F}}_{\mathrm{y}}^{\mathrm{i}}\)) data. The condition\(\mathrm{i}\mathrm{\%}\mathrm{p}=0\) aims to select ship position data at given frame interval p, and we set default value for the p as 5 in our study.

4 Experiment

4.1 Data description and experimental platform

We collected three ship video clips to evaluate the proposed framework efficiency, which involved three typical maritime traffic situations. The video #1 originated from public-accessible maritime website which is shown as follows: https://sites.google.com/site/dilipprasad/home/singapore-maritime-dataset [5], [31]. The video #2 and the video #3 are collected by our colleague when they served as ship crew on the Yuming ship (i.e., internship training ship). The frame rate for the video #1 was 30 frame per second (fps), and the image resolution was 1920 × 1080. Note that the video length was 9s. The frame rate, resolution and length for the video #2 were same to those of the counterparts of video #1. The frame rate for the video #3 was 24 fps, and the image resolution (video length) was 640 × 368 (22s). The maritime traffic environment for the video #1 was in mist weather condition, while the target ship was fully-sheltered by neighboring ships in the collected maritime video in short period. Video #2 was shot under heavy mist condition, and the target ship was partially sheltered by the obstacles in the maritime image sequences. Video #3 was collected involving with ship encountering traffic scenario.

We set the filter kernel size of the fog-removal module into 7, and the brightest spot portion for estimating atmospheric light is fine-tuned into 0.0001. Moreover, the proposed ship tracking SAMF model employed the Gaussian function as the kernel module. The padding value and regularization parameter were set into 1.5 and 0.0001, respectively. Besides, the default spatial bandwidth (interpolation factor) was set to 0.1 (0.01) in the study. Detailed information for the collected videos were shown in Table 1. The proposed framework was implemented on windows 10 OS with Intel (R) Core (TM) i7-7500U CPU @ 2.70 GHz processor, and the RAM was12G. The GPU version is NVIDIA GeForce GTX 940MX with 2G memory. The experimental platform for the study is Python version 3.6 and Matlab version 2020a.

Table 1 Details for the collected three maritime videos

4.2 Results

To evaluate the proposed model performance, we firstly removed fogs in the above-mentioned three maritime videos with the help of the dark channel prior module in the proposed framework. For the purpose of obtaining optimal fog-removal results, we set the convergence factor \({\upbeta }\) and threshold \({\mathrm{t}}_{\mathrm{t}}\) into 0.91 and 0.1, respectively. Note that we did not provide parameter fine-tuning procedure in our study considering the page limitations. Figure 2 provided typical images for both before and after fog-removal procedure for video #1, and the region of interests (ROIs) were labeled out in each image with red rectangle. It is noted that fogs in Fig. 2a for frame #1, #191 and #265 were successfully suppressed, which can be clearly observed in the counterparts of Fig. 2b. The ship and sky relevant pixels in Fig. 2a were slightly contaminated by fogs (see the top one-third image pixels for each subplot of Fig. 2a), and the corresponding pixels in Fig. 2b showed that fog interference was successfully removed by the proposed fog-removal model. Note that the contours for the two ships involving with ship occlusion situation (see the embedded red rectangle in Fig. 2) were better identified in the Fig. 2b (i.e., fog-removal images).

Fig. 2
figure 2

Typical ship images for video #1 before and after fog removal

We presented imaging trajectory extraction and outlier removal results for ship #1 in video #1 to evaluate the proposed framework performance. Figure 3 showed typical ship tracking results for video #1. It is noted that ships were successfully tracked under ship occlusion challenge (see the ship tracking results of frame #35, frame #111 and frame #260 in Fig. 3). More specifically, the target ship #1 was successfully tracked for both before and after ship occlusion scenarios. The main reason was that the proposed SAMF ship tracking model retained the raw ship template (i.e., initial ship training sample), which was further employed to track the ship #1 in the later video #1 sequences. Moreover, the target ship #2 and #3 were successfully tracked without obvious tracking outliers, which indicated that proposed ship tracking module can obtain satisfied ship tracking performance.

Fig. 3
figure 3

Typical ship tracking results for video #1

Figure 4 provided raw ship tracking positions and the curve-fitting data for ship #1 (in video #1). Anomaly raw ship position in x axis (see Fig. 4a) was clearly found approximately from frame #115 to frame #250. Note that the raw x (and y) position data indicated the ship tracking positions obtained by the SAMF tracking module in our proposed framework. The ship positions in x axis obtained by the curve-fitting model (see the red curve in the Fig. 4a) showed smooth yet consistent variation tendency compared to those of the raw x data series. Figure 4b showed both raw and curve-fitted ship positions in y axis. It is found that ship movement in the y-axis was much smoother compared to that of the x axis, whilst the difference between the raw and curve-fitting data in y-axis was smaller in comparison with the x-axis data. Moreover, the maximum data variation in the y axis was less five pixels which suggested that ship movement in y axis direction was quite trivial.

Fig. 4
figure 4

Position distributions for both x and y axis of ship #1 in video #1

For the purpose of further removing trivial ship trajectory data, we provided both 2D and 3D imaging trajectory for ship #1 in video #1 as shown in Fig. 5. More specifically, the Fig. 5a provided raw ship imaging trajectory (i.e., ship movement in x and y axis) in each maritime frame, which was denser compared to the down-sampling ship imaging trajectory (see Fig. 5c). In that manner, the abnormal yet trivial ship imaging trajectory oscillations can be successfully removed. The 2D ship imaging trajectory distributions for both before and after down sampling procedure (see Fig. 5b and 6d) confirmed the above-mentioned analysis.

Fig. 5
figure 5

The 2D and 3D imaging trajectory distribution for ship #1 in video #1

Fig. 6
figure 6

Trajectory distributions for the three ships of video #1

The ship #1, #2 and #3 showed obvious movement in ROI of video #1, while other ships showed anchoring state in the video (i.e., ships moved at a quite slow speed). In that way, we implemented the maritime traffic situation awareness task by exploring spatial-temporal relationship via trajectory of ship #1, #2 and #3. We focused on analyzing ship trajectory relationship in the x-axis considering that ship movement in the y-axis was quite trivial. The traffic situation was obtained by exploiting ship spatial-temporal relationship in the x-axis, which was applicable to video #1. More specifically, the ship #1and #2 involved with traffic overtaking situation, and the distance between the two ships showed an increase tendency (which can be inferred from the vertical distance between the red and blue curves in Fig. 6). The traffic situation for ship #2 and #3 was same to that of the ship #1 and #2 (i.e., overtaking situation). It can be inferred that the ship #3 may overtake ship #2 in near future considering that the distance between the ship #2 and #3 showed obvious decrease variation tendency. To sum up, on-site traffic participants (e.g., ship crew, maritime traffic officials) need to pay more attentions to the traffic situation between ship #2 and #3 (i.e., ship spatial temporal relationship).

The proposed framework was implemented to video #2 for the purpose of further model performance evaluation and traffic situation analysis. Figure 7 demonstrated fog-removal results for the video #2, which suggested that target ship contours in ROI were clearer (see each subplot in Fig. 7a and b, respectively). In that way, we can draw a conclusion that our proposed model can successfully remove fog interference in both mist and strong fog conditions. The ship tracking positions were shown in Fig. 8, which indicated that each target ship can be successfully tracked by our proposed model. More specifically, the proposed framework tackled the partial ship-occlusion challenge, and obtained high-fidelity ship imaging trajectory dataset (see the ship tracking results in frame #68, #135 and #270 in Fig. 8). To further explore maritime traffic situation in video #2, we provided the imaging trajectories for the two ships (in terms of x axis distributions considering ship movement in the y axis was negligible) in Fig. 9. We observed that the two ships experienced same traffic situation as that of video #1 (i.e., overtaking), and the displacement between the two ships become smaller in the later image sequences. In other words, the ship #2 was overtaking ship #1 in the video #2.

Fig. 7
figure 7

Typical ship images for video #2 before and after fog removal

Fig. 8
figure 8

Typical ship tracking results for video #2

Fig. 9
figure 9

Trajectory distributions for the two ships of video #2

We also verified model performance on a typical traffic scenario (i.e., ship encountering situation). The fog removal results for the video #3 were shown in Fig. 10, which demonstrated that ship contours in the de-fogged images were more obviously than those in the fog-polluted maritime images. Moreover, ship tracking results in Fig. 11 demonstrated that our proposed framework can accurately track ships, and consecutive ship trajectories can be found in Fig. 12. The obtained ship trajectories indicated the two ships can safely travel in the channel without taking additional maneuvering operations. In that manner, it can be safely concluded that the on-board ship crew has performed appropriate ship operations.

Fig. 10
figure 10

Typical ship images for video #3 before and after fog removal

Fig. 11
figure 11

Typical ship tracking results for video #3

Fig. 12
figure 12

Trajectory distributions for the two ships of video #3

5 Conclusion

Accurate maritime traffic situation awareness provides early-warning useful information to varied maritime traffic participants (e.g., ship crew, ship company, ship owner). However, complex maritime traffic environment may significantly challenge MTSA model performance due to disadvantages of low visibility, ship imaging occlusions. The study aimed to develop a novel framework for maritime traffic situation awareness via the support of maritime surveillance videos. First, our proposed framework removed the fog interference via the dark channel prior model. Second, raw ship positions in each maritime image were obtained via the SAMF ship tracking module. Finally, we removed the ship imaging trajectory outliers with the help of curve-fitting and down sampling model. We implemented the proposed framework to extract high-resolution ship imaging trajectory in three typical maritime videos, and thus further analyze maritime traffic situation variation tendency to help traffic participants obtain early-warning information.

Though the proposed framework achieved satisfied performance for the MTSA task, the remaining work can be explored to further enhance model performance. First, we can exploit additional adverse navigation environment to verify model robustness (e.g., raining days, snow weather condition). Second, maritime videos were shot at relative stable conditions (i.e., camera shooting angle was constant). We can evaluate the proposed framework performance under camera shooting angle vibration constraints. Third, the proposed framework was implemented in an off-line manner, and we will further explore real-time ship trajectory extraction modules along with additional maritime situation challenges (e.g., wave imaging challenge, blurring maritime images).