Applications of Deep Learning in Intelligent Transportation Systems

Haghighat, Arya Ketabchi; Ravichandra-Mouli, Varsha; Chakraborty, Pranamesh; Esfandiari, Yasaman; Arabi, Saeed; Sharma, Anuj

doi:10.1007/s42421-020-00020-1

Applications of Deep Learning in Intelligent Transportation Systems

Original Paper
Published: 16 August 2020

Volume 2, pages 115–145, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Big Data Analytics in Transportation Aims and scope Submit manuscript

Applications of Deep Learning in Intelligent Transportation Systems

Download PDF

Arya Ketabchi Haghighat ORCID: orcid.org/0000-0003-1643-3575¹,
Varsha Ravichandra-Mouli¹,
Pranamesh Chakraborty¹,
Yasaman Esfandiari¹,
Saeed Arabi¹ &
…
Anuj Sharma¹

3839 Accesses
70 Citations
Explore all metrics

Abstract

In recent years, Intelligent Transportation Systems (ITS) have seen efficient and faster development by implementing deep learning techniques in problem domains which were previously addressed using analytical or statistical solutions and also in some areas that were untouched. These improvements have facilitated traffic management and traffic planning, increased safety and security in transit roads, decreased costs of maintenance, optimized public transportation and ride-sharing company's performance, and advanced driver-less vehicle development to a new stage. This papers primary objective was to provide a review and comprehensive insight into the applications of deep learning models on intelligent transportation systems accompanied by presenting the progress of ITS research due to deep learning. First, different techniques of deep learning and their state-of-the-art are discussed, followed by an in-depth analysis and explanation of the current applications of these techniques in transportation systems. This enumeration of deep learning on ITS highlights its significance in the domain. The applications are furthermore categorized based on the gap they are trying to address. Finally, different embedded systems for deployment of these techniques are investigated and their advantages and weaknesses over each other are discussed. Based on this systematic review, credible benefits of deep learning models on ITS are demonstrated and directions for future research are discussed.

Review on Deep Learning in Intelligent Transportation Systems

Deep learning for intelligent traffic sensing and prediction: recent advances and future challenges

Article 03 September 2020

Deep Learning Approaches for IoV Applications and Services

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The emergence of machine learning and its substitution for several statistical models have led to better problem-solving, which in turn has led various fields of study to turn their research paths to take advantage of this new method. Transportation systems have been influenced by the growth of machine learning, particularly in intelligent transportation systems (ITS).With the proliferation of data and advancements in computational techniques such as graphical processing units (GPUs), a specific class of machine learning known as deep learning (DL) has gained popularity. The capability of DL models to address large amounts of data and extract knowledge from complex systems has made them a powerful and viable solution in the domain of ITS. A variety of networks in DL have helped researchers to formulate their problems in a way that can be solved with one of these neural network techniques. Traffic signal control for better traffic management, increasing the security of transportation via surveillance sensors, traffic rerouting systems, health monitoring of transportation infrastructure, and several other problems now have a strong new approach, and for several challenging problems in transportation engineering, new solutions have been created.

There have been several surveys of the literature on the application and enhancement of ITS using DL techniques. However, most of these have tended to focus on a specific aspect of DL or a specific aspect of ITS. For instance, Zhu et al. (2018a) conducted survey of big data analytics in ITS. A review of computer vision playing a key role in roadway transportation systems was discussed in Loce et al. (2013). While (Nguyen et al. 2018) reviews DL models across the transportation domain, it is not a comprehensive survey that encompasses all current research publications on the ITS domain and DL. One dedicated review on enhancing transportation systems via DL was done in Wang et al. (2018a) where substantial research was included, but it focused primarily on traffic state prediction and traffic sign recognition tasks. The ITS domain includes other tasks, such as public transportation, ride-sharing, vehicle re-identification, and traffic incident prediction and inference tasks, which are all represented in this paper to make its extent more comprehensive and holistic. The transportation and research community has always taken notice of pivotal research directions, with the earliest review of neural nets applied to transportation (Dougherty 1995), where the critical review spanned the classes of problems, neural nets applied and the challenges in addressing various problems. It is this that motivates of the question we address in this paper: How effective and efficient are the current DL research applications for the domain of ITS? To the best of the authors’ knowledge, the literature in this field has suffered from the lack of a holistic survey that takes a broader perspective of ITS as a whole and its enhancement using DL models.

The purpose of this paper was, therefore, to present the systematic review we have conducted on the existing state of the research on ITS and its foray into DL. In “Research Approach and Methodology”, we discuss our approach taken to identify relevant literature. In “Background on Techniques in Deep Learning”, we talk about different methods of DL network systems and breakthrough research on those methods. In “Applications in Transportation”, we talk about different applications of DL methods in transportation engineering, specifically six major application categories in ITS.

In “Discussion and Conclusion”, we investigate different available embedded systems, or devices that can facilitate the running of neural network experiments. Finally, in “References”, we provide a summary and an outlook for future research.

The research methodology which is followed in this paper is PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) (Moher et al. 2009). Following this method, we first produced a questionnaire and in each paper we reviewed, we looked for answers to these questions. The focus of these questions is about the gap which each paper tries to address, their proposed solutions, and finally the performance of these solutions for their datasets.

Research Approach and Methodology

This paper performs a detailed analysis of existing studies on intelligent transportation systems (ITS) and deep learning (DL). Articles were searched in multiple databases using the search strategy described below. The collected articles were then reviewed and organized. The scope of this review was restricted to conference proceedings and journal articles, including existing literature reviews.

Relevant articles were primarily obtained by querying the TRID TRB database (Home—transport research international documentation 2017), where the search terms included “deep learning”, “convolutional”. These search terms were sought in the title, abstract and notes. Then the references of the papers identified were examined to trace other trusted journals and papers. Also, online searches on various databases such as Scopus, Science Direct, IEEE, and ArXiv were done. All papers obtained were included in this review if they met the following criteria:

Describe solutions to ITS problems using DL, as identified by methodology sections, that include DL-based model development
Published between January 2015 and October 2019 (during which period the majority of research so far using DL in ITS has been conducted)
Not a book, book chapter, dissertation, thesis or technical report
Not a general introduction to ITS
Not in the domain of autonomous vehicles

Though DL boom was spawned by the ImageNet project in 2012 (Russakovsky et al. 2015) and applications of DL on ITS first appeared in 2013, substantial growth in ITS research by means of DL methodologies did not start until 2015. This is illustrated in Fig. 1. Since then, there has been a steady growth in the prominence of DL-based ITS studies across journals and conferences. In the year 2019, up until October, 43 papers have been published across various ITS applications. In light of the marked increasing importance of DL as an ITS research method, in the following section, we will discuss and review the various DL structures and then their key applications in the ITS domain.

Background on Techniques in Deep Learning

Deep Neural Networks (DNN)

Deep learning (DL) is a specific subcategory of machine learning where several layers of stacked parameters are used for the learning process (Ketkar 2017). These parameters are component representations of different aspects which can affect the result of the network. Each layer contains several perceptrons (known also as neurons or hidden units) which carry weights for the parameter. The input of each layer is multiplied by these parameters and, therefore, the output is a representation of the impact of each parameter on the input. Usually after each layer or several layers of neurons, a nonlinearity function such as the tanh, sigmoid, or rectified linear function (ReLU) (Glorot et al. 2011) is used to generate the output layer. All these layers combine to form a deep neural network (DNN) (Schmidhuber 2015). There are two major challenges in building a DNN: first, designing the structure of the network, which includes the number of layers, number of neurons in each layer, and nonlinearity function type,and second, adjusting the weight of the parameters to train the network on how it should perceive the input data and calculate the output. For the first challenge, what is usually most helpful is simply trial and error and overall experience. For the second challenge, the back-propagation method is the most popular method to train the weight of parameters in a supervised manner. More details about this method can be found in Schmidhuber (2015). Although all the techniques which will be discussed in the rest of this paper can be classified as a subcategory of DNN, here in this paper, DNN is defined as the simplest structure of a network, in other words, fully connected layers. In this fully connected model, there is a connection between all the neurons of one layer to all the neurons in another layer, and for each connection, there is a weight which should be determined through back-propagation method.

Convolutional Neural Networks (CNN)

One of the major applications of neural networks was computer-aided detection (CAD) that aimed to increase classification accuracy and inferencing time. A revolutionary method was proposed in LeCun et al. (1989) called convolutional neural networks (CNN). Inspired by the vision system of cats which are locally sensitive and orientation-selective, as presented in LeCun et al. (1989) and Hubel and Wiesel (1962) suggested that instead of using fully connected layers of neural networks, it is possible to use a single kernel with shared weights to wisp the entire image and extract the local features. The proposed method enhanced the detection effectiveness both in terms of accuracy and memory requirement when compared with traditional methods, which required handcrafted feature extractions (LeCun et al. 1998).

CNN is a detection architecture that automatically learns spatial hierarchical features using back-propagation through the network. A schematic figure of this architecture is presented in Fig. 2a. These networks usually contain three types of layers: convolution, pooling, and fully connected, where the first two are used to extract the features and the last one used as a classifier (Bengio et al. 2015).

The convolution layer consists of a combination of a convolution kernel, which counts as a linear part of the layer and a nonlinear activation function. The main advantage of using a kernel that shares weights in operation, is extracting the local features and learning the spatial hierarchies of features efficiently by reducing the required parameters. Then the nonlinear activation function maps the results onto the feature map. In order to reduce the number of parameters, usually one pooling layer comes after a few convolutional layers in order to downsample the data, by taking the maximum unit (max pooling) or the average (average pooling) of a collection of units and substituting it as a representative of these collections. After extracting features and downsampling the data by the convolution and pooling layers, they are mapped onto the final output by fully connected layers. The output of these layers usually is the same size as the number of classes and each output indicates the probability of it belonging to that class. Finally, this string maps onto the final result by an activation function. This activation function can be sigmoid for binary/multiclass classification, softmax for single/multiclass classification or to identity continuous values in case of regression (Yamashita et al. 2018).

Based on the fact that in order to train a deep model a large amount of data are needed, CNN and other models’ popularity only began to rise when a large quantity of labeled data were provided for the ImageNet challenge (Russakovsky et al. 2015). Afterward, lots of architectures have been proposed which use these CNN blocks to enhance the efficiency of CAD. Some of these methods are AlexNet, Inception, VGGNet 16/19, Resnet, etc. However, in order to increase the accuracy of detection, other concepts have been used in the process. Some of these concepts are transfer learning, which uses the knowledge of the network from retraining on a large dataset in order to train the network on a smaller dataset (Yamashita et al. 2018). The other method is training with an equal prior instead of a biased prior in those cases where the dataset has a bias towards one of the classes (imbalanced dataset). In this case, different sampling or resampling rates are applied to the dataset to balance it. The effect of these different methods of changing the architecture, using transfer learning and balancing the dataset for various datasets are investigated in Shin et al. (2016).

Recurrent Neural Networks (RNN)

Recurrent neural networks (RNNs), another class of supervised DL models, are typically used to capture dynamic sequences of data. RNNs can successfully store the representation of recent inputs and capture the data sequence by introducing a feedback connection to interpret the data. This ability can play the role of memory to pass information selectively across sequence steps to process data at a certain time. Thus, each state depends on both the current input and the state of the network at a previous time. In other words, there is a similarity between a traditional, simple RNN and Markov models (Lipton et al. 2015). In 1982, the first algorithm for recurrent networks was used by Hopfield (1982) in order to do pattern recognition. In 1990, Elman (1990) introduced his architecture, which is known as the most basic RNN. A schematic figure of this architecture is presented in Fig. 2b. In this architecture, associated with each hidden unit, there is a context unit which takes the exact state of the corresponding unit at the previous time as an input and re-feeds it with the learned weight to the same unit in the next step.

Although training RNN networks seems to be straightforward, vanishing or exploding gradient problems remain the two main difficulties. These problems can happen during learning from previous states when the chain of dependencies gets prolonged and, in this case, it is difficult to choose which information should be learned from past states. In order to solve the problem of an exploding gradient in recurrent networks, which can result in oscillating weights, Williams and Zipser (1989) has suggested Truncated Back-Propagation Through Time (TBPTT), which sets a certain number of time steps as a propagation limit. In this case, to prevent exploding the gradient, a small portion of previously analyzed data is collected to use during the training phase. However, this means that in the case of long-range dependencies cases, the former information related to these dependencies will end up lost.

Long Short Term Memory (LSTM) architecture has been suggested by Hochreiter and Schmidhuber (1997) to solve both these problems together. The primary idea of this method is using a memory cell with only two gates of input and output. The input gate decides when to keep the information in the cell and the output gate decides when to access the memory cell or prevent its effect on other units. In recent years, several corrections and improvements have been made on LSTM architecture.

As described above, LSTM contains a memory cell that holds its state over time, and based on its regulation, controls how this cell affects the network. The most common type of LSTM cell has been suggested by Graves and Schmidhuber (2005). Several gates and components which are added to this cell are different from the basic suggested LSTM by Hochreiter and Schmidhuber (1997). A logistic sigmoid function is usually used as the gate activation, though due to the state-of-the-art design of Graves and Schmidhuber (2005), a tanh function is usually used as the block input activation and block output activation. The forget gate and peephole connections were first suggested by Gers and Schmidhuber (2001) that enables the cell to reset by forgetting its current state and passing the current state data from the internal state to all gates without passing them through an activation function.

Finally, it is notable that Cho et al. (2014) has proposed a gated recurrent unit (GRU) inspired by the LSTM block, where they have eliminated the peephole connections and output activation function. They have also coupled the input gate and forget gate into one gate called the update gate and what passes through their output gate is only recurrent connections to the block input. This architecture is much simpler than LSTM and based on what it eliminates, it avoids a significant reduction in performance, which makes it more popular to use.

Autoencoders (AE)

One of the most important task in DL is access to a large amount of data to train the model. Usually, such a dataset is not readily available and producing a rich dataset would be expensive. In this situation, unsupervised methods show their value. Instead of training models using labeled data, unsupervised methods extract the features of unlabeled data and use these extracted features to train the model. Autoencoders (AEs) are one such method which aims to reconstruct the input data and in this manner is similar to principal component analysis. AEs are composed of two networks that are concatenated to each other. The first network extracts and encodes the input data into its main features and the second network usess these features to reshape arbitrary random data to reconstruct something similar to the input data. The schematic figure of this architecture is presented in Fig. 3a. Although the concept of AEs has been used previously as a denoiser (Vincent et al. 2008) and data constructor (Tan and Eswaran 2008), it found a new application as variational AEs (Kingma and Welling 2013). To minimize the difference from input and output, Kingma and Welling (2013) have used the variational inference method. They introduced a lower bound on the marginal likelihood and tried to maximize it to minimize the error between input and output. Doersch (2016) and Le (2015) have explained exactly how a variational AE can be built.

Usually, an AE’s hidden layer is smaller than its input layer, although the opposite situation can happen as well. Also, the horizontal orientation of AEs is defined as combining two or more AEs horizontally, and this can have different motivations such as different learning algorithms (e.g., RBM, neural network, or Boolean) or different initialization and learning rates. In addition to details about these situations, linear and nonlinear AEs have been studied by Baldi (2012). It has been shown that a Boolean AE as a nonlinear type has the ability to cluster data and an AE layer on top can be used as a pretrainer for a supervised regression or classification task.

Deep Reinforcement Learning (DRL)

Reinforcement learning (RL) attempts to train a machine to act as an agent who can interact with the environment and learn to optimize these interactions by learning from responses (Arulkumaran et al. 2017). In RL, the agent observes the environment and gets a state signal and chooses an action that impacts the environment to produce a new state. In the next step, a reward from the environment and the new state is fed to the agent to help it decide more intelligently in the next step. The goal of an agent in this setup is gaining the maximum reward over the long term by following an optimal policy. The algorithm of RL is usually based on the Markov Decision Process (MDP) (Silver 2015). The problems that can be solved by RL algorithms can be divided into episodic and non-episodic MDP. In episodic MDP, the state will reset at the end of the episode and the return (accumulation of rewards for the episode) is calculated. In non-episodic MDP, there is no end of the episode and using a discount factor is vital to prevent an explosion of return values (Arulkumaran et al. 2017).

There are two functions usually used in RL: the state-value function, also known as the value function, is the expected return if the agent starts at a given state (no action limitation), whereas the action-value function, also known as the quality function (Q-function) is the expected return of starting at a given state and taking a particular action. Usually, one of two methods is implemented to solve an RL problem. In the first approach, the Q-function is predicted using different methods of temporal difference controls such as state–action–reward–state–action (SARSA), which improves the estimation of Q. The second approach is Q learning, which directly approximates the optimal Q. Both of these methods use bootstrapping and learn from incomplete episodes.

Deep reinforcement learning (DRL) is an approach to solving the RL problem using a DNN. Although the history of DRL began in the 1990s when Tesauro (1995) developed a neural network that reached an expert level in backgammon, its rebirth can be considered as Mnih et al. (2015) who introduced Deep Q-Networks (DQN) as DNNs that can approximate Q instead of reading its value from a Q table that indicates for each state what the Q value would be for taking each action. In this new method, complex and high dimensional problems have potential to be addressed easily (Mnih et al. 2015). The model used by Mnih et al. (2015) extracted images from the Atari games and used a combination of a CNN model and a fully connected layer on the data extracted from the images to obtain an estimate of the Q value.

However, because of the complexity of DRL, it can be unstable. Therefore, much research has been focused on solutions able to defeat this instability. Experience replay (Lin 1992) and target networks (Mnih et al. 2015) are the two most used techniques to make RL stable. Other techniques include Double-Q learning (Hasselt 2010) and dueling DQN (Wang et al. 2015), which have also been proposed to make DRL more robust and stable. In Double-Q learning, the second estimator is used for estimating an extra assumptive Q′ to approximate the Q value more precisely. On the other hand, dueling DQN (Wang et al. 2015) uses a baseline instead of an accurate calculation of Q value to learn relatives.

Generative Adversarial Networks (GAN)

Generative adversarial networks (GANs) are a specific class of deep learning networks that learn how to extract the statistical distribution of training data to synthesize new data similar to real-world data. These synthetic data can be used for several applications such as producing high-resolution images (Ledig et al. 2017), denoising low-quality images, and image-to-image translation (Isola et al. 2017). Most of the generative models use the maximum likelihood concept to create a model that can estimate the probability distribution of the training data and synthesize a dataset that maximizes the likelihood of the training data (Dougherty 1995). Although calculating maximum likelihood can directly result in the best action of the model, sometimes these calculations are so difficult that it is more beneficial to implicitly estimate this amount. In the case of explicit density calculation, three main types of models are popular:

Fully visible belief networks
Variational AEs
Markov chain approximations

All of these models, however, suffer from the problems of low speed, low quality, and early stoppage (Goodfellow 2016). To overcome these problems, Goodfellow et al. (2014a) has suggested a method that does not require explicit definition of the density function. This model can generate samples in parallel, no Markov chain is needed to train the model and no variational bound is needed to make it asymptotically consistent.

This method has two models: the generative model which is responsible to pass random noise through a multilayer network to synthesize samples, and the discriminative model, which is responsible to pass real data and artificial data through a multilayer network to detect whether the input is fake or real. A schematic figure of this architecture is presented in Fig. 3b. Both models use back-propagation and dropout algorithms: the generative model to create more realistic data and the discriminative model to achieve better distinction between real and fake data.

When GANs were first proposed in both their generative and discriminative models, fully connected networks were used. However, later in 2015, Radford et al. (2015) suggested a new architecture named deep convolution GAN (DCGAN), which uses batch normalization in all layers of both models, except the last layer of the generator and first layer of the discriminator. Also, no pooling or unpooling layer is used in this architecture. A DCGAN allows the model to understand operations in latent space meaningfully and respond to these operations by acting on the semantic attributes of the input (Goodfellow 2016).

The other improvisation on the GAN architecture has been conditional GAN (Mirza and Osindero 2014), where both networks are class conditional, which means the generator tries to generate image samples for a specific class and the discriminator network is trained to distinguish real data from fake data, conditional on the particular class. The advantage of this architecture is better performance in multimodal data generation (Creswell et al. 2018).

In the next section, we discuss and review the applications of deep learning models to transportation.

Applications in Transportation

Performance Evaluation

Before reviewing papers that have already used DL methods to investigate ITS applications, it is necessary to make clear the model evaluation criteria used. The classification metrics are accuracy (AC), precision (PR), recall (RL), top 1 accuracy, and top 5 accuracy, while the regression metrics are mean average precision (mAP), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean squared error (RMSE):

$$\mathrm{AC}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{FP}+\mathrm{TN}+\mathrm{FN}}$$

(1)

$$\mathrm{PR}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$$

(2)

$$\mathrm{RL}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$

(3)

where TP = true positive, TN = true negative, FP = false positive, FN = false negative.

Top 1 accuracy means the model’s top answer must match the expected answer.

Top 5 is when at least one of the model’s five highest probability answers must match the expected answer.

mAP is the mean of the average precision (AP) scores for every query, where AP is the area under the PR vs RL curve

IoU is the ratio between area of overlap and area of union, between the predicted and the ground truth bounding boxes:

$$\mathrm{MAE}=\frac{1}{n}\sum_{i=1}^{n}|{y}_{i}-\stackrel{-}{{y}_{i}}|$$

(4)

$$\mathrm{MAPE}=\frac{1}{n}\sum_{i=1}^{n}\left|\frac{{y}_{i}-\stackrel{-}{{y}_{i}}}{{y}_{i}}\right|$$

(5)

$$\mathrm{RMSE}=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{({y}_{i}-\stackrel{-}{{y}_{i}})}^{2}}$$

(6)

$$\mathrm{MSRE}=\frac{1}{n}\sum_{i=1}^{n}{\left(\frac{{y}_{i}-\stackrel{-}{{y}_{i}}}{{y}_{i}}\right)}^{2}$$

(7)

where y_i is the actual value of observed travel time, y_i is the predicted value of travel time, and n is the number of observations.

We now discuss different applications of deep learning in ITS. The included topics have been selected based on the functional areas in ITS as mentioned in Sussman (2008) and have been studied substantially over the period of 2012–2019.

Traffic Characteristics Prediction

One of the most considered applications of DL in transportation is related to traffic characteristics prediction. Traffic characteristics information can help drivers to choose their routes more wisely and traffic management agencies to manage traffic more efficiently. The main characteristics of interest are traffic flow, traffic speed, and travel time. Since these characteristics are not mutually exclusive, methods that are used to predict one of them also can be used to predict the value for the remaining features. Due to this, methods used to make these predictions are discussed together as follows:

Based on the duration of prediction for each traffic characteristic, a forecast value is usually classified as short-term (S) for predictions within less than 30 min, medium-term (M) for a prediction window between 30 and 60 min, and long-term (L) within more than 60 min (Yu et al. 2017a). Since driving behavior and traffic characteristics can vary across locations, results from one dataset are difficult to apply to other datasets (Wang et al. 2018a). Previously, traffic feature prediction has predominantly used parametric and statistical methods, such as autoregressive integrated moving average (ARIMA) modeling, but most of the time these methods have been incapable of predicting irregular traffic flows (Wang et al. 2018a). However, through the emergence of machine learning and furthermore DL methods, nonparametric methods are now being used in traffic characteristics prediction to achieve higher accuracy.

One of the first attempts to predict traffic characteristics has used deep belief networks (DBN) as an unsupervised feature learner. Chen et al. (2017a), Huang et al. (2014) and Khajeh Hosseini and Talebpour (2019) have implemented DBNs for traffic flow prediction. Siripanpornchana et al. (2016) and Hou and Edara (2018) have used the same concept for predicting travel time and traffic speed. Along with traffic data, weather data have been fed into DBNs using data fusion techniques to predict traffic flow more accurately (Koesdwiady et al. 2016).

However, due to the nature of the above mentioned traffic features and their dependency on past traffic conditions, several studies have been done to discover correlations using RNN to predict traffic characteristics. For instance, Zhang and Kabuka (2018) have used a gated RNN unit to predict traffic flow with respect to the weather conditions, where Jia et al. (2016) have used LSTM to overcome the same challenge. Liu et al. (2017) and Tian and Pan (2015) have used LSTM to predict travel time as well as traffic flow, while also taking into account weather conditions. Finally, Ma et al. (2015) have implemented a combination of deep RBM and RNN to predict congestion in transportation network links.

Polson and Sokolov (2017) have tried to increase the AC of traffic flow prediction especially for nonrecurrent traffic congestion, such as a special event or harsh weather, by paying more attention to the spatiotemporal feature of traffic. This feature is grounded in the assumption that to predict any traffic characteristic, we need both the historical data on that particular location and current traffic in the neighboring areas. To accomplish this, Wang et al. (2016a) have tried to combine an RNN with a CNN to pay attention to both the temporal and spatial aspects of traffic. Fouladgar et al. (2017), Du et al. (2017) and Goudarzi et al. (2018) have combined the power of LSTM + CNN to understand both temporal and local dependencies to predict different traffic characteristics. Yao et al. (2018a) have considered two challenges, the first being the dynamic dependency of traffic on temporal features, that is, in different hours of the day, this dependency may differ from one direction of traffic flow to another direction. The second challenge has been the probability of shifting time periods in relation to traffic density. In other words, a periodic temporal dependency may shift from one time to another (e.g., on different days of the week). As a result Yao et al. (2018a) designed a network consisting of a flow-gated local CNN network to capture the dynamic of the spatial dependencies and an LSTM network as a periodically shifted attention mechanism for handling the periodic dependencies. One other approach to accounting for both types of dependencies was taken by Ma et al. (2017). They converted data into images representing the two dimensions of time and space. By converting their data matrices into images, they were able to use a CNN model to extract image features and predict the network-wide traffic speed. Yu et al. (2019) improved this approach later by adding a temporal gated convolution layer to extract temporal features.

To extract both spatial and temporal features, Cui et al. (2018a) have used a deep model called the stacked bidirectional and unidirectional LSTM (SBU-LSTM) model where the bidirectional LSTM considers both the backward and forward dependencies in time-series data. Since traffic conditions have periodicity, by analyzing both backward and forward features, the AC can be increased.

One of the other models able to consider the spatiotemporal property of traffic has been AE, which was proposed first by Lv et al. (2014) and improved by Duan et al. (2016) using denoising Stacked AE (dSAE) and Yu et al. (2017a) by combining LSTM and AE to predict traffic conditions at peak hours and in post-accident situations. To predict post-accident situations, they extracted a latent representation 7 of the static features that are common in all accidents from stacks of AE and combined this with a temporal correlation to traffic flow that came from stacks of LSTM, using a linear regression (LR) layer.

Table 1 summarizes all these papers, with the columns from left to right describing for each study the traffic characteristics investigated and its DL model, dataset, experiment results (best results achieved), baseline model, and the baseline model’s best results, prediction window length, hyperlink to the given paper and its year of publication.

Table 1 Overview of papers using deep learning techniques for traffic characteristic prediction

Full size table

To the best of the authors’ knowledge, all studies matching the meta-analysis criteria described in “Research Approach and Methodology” of the current paper related to travel time, traffic speed, traffic flow, traffic conditions, and traffic density have been tabulated here. For traffic conditions, the goal was to predict if the road is congested or not. Results performed on multiple datasets are also represented in Table 1. To have uniformity, the best results are those achieved when the window length is ‘S’ (short-term). This table structure is followed across all tables in this paper.

Traffic Incident Inference

The goals of predicting traffic incident risk for a given location as well as incident detection based on traffic features are to help traffic management agencies to reduce incident risk in a hazardous area and traffic jams in incident locations. Although there are parameters such as drivers’ behavior, that are not very predictable, there are several key features that can help predict traffic incidents.

Human mobility (Chen et al. 2016), traffic flow, geographical position, weather, time period, and day of the week [97] are some of these features that can be investigated as indicators of a traffic incident. However, a single model cannot generally be used in different places because accident factors in metropolitan areas, where the population and vehicles are generally dense, are completely different from accident factors in a small town with a scattered population (Yuan et al. 2017). The prediction and detection of an incident is more challenging than the prediction of incident risk since data for the former are usually heterogeneous (i.e., traffic incidents happen rarely, compared to the amount of data for the cases where there is no incident). To overcome this issue, Yuan et al. (2017) in each step changed only one feature of the data (hour, day, or location) and then checked if the resulting data point was negative or not. In negative cases, it was added to the pool of data to be considered.

To measure the traffic incident risk based on surveillance camera data, different approaches have been used. For example, Chen et al. (2016) have used a stack denoising AE (SDAE) to learn the hierarchical features of human mobility and their correlation with a traffic incident. In contrast, Ren et al. (2017) and (Bao et al. 2019) have implemented an LSTM model to evaluate risk, but Ren et al. (2017) achieved better performance due to learning from more features.

To predict traffic incidents in a macroscopic manner, Yuan et al. (2017) and Pan et al. (2017) have tried implementing DNN models, Yuan et al. (2017) by considering the curvature of the road as well as the number of intersections and density of the area in order to overcome the spatial heterogeneity problem. For the same concern, Dong et al. (2018) have used AE by considering both continuous and categorical variables, and Yuan et al. (2018) have used a Conv-LSTM that breaks regions into smaller regions in order to overcome spatial heterogeneity.

If, following Yuan et al. (2018), we consider the macroscopic prediction of traffic incidents as not focused on any single vehicle, but instead as predicting the probability of an accident between any pair of vehicles in the wider region, microscopic incident prediction studies can also be introduced that—by getting data about the location, speed, and direction of each vehicle in the surrounding area—predict the probability of an incident in the near future between any certain pair of vehicles. In this regard, Chen et al. (2018b) and Theofilatos et al. (2019) have trained a DNN to predict likely collisions. Theofilatos et al. (2019) have used a simple NN with four layers, which, though it does not compare well with the baseline results of machine learning (ML) techniques, is still preferred, as the ML techniques have poor sensitivities.

Suzuki et al. (2018) have annotated their large dataset of near-miss traffic accidents to train a quasi-RNN model. The innovation of their work was introducing an adaptive loss function for early anticipation (AdaLEA), which gives their model the ability to predict a collision 3.65 s before it happens.

Another challenge in traffic incident inferencing is detecting an accident by processing only raw data. To address this, Hatri and Boumhidi (2018) and Singh and Mohan (2018) have used a stacked AE (SAE) to extract the features of traffic patterns in the context of an accident. Also, Hatri and Boumhidi (2018) have used a fuzzy DNN to control the learning of traffic-incident-related parameters. Zhang et al. (2018a) have trained their DBN model on a dataset that includes tweets related to traffic accidents, showing that non-traffic features can be used along with traffic feature data to validate traffic incident detection.

Incident severity prediction based on recorded incident features have been studied in Wang et al. (2016a), Sameen and Pradhan (2017) and Alkheder et al. (2017). The artificial neural network (ANN) trained in Alkheder et al. (2017) has shown an improvement in baseline performance as compared to the LSTM model with fully connected layers in Sameen and Pradhan (2017).

Table 2 summarizes all these papers, shows their model, the dataset which their model was trained on, evaluation of their model for their testing dataset as well as comparison of their model’s performance to that of their baseline model. In the first section of this table, different studies regarding parameters effective in predicting increased incident risk and the manner in which incident risk is affected are listed. In the next section, macroscopic studies on incident prediction are categorized as “traffic incident prediction,” whereas microscopic studies are categorized as “collision prediction.” In the incident detection 9 section, all studies focused on detecting incidents by analyzing raw traffic data have been gathered and, finally, in the last section, investigations predicting the severity of the incident are listed.

Table 2 Overview of papers using deep learning techniques for traffic incident inference

Full size table

Vehicle Identification

Applications of re-identification (Re-ID) vary from calculating travel time to automatic ticketing. Since license plates are unique to each vehicle, the first task in Re-ID is recognizing them.

Zang et al. (2015) and Abedin et al. (2017) have implemented DL models to recognize license plates by using a visual attention model that first generates a feature map using a combination of the most commonly used colors in license plates, extracts data from plates using a CNN model, and ultimately runs an SVM on the extracted data. However, bad lighting, blurriness due to vehicle movement, low camera quality, and even traffic occlusion where the plate is covered behind other cars can make reading license plate characters impossible. To overcome this, Liu et al. (2016) have proposed a CNN layer to extract conspicuous features such as the color and model of the vehicle and have used a Siamese neural network to distinguish similar plates. (This network has been used before in signature verification tasks). Note that for some feature extractions, such as vehicle color recognition, solutions like what Hu et al. (2015) did using a combination of CNN for feature extraction and SVM for categorizing are also available. Tang et al. (2018) have similarly used a histogram-based adaptive appearance model like what Zheng et al. (2017) did for target re-identification, detecting and saving other features of each car besides the scheme of the license plate to do Re-ID. Also, Yu et al. (2017b) have used faster RCNN to detect vehicles in images. In addition, a modified version of the Single Shot Detection (SSD) method to localize and classify the different types of construction equipment by employing MobileNet as the feature extraction network has been done by Arabi et al. (2020). Wu et al. (2018b) has worked on the same idea but trained their model based more on spatiotemporal data, pruning their results with the fact that (1) a vehicle cannot be in two places at one time and (2) a vehicle that has already passed a section is unlikely to pass it again. However, their model could not compete with the model defined in Tang et al. (2018) that proposed a Markov chain random fields to prepare several queries based on a visual spatiotemporal path and then used a combined Siamese-CNN and path-LSTM model.

Table 3 summarizes all these papers, shows their models, the dataset which model is trained on, and their performances on those dataset and comparison to the baseline model.

Table 3 Overview of papers using deep learning techniques for vehicle id tasks

Full size table

Traffic Signal Timing

One of the main tasks of ITS management based on multiple types of data is controlling traffic via traffic signal lights. For several years, research on optimizing signal light timing to have the best performance has been one of the greatest challenges in the transportation field. The results of studies in this area have endowed traffic agencies with analytical models that use mathematical methods to address this optimization problem. However, through emerging DL studies, modeling the dynamics of traffic to achieve the best performance has taken a new path. This is because the nature of RL has facilitated its application in different studies to find the best traffic signal timing.

Li et al. (2016) has used DRL to tackle traffic light timing. In DRL, a DL model is usually used to implement the Q-function in a complex system to capture the dynamics of traffic flow. A dSAE network is used to take the state as input and give the Q-function for any possible action as the output of the network. Li et al. (2016) has shown a 14% reduction in cumulative delay in the case of using an SAE to predict the Q-function instead of conventional prediction.

Gao et al. 2017) has suggested an alternative novel idea for choosing RL states. They argue that instead of taking raw data as the state, it could be more effective if the CNN extracts important features from the raw data—e.g., the position of the cars and their speeds—and feeds it to a DRL network with a fully connected network to predict the Q-value for each of four states of green, yellow, red, and protected left turn light, considering cumulative staying time as the reward. They have also used the experience replay and target network techniques to stabilize the algorithm and converge it to the optimal policy as suggested in Tan and Eswaran (2008).

Liang et al. (2018) have also used CNN to map states. They use several state-of-the-art techniques such as the target network, experience replay, double Q-learning network, and dueling network methods to increase the performance of the network and make it stable. Their results have shown a great reduction in waiting time (more than 30%) for a fixed-time scenario.

Genders and Razavi (2018) have investigated the importance of choosing delay time states. The main goal of this study was investigating whether the data from conventional sensors, such as occupancy and average speed, are satisfactory or more precise data are needed, such as vehicle density and queue length, or even data with the highest resolution, such as discretizing each incoming lane into cells and considering the presence of a vehicle in each cell separately. The results of this study showed that using high-resolution data is not substantially effective and conventional data are good enough for their model. However, one of the reasons that may have contributed to this conclusion is that they used a simple fully connected model that could not extract deep features from more precise states very well.

Finally, Wei et al. (2018) have tested their model on real-world traffic data to see how effective its results could be. They suggest that instead of only studying the reward, we need to consider different policies that may result in the same reward and then take the most feasible one. The final results of this study have shown great performance in reducing queue length, delay time, and duration compared with other methods.

Table 4 summarizes all these papers, shows their model, the dataset which their model was trained on, and the performance of their model for the testing dataset as well as comparison of their model’s performance to that of the baseline model.

Table 4 Overview of papers using deep learning techniques for traffic signal timing

Full size table

Ride Sharing and Public Transportation

Public transportation systems (including bus or metro systems, taxis, etc.) are one of the main means of moving passengers within cities. To increase city planning performance and also passenger satisfaction, the nature of DNN has endowed companies with increasingly optimal routing maps that take into account data such as passenger demand for a given mode of 11 travel at particular places and times. DL has been adopted to make predictions even more accurate compared to existing ML techniques.

Saadi et al. (2017) have investigated the performance of several ML techniques and a fully connected DL model with only two hidden layers and have shown that their very simple DL model outperforms almost all other techniques except a boosted decision tree. Besides the simple DNN models in Dominguez-Sanchez et al. (2017), Jung and Sohn (2017), Wan et al. (2018) and Zhu et al. (2018b), a hybrid model containing a stacked AE and a DNN has been implemented by Liu and Chen (2017) to predict hourly passenger flow.

To capture all related features such as the spatial, temporal, and exogenous features impacting passenger demand, a fusion convolutional LSTM network (FCL-Net (Ke et al. 2017) has been proposed. This network includes stacked Conv-LSTM layers to analyze spatiotemporal variables, such as historical demand intensity and travel time, and LSTM layers to evaluate nonspatial time-series variables, such as weather, day of the week, and time of the day. With the same idea, Zhang et al. (2017) has proposed a spatiotemporal Resnet (ST-Resnet) which includes several convolutional layers. Liao et al. (2018) has implemented both of these techniques on a New York City taxi record dataset and their comparison has shown that better performance with a faster training time can be achieved using ST-Resnet. The authors suggest two reasons for this. First, LSTM captures fine temporal dependencies which are not as fundamental as the coarse-grained dependencies from the convolutional layers. Their second explanation is that spatial features may be more important than temporal ones and since the ST-Resnet focuses more on spatial features, it outperforms the FCL-Net. Zheng et al. (2017) and Lin et al. (2018b) work directly on graphs structures to leverage structural information by considering the nodes as stations and the edges as dependencies among stations. Finally, Yao et al. (2018b) and Ma et al. (2018) have proposed a deep multiview spatiotemporal network to capture all dependencies separately.

Another research area related to public transportation deals with travel mode selection. Nam et al. (2017) has implemented a simple fully connected DNN on Swiss Metro data to reveal demand based on mode. Another issue for transportation network companies is route scheduling for their drivers to pick up passengers in order to minimize passenger waiting time as well as cost for the driver and company. Shi et al. (2018) has suggested a DRL model aiming to give drivers the best route. This paper considers different factors such as the current location of vehicles, time of day, and competition between drivers, resulting in a significantly shorter search time and more long-term revenue for drivers.

Table 5 summarizes all these papers, shows their model, the dataset which their model was trained on, and evaluation of their model for their testing dataset as well as comparison of their model’s performance to that of their baseline model. (In this table, “travel mode” refers to studies which tried to predict the mode of transportation that passengers would choose at each time point. Also, “passenger flow” is defined as the number of passengers flowing in or out of a given location at a certain time point).

Table 5 Overview of papers using deep learning techniques for ride sharing and public transportation

Full size table

Visual Recognition Tasks

One of the most significant applications of DL is the use of nonintrusive recognition and detection systems, such as camera-image-based systems. These applications can vary from providing a suitable roadway infrastructure for driving vehicles to endowing the autonomous vehicles with a safe and reliable driving strategy.

One of the first visual recognition challenges tackled has been obstacle detection via exploiting vehicle sensors. To do this, a variety of networks with unique architectures have been implemented. Kim and Ghosh (2016) have merged data from an RGB camera and LIDAR sensors to increase obstacle detection performance in different illumination conditions. Dairi et al. (2018a, b), on the other hand, have confronted obstacle detection as an anomaly detection problem. They have used a hybrid encoder model to extract features of Deep Boltzmann Machine (DBM) and then an autoencoder to reduce the dimensionality and obtain vertical disparity (V-disparity) map coordinate system data from images. The key feature of V-disparity data is that these data are mostly stable with small variations from noise and they change drastically only if an obstacle appears in an image.

Wang et al. (2016b) and Cai et al. (2016) have used data from far-infrared sensors to improve vehicle detection at night. While the former used only far infrared data, the latter, in order to decrease the false positive percentage used both camera and far-infrared data. Wang et al. (2016c) have tried to address requirements in regard to vehicle following, which include detecting brake lights. They used the Histogram of Oriented Gradient (HOG) approach implemented with LIDAR and camera data. To decrease the false positive rate and speed up the process, they also used the vanishing point technique. Next, they used AlexNet to detect if the rear middle brake light was on or off.

Another important task in navigating safely is traffic sign detection. These signs obligate, prohibit or alert drivers. One of the most common DL models to detect traffic signs are CNNs. Qian et al. (2015), Yang et al. (2015), Lin et al. (2016, 2019), Lim et al. (2017), Zeng et al. (2016), Hu et al. (2017b), Yuan et al. (2016), Arcos-Garcia et al. (2018), Natarajan et al. (2018), Lee and Kim (2018), Li et al. (2018b) and You et al. (2018) have all used CNN as their main feature extractor, each trying to tune their model to get the best results. Qian et al. (2015) have used RCNN to derive regions of interest from RGB images. Lim et al. (2017) have focused on low-illumination images. They used a classifier to detect regions of interest and an SVM to verify if any traffic signs were present inside the region or not. Then, a CNN model using the Byte-MCT technique classified the traffic sign. Experiments have shown that this method is robust in deficient lighting, outperforming other methods in cases of low illumination.

Zeng et al. (2016) have suggested that the RGB space cannot provide as much useful data as the perceptual lab color space. Therefore, after space changing, they extracted the deep perceptual features using a CNN and fed these features to a kernel-based ELM classifier to identify the traffic sign. This classifier used the radial basis function to map the features in a higher dimension space in order to disconnect features to get the best outcome.

Arcos-Garcia et al. (2018) have tried different optimization methods on a CNN model containing several convolutional layers and spatial transformer networks (STN) that make the CNN spatially independent, resulting in no need for supervised training, data augmentation or even normalization. In contrast, Li and Yang (2016), instead of using a CNN, have used a DBM that is boosted with canonical correlation analysis for feature extraction and then an SVM for classification. Also, they have used certain conventional image-processing techniques such as image drizzling and gray-scale normalization to reduce noise.

Weber et al. (2016), Behrendt et al. (2017) and Kim et al. (2018a) have focused more on traffic light detection and classification. This has a very significant role in managing traffic, and correct detection has a high correlation to reduced risk. Weber et al. (2016) have proposed their deep traffic light recognition (DeepTLR) model that first classifies each fine-grained pixel of the input data, calculating the probability for each class. Then, for the regions with higher probability toward the presence of a traffic light, a CNN was used to classify the status of the traffic light. (In this model, temporal data were not used and each frame was analyzed separately). However, Behrendt et al. (2017) have used traffic speed information as well as stereovision data to track detected traffic lights. Lin et al. (2016) have used a combination of region-of-interest (ROI) performance, CNN feature extraction and an SVM as a classifier to detect arrow signs on the roadway and classify their direction. Gurghian et al. (2016) have used a CNN to detect lane position in the road.

Finally, the monitoring of civil infrastructure has always been a focus for engineers and researchers. Various monitoring techniques have been used for infrastructure performance evaluation, ranging from conventional short-term (Arabi et al. 2018) and long-term (Arabi et al. 2019, 2017; Constantinescu et al. 2018) sensor-based monitoring to nondestructive and noncontact techniques (Moll et al. 2018). Among the applications of nondestructive damage detection, pavement crack detection, in particular, has received attention, due to its importance in civil infrastructure management. For instance, Hosseini et al. (2020) and Hosseini and Smadi (2020) developed pavement prediction models that can help agencies to come up with more accurate maintenance and rehabilitation activities. Zhang et al. (2018c) have proposed a unified pavement crack detection approach that can distinguish between cracks, sealed cracks, and background regions. Through their approach, they have been able to effectively separate different cracks having similar intensity and width. Moreover, Bang et al. (2019) have proposed pixel-level pavement crack detection in black-box images using an encoder-decoder network and found that ResNet-152 with transfer learning outperformed other networks. Additionally, CrackNet, which performs pixel-level pavement crack detection on laser-based 3D asphalt images, was introduced by Zhang et al. (2018d). In a separate study, Zhang et al. (2018d) extended their previous study to CrackNet-R, which utilizes RNN with a gated recurrent multilayer perceptron (GRMLP) to update the memory of the network, showing their model outperforms other models based on LSTM and GRU. Also, Nhat-Duc et al. (2018) have investigated pavement crack detection performance using metaheuristic-optimized Canny and Sobel edge detection algorithms, comparing these algorithms with their proposed CNN and confirming the superior performance of DL over conventional edge detection models.

Table 6 summarizes all these papers, shows their model, the dataset which their model was trained on, and evaluation of their model for their testing dataset as well as comparison of their model’s performance to that of the baseline model.

Table 6 Overview of papers using deep learning techniques for visual recognition tasks

Full size table

Discussion and Conclusion

Hardware

Generally, there are two types of intelligent decision-making, namely cloud-computing-based and edge-computing-based. While computing services are delivered over the internet via the cloud computing approach, they are performed at the edge of the network via the edge-computing approach. The edge-computing approach has introduced several advantages, such as efficient and fast intelligent decision-making as well as decreased data transfer cost. Emerging technologies such as DL have significantly increased the importance of edge computing devices. Though discussing edge computing devices in detail goes beyond the scope of this paper, we briefly overview and compare the edge computing devices popularly used for DL. Figure 4 illustrates the various edge computing platforms discussed in this section. Also, Table 7 summarizes the technical specifications of the covered hardware.

Table 7 Detailed specifications of the popular edge-computing devices used for DL

Full size table

The Jetson Xavier is the high-end system-on-a-chip (SoC) computing unit in the Jetson family, which exploits the Volta GPU. An integrated GPU with Tensor Cores and dual Deep Learning Accelerators (DLAs) make this module ideal to deploy computationally extensive DL based solutions. NVIDIA Jetson Xavier is capable of providing 32 TeraOPS of computing performance with a configurable power consumption of 10, 15 or 30 W.

Another widely used embedded SoC is NVIDIA Jetson TX2 which takes advantage of NVIDIA Pascal GPU. Although it delivers less computing performance than NVIDIA Xavier, it can be a reliable edge computing device for certain applications. The module can provide more than 1TFLOPS of FP16 computing performance using less than 7.5 W of power consumption. The Jetson Nano, which utilizes the Maxwell GPU, is newest product from the Jetson family introduced by NVIDIA. It is suitable for deploying computer vision and other DL models and can deliver 472 GFLOPS of FP16 computing performance with 5–10 W of power consumption.

Another family of edge computing devices is the Raspberry Pi family, which introduces affordable SoCs capable of high performance in basic computer tasks. The Raspberry Pi3 Model B + is the latest version of the Raspberry Pi which uses a 1.4-GHz 64-bit quad-core processor and can be used alongside deep learning accelerators to achieve high performance in computationally expensive tasks.

Finally, the Intel Neural Computing Stick 2 (NCS 2) is a USB-sized fanless unit, which utilizes the Myriad X Vision Processing Unit (VPU) that is capable of accelerating computationally intensive inference on the edge. Very low power consumption along with supporting popular DL frameworks such as Tensorflow and Caffe have made the NCS 2 ideal to use with resource-restricted platforms such as Raspberry Pi3 B + . There have been limited studies investigating the inference speed of these hardware, though Arabi et al. (2020) has compared the inference speed of an SSD-MobileNet model of the abovementioned embedded devices on a construction vehicle dataset. Utilizing the Jetson TX2, they achieved 47 FPS, and utilizing a Raspberry Pi and NCS combination, they achieved 8 FPS.

Summary

Below, we provide a summary of the studies cited in the current paper. We have classified these studies according to our six ITS application categories in relation to the DL models they use (see Fig. 5). The following are our observations:

Traffic characteristics: CNN, RNN, and CNN-RNN hybrid models are most frequently used. The main reason is undoubtedly related to the nature of traffic that has two main dependencies: spatial and temporal. Because various datasets and performance evaluation metrics have been used, it is hard to compare different studies related to traffic characteristics, but in traffic flow studies, the PeMS dataset has been widely used. The majority of research has used hybrid CNN and RNN models, which can identify both long temporal dependencies and local trend features. Although most papers have defined their own CNN model rather than using an existing architecture, CNN has generally shown better performance across papers when compared to RNN, which shows lower computation/training time.
Traffic incidents: the most widely used model is RNN, since the result of an incident shows itself at a specific time that requires a powerful network model to identify. Autoencoders are also popular models, since they can learn traffic patterns and then detect and isolate accident conditions from regular conditions.
Vehicle ID: CNN is the most widely used model, given its power in inferencing from images, as detection and tracking is the main task in license plate and vehicle type/color identification. Existing CNN architectures that have been popularly utilized are AlexNet and VGG models that have been pretrained on ImageNet.
Traffic signal timing: RL has been the most commonly used model, given the control strategy nature of the traffic signal timing task. Hybrids of CNN and SAE have been used to approximate or learn Q-values to improve DRL performance.
Ride-sharing and public transportation: CNN, RNN, and DNN have been the most frequently used models in the domain. Most researchers have built their own DL architecture to accomplish tasks in this category. Public transportation demand and traffic flow prediction tasks have generally been done by either CNN or hybrid CNN models.
Visual recognition tasks: CNN has been the most commonly used DL model for visual recognition tasks, again because detection and tracking are efficient via CNN. Especially in traffic sign recognition tasks, the GTSRB dataset has been one of the most frequently used benchmarks. Existing architecture such as ResNet, AlexNet, VGG, and YOLO have been used extensively, with the AlexNet and ResNet architectures being the most popular to build on. This can be attributed to the fact that visual recognition tasks are not limited to ITS, so research done in other domains can be utilized to accomplish ITS-related visual recognition tasks.

Based on all the studies reviewed in the current paper, deep learning as an approach for addressing intelligent transportation problems has undeniably achieved better results as compared to existing techniques. The major growth has been seen in the past 3 years, constituting more than 70% of all ITS-related DL research performed so far.

Future Work and Challenges

In recent years, DL methods have been able to achieve state-of-the-art results in different visual recognition and traffic state prediction tasks. The majority of the visual recognition work such as vehicle and pedestrian detection, traffic sign recognition, etc. have focused on autonomous driving or in-vehicle cameras. However, there have also been a significant number of overhead cameras installed by city traffic agencies and state Departments of Transportation that are mostly used for human-evaluated surveillance purposes. To date, there have been only a few studies that have focused on using these cameras for determining traffic volumes on freeways and arterials, traffic speed, and also for surveillance purposes such as automatically detecting anomalies or traffic incidents (particularly at a large-scale, citywide level). Currently, the majority of traffic intersections rely on using loop detectors for vehicle counting and for developing actuated traffic signals. However, installation of these loop detectors is intrusive, in that road closures are required for installing such sensors. Cameras, on the other hand, can be used as a cheap, nonintrusive detection sensor technology for counting traffic volume in all directions as well as turning movements, the presence of pedestrians, etc., thereby facilitating smart traffic signal control strategies. However, two main challenges need to be considered for developing DL techniques able to handle the use of cameras as sensors. First, such methods need to be able to handle the large volume of data collected from hundreds or thousands of cameras installed at a citywide or statewide level. Efficiently providing real-time or near-real-time inferencing from this large volume of data is currently one of the primary challenges of using cameras as sensors. Second, the methods developed need to be able to perform with minimal or no calibration such that they are feasible to apply and maintain at a large-scale level. Also, the ITS community needs to focus on creating more benchmark datasets for different research tasks related to DL applications. Although PeMS has been used as a popular dataset for traffic state prediction as shown in 1, the absence of any comparable benchmark dataset for traffic incident inference and ride-sharing studies has resulted in most of these studies using an original dataset. This has created difficulties in comparing different algorithms to determine the state-of-the-art model. Indeed, one of the reasons these research areas have still not been significantly explored using DL models is likely attributable to their lack of a recognized benchmark dataset. While this study has shown that DL models have been successfully applied to traffic state prediction, vehicle ID and visual recognition tasks, significant improvements need to be made in the use of DL models for other research topics such as traffic incident inference, traffic signal timing, ride sharing, and other public transportation concerns. These topics have still not been fully explored using DL models and hence there remains significant scope for improving detection and prediction accuracy in these areas.

While DL models are becoming increasingly popular among researchers as the most effective classification method in visual recognition tasks in the ITS domain, privacy and security are extremely important. Therefore, the potential for adversarial attacks and thus the need for robustifying DL models have been receiving greater attention. (Adversarial attacks in this domain are, in most of the cases, small changes in the input which are imperceptible to the human eye but make the classifier classify incorrectly.) For example, self-driving cars use DL algorithms to recognize traffic signs (Cireşan et al. 2012), other vehicles, and related objects for navigation purposes. However, if DL models fail to detect a stop sign due to slight modification in a couple pixels, this can create serious impedance to the adoption of self-driving cars. Adversarial attacks, are, therefore, an increasing area of focus in different DL application research topics such as natural language processing, computer vision, speech recognition, and malware detection (Najafabadi et al. 2015; Collobert and Weston 2008; LeCun et al. 2010; Deng et al. 2013; Hardy et al. 2016; Tan et al. 2020).

Biggio et al. (2013) has called into question the advisability of using neural networks and SVMs in security-sensitive applications, demonstrating the legitimacy of their concern by attacking some arbitrary PDF files and the MNIST dataset using the gradient descent evasion attack algorithm that they proposed. Their suggested solution is employing regularization terms in classifiers. In the same vein Szegedy et al. (2013) has shown that accuracy for perturbed input due to adversarial attacks is much less than that in the case of high magnitude noise. Another downside of DL classification methods is that adversarial attacks can be independent of the classification model, meaning that one can generate an adversarial attack that can fool a machine learning system without any access to the model. These are called black-box attacks, a concept first introduced by Papernot et al. (2016), whereas white-box attacks are when the attacker is aware of all relevant information such as the training dataset, the model, etc. For example, Madry et al. (2017) has used a projected gradient descent (PGD) form of attack, which is different from related work that has mostly used a form of attack involving the Fast Gradient Sign Method (FGSM). Also, Moosavi-Dezfooli et al. (2017) has come up with a systematic way to compute universal attacks that are small image-agnostic perturbations that have a high probability of breaking most classifiers. Concurrent to research regarding designing attacks and understanding the vulnerability of neural networks to them, researchers have studied different ways to defend against adversarial attacks to make DNNs robust to them. One of the most popular approaches to defense against adversarial attacks is to add the adversarial set generated by any algorithm to the training set and then training the neural network with the new augmented dataset (Fawcett 2003). Goodfellow et al. (2014b) has shown that although this method works for specific perturbations, networks being trained by this method are not robust to all adversaries. For example, while working to mitigate the effect of adversaries using denoising autoencoders (DAEs), Gu and Rigazio (2014) discovered that the resulting DNN became even more 17 sensitive to perturbed input data.

Around the same time, Bastani et al. (2016) designed a metric to measure the robustness of networks and approximate this using the encoding of their robustness as a linear program to improve the robustness of the overall DNN. Defense against adversarial attacks can be looked at as a robust optimization problem, as Shaham et al. (2018) has shown that adversarial training using their proposed algorithm results in a more robust network achieved by robust optimization theory which results in increasing the accuracy and robustness of the DNN. Also, authors in Esfandiari et al. (2019) achieved an algorithm which can provide comparable accuracies with State-Of-the-Art algorithms, and save a lot of computational overhead accompanied with computing worst case adversarial attacks. They achieved that by looking at the robust learning problem from a robust optimization lens as well. Another recent method to harden DNNs against adversarial attacks is defensive distillation which has shown outstanding preliminary results in being able to reduce the adversarial attack success rate from 95 to 0.5% (Papernot et al. 2016), but Carlini and Wagner (2017) defeated this method by designing a powerful attack able to break this defense mechanism. Thus, defense and design against adversarial attacks remain an open problem in DL applications.

As mentioned above, most studies regarding the application of DL models in transportation have paid no attention to robustness. However, in light of emerging malware attacks, the importance of defending models from such attacks has become increasingly important. These attacks usually destroy the input data by adding noise to them. These attacks can thus disturb the control unit by causing it to infer wrong information from the data, resulting in serious accidents. Also, another source of noise can be the weather conditions such as rainy or snowy conditions. Increasing the robustness of detection models will enable ITS models to operate better in severe conditions and thus improve their performance.

In summary, though much research is happening in various domains of ITS using a variety of DL models, the focus of future research in DL for ITS should encompass the following: how to develop DL models able to efficiently use the heterogeneous ITS data generated, how to build robust detection models, and how to ensure security and privacy in the use of these models.

References

Abedin MZ, Nath AC, Dhar P, Deb K, Hossain MS (2017) License plate recognition system based on contour properties and deep learning model. In: 2017 IEEE region 10 humanitarian technology conference (R10-HTC), IEEE, pp 590–593
Adu-Gyamfi YO, Asare SK, Sharma A, Titus T (2017) Automated vehicle recognition with deep convolutional neural networks. Transp Res Rec 2645:113–122
Google Scholar
Ali EM, Ahmed MM, Wulff SS (2019) Detection of critical safety events on freeways in clear and rainy weather using shrp2 naturalistic driving data: parametric and non-parametric techniques. Saf Sci 119:141–149
Google Scholar
Alkheder S, Taamneh M, Taamneh S (2017) Severity prediction of traffic accident using an artificial neural network. J Forecast 36:100–108
MathSciNet Google Scholar
Arabi S, Haghighat A, Sharma A (2020) A deep-learning-based computer vision solution for construction vehicle detection. Comput Aided Civil Infrastruct Eng 35:753–767
Google Scholar
Arabi S, Shafei B, Phares BM (2018) Fatigue analysis of sign-support structures during transportation under road-induced excitations. Eng Struct 164:305–315
Google Scholar
Arabi S, Shafei B, Phares BM (2019) Investigation of fatigue in steel sign-support structures under diurnal temperature changes. J Constr Steel Res 153:286–297
Google Scholar
Arabi S, Shafei B, Phares BM (2017) Vulnerability assessment of sign support structures during transportation. Technical Report
Arcos-Garcia A, Alvarez-Garcia JA, Soria-Morillo LM (2018) Deep neural network for traffic sign recognition systems: an analysis of spatial transformers and stochastic optimisation methods. Neural Netw 99:158–165
Google Scholar
Arif M, Wang G, Chen S (2018) Deep learning with non-parametric regression model for traffic flow prediction. In: 2018 IEEE 16th intl conf on dependable, autonomic and secure computing, 16th intl conf on pervasive intelligence and computing, 4th intl conf on big data intelligence and computing and cyber science and technology congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, pp 681–688
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34:26–38
Google Scholar
Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 37–49.
Bang S, Park S, Kim H, Kim H (2019) Encoder–decoder network for pixel-level road crack detection in black-box images. Comput Aided Civil Infrastruct Eng 34:713–727
Google Scholar
Bao J, Liu P, Ukkusuri SV (2019) A spatiotemporal deep learning approach for citywide short-term crash risk prediction with multisource data. Accid Anal Prev 122:239–254
Google Scholar
Bastani O, Ioannou Y, Lampropoulos L, Vytiniotis D, Nori A, Criminisi A (2016) Measuring neural net robustness with constraints. Advances in neural information processing systems. MIT Press, Cambridge, pp 2613–2621
Google Scholar
Behrendt K, Novak L, Botros R (2017) A deep learning approach to traffic lights: detection, tracking, and classification. In: 2017 IEEE international conference on robotics and automation (ICRA), IEEE, pp 1370–1377.
Bengio Y, Goodfellow IJ, Courville A (2015) Deep learning. Nature 521:436–444
MATH Google Scholar
Biggio B, Corona I, Maiorca D, Nelson B, Šrndić N, Laskov P, Giacinto G, Roli F (2013) Evasion attacks against machine learning at test time. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 387–402.
Bulan O, Kozitsky V, Ramesh P, Shreve M (2017) Segmentation-and annotation-free license plate recognition with deep localization and failure identification. IEEE Trans Intell Transp Syst 18:2351–2363
Google Scholar
Cai Y, Li D, Zhou X, Mou X (2018) Robust drivable road region detection for fixed-route autonomous vehicles using map-fusion images. Sensors 18:4158
Google Scholar
Cai Y, Sun X, Wang H, Chen L, Jiang H (2016) Night-time vehicle detection algorithm based on visual saliency and deep learning. J Sens 2016:1–7
Google Scholar
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP), IEEE, pp 39–57
Chakraborty P, Adu-Gyamfi YO, Poddar S, Ahsani V, Sharma A, Sarkar S (2018) Traffic congestion detection from camera images using deep convolution neural networks. Transp Res Rec 2672:222–231
Google Scholar
Chang J, Wang L, Meng G, Xiang S, Pan C (2018) Vision-based occlusion handling and vehicle classification for traffic surveillance systems. IEEE Intell Transp Syst Mag 10:80–92
Google Scholar
Chen L, Hu X, Xu T, Kuang H, Li Q (2017) Turn signal detection during nighttime by CNN detector and perceptual hashing tracking. IEEE Trans Intell Transp Syst 18:3303–3314
Google Scholar
Chen Q, Song X, Yamada H, Shibasaki R (2016) Learning deep representation from big and heterogeneous data for traffic accident inference. In: Thirtieth AAAI conference on artificial intelligence
Chen Y, Shu L, Wang L (2017) Traffic flow prediction with big data: A deep learning based time series model. In: 2017 IEEE conference on computer communications workshops (INFOCOM WKSHPS), IEEE, pp 1010–1011.
Chen C, Xiang H, Qiu T, Wang C, Zhou Y, Chang V (2018a) A rear-end collision prediction scheme based on deep learning in the internet of vehicles. J Parallel Distrib Comput 117:192–204
Google Scholar
Chen M, Yu X, Liu Y (2018b) PCNN: deep convolutional networks for short-term traffic congestion prediction. IEEE Trans Intell Transp Syst 19(11):3550–3559
Google Scholar
Chen B, Gong C, Yang J (2018c) Importance-aware semantic segmentation for autonomous vehicles. IEEE Trans Intell Transp Syst 20(1):137–148
Google Scholar
Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint. https://arxiv.org/abs/1406.1078
Choi S, Yeo H, Kim J (2018) Network-wide vehicle trajectory prediction in urban traffic networks using deep learning. Transp Res Rec 2672:173–184
Google Scholar
Chung J, Sohn K (2017) Image-based learning to measure traffic density using a deep convolutional neural network. IEEE Trans Intell Transp Syst 19:1670–1675
Google Scholar
Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. arXiv preprint. https://arxiv.org/abs/1202.2745
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning, ACM, pp 160–167.
Constantinescu G, Bhatti A, Phares B (2018) Effect of wind induced unsteady vortex shedding, diurnal temperature changes, and transit conditions on truss structures supporting large highway signs problem statement. Technical Report
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35:53–65
Google Scholar
Cui Y, Meng C, He Q, Gao J (2018) Forecasting current and next trip purpose with social media data and google places. Transportation Research Part C: Emerging Technologies 97:159–174
Google Scholar
Cui Z, Ke R, Wang Y (2018) Deep bidirectional and unidirectional LSTM recurrent neural network for network-wide traffic speed prediction. arXiv preprint. https://arxiv.org/abs/1801.02143
Cui Z, Henrickson K, Ke R, Wang Y (2018) Traffic graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting. arXiv preprint. https://arxiv.org/abs/1802.07007
Dabiri S, Heaslip K (2018) Inferring transportation modes from gps trajectories using a convolutional neural network. Transp Res Part C Emerg Technol 86:360–371
Google Scholar
Dai X, Fu R, Zhao E, Zhang Z, Lin Y, Wang F-Y, Li L (2019) Deeptrend 2.0: a light-weighted multi-scale traffic prediction model using detrending. Transp Res Part C Emerg Technol 103:142–157
Google Scholar
Dairi A, Harrou F, Senouci M, Sun Y (2018) Unsupervised obstacle detection in driving environments using deep-learning-based stereovision. Robot Autonom Syst 100:287–301
Google Scholar
Dairi A, Harrou F, Sun Y, Senouci M (2018) Obstacle detection for intelligent transportation systems using deep stacked autoencoder and k-nearest neighbor scheme. IEEE Sens J 18:5122–5132
Google Scholar
Daneshgaran F, Zacheo L, Stasio FD, Mondin M (2019) Use of deep learning for automatic detection of cracks in tunnels: prototype-2 developed in the 2017–2018 time period. Transp Res Rec 2673(9):44–50
Google Scholar
Das S, Dutta A, Dixon K, Minjares-Kyle L, Gillette G (2018) Using deep learning in severity analysis of at-fault motorcycle rider crashes. Transp Res Rec 2672:122–134
Google Scholar
Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, pp 8599–8603.
Ding F, Zhang Z, Zhou Y, Chen X, Ran B (2019) Large-scale full-coverage traffic speed estimation under extreme traffic conditions using a big data and deep learning approach: case study in china. J Transp Eng Part A Syst 145:05019001
Google Scholar
Doersch C (2016) Tutorial on variational autoencoders. arXiv preprint. https://arxiv.org/abs/1606.05908
Dominguez-Sanchez A, Cazorla M, Orts-Escolano S (2017) Pedestrian movement direction recognition using convolutional neural networks. IEEE Trans Intell Transp Syst 18:3540–3548
Google Scholar
Dong C, Shao C, Li J, Xiong Z (2018) An improved deep learning model for traffic crash prediction. J Adv Transp 2018:1–13
Google Scholar
Dong Z, Wu Y, Pei M, Jia Y (2015) Vehicle type classification using a semisupervised convolutional neural network. IEEE Trans Intell Transp Syst 16:2247–2256
Google Scholar
Dou Y, Fang Y, Hu C, Zheng R, Yan F (2018) Gated branch neural network for mandatory lane changing suggestion at the on-ramps of highway. IET Intel Transp Syst 13:48–54
Google Scholar
Dougherty M (1995) A review of neural networks applied to transport. Transp Res Part C Emerg Technol 3:247–260
Google Scholar
Du S, Li T, Gong X, Yang Y, Horng SJ (2017) Traffic flow forecasting based on hybrid deep learning framework. In: 2017 12th International conference on intelligent systems and knowledge engineering (ISKE), IEEE, pp 1–6.
Duan Y, Lv Y, Liu Y-L, Wang F-Y (2016) An efficient realization of deep learning for traffic data imputation. Transp Res Part C Emerg Technol 72:168–181
Google Scholar
E. Van der Pol, F. A. Oliehoek, Coordinated deep reinforcement learners for traffic light control, Proceedings of Learning, Inference and Control of Multi-Agent Systems (at NIPS 2016) (2016).
Elman JL (1990) Finding structure in time. Cognit Sci 14:179–211
Google Scholar
Esfandiari Y, Balu A, Ebrahimi K, Vaidya U, Elia N, Sarkar S (2019) A fast saddle-point dynamical system approach to robust deep learning. arXiv preprint. https://arxiv.org/abs/1910.08623
Fang J, Zhou Y, Yu Y, Du S (2016) Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture. IEEE Trans Intell Transp Syst 18:1782–1792
Google Scholar
Fawcett T (2003) In vivo spam filtering: a challenge problem for kdd. ACM SIGKDD Explor Newsl 5:140–148
Google Scholar
Fouladgar M, Parchami M, Elmasri R, Ghaderi A (2017) Scalable deep traffic flow neural networks for urban traffic congestion prediction. In: 2017 International joint conference on neural networks (IJCNN), IEEE, pp 2251–2258.
Gang X, Kang W, Wang F, Zhu F, Lv Y, Dong X, Riekki J, Pirttikangas S (2015) Continuous travel time prediction for transit signal priority based on a deep network. In: 2015 IEEE 18th international conference on intelligent transportation systems, IEEE, pp 523–528.
Gao J, Shen Y, Liu J, Ito M, Shiratori N (2017) Adaptive traffic signal control: deep reinforcement learning algorithm with experience replay and target network. arXiv preprint. https://arxiv.org/abs/1705.02755
Genders W, Razavi S (2018) Evaluating reinforcement learning state representations for adaptive traffic signal control. Proc Comput Sci 130:26–33
Google Scholar
Gers FA, Schmidhuber E (2001) LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans Neural Netw 12:1333–1340
Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323.
Gong Y, Abdel-Aty M, Cai Q, Rahman MS (2019) Decentralized network level adaptive signal control by multi-agent deep reinforcement learning. Transp Res Interdiscip Perspect 1:100020
Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014a) Generative adversarial nets. Advances in neural information processing systems. MIT Press, Cambridge, pp 2672–2680
Google Scholar
Goodfellow I (2016) Nips 2016 tutorial: generative adversarial networks. arXiv preprint. https://arxiv.org/abs/1701.00160
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint. https://arxiv.org/abs/1412.6572
Goudarzi S, Kama M, Anisi M, Soleymani S, Doctor F (2018) Self-organizing traffic flow prediction with an optimized deep belief network for internet of vehicles. Sensors 18:3459
Google Scholar
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18:602–610
Google Scholar
Gu Y, Lu W, Qin L, Li M, Shao Z (2019) Short-term prediction of lane-level traffic speeds: a fusion deep learning model. Transp Res Part C Emerg Technol 106:1–16
Google Scholar
Gu S, Rigazio L (2014) Towards deep neural network architectures robust to adversarial examples. arXiv preprint. https://arxiv.org/abs/1412.5068
Guo J, Liu Y, Wang Y, Yang K (2019) Deep learning based congestion prediction using probe trajectory data. In: 19th COTA international conference of transportation professionals
Gurghian A, Koduri T, Bailur SV, Carey KJ, Murali VN (2016) Deeplanes: end-to-end lane position estimation using deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 38–45.
Hao S, Lee D-H, Zhao D (2019) Sequence to sequence learning with attention mechanism for short-term passenger flow prediction in large-scale metro system. Transp Res Part C Emerg Technol 107:287–300
Google Scholar
Hardy W, Chen L, Hou S, Ye Y, Li X (2016) Dl4md: a deep learning framework for intelligent malware detection. In: Proceedings of the international conference on data mining (DMIN), The Steering Committee of The World Congress in Computer Science, computer engineering and applied computing (WorldComp), p 61.
Hasselt HV (2010) Double q-learning. Advances in neural information processing systems. MIT Press, Cambridge, pp 2613–2621
Google Scholar
El Hatri C, Boumhidi J (2018) Fuzzy deep learning based urban traffic incident detection. Cognit Syst Res 50:206–213
Google Scholar
He M, Luo H, Chang Z, Hui B (2017) Pedestrian detection with semantic regions of interest. Sensors 17:2699
Google Scholar
Hoang TM, Nguyen PH, Truong NQ, Lee YW, Park KR (2019) Deep retinanet-based detection and classification of road markings by visible light camera sensors. Sensors 19:281
Google Scholar
Hochreiter S, Schmidhuber J (1997) LSTM can solve hard long time lag problems. Advances in neural information processing systems. MIT Press, Cambridge, pp 473–479
Google Scholar
Home—transport research international documentation (2017). https://trid.trb.org/
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79:2554–2558
MathSciNet MATH Google Scholar
Hosseini SA (2020) Data-driven framework for modeling deterioration of pavements in the state of Iowa. Graduate Theses and Dissertations
Hosseini SA, Smadi O (2020) How prediction accuracy can affect the decision-making process in pavement management system. https://doi.org/10.31224/osf.io/t28ue
Hosseini SA, Alhasan A, Smadi O (2020) Use of deep learning to study modelling deterioration of pavements a case study in Iowa. https://doi.org/10.31224/osf.io/edhvy
Hou Y, Edara P (2018) Network scale travel time prediction using deep learning. Transp Res Rec 2672:115–123
Google Scholar
Hu C, Bai X, Qi L, Chen P, Xue G, Mei L (2015) Vehicle color recognition with spatial pyramid deep learning. IEEE Trans Intell Transp Syst 16:2925–2934
Google Scholar
Hu Q, Wang H, Li T, Shen C (2017) Deep CNNs with spatially weighted pooling for fine-grained car recognition. IEEE Trans Intell Transp Syst 18:3147–3156
Google Scholar
Hu X, Xu X, Xiao Y, Chen H, He S, Qin J, Heng P-A (2018) Sinet: a scale-insensitive convolutional neural network for fast vehicle detection. IEEE Trans Intell Transp Syst 20:1010–1019
Google Scholar
Hu W, Zhuo Q, Zhang C, Li J (2017) Fast branch convolutional neural network for traffic sign recognition. IEEE Intell Transp Syst Mag 9:114–126
Google Scholar
Huang W, Song G, Hong H, Xie K (2014) Deep architecture for traffic flow prediction: deep belief networks with multitask learning. IEEE Trans Intell Transp Syst 15:2191–2201
Google Scholar
Huang Y, Wu R, Sun Y, Wang W, Ding X (2015) Vehicle logo recognition system based on convolutional neural networks with a pretraining strategy. IEEE Trans Intell Transp Syst 16:1951–1960
Google Scholar
Huang R, Hu J, Huo Y, Pei X (2019) Cooperative multi-intersection traffic signal control based on deep reinforcement learning. In: CICTP 2019, pp 2959–2970.
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160:106–154
Google Scholar
Hussain KF, Afifi M, Moussa G (2018) A comprehensive study of the effect of spatial resolution and color of digital images on vehicle classification. IEEE Trans Intell Transp Syst 20(3):1181–1190
Google Scholar
Intel® Neural Compute Stick 2 Product Specifications (2020) Ark.Intel.Com. https://ark.intel.com/content/www/us/en/ark/products/140109/intel-neural-compute-stick-2.html
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Jetson AGX Xavier Developer Kit (2020) NVIDIA Developer. https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit
Jetson Nano Developer Kit (2020) NVIDIA Developer. https://developer.nvidia.com/embedded/jetson-nano-developer-kit
Jetson TX2 - Elinux.Org (2020) Elinux.Org. https://elinux.org/Jetson_TX2
Jia Y, Wu J, Xu M (2017) Traffic flow prediction with rainfall impact using a deep learning method. J Adv Transp 207:1–10
Google Scholar
Jia Y, Wu J, Du Y (2016) Traffic speed prediction using deep learning method. In: 2016 IEEE 19th international conference on intelligent transportation systems (ITSC), IEEE, pp 1217–1222.
Jo D, Yu B, Jeon H, Sohn K (2018) Image-to-image learning to predict traffic speeds by considering area-wide spatio-temporal dependencies. IEEE Trans Veh Technol 68:1188–1197
Google Scholar
Jung J, Sohn K (2017) Deep-learning architecture to forecast destinations of bus passengers from entry-only smart-card data. IET Intel Transp Syst 11:334–339
Google Scholar
Ke J, Zheng H, Yang H, Chen XM (2017) Short-term forecasting of passenger demand under on-demand ride services: a spatiotemporal deep learning approach. Transp Res Part C Emerg Technol 85:591–608
Google Scholar
Ketkar N et al (2017) Deep learning with Python. Springer, Berlin
Google Scholar
Khajeh Hosseini M, Talebpour A (2019) Traffic prediction using time-space diagram: a convolutional neural network approach. Transp Res Rec 2673(7):425–435
Google Scholar
Kim T, Ghosh J (2016) Robust detection of non-motorized road users using deep learning on optical and lidar data. In: 2016 IEEE 19th international conference on intelligent transportation systems (ITSC), IEEE, pp 271–276.
Kim H-K, Park JH, Jung H-Y (2018a) An efficient color space for deep-learning based traffic light recognition. J Adv Transp 2018:2365414
Google Scholar
Kim I-H, Jeon H, Baek S-C, Hong W-H, Jung H-J (2018b) Application of crack identification techniques for an aging concrete bridge inspection using an unmanned aerial vehicle. Sensors 18:1881
Google Scholar
Kim EJ, Park HC, Ham SW, Kho SY, Kim DK (2019a) Extracting vehicle trajectories using unmanned aerial vehicles in congested traffic conditions. J Adv Transp 2019:9060797
Google Scholar
Kim TS, Lee WK, Sohn SY (2019b) Graph convolutional network approach applied to predict hourly bike-sharing demands considering spatial, temporal, and global effects. PLoS ONE 14:e0220782
Google Scholar
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint. https://arxiv.org/abs/1312.6114
Koesdwiady A, Soua R, Karray F (2016) Improving traffic flow prediction with weather information in connected cars: a deep learning approach. IEEE Trans Veh Technol 65:9508–9517
Google Scholar
Le QV et al (2015) A tutorial on deep learning part 2: autoencoders, convolutional neural networks and recurrent neural networks. Google Brain, California, pp 1–20
Google Scholar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551
Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Google Scholar
LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE international symposium on circuits and systems, IEEE, pp 253–256.
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al. (2017) Photo-realistic ´ single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Lee HS, Kim K (2018) Simultaneous traffic sign detection and boundary estimation using convolutional neural network. IEEE Trans Intell Transp Syst 19:1652–1663
Google Scholar
Lee S, Xie K, Ngoduy D, Keyvan-Ekbatani M, Yang H (2018) A lane-based predictive model of downstream arrival rates in a queue estimation model using a long short-term memory network. Transp Res Proc 34:163–170
Google Scholar
Li J, Wang J (2017) Short term traffic flow prediction based on deep learning. In: CICTP 2019, pp 2457–2469.
Li C, Yang C (2016) The research on traffic sign recognition based on deep learning. In: 2016 16th international symposium on communications and information technologies (ISCIT), Qingdao, China, 26–28 September 2016. IEEE, pp 156–161
Li L, Lv Y, Wang F-Y (2016) Traffic signal timing via deep reinforcement learning. IEEE/CAA J Automatica Sinica 3:247–254
MathSciNet Google Scholar
Li L, Qian B, Lian J, Zheng W, Zhou Y (2017) Traffic scene segmentation based on rgb-d image and deep learning. IEEE Trans Intell Transp Syst 19:1664–1669
Google Scholar
Li X, Liu Y, Zhao Z, Zhang Y, He L (2018a) A deep learning approach of vehicle multitarget detection from traffic video. J Adv Transp 2018:1–11
Google Scholar
Li Y, Song B, Kang X, Du X, Guizani M (2018b) Vehicle-type detection based on compressed sensing and deep learning in vehicular networks. Sensors 18:4500
Google Scholar
Li C, Chen Z, Wu QJ, Liu C (2018c) Deep saliency with channel-wise hierarchical feature responses for traffic sign detection. IEEE Trans Intell Transp Syst 20(7):2497–2509
Google Scholar
Liang X, Du X, Wang G, Han Z (2019) A deep reinforcement learning network for traffic light cycle control. IEEE Trans Veh Technol 68:1243–1253
Google Scholar
Liang X, Du X, Wang G, Han Z (2018) Deep reinforcement learning for traffic light control in vehicular networks. arXiv preprint. https://arxiv.org/abs/1803.11115
Liao S, Zhou L, Di X, Yuan B, Xiong J (2018) Large-scale short-term urban taxi demand forecasting using deep learning. In: Proceedings of the 23rd Asia and South Pacific design automation conference, IEEE Press, pp 428–433.
Lim K, Hong Y, Choi Y, Byun H (2017) Real-time traffic sign recognition based on a general purpose GPU and deep-learning. PLoS ONE 12:e0173317
Google Scholar
Lin L-J (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn 8:293–321
Google Scholar
Lin Y, Dai X, Li L, Wang F-Y (2018) Pattern sensitive prediction of traffic flow based on generative adversarial framework. IEEE Trans Intell Transp Syst 20:2395–2400
Google Scholar
Lin L, He Z, Peeta S (2018) Predicting station-level hourly demand in a large-scale bike-sharing network: a graph convolutional neural network approach. Transp Res Part C Emerg Technol 97:258–276
Google Scholar
Lin F, Lai Y, Lin L, Yuan Y (2016) A traffic sign recognition method based on deep visual feature. In: 2016 Progress in electromagnetic research symposium (PIERS), IEEE, pp 2247–2250
Lin Z, Yih M, Ota JM, Owens JD, Muyan-Özçelik P (2019) Benchmarking deep learning frameworks and investigating fpga deployment for traffic sign classification and detection. IEEE Trans Intell Veh 4(3):385–395
Google Scholar
Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv preprint. https://arxiv.org/abs/1506.00019
Liu L, Chen R-C (2017) A novel passenger flow prediction model using deep learning methods. Transportation Research Part C: Emerging Technologies 84:74–91
Google Scholar
Liu Y, Liu Z, Jia R (2019) Deeppf: a deep learning based architecture for metro passenger flow prediction. Transp Res Part C Emerg Technol 101:18–34
Google Scholar
Liu X, Liu W, Mei T, Ma H (2016) A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: European conference on computer vision, Springer, pp 869–884.
Liu Y, Wang Y, Yang X, Zhang L (2017) Short-term travel time prediction by deep learning: a comparison of different LSTM-DNN models. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), IEEE, pp 1–8.
Loce RP, Bernal EA, Wu W, Bala R (2013) Computer vision in roadway transportation systems: a survey. J Electron Imaging 22:041121
Google Scholar
Luo X, Shen R, Hu J, Deng J, Hu L, Guan Q (2017) A deep convolution neural network model for vehicle recognition and face recognition. Proc Comput Sci 107:715–720
Google Scholar
Lv Y, Duan Y, Kang W, Li Z, Wang F-Y (2014) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16:865–873
Google Scholar
Ma X, Dai Z, He Z, Ma J, Wang Y, Wang Y (2017) Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction. Sensors 17:818
Google Scholar
Ma X, Yu H, Wang Y, Wang Y (2015) Large-scale transportation network congestion evolution prediction using deep learning theory. PLoS ONE 10:e0119044
Google Scholar
Ma X, Zhang J, Du B, Ding C, Sun L (2018) Parallel architecture of convolutional bi-directional lstm neural networks for network-wide metro ridership prediction. IEEE Trans Intell Transp Syst 20:2278–2288
Google Scholar
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint. https://arxiv.org/abs/1706.06083
Maŕın-Reyes PA, Bergamini L, Lorenzo-Navarro J, Palazzi A, Calderara S, Cucchiara R (2018) Unsupervised vehicle re-identification using triplet networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), IEEE, pp 166–1665.
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint. https://arxiv.org/abs/1411.1784
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518:529
Google Scholar
Moher D, Liberati A, Tetzlaff J, Altman DG (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med 151:264–269
Google Scholar
Moll J, Arnold P, Malzer M, Krozer V, Pozdniakov D, Salman R, Rediske S, Scholz M, Friedmann H, Nuber A (2018) Radar-based structural health monitoring of wind turbine blades: the case of damage detection. Struct Health Monit 17:815–822
Google Scholar
Moosavi-Dezfooli S-M, Fawzi A, Fawzi O, Frossard P (2017) Universal adversarial perturbations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1765–1773.
Mousavi SS, Schukat M, Howley E (2017) Traffic light control using deep policy-gradient and value-function-based reinforcement learning. IET Intel Transp Syst 11:417–423
Google Scholar
Muresan M, Fu L, Pan G (2019) Adaptive traffic signal control with deep reinforcement learning an exploratory investigation. arXiv preprint. https://arxiv.org/abs/1901.00960
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2:1
Google Scholar
Nam D, Kim H, Cho J, Jayakrishnan R (2017) A model based on deep learning for predicting travel mode choice. In: Proceedings of the transportation research board 96th annual meeting transportation research board, Washington, DC, USA, pp 8–12.
Natarajan S, Annamraju AK, Baradkar CS (2018) Traffic sign recognition using weighted multi-convolutional neural network. IET Intel Transp Syst 12:1396–1405
Google Scholar
Nezafat RV, Sahin O, Cetin M (2019) Transfer learning using deep neural networks for classification of truck body types based on side-fire lidar data. J Big Data Anal Transp 1:71–82
Google Scholar
Nguyen H, Kieu L-M, Wen T, Cai C (2018) Deep learning methods in transportation domain: a review. IET Intel Transp Syst 12:998–1004
Google Scholar
Nguyen VD, Van Nguyen H, Tran DT, Lee SJ, Jeon JW (2016) Learning framework for robust obstacle detection, recognition, and tracking. IEEE Trans Intell Transp Syst 18:1633–1646
Google Scholar
Nguyen H, Bentley C, Kieu LM, Fu Y, Cai C (2019) Deep learning system for travel speed predictions on multiple arterial road segments. Transp Res Rec 2673(4):145–157
Google Scholar
Nhat-Duc H, Nguyen Q-L, Tran V-D (2018) Automatic recognition of asphalt pavement cracks using metaheuristic optimized edge detection algorithms and convolution neural network. Autom Constr 94:203–213
Google Scholar
Ni X, Wang H, Che C, Hong J, Sun Z (2019) Civil aviation safety evaluation based on deep belief network and principal component analysis. Saf Sci 112:90–95
Google Scholar
de Oliveira D, Wehrmeister M (2018) Using deep learning and low-cost rgb and thermal cameras to detect pedestrians in aerial images captured by multirotor uav. Sensors 18:2244
Google Scholar
Pamula T (2018) Road traffic conditions classification based on multilevel filtering of image content using convolutional neural networks. IEEE Intell Transp Syst Mag 10:11–21
Google Scholar
Pan G, Fu L, Thakali L (2017) Development of a global road safety performance function using deep neural networks. Int J Transp Sci Technol 6:159–173
Google Scholar
Pan G, Fu L, Thakali L, Muresan M, Yu M (2018) An improved deep belief network model for road safety analyses. arXiv preprint. https://arxiv.org/abs/1812.07410
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2016) Practical black-box attacks against deep learning systems using adversarial examples. arXiv preprint. https://arxiv.org/abs/1602.026971
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP), IEEE, pp 582–597
Park J, Min K, Kim H, Lee W, Cho G, Huh K (2018) Road surface classification using a deep ensemble network with sensor feature selection. Sensors 18:4342
Google Scholar
Polson NG, Sokolov VO (2017) Deep learning for short-term traffic flow prediction. Transp Res Part C Emerg Technol 79:1–17
Google Scholar
Puarungroj W, Boonsirisumpun N (2018) Thai license plate recognition based on deep learning. Proc Comput Sci 135:214–221
Google Scholar
Qian R, Zhang B, Yue Y, Wang Z, Coenen F (2015) Robust chinese traffic sign detection and recognition with deep convolutional neural network. In: 2015 11th international conference on natural computation (ICNC), IEEE, pp 791–796.
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint. https://arxiv.org/abs/1511.06434
Rahman M, Islam M, Calhoun J, Chowdhury M (2019) Real-time pedestrian detection approach with an efficient data communication bandwidth strategy. Transp Res Rec 2673(6):129–139
Google Scholar
Raspberry (2020) Raspberrypi.Org. https://www.raspberrypi.org/products/raspberry-pi-4-model-b/
Ren H, Song Y, Liu J, Hu Y, Lei J (2017) A deep learning approach to the prediction of short-term traffic accident risk. arXiv preprint. https://arxiv.org/abs/1710.09543
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252
MathSciNet Google Scholar
Saadi I, Wong M, Farooq B, Teller J, Cools M (2017) An investigation into machine learning approaches for forecasting spatio-temporal demand in ride-hailing service. arXiv preprint. https://arxiv.org/abs/1703.02433
Sameen M, Pradhan B (2017) Severity prediction of traffic accidents with recurrent neural networks. Appl Sci 7:476
Google Scholar
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Google Scholar
Shaham U, Yamada Y, Negahban S (2018) Understanding adversarial training: increasing local stability of supervised models through robust optimization. Neurocomputing 307:195–204
Google Scholar
Shen Y, Xiao T, Li H, Yi S, Wang X (2017) Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. In: Proceedings of the IEEE international conference on computer vision, pp 1900–1909.
Shi D, Ding J, Errapotu SM, Yue H, Xu W, Zhou X, Pan M (2018) Deep q-network based route scheduling for transportation network company vehicles. In: 2018 IEEE global communications conference (GLOBECOM), IEEE, pp 1–7
Shin H-C, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35:1285–1298
Google Scholar
Shustanov A, Yakimov P (2017) CNN design for real-time traffic sign recognition. Proc Eng 201:718–725
Google Scholar
Silver D (2015) UCL course on RL, lecture notes: reinforcement learning. https://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html. Accessed 9 Sep 2019
Simoncini M, Taccari L, Sambo F, Bravi L, Salti S, Lori A (2018) Vehicle classification from low-frequency GPS data with recurrent neural networks. Transp Res Part C Emerg Technol 91:176–191
Google Scholar
Singh D, Mohan CK (2018) Deep spatio-temporal representation for detection of road accidents using stacked autoencoder. IEEE Trans Intell Transp Syst 20(3):879–887
Google Scholar
Siripanpornchana C, Panichpapiboon S, Chaovalit P (2016) Travel-time prediction with deep learning. In: 2016 IEEE region 10 conference (TENCON), IEEE, p. 1859–1862
Soon FC, Khaw HY, Chuah JH, Kanesan J (2018a) Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. IET Intel Syst 12:939–946
Google Scholar
Soon FC, Khaw HY, Chuah JH, Kanesan J (2018b) Pcanet-based convolutional neural network architecture for a vehicle model recognition system. IEEE Trans Intell Transp Syst 20(2):749–759
Google Scholar
Suhao L, Jinzhao L, Guoquan L, Tong B, Huiqian W, Yu P (2018) Vehicle type detection based on deep learning in traffic scene. Proc Comput Sci 131:564–572
Google Scholar
Sussman JS (2008) Perspectives on intelligent transportation systems (ITS). Springer, Berlin
Google Scholar
Suzuki T, Kataoka H, Aoki Y, Satoh Y (2018) Anticipating traffic accidents with adaptive loss and large-scale incident db. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3521–3529.
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv preprint. https://arxiv.org/abs/1312.6199
Tan C, Eswaran C (2008) Reconstruction of handwritten digit images using autoencoder neural networks. In: 2008 Canadian conference on electrical and computer engineering, IEEE, pp 000465–000470.
Tan KL, Esfandiari Y, Lee XY, Sarkar AS (2020) Robustifying reinforcement learning agents via action space adversarial training. arXiv preprint. https://arxiv.org/abs/2007.07176
Tang T, Zhou S, Deng Z, Zou H, Lei L (2017) Vehicle detection in aerial images based on region convolutional neural networks and hard negative example mining. Sensors 17:336
Google Scholar
Tang Z, Wang G, Xiao H, Zheng A, Hwang J-N (2018) Single-camera and inter-camera vehicle tracking and 3d speed estimation based on fusion of visual and semantic features. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 108–115.
Tapu R, Mocanu B, Zaharia T (2017) Deep-see: Joint object detection, tracking and recognition with application to visually impaired navigational assistance. Sensors 17:2473
Google Scholar
Tawfeek MH, El-Basyouny K (2019) Estimating traffic volume on minor roads at rural stop-controlled intersections using deep learning. Transp Res Rec 2673(4):108–116
Google Scholar
Tesauro G (1995) Temporal difference learning and td-gammon. Commun ACM 38:58–68
Google Scholar
Theofilatos A, Chen C, Antoniou C (2019) Comparing machine learning and deep learning methods for real-time crash prediction. Transp Res Rec 2673(8):169–178
Google Scholar
Tian Y, Pan L (2015) Predicting short-term traffic flow by long short-term memory recurrent neural network. In: 2015 IEEE international conference on smart city/SocialCom/SustainCom (SmartCity), IEEE, pp 153–158
Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, ACM, pp 1096–1103.
Wan C-H, Hwang M-C (2018) Value-based deep reinforcement learning for adaptive isolated intersection signal control. IET Intel Transp Syst 12:1005–1010
Google Scholar
Wan Z, Jiang C, Fahad M, Ni Z, Guo Y, He H (2018) Robot-assisted pedestrian regulation based on deep reinforcement learning. IEEE Trans Cybern 50:1669–1682
Google Scholar
Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas N (2015) Dueling network architectures for deep reinforcement learning. arXiv preprint. https://arxiv.org/abs/1511.06581
Wang J, Gu Q, Wu J, Liu G, Xiong Z (2016a) Traffic speed prediction and congestion source exploration: a deep learning method. In: 2016 IEEE 16th international conference on data mining (ICDM), IEEE, pp 499–508.
Wang H, Cai Y, Chen X, Chen L (2016b) Night-time vehicle sensing in far infrared image with deep learning. J Sens 2016:3403451
Google Scholar
Wang JG, Zhou L, Pan Y, Lee S, Song Z, Han BS, Saputra VB (2016c) Appearance-based brake-lights recognition using deep learning and vehicle detection. In: 2016 IEEE intelligent vehicles symposium (IV), IEEE, pp 815–820
Wang J, Zheng H, Huang Y, Ding X (2017a) Vehicle type recognition in surveillance images from labeled web-nature data using deep transfer learning. IEEE Trans Intell Transp Syst 1–10.
Wang Q, Gao J, Yuan Y (2017b) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Trans Intell Transp Syst 19:1457–1470
Google Scholar
Wang K, Zhang A, Li JQ, Fei Y, Chen C, Li B (2017c) Deep learning for asphalt pavement cracking recognition using convolutional neural network. In: Proceedings of international conference airfield highway pavements, pp 166–177.
Wang Y, Zhang D, Liu Y, Dai B, Lee LH (2018a) Enhancing transportation systems via deep learning: a survey. Transp Res Part C Emerg Technol 99:144–163
Google Scholar
Wang Y, Wang C, Zhang H (2018b) Ship classification in high-resolution sar images using deep learning of small datasets. Sensors 18:2929
Google Scholar
Wang H, Yu Y, Cai Y, Chen L, Chen X (2018c) A vehicle recognition algorithm based on deep transfer learning with a multiple feature subspace distribution. Sensors 18:4109
Google Scholar
Wang J, Chen R, He Z (2019) Traffic speed prediction for urban transportation network: a path based deep learning approach. Transp Res Part C Emerg Technol 100:372–385
Google Scholar
Weber M, Wolf P, Zollner JM (2016) Deeptlr: A single deep convolutional network for detection and classification of traffic lights. In: 2016 IEEE intelligent vehicles symposium (IV), IEEE, pp 342–348
Wei H, Zheng G, Yao H, Li Z (2018) Intellilight: a reinforcement learning approach for intelligent traffic light control. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 2496–2505.
Wen J, Zhao J, Jaillet P (2017) Rebalancing shared mobility-on-demand systems: a reinforcement learning approach. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), IEEE, pp 220–225.
Williams RJ, Zipser D (1989) Experimental analysis of the real-time recurrent learning algorithm. Connect Sci 1:87–111
Google Scholar
Wu Y, Tan H, Qin L, Ran B, Jiang Z (2018) A hybrid deep learning based traffic flow prediction method and its understanding. Transp Res Part C Emerg Technol 90:166–180
Google Scholar
Wu C-W, Liu C-T, Chiang C-E, Tu W-C, Chien S-Y (2018) Vehicle re-identification with the space-time prior. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 121–128.
Xiang X, Lv N, Guo X, Wang S, El Saddik A (2018) Engineering vehicles detection based on modified faster R-CNN for power grid surveillance. Sensors 18:2258
Google Scholar
Xie L, Ahmad T, Jin L, Liu Y, Zhang S (2018) A new CNN-based method for multi-directional car license plate detection. IEEE Trans Intell Transp Syst 19:507–517
Google Scholar
Xu C, Ji J, Liu P (2018) The station-free sharing bike demand forecasting with a deep learning approach and large-scale datasets. Trans Res Part C Emerg Technol 95:47–60
Google Scholar
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9:611–629
Google Scholar
Yan Z, Feng Y, Cheng C, Fu J, Zhou X, Yuan J (2017) Extensive exploration of comprehensive vehicle attributes using D-CNN with weighted multi-attribute strategy. IET Intel Transp Syst 12:186–193
Google Scholar
Yang Y, Li D, Duan Z (2017) Chinese vehicle license plate recognition using kernel-based extreme learning machine with deep convolutional features. IET Intel Transport Syst 12:213–219
Google Scholar
Yang Y, Luo H, Xu H, Wu F (2015) Towards real-time traffic sign detection and classification. IEEE Trans Intell Transp Syst 17:2022–2031
Google Scholar
Yang G, Wang Y, Yu H, Ren Y, Xie J (2018a) Short-term traffic state prediction based on the spatiotemporal features of critical road sections. Sensors 18:2287
Google Scholar
Yang H, Xie K, Ozbay K, Ma Y, Wang Z (2018b) Use of deep learning to predict daily usage of bike sharing systems. Transp Res Rec 2672(36):92–102
Google Scholar
Yao Y, Tian B, Wang F-Y (2016) Coupled multivehicle detection and classification with prior objectness measure. IEEE Trans Veh Technol 66:1975–1984
Google Scholar
Yao H, Tang X, Wei H, Zheng G, Yu Y, Li Z (2018) Modeling spatial-temporal dynamics for traffic prediction. arXiv preprint. https://arxiv.org/abs/1803.01254
Yao H, Wu F, Ke J, Tang X, Jia Y, Lu S, Gong P, Ye J, Li Z (2018) Deep multi-view spatial-temporal network for taxi demand prediction. In: Thirty-second AAAI conference on artificial intelligence
Ye YY, Hao XL, Chen HJ (2018) Lane detection method based on lane structural analysis and CNNs. IET Intel Transp Syst 12:513–520
Google Scholar
Ye T, Wang B, Song P, Li J (2018) Automatic railway traffic object detection system using feature fusion refine neural network under shunting mode. Sensors 18:1916
Google Scholar
You C, Wen C, Wang C, Li J, Habib A (2018) Joint 2-D–3-D traffic sign landmark data set for geo-localization using mobile laser scanning data. IEEE Trans Intell Transp Syst 20(7):2550–2565
Google Scholar
Yu B, Guo Z, Asian S, Wang H, Chen G (2019) Flight delay prediction for commercial air transport: a deep learning approach. Transp Res Part E Logist Transp Rev 125:203–221
Google Scholar
Yu S, Wu Y, Li W, Song Z, Zeng W (2017) A model for fine-grained vehicle classification based on deep learning. Neurocomputing 257:97–103
Google Scholar
Yu R, Li Y, Shahabi C, Demiryurek U, Liu Y (2017) Deep learning: a generic approach for extreme condition traffic forecasting. In: Proceedings of the 2017 SIAM international conference on data mining, SIAM, pp 777–785.
Yu B, Yin H, Zhu Z (2017) Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. arXiv preprint. https://arxiv.org/abs/1709.04875
Yuan Y, Xiong Z, Wang Q (2016) An incremental framework for video-based traffic sign detection, tracking, and recognition. IEEE Trans Intell Transp Syst 18:1918–1929
Google Scholar
Yuan Z, Zhou X, Yang T, Tamerius J, Mantilla R (2017) Predicting traffic accidents through heterogeneous urban data: a case study. In: Proceedings of the 6th international workshop on urban computing (UrbComp 2017), Halifax, NS, Canada, volume 14
Yuan Z, Zhou X, Yang T (2018) Hetero-convLSTM: A deep learning approach to traffic accident prediction on heterogeneous spatiotemporal data. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 984–992.
Zang D, Chai Z, Zhang J, Zhang D, Cheng J (2015) Vehicle license plate recognition using visual attention model and deep learning. J Electron Imaging 24:033001
Google Scholar
Zeng Y, Xu X, Shen D, Fang Y, Xiao Z (2016) Traffic sign recognition using kernel extreme learning machines with deep perceptual features. IEEE Trans Intell Transp Syst 18:1647–1653
Google Scholar
Zhang X, Cheng L, Li B, Hu H-M (2018) Too far to see? not really!—pedestrian detection with scale-aware localization policy. IEEE Trans Image Process 27:3703–3715
MathSciNet MATH Google Scholar
Zhang K, Cheng H, Zhang B (2018) Unified approach to pavement crack and sealed crack detection using preclassification based on transfer learning. J Comput Civil Eng 32:04018001
Google Scholar
Zhang Z, He Q, Gao J, Ni M (2018) A deep learning approach for detecting traffic accidents from social media data. Transp Res Part C Emerg Technol 86:580–596
Google Scholar
Zhang D, Kabuka MR (2018) Combining weather condition data to predict traffic flow: a gru-based deep learning approach. IET Intel Transp Syst 12:578–585
Google Scholar
Zhang Z, Li M, Lin X, Wang Y, He F (2019) Multistep speed prediction on traffic networks: a deep learning approach considering spatio-temporal dependencies. Transp Res Part C Emerg Technol 105:297–322
Google Scholar
Zhang A, Wang KC, Fei Y, Liu Y, Tao S, Chen C, Li JQ, Li B (2018) Deep learning–based fully automated pavement crack detection on 3d asphalt surfaces with an improved cracknet. J Comput Civil Eng 32:04018041
Google Scholar
Zhang W, Wang Z, Liu X, Sun H, Zhou J, Liu Y, Gong W (2018) Deep learning-based real-time fine-grained pedestrian recognition using stream processing. IET Intel Transp Syst 12:602–609
Google Scholar
Zhang X, Yang W, Tang X, Wang Y (2018) Lateral distance detection model based on convolutional neural network. IET Intel Transport Syst 13:31–39
Google Scholar
Zhang L, Zhang G, Liang Z, Ozioko EF (2018) Multi-features taxi destination prediction with frequency domain processing. PLoS ONE 13:e0194629
Google Scholar
Zhang J, Zheng Y, Qi D (2017) Deep spatio-temporal residual networks for citywide crowd flows prediction. In: Thirty-first AAAI conference on artificial intelligence, IEEE
Zhao Z, Chen W, Wu X, Chen PC, Liu J (2017) LSTM network: a deep learning approach for short-term traffic forecast. IET Intell Transp Syst 11:68–75
Google Scholar
Zhao X, Gu Y, Chen L, Shao Z (2019) Urban short-term traffic flow prediction based on stacked autoencoder. In: 19th COTA international conference of transportation professionals
Zheng Y, Ozcan K, Velipasalar S (2017) A codebook of brightness transfer functions for improved target re-identification across nonoverlapping camera views. In: 2017 IEEE global conference on signal and information processing (GlobalSIP), IEEE, pp 166–170.
Zhong J, Lei T, Yao G (2017) Robust vehicle detection in aerial images based on cascaded convolutional neural networks. Sensors 17:2720
Google Scholar
Zhou F, Li J, Li X, Li Z, Cao Y (2019) Freight car target detection in a complex background based on convolutional neural networks. Proc Inst Mech Eng Part F J Rail Rapid Transit 233:298–311
Google Scholar
Zhou Y, Liu L, Shao L, Mellor M (2018) Fast automatic vehicle annotation for urban traffic surveillance. IEEE Trans Intell Transp Syst 19:1973–1984
Google Scholar
Zhu L, Yu FR, Wang Y, Ning B, Tang T (2018a) Big data analytics in intelligent transportation systems: a survey. IEEE Trans Intell Transp Syst 20(1):383–398
Google Scholar
Zhu H, Yang X, Wang Y (2018b) Prediction of daily entrance and exit passenger flow of rail transit stations by deep learning method. J Adv Transp 2018:1–11
Google Scholar
Zhuang Y, Ke R, Wang Y (2018) Innovative method for traffic data imputation based on convolutional neural network. IET Intell Transp Syst 13:605–613
Google Scholar

Download references

Author information

Authors and Affiliations

Iowa State University, Ames, IA, USA
Arya Ketabchi Haghighat, Varsha Ravichandra-Mouli, Pranamesh Chakraborty, Yasaman Esfandiari, Saeed Arabi & Anuj Sharma

Authors

Arya Ketabchi Haghighat
View author publications
You can also search for this author in PubMed Google Scholar
Varsha Ravichandra-Mouli
View author publications
You can also search for this author in PubMed Google Scholar
Pranamesh Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Yasaman Esfandiari
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Arabi
View author publications
You can also search for this author in PubMed Google Scholar
Anuj Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arya Ketabchi Haghighat.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haghighat, A.K., Ravichandra-Mouli, V., Chakraborty, P. et al. Applications of Deep Learning in Intelligent Transportation Systems. J. Big Data Anal. Transp. 2, 115–145 (2020). https://doi.org/10.1007/s42421-020-00020-1

Download citation

Received: 19 November 2019
Revised: 19 November 2019
Accepted: 17 July 2020
Published: 16 August 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s42421-020-00020-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Applications of Deep Learning in Intelligent Transportation Systems

Abstract

Similar content being viewed by others

Review on Deep Learning in Intelligent Transportation Systems

Deep learning for intelligent traffic sensing and prediction: recent advances and future challenges

Deep Learning Approaches for IoV Applications and Services

Introduction

Research Approach and Methodology