1 Introduction

The emergence of software-defined cloud infrastructures and scores of integrated platforms along with a bevy of pioneering digital technologies such as machine and deep learning, streaming analytics, micro services architecture (MSA), container management solutions, the distributed and decentralized IoT architectures, fog or edge data analytics, and 5G communication leads to a variety of digital disruption, innovation and transformation for the worldwide corporate and cities. The nations across the globe setting up and sustaining smarter cities are empowered with the faster maturity and stability of game-changing technologies and tools. With the continued advancements and accomplishments in the ICT (information and communication technologies) space, the speed and sagacity with which the establishment of smarter cities is really praiseworthy. The rising complexities due to the arrival and usage of heterogeneous and multiple technologies for realizing smart cities are on the climb. Therefore, the adoption of complexity-mitigation and value-adding technologies helps planners, decision-makers, and administrators come handy in surmounting those complications to quickly and easily bring forth people-centric, extensible, adaptive, knowledge-driven, innovation-filled, cloud-enabled, and safe cities.

1.1 The transformative technologies for transport and traffic domains

Traffic management becomes an intimate and intense affair for accomplishing smarter city projects. With the growing population of cars and vehicles, our connectivity infrastructures such as roadways, expressways, tunnels, bridges, and underground passages are experiencing a different kind of stress. Traffic snarls, congestions, and blockages damage the productivity of people. There is huge fuel wastage because of many stops and slow movements of vehicles at several junctions on the way to the destination. There have been concerted efforts by research scholars and scientists to bring forth strategically sound solutions for real-time intelligent traffic management solutions. However, they are found insufficient due to various causes and reasons. Now with the emergence of path-breaking technologies, automated tools, optimized processes and integrated platforms, researchers across the globe have started to focus on breakthrough solutions to minimize the traffic congestions and road blockages. There is a unified view that real-time decision-enabling, value-adding, and actionable data-driven insights are the need of the hour to regulate and rectify traffic issues. That is, capturing all kinds of vehicle movement data, road capacities, driver intention, destination, and any local traffic information and subjecting them to a variety of mining, processing, and analytics is the way forward for smarter traffic management.

The continuous maturity of artificial intelligence (AI) technologies such as machine and deep learning also contribute to the smartness of traffic management. Finally, the recent concepts of fog or edge analytics, digital twin and blockchain are getting a lot of attraction and attention. This paper is to describe new automated transport management solution that is to gain the intended prominence and dominance through the seamless and smart integration of the above-mentioned transformative technologies.

1.2 Research problem description

The various traffic statistics across cities say that the number of road accidents is on the rise, the traffic congestion is becoming alarming, the car population is growing fast, the time being spent on the roads is increasing, and the fuel and time wastage due to traffic snarl is definitely higher. On the other hand, pleasure trips and joyrides also contribute to more vehicles on the roads. Roadside hotels and motels are increasing in numbers. The number of traffic signals is steadily growing to regulate the escalating traffic. There is a growing family of traffic management systems that automate several aspects.

There is a realization that for further and deeper automation, big and streaming data analytics is the viable approach and answer. There are integrated platforms (commercial-grade and open source) for enabling both the activities. These platforms are being made readily available in cloud environments. Collecting all kinds of road, car, and traffic data, carrying them to cloud platforms, subjecting the collected, curated, and cleansed data to a variety of investigations to arrive at decision-enabling insights, taking decisions on time, and plunging into appropriate actions are the major components in the workflow. However, with clouds being operated at remote locations, the idea of real-time data capture, communication, processing, decision enablement, and actuation is out of question. Therefore, analytics professionals are of the opinion that instead of leveraging off-premise, online and on-demand cloud infrastructure, edge device clouds are recommended as the best fit for real-time data collection and crunching to facilitate real-time decision-making and actuation. Thus the faster the maturity and stability of the IoT edge/fog computing signals, the more advanced are the traffic management capabilities.

That is, there is a high synchronization between cloud-based big data analytics and the IoT edge data analytics through edge device clouds. But then, the pronounced advantages of this design are not to be boasted, because the big data analytics typically does deterministic, diagnostic, and historical processing and mining. That is, the processing and analytics logic have to be coded manually and deployed. Still, there are challenges in arriving at competent traffic management systems. This paper has proposed a fresh and futuristic attempt at producing viable, self-learning, and automated traffic management systems.

1.3 Embarking on next-generation intelligent transport systems (ITS)

Conventional IT-enabled ITSs are found insufficient and obsolete in the increasingly connected and complicated transport world. The fast-growing traffic conundrum insists on highly sophisticated and technology-intensive solutions for the transport world. Fortunately, the technology domain is also on the fast track producing breakthrough technologies and tools for simplifying and streamlining the process toward producing highly competitive and cognitive transport systems and services. This section illustrates the famous technologies enormously contributing to the faster realization of next-generation transport solutions.

Traffic lights have become very prominent and pervasive in urban areas for enabling smooth flow of pedestrians as well as vehicle drivers. There are high-fidelity video cameras in plenty along the roads, expressways, tunnels, etc. to activate and accelerate a variety of real-time tasks for pedestrians, traffic police, and vehicle drivers. Wireless access points such as Wi-Fi, 3G, 4G, roadside units, and smart traffic lights have been deployed along the roads. Vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) interactions enrich the application of this scenario. All kinds of connected vehicles and transport systems need actionable insights in time to derive and deliver a rich set of context-aware services. Safety is an important factor for car and road users and there are additional temporal as well as spatial services being worked out. With driverless cars under intense development and testing, insights-driven decisions and knowledge-centric actions are very vital for next-generation transports.

Every vehicle is connected. The in-vehicle infotainment system is being fit in every kind of vehicle on the road. This in-vehicle system acts as the centralized controller and gateway for the outside world. They contribute to the communication module capturing and communicating all kinds of operational, health, and performance parameter values of every significant module of the vehicle to faraway cloud environments. A cloud-hosted intelligent traffic system (ITS) has to be in place to act as the data cruncher, decision-maker, and actuator. The ITS has to be highly introduced.

1.3.1 Fog/edge analytics through device clouds

Typically, cloud computing prescribes centralized, consolidated, and sometimes federated processing through a variety of cloud models ranging from public, private, hybrid, and community clouds to fulfill new-generation computing needs. Now with the accumulation of distributed and dissimilar devices emerging as the new viable source for data generation, collection, storage, and processing, the cloud idea is getting expanded substantially and skillfully toward the era of edge or fog clouds, which is a kind of distributed yet local clouds for proximate processing. That is, the growing device ecosystem of resource-constrained as well as powerful fog devices (smartphones, device and sensor gateways, microcontrollers such as Raspberry Pi, etc.) in close collaboration with the traditional clouds are emerging as a venerable force for accomplishing the strategic goal of precision-centric data analytics.

The next-generation data analytics is being expected to be achieved through extended clouds, which is a hybrid version of conventional and edge clouds. That is, the sophisticated analytics happens not only at the faraway cloud servers, but also at the edge devices so that the security of data is ensured and the scarce network bandwidth gets saved immeasurably. The results of such kinds of enhanced clouds are definitely vast and varied. Primarily insights-filled applications and services will be everywhere all the time to be dynamically discoverable and deftly used for building and delivering sophisticated applications to people. There are convincing and captivating business, technical, and use cases for edge clouds and analytics for discovering and disseminating real-time knowledge.

1.3.2 Relevant and real-time vehicle and traffic information through edge clouds

Edge analytics is gaining a lot of momentum these days. With the edge devices being embedded with sufficient processing, storage, and I/O power, they are individually as well as collectively readied to participate in the mainstream computing. These devices can collect and process any incoming data and emit useful information in real time. The shared information can help the various participating sensors and actuators to plan and indulge in performing their activities with cognition, clarity, and confidence. Vehicles on the road are being stuffed with a number of purpose-specific and agnostic sensors and actuators to proactively and preemptively capture all the right and relevant data. The centralized infotainment system or OBD dongle contributes immensely to making smarter vehicles. The road infrastructure is also fitted with various cameras, sensors, Wi-Fi gateways, and other electronics to enable data gathering, aggregation, and communication. The in-vehicle infotainment system readily communicates, cooperates, corroborates, and correlates with the road infrastructure modules to get synched up with one another to collectively do the real-time and secure data capture, cleansing, filtering, decision enablement, and actuation.

Vehicles talk to one another as well as with the roadside IT and electronics equipment to recognize and relay the real-time situation on the road. The roadside infrastructure also comprises a variety of sensors to measure the distance and the speed of approaching vehicles from every direction. The other requirements include detecting the presence of pedestrians and cyclists crossing the street or road to proactively issue “slow down” warnings to incoming vehicles and instantaneously modifying its own cycle to prevent collisions. Besides ensuring utmost safety and the free flow of traffic, all kinds of traffic data need to be captured and stocked to do specific analytics to accurately predict and prescribe the ways and means of substantially improving the traffic system. Ambulances need to get a way out through traffic-free open lanes in the midst of chaotic and cruel traffic.

1.3.3 Digital twin

This is the latest buzz in the IT space. The ground-level entities (physical elements) are being integrated with cloud-based applications (cyber applications). This formal integration accordingly empowers the physical entities to join in the mainstream computing. This is the overall gist of cyber-physical systems (CPSs) and the Internet of things (IoT). Primarily, scores of industrial and manufacturing machines get integrated with remotely held applications and data sources. This setup enables the machines to be extremely and elegantly sensitive, responsive, and adaptive in their actions.

Now, the idea of the digital twin is to have a corresponding virtual image for a physical asset at the ground. That is, the virtual entity has all the structural as well as behavioral properties as the corresponding physical element. The digital twin is to have a dynamic virtual/digital representation for each of the physical systems. This cloud-based virtual representation helps to gain a better and deeper understanding of all kinds of ground-level physical, mechanical, electrical, and electronics systems and how they team up to collaborate, corroborate, and correlate with one another in the vicinity. The actions and reactions of these ground-level elements can be easily visualized, modeled, studied, and articulated through their corresponding virtual entities. There are other benefits of having a virtual replica of physical things. Ultimately, the fresh concept of digital twin takes the current IoT capability to the next level.

1.3.4 The machine and deep learning methods

This is the hottest topic on the planet Earth at this point in time. The data being generated and collected from different and distributed sources are growing exponentially. That is, it is the big data era. The data are simply multi-structured. The data size, speed, scope, structure, and schema vary and the hence it is a tremendous challenge to extract useful and usable information out of big data for data engineers and management professionals. There are a number of standardized big data analytics solutions in the form of enabling tools and integrated platforms. These analytical solutions typically perform batch processing, which is not liked by many. We are tending toward a real-time analytics of big data. That is, extracting actionable intelligence in time out of big data is the motto behind the recent advancements in the analytical space. Another interesting and intriguing trend is the automated analytics. That is, next-generation analytics platforms are being stuffed with a variety of learning algorithms to empower the analytical platforms to self-learn, reason, train, model, understand, and articulate newer evidence-based hypotheses.

1.3.5 Data lake for transport and traffic data heaps

Data lakes are becoming commonplace across industrial verticals. All kinds of multi-structured data get stocked in a centralized place to be found, accessed, and used for extracting useful insights out of data heaps. Data scientists are using data lakes greatly in their everyday job. For setting up and sustaining insights-driven transport management systems, data lakes are essential. We have object storage facilities in cloud environments to facilitate the realization of data lakes. Application programming interfaces (APIs) are being attached to open up for the outside world to find and bind with data collections to envision futuristic things.

1.3.6 Blockchain technology

This is quite a new paradigm gaining a lot of momentum these days. This has found a lot of followers across various industry sectors. This newly introduced technology brought in newer possibilities and opportunities for the transport sector. There are forecasts that as many as 54 million autonomous vehicles will be on the road by 2035. As the number of vehicles increases, so too will the volume of data. Also, by 2020, there will be 8.6 million connected features in cars and there are also estimates that there are up to 100 electronic control units in today’s cars. That equates to 100 million lines of code. There are strategic use cases for the automotive industry through the fast-evolving blockchain paradigm. Smart contracts are being coded to bring in the required intelligence to vehicles, traffic systems and databases, drivers, owners, etc. All kinds of interactions and transactions between the various participants get securely stored through the blockchain database. Thus, in the days ahead, there will be closer and tighter integration between vehicles and the fast-growing blockchain technology.

The noteworthy factor here is that the smarter traffic system has to learn, decide, and act instantaneously to avert any kind of accidents. That is, the real-time reaction is the crucial need and, hence, the concept of edge clouds out of edge devices for collaboratively collecting different data and processing them instantaneously to spit out insights is gaining widespread and overwhelming momentum. Another point here is that data flows in streams. Thus, all kinds of discrete/simple, as well as complex events need to be precisely and perfectly captured and combined to be subjected to a bevy of investigations to complete appropriate actions. The whole process has to be initiated at the earliest through a powerful and pioneering knowledge discovery and dissemination platform to avoid any kind of losses for people and properties. Here, collecting and sending data to remote cloud servers to arrive at competent decisions are found inappropriate for real-time and low-latency applications. However, the edge data can be aggregated and transmitted to powerful cloud servers casually in batches to have a historical diagnostic and deterministic analytics at a later point in time.

2 The proposed solution approach

We have come out with a real-time and cognitive traffic congestion avoidance solution. Having studied the current lacunae in the traffic management solutions, we have come out with an advanced, extensible, and AI-inspired solution to precisely and perfectly measure the traffic situation in real time and the driver intention by leveraging the localized fog analytics, the power of the digital twin along with the big data processing using competent machine learning methods. The reference architecture for our solution is shown in Fig. 1.

Fig. 1
figure 1

Solution approach diagram

2.1 The solution architecture description

There are three principal ingredients for enabling congestion discovery and dispersal, avoidance, and prediction.

  1. 1.

    Gathering situational information in real time—the current road and vehicle data through fog or edge data analytics.

  2. 2.

    Gaining driver history, behavior, and intention through machine learning (ML) and deep learning (DL).

  3. 3.

    Data lake at cloud for stocking historical information.

  4. 4.

    Intelligent transport system (ITS).

  5. 5.

    The virtual vehicle (VV) model—digital twin.

  6. 6.

    Blockchain as a service for vehicles.

The situational details are being captured through a variety of multifaceted cameras deployed along the road and route. Secondly, the driven intention is captured and decided through the VV model, which was explained above in detail. The key device is the vehicle telematics system that acts as the primary gateway between the car and the outside world.

2.2 Edge analytics-based virtual vehicle (VV) networks

To address the traffic challenges, here is a viable proposal. With the availability of powerful cameras and sensors along the roads, bridges, expressways, tunnels, signals, etc., a massive amount of real-time as well as historical data get captured, collected, cleaned, and stocked to be crunched. One of the decision-enabling factors for proactively and preemptively avoid traffic congestion and snarl is to get the drive intention. Figure 2 vividly illustrates how the driver intention is deduced from the various data collection and the digital twin, which is formed through a virtual vehicle (VV) model. The need here is to formulate a flexible and futuristic VV model to enable machine and deep learning algorithms to predict the driver intention with accuracy.

Fig. 2
figure 2

VV architecture

Since the proposed VV model makes decisions, it needs detailed driver information, such as preferences as to which lane the driver or automated vehicle is likely to select and route plans that are together considered as ‘intention’. The VV model can obtain scalable, real-time driver intention data, both captured locally from the vehicle, edge cloud, and the remote cloud; by processing them in the edge and by interacting with other VVs, VVs can predict other drivers’ intentions in such a way that this intention information can be used for a variety of scenarios. The VV is a virtual state of the vehicle and driver, which is processed in the edge and exists in the cloud.

The VV can interact with other VVs in the edge, where it is not limited by communication and computation resources. VVs for driverless vehicles can make decisions about path planning and interaction with other vehicles, while VVs for non-autonomous vehicles can help drivers make decisions by mining other drivers’ intentions. By obtaining data directly from the cloud and actively communicating with other VVs, the VV can coordinate with others to form a VV network (VVN). The physical vehicle or traffic controller behaves like an actuator on the road, acting upon directions from the VVN to the edge.

The role of the digital twin in the form of virtual representation for various physical, mechanical, electrical, and electronics assets and artefacts is to grow further in the days to come. The cloud centers emerge as the best-in-class IT environment for activating and accelerating the digital twin capability to produce actionable insights in time. The VV model, the digital twin for the transport industry vertical, is to be realized through integration with various contributing systems to be adaptive.

That is, the VV model orchestrates through several entities to be accurate and authentic. A variety of parameters are incorporated to make the VV approach viable and venerable.

The localized data, being captured and filtered through edge or fog devices, convey the realistic and real-time situation at the ground level. The traffic scenario, the road and the vehicle data, and other useful information are collected by fog devices and subjected to a variety of investigations to extricate usable and useful information that can be communicated to the faraway and powerful clouds to synchronize with the historical data to enhance the accuracy of the decisions. The VV model comes in hand in contributing to the knowledge discovery and dissemination. Finally, the ITS acts based on the insights accrued.

2.3 Virtual vehicle (VV) model

Having discussed the various ingredients of the solution, this section describes the VV model. In this VV model, we describe various variables and parameters to arrive at a competent VV model.

For more than a decade, multi-agent systems have been an active area of research [15, 6, 7]. Agent technology, which relies on distribution, provides a natural solution to the highly distributed and dynamically changing problem of traffic management and control. Although some existing approaches utilize multi-agents to solve traffic congestion, these methods rely on traffic control centers; traditional agents cannot make decisions for users. However, in our proposed model, the VV is a virtual state of both driver and vehicle, and has personalized knowledge of each vehicle, so that it can make effective decisions compared to the existing agent-based models.

We have designed and developed approaches based on multi-agent systems for several related problems [810] and will leverage these approaches for the proposed VV model. These agents can provide distributed data mining and autonomous data to decisions using minimal computing and networking resources without moving huge amounts of data to analytic codes.

One of the major challenges in the design of the VV model is how to capture the intent information to make decisions and predictions for real-time control. We will leverage the existing approaches [1120, 2129] to capture the vehicle and driver intentions and improve upon them to ingest the intention data for the proposed VV model. Most of the challenges in VV modeling are technical problems that come from handling large amounts of information and modeling highly dynamic interactions. For example, the model constructed should satisfy all potential VVs; the same vehicle with different drivers, and different vehicles with the same driver, forms different VVs. There are, therefore, a large number of VVs to track and model. An additional layer of complexity comes from the fact that each VV must intelligently make decisions according to dynamic traffic information, such as the flow of the vehicles and their related information. Therefore, each of the multitudes of VVs should have personalized knowledge about the driver and the vehicle, another source of high information load. The need for VVs to interact with each other and achieve cooperation only increases the model’s complexity. Another key challenge is how to describe the data—both the objective information VV needs to operate and subjective driver preference.

To address these challenges, we have designed the VV model as shown in Fig. 3, with our rule engine at the core. In the proposed VV model, network data would include the information related to edge network, such as bandwidth and boundary data. Sensor data would include the current GPS coordinates, current speed, and average speed of the vehicle. This dynamic vehicle information is captured in the fact database, and the cache of these facts needed by the knowledge session is stored in the working memory. The knowledge base would include the interest reference, such as scenic route; path preference, such as fastest route; and driving characteristics information, such as rash driving. The fact database and the knowledge base information would be used by the learning agent to learn about the VV and capture the vehicle/driver intentions. Examples of the features that would be captured in the individual action learning (IAL) phase include the current speed, make, model, and year of the vehicle. Example of the features that would be captured for the joint action learning (JAL) phase would include the emergency situation. The features captured in the fact database and knowledge base would be processed in the rule engine to generate actions. An example of actions that would be generated from the rule engine would include informing other VVs in VVN of known traffic accidents, lane closures, etc. The VVs coordinate with edge devices, cloud, and other VVs through the execution agent (where the decision/action is generated) and interaction interface (where the decision/action is executed through network). Please refer to the other resource section of the facilities, equipment and other resources section for one or more features that we identified as important for a given action based on the preliminary study.

Fig. 3
figure 3

VV model

There are some key solutions needed to construct the VV model, including the technology to translate the different formats of incoming data, such as data from the roadside sensor and the network data from the intelligent transportation systems (ITS). To meet these technological needs, we plan to introduce the ontology needed to model the enormous dataset that the proposed project will handle. Similarly, the VV model requires technology to accurately describe the driver’s knowledge. In the VV model, knowledge provides the matching rules, and the VV makes decisions according to the result of fact matching. However, different drivers may have different knowledge, and drivers’ knowledge may change from interaction with other VVs. The matching rule must be accurate in describing the driver’s knowledge, mining the user’s intentions, and capturing them in the VV information space. Finally, VV interaction is another key technology. In our interactive VV model, each VV may take different actions in the same scenario, depending on each driver’s intentions.

2.4 Virtual vehicle and driver intention learning model

Since VV is a virtual state of both vehicle and driver, it must learn the features of each through interaction; it must do the same with other VVs, interactively learning their current state and their intentions. Thus, the higher the number of VVs, the easier and more effective would the learning be. Machine learning (ML) algorithms have been traditionally applied for learning [3038, 39].

We have designed a deep learning technique, recurrent neural networks (RNN), to harness the knowledge needed for route selection. Recent studies have shown that deep learning techniques such as long short-term memory (LSTM)-based sequence to sequence RNNs perform better for connected vehicle applications [4042]. For this reason, we utilized LSTM-based RNNs for VV learning. In this algorithm, a fixed number of neural networks are set and neural networks are used to be trained from new sample data each time. One proposed research task is to deduce the generalizable theory that underlies our already developed algorithm and verify it; another is to leverage the learning approach algorithms and techniques developed earlier for related problems [43, 4447].

2.4.1 Research challenges and future work recommendations

The VV must interactively learn large amounts of information both accurately and quickly. However, the present approaches to learning all have high time complexity and cannot be directly adopted for VV knowledge acquisition. For this reason, fast, efficient learning is a key research challenge for this objective. To solve this challenge, we divide the process of VV learning into two phases: the individual action learning (IAL) phase, which uses an RNN model for acquisition of knowledge of common functions, and the joint action learning (JAL phase, which adapts the incremental learning model to allow VVs to acquire other vehicles’ intentions and knowledge through real-time interactions. The IAL model consists of three layers: (1) the input layer, the fact database of the vehicle acts as the input for the RNNs; (2) the hidden layer, where we set an activation function and a threshold to solve the nonlinear problem; the knowledge base of the driver initializes the activation function and the threshold in the RNNs; additional hidden layers can be set to improve learning accuracy; and (3) the output layer is where the VVs can obtain common knowledge, which can be stored to give newly created VVs immediate knowledge. In the JAL phase, VVs must quickly and effectively acquire knowledge from the corresponding vehicle and driver and from other VVs through online interaction using the incremental learning model. The JAL model consists of four layers: (1) in JAL layer one, each node represents an input variable and directly transmits the input signal to layer two; (2) in JAL layer two, each node represents the membership value of each input variable; (3) in JAL layer three, each node represents the “if” part of if–then rules obtained by the sum-product composition and the total number of such rules; (4) in JAL layer four, each node corresponds to an output variable that is given by the weighted sum of the output of each normalized rule. This model allows the VV to dynamically learn from other VVs. Moreover, we have real data, a set of taxicab traces containing recorded GPS trajectories from more than 7000 taxicabs in November of 2012 [48] that we can use to train our model. We will use the drivers’ experience and the taxis’ GPS trajectories as the input for the IAL model and traffic conditions, such as vehicle speed and weather, as the input for the JAL model.

This research objective aims to produce a fast and deep learning approach that will allow VVs to make correct decisions for the driver and we will design several algorithms to accomplish this goal. To evaluate this outcome, we plan to use an existing dataset for testing purposes.

2.5 The intelligent ITS: VV coordination

Some of the existing vehicle cooperation approaches such as vehicular ad hoc networks (VANET) and navigation-based approaches lack the ability to coordinate automated vehicles or communication between vehicles and traffic infrastructure [6, 49, 50, 5154] efficiently. Furthermore, data collection, a key requirement for enabling routing and coordinating services in the vehicular network, has recently attracted considerable research interest.

3 Experimentation and results

To reduce data redundancy, we propose a VV cooperation approach that is based on the use of a coalition game algorithm in the cloud. In this approach, as shown in Fig. 2, each VV need not upload its (possibly redundant) captured data directly to the data center; instead, each VV interacts with other VVs, forming a coalition to collect data cooperatively. In our coalition algorithm, VVs first ascertain how captured data can be gathered and then form coalitions by exchanging this data gathering information. This coalition formation means that members can individually contribute to a scalable view of the data.

Our preliminary experiments were driven by the data from TAPAS [55], a system that computes mobility plans for an area population, generated from information about Germans’ traveling habits and the infrastructure of the areas in which they live. We used the traffic simulation software SUMo [56] to generate vehicle traces from real data. We divided all the roads into 100 segments in an area of 600 m × 600 m, numbering each segment so that individual vehicles could be linked to their trace. We used the following three related algorithms to compare and evaluate the effectiveness of our approach: (i) the Max Greedy algorithm, where the sensing center selects a virtual vehicle that has the highest number of non-repeated data blocks; (ii) the Min Greedy algorithm, where a virtual vehicle is selected by the sensing center if it has both the least number of non-repeated data blocks for the sensing center and the least number of data blocks compared with the last vehicles; and (iii) the random algorithm, where virtual vehicles gather data and individually transmit it to the sensing center in a random manner.

We evaluated our algorithm with the related algorithms using two metrics: the ratio redundancy metric and the success rate metric. The ratio redundancy is defined as follows:

$$ \xi = {\kern 1pt} \frac{{\left( {\sum\nolimits_{i = 1}^{n} {M_{i} } } \right) - M}}{{\sum\nolimits_{i = 1}^{n} {M_{i} } }}, $$

where n denotes the number of virtual vehicles that can provide complete data with M blocks, and \( M_{i} ( \le M) \) denotes the number of blocks that vehicle i can gather. The success rate is defined as \( \rho {\kern 1pt} { = 1 - }\frac{{n_{c} }}{N} \), where \( n_{c} \) denotes the number of virtual vehicles in the stable coalition and \( n \) denotes the total number of virtual vehicles in our experiments. This is based on the rationale that coalitions with fewer vehicles can help achieve coordination faster. The results of our preliminary experiments, in which M and n are set to 100 for rate of redundancy and success rate metrics, respectively, are shown in Fig. 4, which demonstrates that our approach is an effective way to solve data redundancy without central control. Data collection is a relatively simple task for VVs in coordination, but traffic management requires a huge number of coordinated vehicles, and the coordinating process is more complex than data collection.

Fig. 4
figure 4

Experimental results for VV coordination. a Number of vehicles vs. ratio of redundancy; b number of vehicles vs. success rate

Our paper illustrates a novel intelligent traffic management framework. Intelligent traffic management is acquiring special significance as the number of smart cities across the countries is growing steadily. The much-needed intelligence is realized by accurately predicting traffic congestions and chaos at certain places and by prescribing the ways and means of moderating the traffic jams and snarls.

The technologies and tools used are software-defined cloud environments, digital twin, artificial intelligence (AI) (machine and deep learning algorithms), data lake, real-time data capture, storage, processing, analytics, decision-making and action through IoT edge analytics, edge and public cloud integration, etc. By leveraging the proven, potential, and promising technologies, we arrive at a framework, which guarantees the much-needed accuracy in decision-making and subsequent actions. The digital twin is the virtual and logical representation of physical assets and processes. There is a direct communication between physical and digital systems to collect the latest data.

Machine and deep learning algorithms are capable of analyzing big data in real time to extract actionable insights in time and the discovered knowledge gets disseminated to the particular junctions and locations to streamline the traffic movement in a smooth manner so as to avoid time wastage in those places.

3.1 Research challenges and future work recommendations

There are two key challenges to the coordination research objective: (1) many VVs must achieve coordination with each other in a short time; and (2) we must consider the intention of every VV in the process of coordination. In our VV architecture, as shown in Fig. 2, VVs can cooperate and send the cooperative results to the vehicles to provide a safe and pleasant experience for vehicles on the road. We, therefore, propose an approach based on the contract net protocol to overcome the challenges of a virtual transportation network. In our approach, we first assign weights between two VVs; vehicles that may produce traffic congestion are assigned traffic dispersion tasks. When a VV accepts the task, it becomes a manager and is responsible for sending and allocating the task to other VVs. These vehicles can then communicate with other VVs to make decisions. We plan to assess the proposed approach by validating the decisions that the vehicles make.

4 Conclusion

The transport sector is poised for accomplishing better and bigger things in the years ahead with the consistent flow of path-breaking technologies and tools. A bevy of pioneering technologies in information, communication, sensing, perception, vision, integration, knowledge discovery and dissemination, and decision enablement collectively are bound to do a lot of greater things for the automotive industry. There are already intelligent transport systems (ITS) and, now with the addition of real-time information gathering and analytics, we can safely expect ground-breaking accomplishments for the transport and logistics industry verticals. The faster proliferation of machine and deep learning algorithms along with the evolving concept of digital twin and blockchain goes a long way in bringing more sophisticated and smarter cars, trucks, buses, ships, trains, rocket and satellites, aeroplanes, and other transport solutions. In short, it is going to be a technology-splurged and software-defined world bringing immense and immeasurable benefits for every citizen of this planet Earth.