Keywords

1 Introduction

The 5th Generation of Mobile and Wireless Communications (the so called as “5G”) represents a complete revolution of mobile networks for accommodating the over-growing demands of users, services and applications. In contrast to previous transitions between mobile networks generations, in 5G there will be a much complex management requirements based on the softwarisation of network resources [1]. This ultimately will lead to a “system” that requires real-time management based on a hierarchy of complex decision-making techniques that analyse historical, temporal and frequency network data. Simultaneously, the softwarisation feature will contribute to automating processes, optimising costs, reducing time-to-market, and providing better quality services.

Modern 5G networks represent a “shift” in networking paradigms, purely implicating to a transition from today’s “network of entities” to a sort of “network of functions”. Indeed, this “network of (virtual) functions” resulting in some cases in the decomposition of current monolithic network entities can be a pillar for constituting the unit of networking for next generation systems. The above mentioned functions should be able to be composed upon an “on-demand”, “on-the-fly” basis. In fact, a research challenge for managing virtual network functions (VNFs) consists in designing solutions, which identify a set of elementary functions -or blocks- to compose network functions, while today they are usually implemented as monolithic. In this framework, uniform management and operations for NVFs are becoming part of the dynamic design of software architectures for 5G. The 5G will not only be a kind of progress of mobile broadband networks but will also create a set of novel and unique network and service capabilities, structuring a form of a sustainable and scalable technology. Consequently, this will support towards establishing a proper ecosystem for technical and business innovation [2]. Among current 5G’s priorities is also to incorporate advanced automation, autonomicity and cognitive management features [3] to advance operators’ efficiency, having a positive impact on the broader competitiveness of the European ICT industry [4].

During 5G-PPP Phase-1, the SESAME project [5] evolved the Small Cell (SC) concept by integrating processing power (i.e., a low-cost micro-server) and by enabling the execution of applications and network services, in accordance to the Mobile Edge Computing (MEC) paradigm [6]. It also provided network intelligence and applications by leveraging the Network Function Virtualisation (NFV) concept [7]. Within the actual scope of 5G-PPP Phase 2, the ongoing 5G ESSENCE project [8] mainly leverages results from the prior SESAME project as well as from other Phase 1 projects, in order to cover the specific network needs of the vertical sectors and their inter-dependencies. The 5G ESSENCE Project “addresses” the paradigms of Edge Cloud computing and Small Cell as-a-Service (SCaaS) by fueling the drivers and removing the barriers in the Small Cell (SC) market, forecasted to perform play a major role in the 5G ecosystem [6]. Thus, the 5G ESSENCE project provides a highly flexible and scalable platform, able to support new business models and revenue streams by creating a “neutral” host market and reducing operational costs, by providing new opportunities for ownership, deployment, operation and amortisation. The 5G ESSENCE enhances the processing capabilities for data that have immediate value beyond locality; it also addresses the processing-intensive small cell management functions, such as Radio Resource Management (RRM)/Self Organising Network (SON) and, finally, it culminates with real-life demonstrations. For all the above, the project suggests clear breakthroughs in the research fields of wireless access, network virtualisation and end-to-end (E2E) service delivery.

The 5G ESSENCE’s technical approach exploits the benefits of the centralisation of Small Cell functions as scale grows through an edge cloud environment based on a two-tier architecture, that is: a first distributed tier for providing low latency services and a second centralised tier for providing high processing power for computing-intensive network applications. This allows decoupling the control and user planes of the Radio Access Network (RAN) and achieving the benefits of Cloud-RAN (C-RAN) without the enormous fronthaul latency restrictions [9]. The use of E2E network slicing mechanisms will allow sharing the 5G ESSENCE infrastructure among multiple operators/vertical industries and so customising its capabilities on a per-tenant basis. The versatility of the architecture is enhanced by high-performance virtualisation techniques for data isolation, latency reduction and resource efficiency, and by orchestrating lightweight virtual resources enabling efficient Virtualised Network Function placement and live migration.

2 Telemetry and Analytics in the 5G Ecosystem

Among other issues, the modern service-driven 5G network architectures intent to deal with and cover a wider set of diversified mobile service requirements, via a flexible and reliable way. With Software Defined Networking (SDN) and Network Functions Virtualisation efficiently supporting the underlying physical infrastructure, the related 5G environment can “cloudify” access, transport, and core networks in a very comprehensible and via a quite reliable manner [10, 11]. Cloud adoption allows for better support for diversified 5G services and, furthermore, it enables the “key technologies” of end-to-end network slicing, on-demand deployment of service anchors and component-based network functions. The extended adoption of SDN, NFV, Cloud and Edge Computing in 5G actual architectures results in environments which are primarily software-driven in nature. In turn, this sort of transformation towards “softwarisation” leads to the emergence of a great variety of challenges in how we can effectively manage and monitor these environments. The respective challenges are further “amplified” when these software functions need to “run” on infrastructure environments built from standard high-volume servers hosting multi-layer software functions.

On the other hand [12] the intended 5G Network Management is a non-trivial endeavor that faces a host of new challenges beyond 3G and 4G, covering all radio and non-radio segments of the network. Several factors like the number of nodes, the heterogeneity of the access technologies, the conflicting management objectives, the resource usage minimization, and the division between limited physical resources and elastic virtual resources are implicating for a critical change in the respective methodology for realising an efficient network management. In the past frameworks covering previous generations, a distinction was typically made between the control and data plane of the network. Nevertheless, the model of the actual 5G networks can be extended in terms of a “Service and Softwarisation plane”, where the management of the network services and the virtualised devices is an integral and indispensable part of the overall network. This approach can be used in order to propose a model for extending the idea of network management to the reliance on an increased overall capacity of computational resources creating a robust solution.

Within the above novel framework, telemetry and analytics are expected to perform a fundamental role in the realisation and management of 5G network architectures [13]. Telemetry is an automated communications process (recording and transmission) by which measurements are made and other data collected at remote or inaccessible points and transmitted to receiving equipment for monitoring. Telemetry data may be relayed by using radio, infrared, ultrasonic, GSM, satellite or cable depending on the application (in fact, telemetry is not only used in software development, but also in meteorology, intelligence, medicine and other fields). Telemetry provides the visibility into what is actually happening in real-time within the compute, network and storage infrastructures and at the service layers. Thus, telemetry can offer insights on which features end-users use most, detection of bugs and issues, and offering better visibility into performance without the need to solicit feedback directly from users [14]. Analytics, in turn, provide actionable insights based on the telemetry data collected. Thus, analytics is the discovery and communication of meaningful patterns in data. Being especially valuable in areas rich with recorded information, analytics relies on the simultaneous application of statistics, computer programming and operations research to quantify performance. Analytics often favors data visualisation to communicate insight.

The quality and appropriate exploitation of these insights determines the level of intelligence that exists within a relevant 5G-like system architecture, particularly in the ability to automate and to provide fine-grained management of the network infrastructures. Based on actual trends and following to the results of research being in progress, it appears that the 5G architectures will be highly distributed (in particularly with (SC-) small cell-based edge deployments), heterogeneous and dynamic in nature [15]. These particular characteristics can have significant implications for the design and use of telemetry and analytics systems. For example, when and where analysis takes place will be constrained by the available bandwidth, processing power, network bandwidth etc. Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Data analytics will be required to support a wide variety of usage scenarios such as identifying the bandwidth required to deliver a given level of user experience, based on application or user profiles. Analytics will also evolve beyond looking at what has happen historically within the system to provide views on “what will happen in the future”, e.g. predictive service KPIs. This type of capability can support smarter and more dynamic capacity planning in order to “size” the infrastructure resources, implicating for more efficient asset utilisation in-line with the needs of the business and customers. Besides, analytics will be required to underpin faster service deployments and more responsive service behaviours based on changes in service context (e.g. bubble loads).

The use of telemetry for infrastructure and network monitoring is a well-established discipline. Yet, until today telemetry solutions have been based on a variety of either open source or proprietary tools such as Internet Protocol Flow Information (IPFix)/NetFlow [16] or the Simple Network Management Protocol (SNMP) [17] (e.g. traps and pulling), etc. In fact, SNMP is the de facto telemetry protocol for collecting data from network devices. Nevertheless, its applicability diminishes significantly in NFV/SDN environments. Here, computer infrastructure type metrics have more significance, however not completely at the expensive of standard network oriented metrics. In the forthcoming 5G environments there will be significant more diversity in the sources of telemetry data which must be consumed due to a large network edge footprint and wider variety of use cases. The situation becomes even more exacerbated in multi-vendor environments, due to the complexity of device heterogeneity in different network segments. The variety and diversity of potential data sources results in different data format, data models, resolution, non-standard interfaces, etc., making it challenging to have a fully correlated and end-to-end view of the infrastructure environment and the services running in the environment. Addressing these challenges requires a telemetry fabric which provides an end-to-end platform for metrics collection and processing. The fabric also needs to have multiple hierarchical analytical and actuation points in order to efficiently exploit the metrics [18].

In 5G the use of approaches such as network slicing, push data strategies, embedded metadata, dimensional reduction, in-situ data processing etc., will be used to provide better level of control over the availability and quality of telemetry data. These capabilities lay the foundation for supporting critical analytics which makes possible to achieve real-time and automated intelligence that can seamlessly travel from the cloud to a large number of distributed end-points [19]. To date, the application of telemetry metrics and analytics has focused primarily on the general health and performance of services and host infrastructures. In the 5G this focus is expected to be extended so that to support the efficient deployment, management and optimisation of services and infrastructures. Thus, the range of expected use cases can include [20], inter-alia: (i) Infrastructure, network health; (ii) troubleshooting, forensic system observations, resolution; (iii) SLA (Service Level Agreement) compliance, KPI monitoring, prediction, performance tuning; (iv) dynamic reconfiguration (e.g. scaling, migration, load balancing); (v) security threat identification/remediation; (vi) capacity planning; (vii) resource/feature allocation optimisation, and; (viii) deployment model performance, topology confirmation.

While many of these use cases not necessarily knew their operational implementation, this will require different behaviours from the telemetry and analytics systems in order to “meet” the significantly different needs of 5G. First is the need to “address” the big data challenges of volume, variety and velocity. Telemetry will also need to become more software-defined. A trend which has already been embraced within the network domain in the form of software-defined networking is the ability to dynamically change the behaviour of telemetry systems (i.e. enable/disable metrics, change data resolution, add/remove metadata, etc.) and this will become important. This capability is necessary in order to adapt to dynamic service environments, particularly at the network edge in response to new services, architectural changes, changes in user behaviour or service provider business needs. While, initially, this can primarily be a human-driven activity, telemetry systems in concrete with management and orchestrations systems (MANO) are expected to evolve to higher levels of self-autonomy in order to “dynamically adapt”, based on analytically driven behaviour evolution [21]. Telemetry systems will also need to become hierarchical in nature, following to the wider 5G evolutionary trends. While there is a trend towards centralisation, this unlikely seems to be scalable particularly in high distributed multi-domain edge networks.

Furthermore, telemetry systems are more likely to be domain specific with first level data centralisation within that domain. Selected data from lower level domains can then be aggregated at higher level domains within a dedicated hierarchical systems architecture, as shown in Fig. 1. The higher level analytics can also provide specific inputs into lower level analytics -or vice versa- in order to improve either local or global optimisation requirements. A hierarchical approach also distributes issues such as data normalisation, transport mechanisms data models and timestamps in more manageable blocks.

Fig. 1.
figure 1

General view of hierarchical telemetry domains.

3 Telemetry, Analytics and Orchestration Framework in the Revised 5G ESSENCE Architecture

Figure 2 provides a high-level architecture of the 5G ESSENCE according to the two-tier approach explained in detail in [22]. At the network’s edge, each CESC is able to host one or more service VNFs, directly applying to the users of a specific operator. Similarly, VNFs can be instantiated inside the Main DC (Data Centre) and be parts of a Service Function Chaining (SFC) procedure. The Light DC can be used to implement different functional splits of the Small Cells as well as to support the mobile edge applications of the end-users. At the same time, the 5G ESSENCE proposes the development of small cell management functions as VNFs, which run in the Main DC and coordinate a fixed pool of shared radio resources, instead of considering that each small cell station has its own set of resources. The CESC (Cloud Enabled Small Cell) offers virtualised computing, storage and radio resources and the CESC cluster is considered as a cloud from the upper layers. This cloud can also be “sliced” to enable multi-tenancy. The execution platform is used to support VNFs that implement the different features of the Small Cells as well as to support for the mobile edge applications of the end-users. The 5G ESSENCE architecture allows multiple network operators (tenants) to provide services to their users through a set of CESCs deployed, owned and managed by a third party.

Fig. 2.
figure 2

The 5G ESSENCE refined high level architecture.

The CESC Manager (CESCM) is responsible for coordinating and supervising the use, the performance, and the delivery of both radio resources and services. It controls the interactions between the infrastructure (CESCs, Edge DC) and the network operators. In addition, it handles Service Level Agreements (SLAs) while, on an architectural basis, CESCM encompasses telemetry and analytics as fundamental tools for efficiently managing the overall network. The Virtualised Infrastructure Manager (VIM) is responsible for controlling the NFV Infrastructure (NFVI), which includes the computing, storage and network resources of the Edge DC.

It should be mentioned that the 5G ESSENCE does not only propose the development and adaptation of the multitenant CESC platform, the virtualisation infrastructure and the centralisation of the software-defined radio resource management described above; it also “addresses” several aspects that affect performance in 5G virtualised environments such as virtual switching, VNF migration, and Machine Learning (ML) algorithms, which allow orchestrating diverse types of lightweight virtual resources. Last but not least, it is worth noting that the abovementioned two-tier architecture of the 5G ESSENCE is well aligned with the current views on 5G architecture described by 5G-PPP in [23], where the infrastructure programmability and the split of control and user planes are identified as two key logical architecture design paradigms for 5G. First, the 5G ESSENCE achieves infrastructure programmability by leveraging the virtualised computation resources available at the Edge DC. These resources will be used for hosting VNFs tailored according to the needs of each tenant, on a per-slice basis. Second, the Main DC allows centralising and softwarising control plane SC functions to enable more efficient utilisation of radio resources coordinated among multiple CESCs. In addition to the abovementioned aspects, the 5G ESSENCE contributes to other 5G architectural concepts identified in [23] such as, for example, the realisation of the network slicing concept, which is a fundamental requirement of the 5G ESSENCE for enabling that multiple tenants and vertical industries share the same CESC infrastructure.

The 5G ESSENCE system presents a high degree of dynamicity, due to the constantly changing behaviour of services and workloads to be supported by the radio and cloud infrastructure. This perspective implicates for a proper monitoring system, able to adapt to the different supported scenarios. Data collected by this system shall be used for visualisation purposes (for human consumption) and it is also provided to a set of analytics techniques capable of extracting insights from the data and, via feedback loop, enabling the realisation of efficient resource allocation across the infrastructure through the orchestration system. These are the functionalities provided from the 5G ESSENCE CESCM.

In the scope of the revision and extension of the original 5G ESSENCE architecture, below we discuss the proposed components of telemetry, analytics and orchestration that can further enrich the original architectural approach.

3.1 Telemetry Module

One of the challenges for telemetry in the 5G small cell deployments considered in the 5G ESSENCE is the degree of distribution resulting from the distribution of infrastructure functionalities across different nodes of the network going from a centralised location (the Main DC) to the very edge of the network (Light DC) where the small cells are deployed (also see Fig. 2). Then, to achieve a full end-to-end view of both infrastructure and services, the proposed for inclusion telemetry system should be able to instrument and monitor the different devices composing the overall infrastructure as well as to provide a unique and simple-to-access view of the system that can be exposed to both dashboards and analytical techniques. Another challenge is the high complexity which characterises the overall telemetry system as the huge number of metrics and volume of data to be collected, processed and analysed, increases the complexity of the decision-making modules, slowing down the reaction time of the system and implicitly increasing service latency. The telemetry system is then required to be distributed in nature and to adapt to the constantly changing needs of the infrastructure. For this purpose, two “key” desired characteristics of the 5G ESSENCE telemetry platform are: (i) The capability to generate aggregated and derived metrics, and; (ii) the capability to store and, consequently, access the data locally by using a distributed monitoring approach.

The telemetry functionality is implemented through a number of agents that are distributed in different parts of the architecture, both at the Light and the Main DC, while the analytics functions are to be placed at the Main DC. Something similar occurs with the OSM (Open Source MANO) orchestrator, which is logically associated to the CESCM, but it physically resides at the Main DC. The telemetry module is organised in two main components, that is: Network Service Monitoring and Landscaper. The former is in charge of collecting counters and events, in the form of metrics, from the NFVI, the services running on the 5G ESSENCE infrastructure and the SD-RAN (Software Defined – Radio Access Network) controller. This is supported by the Prometheus monitoring tool and a set of exporters, which provide the capability to extract from the system the required information and send that to Prometheus. In fact, Prometheus natively supports hierarchical federation capabilities that provide to the 5G ESSENCE monitoring system the required flexibility and reliability. Another very important functionality provided by the above fundamental component is the Alerting system, which, based on the continuous monitoring of metrics, can identify deviations from normal behaviours of the services and the infrastructure and so to send a signal to the orchestration system. An important aspect of the 5G ESSENCE monitoring framework is the capability of monitoring the RAN, as a difference from most of the existing telemetry platforms, which have typically been used for the collection of measurements related to Information Technology (IT) infrastructure components, e.g., CPU (Central Processing Unit) usage, memory, operating systems, etc. In contrast, the monitoring of the RAN involves collecting radio interface-related measurements from the CESCs.

The 5G ESSENCE considers multi-RAT CESCs that support 4G, 5G and Wi-Fi technologies. The 4G LTE small cells collect a number of measurements at the physical (PHY) layer [24] and at the Layer 2 [25]. These measurements can be collected by the small cell or by the User Equipment (UE) that reports them to the small cell. Similarly, for 5G NR (New Radio) the list of PHY measurements is given in [26], while Wi-Fi measurements are described in [26].

Although the above measurements are available at the small cells, the capability of a telemetry platform -like Prometheus- to collect them depends on the actual configuration of measurements that each small cell exposes to an external system, typically through management interfaces defined between the small cell and the EMS (Element Management System) and/or NMS (Network Management System). The interface between the small cells and their EMS is typically vendor specific, but there have been some efforts in defining open standard interfaces such as TR-196 [27] supported by multiple vendors. Similarly, 3GPP has also standardised different Performance Measurement (PM) metrics [28] and Key Performance Indicators combining these metrics [29] to be transferred from the small cells (or their EMS) to the NMS. These PM metrics are provided in the form of XML files following the format of [30] and produced according to a configured reporting interval. Each file can contain one or more granularity periods, which define the time across which measurements are collected and aggregated. In the 5G ESSENCE, the PM files are generated by the cSD-RAN controller. It exposes the relevant metrics to a Representational State Transfer (REST) Application Programming Interface (API) that generates a JavaScript Object Notation (JSON) service that is then translated by a specific exporter, which consumes and translates the data to be understandable by Prometheus. Telemetry data obtained from the RAN are then subject to analytics approaches to support the decisions made by different RRM/SON functions for which the cSD-RAN Controller is in charge (and being the core objective of the WP3 of the 5G ESSENCE effort), as seen in Fig. 3. The monitoring framework of the 5G ESSENCE is designed to also collect information about the cloud resources. First the framework requires supporting the identification of available physical resources, which will be allocated to VNF Components and then it requires monitoring the interdependency between virtual and physical resources. In the 5G ESSENCE monitoring framework, this functionality is covered by the Landscaper shown in Fig. 3. The Landscaper is responsible for collecting information related to the resources available in the system. It gathers all the available resources on the NFVI and structures them in the form of a graph, where nodes represent the resources and edges represent relationships between them. Nodes are organised in layers (physical, virtual and service) in order to enable a logical classification of the resources.

Fig. 3.
figure 3

Components of the telemetry, analytics and orchestration framework in the 5G ESSENCE refined high level architecture.

3.2 Analytics Module and Orchestration Module

The analytics module in Fig. 3 is envisioned as a collection of tools and approaches to support and execute Analytics/Machine Learning (ML) tools on the telemetry data to generate models to be used for provisioning of insights to the orchestration system (i.e., OSM), and to the cSD-RAN controller of the small cells. The analytics module is characterised by two main components: The Contextual Information component and the Analytics Framework. The Contextual Information component is responsible for the storing of contextual data for services and NFVI and receives its input from the Landscaper (resource mapping), the Network Service Monitoring (telemetry metrics) and the Analytics Framework (models generated over time). This information is exposed to the Analytics Framework where it is processed to generate actionable knowledge, in terms of models, that will support orchestration decisions. This information is sent to the Orchestration system which exploits it for efficient placement and intelligent resource allocation and to the cSD-RAN Controller, which, as mentioned, makes informed and intelligent decisions on radio resource allocation.

The Orchestration module includes the Service Orchestrator (SO) and the Resource Orchestrator (RO) as part of Open Source MANO (OSM), and the Virtual Infrastructure Manager (VIM) as the combination of OpenStack and OpenDaylight (ODL). Moreover, two extra modules are envisioned in the final implementation of the overall orchestration system, which are the Placement Assistant and the Alert Mitigation Manager. The Placement Assistant intercepts the deployment requests sent to the Orchestrator from the 5G ESSENCE User Portal and translates that into an efficient deployment of components based on hints received from the Analytics Framework. The Placement Assistant is responsible for decisions related to the resource allocation for the service components at the deployment time (i.e. where to allocate resource, Main DC vs. Light DC). The Alert Mitigation Manager is instead responsible for managing the service at the runtime, according to anomalous behaviour identified by the Network Service Monitoring. The Network Service Monitoring sends triggers to the Alert Mitigation Manager when something anomalous is measured and the latter decides how to react to the event, on the basis of hints received from the Analytics Framework. The Alert Mitigation Manager is responsible for decisions related to the dynamic adjustment of service configuration and resource allocation.

4 Conclusion

In the present work we have identified the importance of development and inclusion of network softwarisation features within the wider 5G evolutionary scope which, in turn, can lead to more effective network management promoting cognitive features and related behaviors. Based on the specific novel context of the actual 5G ESSENCE research program, being part of the 5G-PPP Phase 2 effort, we have summarised the proposed innovative features of the respective architecture allowing for a dynamic management of the involved network resources as well as for the promotion of a variety of facilities via dedicated VNFs. The wide implementation of SDN, NFV, Cloud and Edge Computing in present 5G architectures, and within the 5G ESSENCE in particular, leads to network operating environments being software-driven to a great extent. The transition towards “softwarisation” schemes implicates for greater challenges, regarding management and monitoring of the related infrastructures.

In this framework, both telemetry and analytics are expected to realise a very important role as they can both offer various insights upon a variety of features affecting both network and service performance within the intended 5G architectures. This can be a very important factor towards contributing in the ability to automate and provide fine-grained management of the related 5G network infrastructures.

Within the context of the updated and revised 5G ESSENCE architecture, the CESCM is the central service management and orchestration component. Under a more generalised approach, the CESCM integrates all the traditional network management elements and the novel recommended functional blocks to realise NFV operations and it is responsible for coordinating and supervising the use, performance and delivery of both radio resources and services. In addition, it is also responsible for controlling interactions between the infrastructure and the network operators and it handles SLAs. The CESCM incorporates telemetry and analytics as necessary tools for properly managing the overall network. Aiming to extend and refine the original architectural scope of the 5G ESSENCE, and towards achieving a full end-to-end view of both infrastructure and services, we have proposed for inclusion a suitable telemetry module. The related functionality can be implemented through a number of agents, while the telemetry module is organised in two main components (the Network Service Monitoring and Landscaper), each one serving specific purposes.

In the same approach, we have proposed the inclusion of an analytics module, with the purpose of supporting and performing Analytics/Machine Learning (ML) tools on the telemetry data to produce models for offering of insights to the orchestration system and to the cSD-RAN controller of the small cells. Aiming to “widen” all potentially expected benefits, we have also proposed an orchestration module, for better supporting dynamic adjustment of service configuration and resource allocation. These modules can enhance the original architectural 5G ESSENCE considerations to a more reliable level for a more effective cognitive management of the network resources and of the proposed facilities.