1 Introduction

Digital universe has been growing with 40% of a year into next decade, with the expansion of, not only the all-online services by people but also all the things such as smart devices communicating via the Internet. By 2020, the digital universe will be expected to reach 44 zeta bytes of data [1]. Hence, 1.7 MB of new data will be created by every human in second along with 6 billion smartphones and 50 billion smart connected devices in the world [2]. Also, social media usages and video-based services have been contributing the exponential growth in the data volume. In addition, present smartphones are being vastly improving the better Web-surfing experience resulted in increased consumption of media such as transferring live high definition videos, and content-based services are contributing the continuous increment in the data volume [3]. Traditional data analysis methods will not be sufficient to store and analyse such high volume of data. Hence, it is essential to look an alternative way to analyse the data with real-time capability.

Big data is a broad term for data sets with large, complex and unable to handle with traditional data processing applications and techniques. Main challenges are analysis, capture, data curation, search, sharing, storage, transfer, visualization, queering and information privacy [4]. Telecommunication industry is the one of most import industry that needs to use big data strategies to store and analyse the data coming with high volume, velocity, variety, etc.

1.1 Big Data Challenges for Telecom Operators

The volume of the data generation has been increasing with access technologies such as 4G/LTE, FTTH and ADSL due to the volume of operational data generated for sessions increasing dramatically. Also, most of the users are unlike making traditional voice calls, always connected with data. Hence, it is the requirements to have real-time operational capabilities to provide better services.

Varieties of data are becoming too many due to mobile devices and sensors are flooding telecom networks with data in myriad formats. Also, present call detail records (CDR) are enriched [5]. The velocity of data has been increasing day by day due to the dramatic increase in number of active devices connected to telecommunication networks, and a number of mobile service subscriptions are more than the population of a country.

1.2 Current Issues and Objectives

Due to highest competition at present, all the telecom service providers want to be innovative to maximize profit. However, most of them are suffering from making decisions in real time due to volume, variety, velocity, etc. of data coming in [3]. Hence, it is necessary to design a proper solution and make decisions in real time to achieve the goals of telecom operators as applying big data strategies are still in early phase of deployment for many of telecom operators [6].

The main objective is to design a real-time business intelligence model that provides insights into the telecommunication industry using big data strategies, and following facts need to be addressed by the proposed conceptual model

  • To identify the variety of data that can be collected from the telecom industry.

  • To develop a suitable platform to handle the large volume of structured and unstructured data.

  • To explore real-time business insights based on the consumer data collected.

  • To identify the churn in advance as the preventive basis.

  • To develop an innovative business model by using information such as payment data for increased sales and match demand and offering nearby.

  • To enhance the customer experiences with dynamic profiling and enhanced customer segmentation.

2 Conceptual Design of Real-Time Business Intelligence Model

Figure 1 illustrates the conceptual design of a real-time business intelligence model that provides insights into the telecommunication operators.

Fig. 1.
figure 1

Real-time business intelligence model

Since the revenue from voice traffic which was dominant before 2000 is declining and on the other hand, non-voice traffic revenue has been increasing rapidly for all the telecommunication operators globally. Therefore, it is very much appropriate to consider CDR for non-voice traffic in addition to traditional CDR, which will be dominant in the near future.

As the example of CDR, following main parameters were captured at the broadband remote access server (BRAS) for ADSL-based broadband access

  • Date/Time

  • Status of packet: Start or Stop

  • Username for access

  • Package subscribed

  • Output octet for session (only for stop)

  • Session-Id = “0817596553”

  • PPPoE-Description = e.g.: “pppoe a4:7e:39:de:75:b8”

  • Port-Id = “XXX_MSAG5200_28 atm 1/2/11/28:8.35”

  • Framed-Protocol = PPP

  • Event-Timestamp = “Apr 21 2016 04:55:05 IST”

  • NAS-Port = 3758096459

  • Unique-Session-Id = “c3dbb18e0ded8301”

  • Timestamp = 1461194705.

For a single user, each session, two such logs will be created for commencing and completion of a session by the radius server. Hence, the data volumes of these log files are very high.

Block diagram of BRAS architecture for PPPoE authentication is given in Fig. 2 [7], which is mostly using by all the other broadband access technologies such as ADSL, VDSL and FTTH.

Fig. 2.
figure 2

Block diagram of PPPoE authentication [10]

Network management systems (NMS) are used to monitor the network nodes connected to telecommunication networks and for this business intelligence model. It is required to get the output log at the uplink port of the access node such as access gateways when required the further analysis based on requirement. Network related log outputs will be generated at NMS such as environmental, board-related exceptions and port-related issues such as abnormal conditions in uplink ports.

Contact centre will provide the customer feedback on services restoration or delivery such as customer satisfaction index, and this proposed conceptual model will explore the issues in the network in real time for the proposed solution or escalation of the issue to the relevant section of telecom operator.

Operation support system (OSS), customer relationship management (CRM) and enterprise relationship management (ERP) systems are used to keep customer related static data. OSS is having customer name and ID, services subscribed, customer contact details, service address, relevant port and circuit details for services subscribed, etc. CRM is having details related to customers and services subscribed by the customer. ERP is mainly having cost-related information such as cost related to service provision.

3 Data Processing

Distributing computing and platforms such as Hadoop and Spark will play an important role in processing data with real-time basis to take management decisions quickly [8]. Apache Spark is a most powerful data analysis engine based on Hadoop MapReduce, which assists the rapid processing of big data. It overcomes the limitations of Hadoop and emerging as the most popular framework for analysis.Considering the performance and advanced features, using spark ver 2.0 or later is recommended at present with HDFS as more than 500 organizations are using this including Yahoo, Amazon and eBay [9].

4 Expected Outcome from the Model

Present high competitive environment, telecom operators are much focusing on managing the customer expectation in a positive manner in real-time basis, and unified approach inside the organization is the hour of need. Hence, it could be achieved as the main outcome of the above conceptual model. Predictive analytics by using dynamic profiling can analyse the existing data such as usage, interests, location, economic status, intend to churn and relationship with other operator users. Also at present, segmentation of customers is mostly performing by considering the generation of revenue. Dynamic profiling and segmentation are possible with the big data strategies, which will retain with telecom operator with higher satisfaction due to consideration of all the factors beyond revenue.

Further based on usage or usage pattern, stakeholders can take initiatives to develop the particular area by considering the customer expectations and requirements. Further, from the business perspective, proper investment decision can be taken and also by increasing operational efficiency and to reduce OPEX through better decision-making. Next, it will be possible to sharpen campaign management, preemptive churn management, maximize revenue potential and reduce fraud and revenue leakages. As the important outcome, it is possible to predict unexpected and unpredicted events that might be an opportunity or threat for telecommunication operators.

5 Conclusion

The requirement of big data analytics to the telecommunication industry is well accepted. To fulfil the above, a conceptual model with real-time capabilities using big data strategies is important to achieve the goals. Validation of this model is required initially, with past data, which will help to validate the model in a real environment. Finally, the model needs to be validated with real-time data obtained from a telecommunication service provider.