Keywords

1 Introduction

The decision support system (DSS) plays a very vital role in ascertaining optimal organizational decision process, and on the other hand, the intricacy of modern business scenario needs a highly robust and effective automation software support to optimize growth-oriented decision support [1]. This certain advanced technologies such as business intelligence (BI) are employed [2]. BI application encompasses numerous significances like dispersion of information, also provides facilities for growth-oriented features to interact with customers, and makes associated decision making to retain markets and flexible scheme for data access, versatility, and litheness in adapting BI in organizational decision making [3]. BI makes understand that optimal and well-calibrated sales information and guidance for service innovation and discovery to meet market demands and to attract new customers along with retaining old customers with value-added products [4]. BI paradigm, on the other hand, facilitates optimal solution by facilitating the integration of information flows from clients and suppliers, service innovation, redesign and formalization of business processes. It also might be significantly assisted due to its appropriate and relevant data assortment, processing, and objective-oriented information retrieval for decision support. The data retrieved or collected only from local-based resources could not be effective to ensure optimum results for certain DSS utilities. Therefore, the data are required to be collected from varied heterogeneous data sources. In such circumstances, the datasets are required to be stored (DW) at certain well-structured medium or infrastructure where it could be effectively mined and processed to ensure ultimate DSS.

The data storage infrastructure or framework called data warehouse states a “goal-oriented, data-centric and integrated, time-variant, nonvolatile collection of gigantic datasets which are implemented for effective decision support for BI utilities.” It assists for online analytical processing (OLAP) resulting into superior performance needs as compared to the online transaction processing (OLTP). Data warehousing is a collection of decision support technologies, aimed at enabling the knowledge workers (executive, manager, and analyst) to make better and faster decisions. It serves as a physical implementation of a decision support data model and stores the information on which an enterprise needs to make strategic decisions [5, 6]. In such situations, the consideration of data warehouses and OLAP becomes inevitable. To ensure efficient QoS and effective data processing, DWs or OLAPs are required to be structured in certain multidimensional data model (MDDM) [7]. Hence, the data security and its redundancy play a very significant role to ensure most precise BI functions for DSS utilities.

In this paper, a highly robust and effective BI system has been developed employing C5.0 decision tree algorithm-based mining module [8] that ensures optimal classification of data for MDDM and DWs. To have a privacy-preserved mining model for DWs, a hybrid system model based on the Commutative RSA (CRSA) cryptosystem and C5.0 decision tree algorithm has been developed which has been further employed for retrieving various feedback data for service innovation discovery and redesign for BI applications. It not only strengthens optimal secure data mining or processing but also reduces huge computational overheads.

Thus, the BI application for SIDRD can be employed for service prototyping, blue printing, innovation, and its redesign to meet customers’ expectations so as to sustain in competitive market. The system performance has been evaluated by increasing user-based and reducing risk factors, and results have been obtained in terms of accuracy, coverage, and F1 score. The developed system ensures optimal. The other sections of the presented manuscript discusses like: Sect. 2 presents related work and in Sect. 3 the proposed system model or our contributions or results and analysis, in Sect. 4. In Sect. 5 the conclusions drawn. The references considered for research work have been provided at the last of the manuscript.

2 Related Work

A hybrid mining model for DSS was advocated in [9] where a unified BI architecture was developed using stream mining concept. For E-commerce utilities in [10], a BI-driven data mining (BidDM) framework was developed using four-layered architecture based on mining processing. Chang et al. in [11] proposed an integrated mining model with Web information which was further enhanced in [12] where a bankruptcy prediction approach was developed using a qualitative as well as quantitative optimization with GA [13]. For DSS utilities, the data derived from UCI machine learning repository for data clustering was advocated in [14]. To ensure data security in mining, [15] advocated a k-anonymity-based privacy preservation approach that was further optimized in [16] using classification and weighted attributes. A privacy-preserving model for decision tree mining algorithm was developed in [17] based on the homomorphism encryption approach.

3 Our Contributions

In this paper, a BI model has been developed that emphasizes on service innovation discovery and service redesign (SIDRD). The system encompasses varied uniqueness in terms of classification efficiency and accuracy as well as data security under multidimensional data model (MDDM). The system has multiple data warehouses in the form of MDDM that is considered where the individual data cube behaves like a data source. To implement SIDRD, the feedback from customers is collected from these multiple data warehouses where the datasets under consideration represent a heterogeneous environment. In such scenario, the security of data plays a very significant role to ensure optimal accuracy and DSS support. Hence, in order to facilitate optimal security of the multiple data sources, we have employed a novel cryptosystem called Commutative RSA (CRSA). Unlike traditional RSA cryptosystem, CRSA possesses numerous advantages are as follows: It does not introduce huge computational overheads for key computation, distribution, and management, and it also reduces computational complexities. Since the organization might have huge datasets and even with multiple DWs, in such circumstances, the data mining and resulting classification accuracy could have concluding significances. Therefore, taking into account these requirements, in this paper C5.0 decision tree algorithm has been used for mining data. Thus, the combination of MDDM, C5.0 DTA, and CRSA cryptosystem makes the overall system much robust and efficient for DSS utilities and provides solution for business intelligence (BI) application. A brief description of the proposed research model has been discussed in the following sections.

The overall functional procedure of the system is illustrated in Fig. 1.

Fig. 1
figure 1

Proposed system model for DSS-oriented BI solution for SIDRD

  1. A.

    CRSA-Enriched C5.0 Mining Model for SIDRD Business Intelligence

In this paper, a novel scheme for BI application has been developed using Commutative RSA (CRSA) cryptosystem amalgamated with C5.0 decision tree algorithm for data mining in MDDM-based BI environment. The implementation of CRSA for mining process is accomplished in three consecutive steps. These are as follows:

  • Step 1: Here, the data security is facilitated for locally generated rule sets which are followed by combined secure rule set generation which depicts the collection of all generated rules by encompassing MDDMs in form of encrypted data elements.

  • Step 2: Once the combined secure rule set has been generated for C5.0 DTA to perform data mining, the individual data sources are initiated to propagate secure rule sets throughout the encompassing data elements or MDDM components. Retrieving the secure rule sets, the MDDM components exhibit decryption of data elements. Here, it must be mentioned that the initiating data element or MDDM component performs decryption of secure rule set that is supposed to accomplish combined rule sets.

  • Step 3: This phase the data classification for unified BI representation takes place where the employed C5.0 data mining algorithm is implemented for data mining in MDDM elements and to classify data elements for BI utilities and DSS functions.

  1. B.

    System Model

The overall system modeling for C5.0 DTA implementation with CRSA for SIDRD has been discussed in the following sections.

  1. a.

    System Initialization

The pseudoalgorithm for initialization is given in Fig. 2.

Fig. 2
figure 2

System initialization for CRSA

In expression, the combined rule set \( \left( {YC_{RSet} } \right) \) can be stated as \( YC_{RSet} = \left\{ {Nj_{1} \cup Nj_{2} \cup Nh_{3} \cdots \cup Nj_{y} } \right\}\forall y \) which is employed for achieving data mining results. Here, the data analysis can be done for entity MDDM elements.

  1. b.

    Secure Rule Set Generation for Classification

The pseudocode for securing rule sets generated is given in Fig. 3.

Fig. 3
figure 3

Pseudocode for rule set generation

  1. c.

    Combined Rule Set Generation

The pseudoalgorithms for combined rule set generation is given in Fig. 4

Fig. 4
figure 4

CRSA cryptosystem combined rule set generation

The developed system ensures the optimal performance in terms of mining efficiency, execution time, accuracy, and data security which are the key requirements for service innovation and redesign (SIDRD)-oriented BI utilities.

  1. d.

    Amalgamation of C5.0 with CRSA for SIDRD-Oriented BI Application

In our previous work [8], the comparative analysis of CRSA with C4.5 and C5.0 had been done where C5.0 exhibited much better as compared to C4.5 algorithm. Thus, the robustness of C5.0 has been incorporated in our research, and SIDRD objective has been achieved with hybridization or amalgamation of CRSA with C5.0 decision tree algorithm. The combined rule set generated for C5.0 data mining for y feedback datasets is given by \( C_{RSety}^{C5.0} = C_{{{\mathcal{R}}jx{\mathcal{R}\mathcal{G}}}}^{c5.0} \left( {C_{RSet} ,y} \right) \). In other words, \( C_{RSety}^{C5.0} = C_{{{\mathcal{R}}jx{\mathcal{R}\mathcal{G}}}}^{c5.0} \left( {\left\{ {Nj_{1} \cup Nj_{2} \cup Nj_{3} \cdots \cup Nj_{y} } \right\} ,y} \right) \)

$$ \begin{aligned} C_{RSet\,y}^{C5.0} & = C_{{{\mathcal{R}}jx{\mathcal{R}\mathcal{G}}}}^{c5.0} \left( {\left\{ {\left\{ {{\mathcal{R}}j_{1} ,{\mathcal{R}}j_{21} \ldots ,{\mathcal{R}}j_{1xCub1} } \right\} \cup \left\{ {{\mathcal{R}}j_{12} ,{\mathcal{R}}j_{22} \ldots ,{\mathcal{R}}j_{1xCub2} } \right\}} \right.} \right. \\ & \quad \left. {\left. { \cup \cdots \cup \left\{ {{\mathcal{R}}j_{1y} ,{\mathcal{R}}j_{2y} \ldots .,rl_{1xCuby} } \right\}} \right\},y} \right) \\ \end{aligned} $$

Here, \( {\mathcal{R}}j_{1xCuby} \) is stated for the highest rules generated by certain feedback sources. Similarly, \( C_{{{\mathcal{R}}jx{\mathcal{R}\mathcal{G}}}}^{C5.0} \) depicts C5.0 mining model-based function for combined rule generation. The derived algorithm for ultimate model with C5.0 algorithm and CRSA for BI application in multidimensional data model (MDDM) scenario is given in Fig. 5.

Fig. 5
figure 5

C5.0 algorithm-based CRSA approach for SIDRD utilities in MDDM scenario

  1. e.

    Service Innovation Discovery and Service Redesign (SIDRD) Implementation

Considering the ultimate requirement of a robust and highly effective decision support-based business intelligence (BI) utility, the system model has been implemented for service innovation discovery and service redesign (SIDRD). The overall system implementation has been illustrated in Fig. 1. There are predominantly 3 processes for accomplishing SIDRD objective. The implementation details are as follows:

  • Phase 1: Service Feedback Mining

In this phase, the feedback data from various data heterogeneous data sources are processed for robust classification. In this research phase, we have implemented our developed C5.0 data mining and CRSA-based mining model for service feedback mining to ensure optimal classification accuracy and genuine classified data. In fact, this model exhibits the crawling of service feedback data retrieved from various data sources, and then, the mining is accomplished in MDDM warehouse infrastructure. The contribution of this research phase is to facilitate most precise and accurate classified data which can be employed for decision support system (DSS).

  • Phase 2: Service Modeling

Since the SADT flow-based service descriptions are not optimal and cannot be employed for DSS-oriented reasoning, therefore, the retrieved and processed data are converted into an optimal structured and classified paradigm to accomplish reasoning. In this paper, we have implemented strategic rationale model (SRM) for presenting the service flow architecture. In fact, strategic rationale model represents a graph-based presentation paradigm where the nodes could be goal, feedback data resources, or tasks. In our work, the node elements have been linked together with the help of means–end relationship and the robust task decomposition relationship.

  • Phase-3 Reasoning

It might be considered as the justifying element of the SIDRD model as it executes reasoning for service modeling and service feedback mining. In fact, the significances of reasoning are refinement of service optimization. In our system model, the reasoning component performs learning of C5.0 decision trees from the strategic rationale model as considered earlier. And in DSS-oriented BI utilities, managers or business analysts define the organizational goals or objectives. On the basis of objectives defined, we have constructed decision tree using SRM. The data elements in the considered privacy-presented decision tree (CRSA+C5.0) are analyzed against the data mined, and thus, the validity of decision is verified against mined and classified data.

4 Results and Discussion

In this paper, a robust and hybrid system for business intelligence (BI) application has been developed. To accomplish the ultimate objective of the research work, we have exploited and employed the contributions of our previous researches [1, 2, 31, 32]. The proposed and hence developed system has considered the effectiveness of C5.0 decision tree algorithm [2] and Commutative RSA (CRSA)-based privacy preservation to ensure data genuinity in multiple dimensional data model [1]. Further to ensure effective performance for service innovation discovery and service redesign (SIDRD), we have implemented the developed hybrid system with multiple datasets where data are in heterogeneous form as feedback from customers. The refactor services are accomplished in two consecutive terms: by increasing user base and then increasing user base. The system has also been analyzed while reducing user risk. The overall system performance has been analyzed by means of varying parameters such as overall accuracy which is nothing else, but the BI-oriented classification accuracy, coverage, and ultimately the developed system are analyzed for F1 score. The simulation framework has been developed on Java platform (Figs. 6 and 7).

Fig. 6
figure 6

Analysis for accuracy with respect to number of mined data

Fig. 7
figure 7

Analysis of coverage with respect to number of mined data for change

Figure 8 (Fig. 4) represents the F1 score stating the weighted average of precision and recall for the system under evaluation. Here, considering the results obtained, it can be found that the F1 score is approaching toward unity (1) in relation to the number of mined data. It illustrates the system under consideration is much robust for higher data samples and it performs better even with higher data counts. The higher F1 score depicts the effectiveness of the developed system for ensuring optimal information retrieval and query classification or classification. Thus, the results exhibit that the proposed system can be a potential solution for BI utilities where every aspect of DSS-oriented BI can be fulfilled while ensuring optimal classification, security, accuracy, and minimal computational complexity.

Fig. 8
figure 8

Analysis for estimate F1 score

5 Conclusion

In present-day competitive business scenario, there is a great significance of business intelligence (BI) to provide decision support to managers and organizational decision makers. Organization employs huge datasets and information to perform decision making for its growth-oriented service redesign so as to retain market or to sustain in market. In case of huge datasets for analysis, business houses possess numerous data warehouse infrastructures or online analytical processes where data are collected from various sources and are stored in heterogeneous environment. In such environment, the mining of data and its classification for BI utilities plays a significant role; meanwhile, being a heterogeneous multiuser-based application scenario, the data security has a great significance to ensure precise classification, decision-oriented data retrieval, and presentation. In order to accomplish these all objectives in this paper, a highly efficient BI paradigm has been developed using C5.0 decision tree algorithm which is recognized for its efficient classification accuracy and mining efficiency. On the other hand, in order to ensure robust security of datasets or its privacy preservation in multidimensional data model (MDDM) or multiple data warehouses, the implementation of Commutative RSA has exhibited significant role for data security. The hybrid system utilization possesses C5.0 decision tree algorithm with CRSA cryptosystem for MDDM or multiple data warehouse and its implementation for service innovation discovery and service redesign (SIDRD). The ultimate system is enriched with the effectiveness of C5.0 data mining algorithm and robust privacy preservation approach that ensures optimal data processing and classification. Thus, the final implemented system has exhibited optimal performance for accuracy, coverage, and F1 score. Unlike other existing paradigms, the developed system has emphasized on overall system optimization and decision support-oriented BI efficiency.