1 Introduction

Computers can comprehend language because it is the medium humans use for communication; hence, Search Engine Optimization (SEO) is optimizing websites to increase their visibility in Google’s natural ranking and other search engines. It can model how individuals acquire and discover information on practically any topic. Keyword search is finding the relevance of words, queries, and phrases to a website and its folios so that the user can find the best folio to answer their query on real-world applications, known as search intent see Fig. 1 for more details.

Fig. 1
figure 1

Keyword search representation flow chart

One of the most valuable uses of pattern recognition (PR), machine learning (ML), artificial intelligence (AI), social computing (SC), and recommender systems (RS) is to help make informed decisions and provide a more realistic representation of multiple relations that characterize an entity in the system. However, optimizing content or creating possible searches from search engines is possible if what people are searching for and what they want to see can be accessed easily (Han et al., 2022; Aggarwal, 2016). Yet another approach to finding Co-occurrence (CO) patterns is revealed through corpus linguistics and statistical analyses in which extensible Markup Language (XML) and graph structures in hypertext corpora extract specific data attributes.

Co-occurrence networks, sometimes called semantic networks, Segev (2021) are graphical methods for solving ambiguity problems and analyzing text, including potential relationships among entities, concepts, and organisms like bacteria (Freilich et al., 2010) using a graphic visualization. Co-occurrence networks are collections of terms that are connected together because they occur together in a certain text, concept, or structure. By linking words together according to a set of co-occurrence strategies and searching the format of scientific communication, co-citation analysis, multinomial model, and graph neural networks (Han et al., 2022; Aggarwal, 2016; Yang et al., 2021; Garg, 2021)networks are created, which have significantly improved the techniques nevertheless still have flaws. There is great interest in relational database keyword searches (Yang et al., 2021; Garg, 2021; Bast et al., 2016), and the most critical aspect of relational data access is a Structured Query Language (SQL). Accessing a significant volume of relational data has become more challenging for prospective users due to the requirement that relational data schema be well-known to use SQL. Graphs, also known as social graphs, are being used in social media for information organization, structure, storage, and retrieval, for node categorization, connections prediction, clustering, and visualization (Cai et al., 2018; Goyal & Ferrara, 2018). Graph clustering groups the nodes of a graph into clusters using the graph structure or node attributes. Numerous research works (Ma et al., 2021) in the node distribution approach are proposed, and the denoted nodes can be transformed into traditional clustering algorithms. Search Engine Optimization (SEO), such as Google, still represents an influential and trustworthy resource for discovering practical website information.

The context relevant of the user query and the search engines indexed folios were the primary factors used by early search engines to return pertinent folios for the user. The information retrieval (IR) techniques were directly implemented in the retrieval and ranking algorithms. Conventional information retrieval (IR) presumes that the fundamental unit of information is a document and that a vast array of documents can be accessed to create the text database. Researchers have used IR to extract knowledge from structured data for community identification and search. A list of keywords sometimes referred to as terms, is the most widely used query format. Information in the text is unstructured, whereas data in databases is highly structured and kept in relational tables; thus, information retrieval from text varies from retrieved data from databases using SQL queries. The primary goal of interest is retrieval and related activities that can increase the accuracy or efficiency of retrieval since text retrieval lacks a structured query language like SQL, and the IR community has not focused much on real-world data applications like false news.

Keyword research is the first and most crucial step in any search engine optimization strategic plan (Yang et al., 2021; Garg, 2021). The most popular approach to solving the keyword search problem is Graph-Based Keyword Search (GBKS), which identifies a set of closely linked nodes in the graph that may match a specific keyword based on the query (Bhalotia et al., 2002; Kacholia et al., 2005; He et al., 2007), BANKS-I (Bhalotia et al., 2002) considers the shortest route from a tree’s root to a node that contains keywords, BANKS-II (Kacholia et al., 2005) suggests using a forward search to approximate a solution, and BLINKS (He et al., 2007) tries to identify the set of all different sub trees with the best scores to improve the BANKS-II approach. These retrieval techniques are centered on nodes while using keyword search engines and semantic relationships (Wang et al., 2008) can link keyword inquiries and formal questions. Therefore, classical manual reading for information extraction and knowledge acquisition cannot keep up with the needs of the complex data age.

Researchers on machine learning (ML) and graph mining have used various branches of artificial intelligence, from recommendation systems, computer vision, natural language processes, and graph-based, for solving standard processes through graph-based machine learning. In conventional ML, researchers have been working on alternative clustering problems on graphs, and comparing the similarity of objects of the same kind is crucial in many applications (Han et al., 2022; Aggarwal, 2016). A sustainable cluster is designated as a collection of nodes in a multiplex network that is concurrently coupled to one another across all of the distinct layers (Baxter et al., 2016). Moreover, sustainability corresponds to several paths that connect the same pair of nodes in the feasible cluster, but each exists on a different multiplex layer. Therefore, understanding fundamental search co-occurrence correlation through multi-layer graph representations is an essential methodology from literature to intelligence analysis (Fig. 2).

Fig. 2
figure 2

Multi-layer graphs representative

Multiple layers are a feature of realistic systems. Multi-layer graphs (MLGs) are widely accepted as such (Boccaletti et al., 2014; Kivelä et al., 2014; Kumar et al., 2020) differ from single-layer graphs SLGs by their multi-relational structure that offers a range of resources for making good decisions, with an inter-relational corporation structure that provides various resources for decision-making, as well as entities that can have different types of relationships between them. When modeling several real-world applications among the same group of people, for example, MLGs provide an expressive method where layers represent various online and offline relations (e.g., following, co-authorship, co-working relations, and so on), keyword research is the first and most crucial step in any search engine optimization strategic plan where various academia and the business community have utilized it in helping users maximize network resources where Label Propagation (LP) (Nickel et al., 2015; Alimadadi et al., 2019) Random Walks (RW) (Bojchevski et al., 2018; Valdeolivas et al., 2019), E-Commerce Recommendation (E-CR) (Aggarwal, 2016) Multi-layer graph embedding (MLGE) (Rossi et al., 2021; Makarov et al., 2021), Deep Neural Network have been well studied to forecast the relational link between entities and keyword search on multi-layer graphs to represent complex relationships accurately (Wu et al., 2020; Perozzi et al., 2014). However, the common usage of MLG representations of various vertices, edges, and critical world search methods find relevant components in a network system. Current methods focus on specific multi-layer graphs, such as multiplex and heterogeneous structures of interconnected complex systems. At the same time, most affirmation approaches have their merit and demerits despite challenges like memory cost and time complexity, graph embedding known as representative of network learning offers (Grover & Leskovec, 2016; Hamilton et al., 2017) an effective solution by changing the representation form and mapping nodes into a low-dimensional space, maintaining consistent and enhancing understanding of network entities. The increasing accessibility of complex networks with billions of vertices and edges has significantly advanced network analysis, where Multi-layer Graph Embedding (MLGE) attempts to describe the vertices and edges in vector space while maintaining the structure of the graph and information within and across layers in overcoming the complex network representation and analysis challenges of the graph embedding network.

Diverse techniques have been put out to learn graph representations. Graph Neural Networks (GNN) (Battaglia et al., 2018), the most known network that Google recently introduced, extends popular networks like RNN and CNN to graph-structured data (Scarselli et al., 2008; Duvenaud et al., 2015; Niepert et al., 2016; Defferrard et al., 2016). One study area is building neural networks as an RNN variant that functions on graphs. (Li et al., 2015) extended the GNN model by proposing a brand algorithm of RNN in the original GNN model. A significant pull of works that have attracted fast-ripening goal is the GCNs (Kipf & Welling, 2016), centered on spectral graph theory, which was initiated (Bruna et al., 2013) and then extended by Defferrard et al. (2016) with fast localized convolution. Most neural networks transverse deep to get a unique performance. Recent GNNs that deal with node categorization on graphs are unable to achieve high performance on a variety of data sets because they are shallow networks and tend to concentrate on node-wise scores.

GNNs are becoming famous in multi-layer learning. Wu et al. (2020); Hamilton (2020) However, prior methods have yet to thoroughly investigate these graphical interactions since they have not combined information from several links concurrently. Researchers have proposed to utilize a multi-omics data analysis by embedding multiple knowledge into graph neural networks to solve this problem (Xiao et al., 2023) To buttress the benefit of structural diversity and deep GNN Architectures, GNN model a pipeline with two-stage novel space is proposed by Feng et al. (2023) which aim to generate high performance. In contrast, transferable deep GNN models in a block-wise manner are utilized, Liang et al. (2021) and He et al. (2021) make use of the multilevel embedding framework MILE and a distributed multilevel framework (Dist MILE) for scalable graph embedding. Our proposed keyword search co-occurrence multi-layer graph mining (EKSCOMLGs) considers implementing association based on multi-layer graph embedding and graph neural networks based on multiple knowledge for mining of features network. Thus, Its fundamental is to learn co-occurrence relations between real-world data sets.

Fig. 3
figure 3

Graphs using keyword co-occurrence and graph neural network representation

Figure 3a considers a scenario where, in a certain community, there are researchers, and recommendations of individuals who have never cooperated seem more valuable. Suppose ten researchers are skilled in different fields and assume there is a talent hunt for a project requiring Mathematicians, Architecture, and Computer Analysis. Since a graph can be used as a pictorial drawing for easy illustration, a social graph mapping based on Co-membership can be used to indicate the model of bringing together information from two or more people who belong to the same community of researcher but different areas of expertise groups (G). Using Fig. 3a to illustrate, where (Red \(\alpha\)) represents researchers who are well skilled in Mathematician (G1), (Green \(\beta\)) represents researchers who are well skilled in Architecture (G2), and (Purple \(\gamma\)) represents researcher who is well skill in Computer Analysis (G3).

A graph neural network representation example is shown in Fig. 3b, where the circles indicate nodes and their functions on the data are represented by the edges, which represent weights or information passing along where certain layers may be hidden. The structural role of the circles of a node can be represented by Red, Green, Purple, and Gold color, When the layer is few, it is called a shallow neural network and when the hidden layer is many, they are called a Deep Neural Network. For the proper execution, there must be a mutual linkage or interest between nodes and edges in Fig. 1.

In this way, we are particularly interested in two research questions: (1) What is the relatedness between nodes and edges within the same community type or different community types using real-world data? (2) Whether the proposed model will perform better using our proposed model? To solve these questions, search engine optimization (SEO) based on content information properties using elements of Multi-layer Graph Embedding (Rossi et al., 2021; Makarov et al., 2021) and Graph Neural Networks have gained helpful information (Wu et al., 2020; Hamilton, 2020). However, a practical keyword search co-occurrence multi-layer graph mining approach (EKSCOMLG) is an NP-complete problem. Thus, the proposed EKSCOMLGs are driven by enhanced multi-layer graph embedding and graph neural networks, which could revolutionize practical keyword search co-occurrence tasks in real-world applications, fully utilizing the network’s capabilities to enhance user experience. The following is a novelty of this paper’s contributions:

  • An effective keyword search co-occurrence multi-layer graph mining approach is proposed. The proposed method is built on multi-layer graph embedding and graph neural networks with highly adaptive real-world processes to build intelligent solutions.

  • We performed extensive experiments using four evaluation metrics on distinct data sets against other benchmark methods. Our proposed model shows improved performance and offers the advantage of providing links that guide the classification process, which enhances existing techniques by examining and learning co-occurrence relations, social association, deformity prediction, and recommendation.

The remaining section of the manuscripts is sorted as follows: Sect. 2 describes the preliminary and problem definition, Sect. 3 denote the materials and methods, Sect. 4 denotes the experiment 5 denote the results and discussions, Sect. 6 represents the related works, Sect. 7 is the conclusion.

2 Preliminaries and problem definition

The preliminaries are introduced in this section, including the definitions and notations used (Table 1), and then the problem definition where directed or undirected edges can represent a graph’s real-world network. To introduce the terminology, for a graph G, the node-set is represented by N and the edge-set with E; thus \(G=(N, E)\) where N is the vertex or node set of size \(n =|N|\), E is the edge list of size \(m =|E|\). Note N is defined as a subset \(N_u= {\{u_1,u_2,...,u_n}\}\) and \(N_v= {\{v_1,v_2,...,v_n}\}\) and a set of edges between this vertex \(E = {\{e_{11}, e_{12},...,e_{nn}}\}\) where \(e_{u v} ={u_i,v_j}\in E\), \(1 \le i,j, \le n\).

Another way to describe graph G is as an adjacency matrix A with \(A(u,v)=1\) if \((u,v)\in E\) and 0 otherwise. if \(A(u,v) \ne A(v,u)\), G is a directed network, otherwise If the graph is undirected, the matrix \(A(u,v)=A(v,u)\) for all nodes \(u,v \in N\) is symmetric. If A(uv) is weighted by \(w(u,v) \in W\), \(G=(N, E, W)\) is a weighted network; otherwise, it is an unweighted network. An improved graph with vital information from simple graphs can be created using attributed graphs, multi-relational graphs (Hamilton, 2020), and Multi-layer graphs (Kivelä et al., 2014).

Definition 2.1

Simple graphs are expanded into attributed graphs. The node attributes X, and the edge attributes \(X^e\) are added to obtain them. For example, \(X\in R^{n \times d}\) represents a node feature matrix, and \(X^{e}\in R^{ m\times c}\) represents an edge matrix, with \(x^e_{u_i,v_j}\in R^c\) representing the vector of an edge \(e_{u,v}\).

Definition 2.2

An extension version of basic graphs with edges having many kinds of relations \(\tau\) are called multi-relational graphs.\(e_{uv}=(u_i,v_j)\in E \rightarrow e_{uv}=(u_i, \tau , v_j)\in E\) is the situation in question. One related adjacency matrix \(A^{\tau }\) exists for each edge. It is possible to construct the complete graph as an adjacency tensor \(A\in R^n\times r\times n.\) Heterogeneous and multiplex graphs are two sub-types of multi-relational graphs.

Table 1 List of notations

Definition 2.3

Multi-Layer Graphs (MLGs) have multiple edges between nodes. Denoting a MLGs where \(G_1\), \(G_ 2\),..., \(G_m\) = \((N, E_1, E_2, E_m)\) considering that the graph has m layers. Accordingly, Ma et al. (2021); Bhalotia et al. (2002) can likewise be modeled as an EKSCOMLGs \(M = (G, C)\). The MLGs = \({(G^{\alpha }, \alpha \in {\{1,2,...,n}\}]}\) are the pair of graphs in this case \(G^{\alpha }={(N^{\alpha },E^{\alpha })}\), \(G^{\alpha }\) is set of layer \(\alpha\) of G. The CO among nodes of various layers \(C ={\{E^{\alpha ,\beta }\in N^{\alpha } * N^{\beta };\alpha ,\beta \in {\{1,2,...,n}\}\alpha \ne \beta }\}\) where \(G^{\alpha }\) is set of layer \(\alpha\) of G and \(G^{\beta }\) is set of layer \(\beta\) of G with \(\alpha \ne \beta\). The MLG M’s element \(E^{\alpha }\) is the set of connections that make up the \(\alpha\) layer, and the elements \(E^{\alpha ,\beta }\) is the set of edges linking \(\alpha\) and \(\beta\) layers. The nodes and edges that comprise the layer \(\alpha\) are collectively called \(N^{\alpha }\) and \(E^{\alpha }\), respectively.

Definition 2.4

Graph Embedding (GE). A functional definition for the graph embedding-based with a mapping function F is defined by \(f \in N\times R\times R\). Thus, an object mapping function for nodes \(f: N \rightarrow X\) and an object link mapping function: \(p: E\rightarrow Y\) are both included in MLGs. In object type X, each object node \(n\in M\) corresponds to a particular object type or \(f(n)\in X\). Each link object in the collection of object types, \(e\in M\) or \(f(e)\in Y\), corresponds to a certain object type. When two links are members of the same relationship type, their start and end object types are the same for both links.

Definition 2.5

Graph Neural Networks (GNNs) are developed by applying deep learning models to graph structure data. It implies that although deep learning models work with data in Euclidean space, some GNNs operate in non-Euclidean domains. Suppose a graph \(G = {N, E}\) with adjacency matrix A and vertex matrix (or edge matrix) X (or \(X^e\)). Given A and X as inputs, the goal of a GNN is to discover the output, i.e., node embedding and node classification, after the \(m-th\) layer is: \(H^{m}= F(A, H^{(m-1)},\theta ^{(m)}\), where F is a mapping(propagation) function, \(\theta\) is a parameter function F, and m denote the index of the layer so when \(m=1\), then \(H^{(0)}=X\). Assume \(\sigma (.)\) is a non-linear function e,g ReLu, \(w^{m}\) is the weight matrix of layer m. A simple form of the mapping function is often used: \(F(A, H^{m}=\sigma (AH^(m-1) W^{m}\). The mapping function can be enhanced for suitable GNN tasks such as the node classification task and node embedding task (Kipf & Welling, 2016; He et al., 2021). More information on general graph representatives using embedding and GNNs can be found in Hamilton (2020).

3 Materials and methods

3.1 Overview of keyword search

Keyword search creates a friendly interface for information retrieval from complex data structures. Likewise, information retrieval suggests content to users of web services during interactions. Over the years, Tags have become increasingly popular to categorize web and online social network content known as folksonomy (Bai et al., 2009) and are a well-studied topic in information retrieval, computer science, and the recommender system field.

Using a folksonomy, it is possible to use a 3-dimensional array \(F= [f_{{u}{v}{k}}]\) of items with a tag. Folksonomy is defined over the group of vertices called users U=\(( u_1, u_2,..., u_n)\) the group of items I=\(( I_1, I_2,..., u_m)\) with a tags T=\(( t_1, t_2,..., t_k)\) where the element \(f_{u,v,k}\) is a unary value indicating whether the user u has tagged the item v with the kth tag. Two tags may be strongly related if their co-occurrence frequency is high; however, their co-occurrence frequency should be shallow if the two are unrelated.

Consequently, ML algorithms extract meaningful themes from a corpus of documents such as probabilistic topic models(PTM). PTM is a common semantic representation method used for the social network node. The straightforward approach utilizes Latent Dirichlet Allocation (LDA) to extrapolate the topic from the generative model. This strategy can also be divided into ranked search and conventional search. Most search algorithms used in conventional search are conjunctive keyword searches, which return all documents containing the search terms without considering the semantic linkages between them or centered on node interactions. A Link prediction based on Keyword Search on structural similarity or dynamic correction has been presented to estimate the propensity of a connection between two nodes as standard search is inadequate for ranked search; however, it has its flaws (Han et al., 2022; Aggarwal, 2016; Yang et al., 2021; Garg, 2021; Kumar et al., 2020).

Limitation of Keyword Search using Social Tag and Probabilistic Model

The user language’s homonyms, polysemies, synonyms, and other user tagging practices might sometimes make the recommendation process challenging. As a result, social networking services like Flickr might have hundreds of millions of users, objects, and tags. Most topic modeling research does not explicitly employ multi-layer graphs, while several studies disregarded categorical delivery and cross-validation outcomes from the balanced population data set presented.

3.2 Keyword search using graphs embeddings and multi-layer graphs (MLGs)

MLGs allow users to enter several search terms for the best relevant results. Though it can be complex, keyword research, content creation, and link development are the three main components of SEO. Of those three, keyword research is the most crucial. For instance, we may produce the best content and generate amazing links that propel us to the top of Google results. Still, If a wrong keyword is targeted in terms of real-world applications, there won’t be benefits in terms of e-commerce growth and technological aspects. Effective keywords can make or break an SEO application in the real world. Key actions to initiate keyword research are as follows:

Step 1: Using important terms and related searches, develop keyword ideas.

Step 2: Determine the actual keyword difficulty and searches.

Step 3: As shown in Fig. 1, ascertain the user’s goal.

Cao et al. (2013) is a method that protects privacy and ranks documents using coordinate matching. Searching documents in the dictionary-scale vectors describes the keywords where the links of different keywords in the document are not considered thus the retrieval result obtained by the schema lacks accuracy. Aggarwal (2016) developed the influence limiter algorithm to study trustworthy recommender systems. A global measure of each user’s reputation is utilized in the suggestion process, but it cannot expressly endorse trustworthiness without user feedback thus this method needs help obtaining more requests for trustworthy dimensions. GE and Co graphs as a feature can support updates on the data set, to use CO graphs as features, the interrelationship is needed and it is often addressed as a boolean feature.

GE is a family of ML and DL approaches that take advantage of the inherent graph structure of data types to transform high-dimensional vectors into continuous vector representations of low-dimensional discrete variables. To capture structural information, GE models (Rossi et al., 2021; Makarov et al., 2021) offer a global picture of latent relationships. For instance, the node-embedding method utilized a node-wise method such that \(e_{uv} = h(y_u, y_v)\), where \(y_u\) and \(y_v\) are the node-wise embeddings and h is the decoder function ranging in complexity from a parameter-free inner product of a multi-layer MLP. In contrast, the constituted node embedding themselves is typically computed with some form of trainable GNN encoder model g of the form \(y_u =g(x_u, G_u)\) and \(y_v =g(x_v, G_v)\) where \(G_u\) and \(G_v\) are the subgraphs containing nodes \(u_i\) and \(v_j\) respectively. Turning to edge-wise methods, the edge representation \(e_{uv}\) relies on the subgraph \(G_{uv}\) defined by both \(u_i\) and \(v_j\). In this case \(e_{uv}=h_e(u_i,v_j, G_{uv})\), where \(h_e\) is an edge encoder GNN whose predictions can generally not be decomposed into a function of individual node embedding method. With ML systems, we note that while the embeddings from node-wise subgraph for all nodes in the graph can be produced by a single GNN forward pass, node classification, node clustering, link prediction, and community discovery and keyword search which are often focused on finding a group of nodes in the graph that match the keywords, which is more of a search task, edge-wise subgraph and corresponding forward pass and multi-layer linkage graphs are needed to make predictions for each candidate edge.

The node embeddings are implemented by DeepWalk (Perozzi et al., 2014) and node2vec (Grover & Leskovec, 2016); both rely on random node co-occurrence to train the models. Since their objective function is non-convex, initializations of this kind may become trapped in local optima. Thus, using node embedding directly in keyword searches is not natural. Most node embedding methods rely on network distance; nevertheless, the resulting edge-wise embedding specifies a relationship between nodes.

In graph-based machine learning, shallow embedding methods have proven effective in capturing the relationships between nodes. We delve deeper into the exciting world of multi-relational graphs. Multi-relational graphs are complex networks consisting of nodes and edges, where each edge represents a specific relation between two nodes.

Formally, a knowledge graph is denoted as \(G=(N, E, R)\), where R is a relation type, entities \(u_i\in N\), and edges \((u_s, \tau , v_o)\in E\) are the entities. In ascertaining the likelihood that such edges correspond to E, the task assigns scores for legit ideas (i.e., triple-like subject, relation, and object). Since they hold factual information as tuples of the form \((u, \tau ,v)\), which represent a relation \({\tau }\) between nodes u and v and can be selected from a range of GNNs, these graphs are frequently referred to as knowledge graphs. Numerous decoder functions, such as ComplEx, RotatE, RESCAL, TransE, and TransX, have been proposed. Every decoder has its method for encoding and decoding relations between nodes, although they all have advantages and disadvantages (Hamilton, 2020).

The extension of graph mining has created multi-layer graphs. Liu et al. (2017) suggested three techniques to build a multi-layer network into a continuous vector space: "layer co-analysis," "results aggregation," and "network aggregation." To find a vector space for a multi-layer network, "network aggregation" and "results aggregation" apply the conventional network embedding method on the merged graph or each layer; our proposed method differs from this approach.

3.3 Proposed method

3.3.1 Keyword search co-occurrence model

Our model uses both directed and undirected multi-layer graphs. We use an edge-wise approach for a multi-layer graph denoted by MLGs= (GNEM) Where \(G_M = N_M, E_M\), N is represented as nodes and E is represented as edge or links with M denoting the Layer. In a graph \(G=(N, E)\) with the node \(N|N|=m\) with a link set, E is considered directed if \((u,v)\in E \Longrightarrow (v,u)\in E\) whereas an undirected edge implies that \((u,v)\in E {}\Longleftrightarrow (v,u)\in E\). Suppose information is observed on a selected subset of nodes in N, denoted by \(N_o|N_o|=m_o\) and \(G_o\) represent the subset of G induced by \(N_o\), let X and Y be a set of nodes and edges such that if \(x\in X\) and \(y\in Y\), then (xy) represents the pair of x and y thus, a variable \(Y_{u,v}\), where \(u,v = 1,2,...,m\), \(u_i\ne v_j\) to show whether a link exists among two nodes u and v in G or not where Y is defined as the graph \(G's\) adjacency matrix. For any edge \((x,u_i)\), its equivalent edge could be represented by \(e=(x,y)\). Note that for undirected edges, \(Y_{u,v} = Y_{v,u}\). In the case of the edge-wise approach, the edge representation is denoted as \(e_{uv} = h_e(u_i,v_j, G_{u,v})\), where \(h_e\) is an edge encoder GNNs whose predictions cannot be generated as the node embedding mention previously. This basic idea can be generated through a query using a synthetic example.

Synthetic Example Suppose the community consists of the set of Researcher \(R ={\{M, A, C}\}\) and make up of group of expertise denoted as \(G_E ={\{G_m, G_a, G_C}\}\) as shown in Fig. 3a. The Researcher can be grouped according to the area of expertise and modeled as a bipartite graph \(G ={\{N_r, N_e, E}\}\). A bipartite graph in this regard is used in MLG to organize entities based on their relationships where \(N_r\) represents the entire researcher and \(N_e\) is the group of experts where the edge E is denoted as \((r,e)\in E\).

A complete bipartite graph on Researcher (nodes) R and Expert Group \(G_E\) contains all possible edges between the researcher and the expert group thus an edge \((r,e)\in E\) is established from r to e if r performs an action in e. An edge between r and e is linked by a relation \(R_{re}\) based on performed action or the weight between them. Let us assume that each researcher in an expert group is associated with a programming language, the idea is to compute relationship scores with respect to certain focus areas that demand area of specialty and location and can be passed based on keyword query \(Q = {\{ r{q_1}, r{q_2},....r{q_n}}\}\) to the relationship algorithm. The detailed mechanism for extracting the query Q from the research profile is not detailed in this work.

In knowledge graph representation, a multi-relation graph can be represented as \(G = (N, E, R)\) where the edges are modeled as tuples \(e =(u, \tau , v)\) signifies the type of a relation \(\tau \in R\) occur among two entities. Let X be the m \(\times\) n matrix of real-world network expression value from n samples and R denoted the m \(\times\) q matrix of the relationship links, then \(Z_{u,v,k}\) = \(S_k(Y, X_{u_i}, X_{v_j}, R_{u_i}, R_{v_j})\) where \(X_{u_i}\) and \(R_{u_i}\) denote ith row of X and R. This notion emphasizes that the kth co-occurrence graph is a function of the feature of the network (Y), the expression level of the corresponding network \({(X_{u_i} {}\ and {}\ X_{v_j})}\) as well as other network linkages \((R_{u_i} {}\ and {}\ R_{v_j})\) and the function \(S_K\) denotes any sequential measure based on different data sources. Our goal is to study the KSCOMLGs in a simple framework to relate the values of \(Y_{uv}\) to \(Z_{u,v,k}\) in the settings where \(Y_{uv}\) can be of different types. A link R(uvk) in each layer denotes the associations among nodes u, v, and k in a given community, and the sequence of the distribution all serves as the training data for the real-world data contain some vital information useful for the analysis.

3.4 Computational complexity of the proposed EKSCOMLGs

EKSCOMLGs could deduce navigation graphs denoted in the search engine query logs to comprehend the relationships between search engine inquiries. We investigate structures with layers in addition to nodes and edges to describe networks with many types of edges (or with other comparable features) in systems. A graph can be represented as a color problem; a similar procedure is called "graph coloring" on an undirected graph G, where the nodes serve as the colored regions, and the edges serve as the neighboring pairs.

Considering a scenario in the given Keyword search co-occurrence representative in Fig. 3a where a community of experts is to set up an activity that requires at least one additional expert from group 1 the \(Red = { N_3, N_2, N_6}\), Group 2 the \(Green = {N_5, N_8}\) and Group 3 the \(Purple={ N_1, N_4, N_7, N_9, N_{10}}\). Since \(N_2\) personally knows \(N_3\) and \(N_6\) from previous collaboration (reflected by social relation), \(N_2\) is well connected to group 1 the mathematical expertise group but \(N_2\) does not know any member from group 3 the computer analysis group but there is a link between group 3 member the computer analysis \(N_7\) and a member a mathematician member \(N_3\), and likewise a link between group 1 member \(N_2\) and group 2 member \(N_5\) the expertise in architecture group.

In this scenario, \(N_2\) of the mathematical group may collaborate with \(N_5\) which is linked with \(N_2\), hence \(N_2\) may act as an invitation to join the architecture group. Likewise, since \(N_7\) member of the computer analysis has a link with \(N_3\) the mathematician group, it is most likely that \(N_3\) will serve as the invitation to join the computer group since \(N_7\) is the focal node in the Computer analysis group-which is linked with \(N_3\). Thus the graph mining technique supports the discovery of emerging social relations which is the logic that is utilized in the discovery of keyword search co-occurrence multi-layer graphs.

3.4.1 Scenario: a keyword by typing a URL and searching the co-occurrence graphs

Let Researcher R suggest a URL, U, that has been previously visited, the system identifies the relationship between researcher R, and the experts who have searched the URL, U. The co-occurrence is represented as a Researcher Co-occurrence Matrix (RCM) and is evaluated based on relatedness between researchers as shown in Table 2. This stems from the fact that co-occurrence is considered a more general representation of the URLs since they are descriptor of the project being addressed as compared to the URL address themselves. Each Researcher is represented by a vector of co-occurrence he/she has utilized linked by the frequency vectors of each pair of expertise on a given researcher topic.

Table 2 Researcher co-occurrence matrix analysis

For instance, if \(N_1\) and \(N_2\) are the number of experts in groups 1 and groups 3 and N is the sum total of researchers, the expected number of co-occurrences as proposed by Forbes is \(E(X)= n_1,n_2/ N\).

3.4.2 Multi-layer activity

Assume the nodes is swap between colors \(\alpha\), \(\beta\), and \(\gamma\) in a given community as shown in Fig. 2. In this case, \(N_u\) in Layer \(\alpha\) can communicate to node \(N_v\) in Layer \(\beta\), and \(N_u\) in Layer \(\alpha\) can communicate to \(N_v\) in Layer \(\beta\) and \(N_k\) in Layer \(\gamma\), respectively. Let \(G_{\alpha ,\beta .\gamma }\) represent the induced sub-graph of G’s nodes, colored \({ \alpha ,\beta , \gamma }\).The operation of a \({( \alpha , \beta , \gamma )}\) swap concerning G is as follows:

Lemma: Let \(G\in G'\) be appropriately colored and assume x be any node of G, Suppose nodes \(y,z\in Adj(x)\) be colored \({( \alpha , \beta , \gamma )}\) respectively with \(\alpha \ne \beta\) or \(\alpha , \beta \ne \gamma\). if \((\alpha , \beta ,\gamma )\) connection connects y and z in G, then \(( \alpha , \beta )\) or \(( \alpha ,\gamma )\) in \(G_{Adj(x)}\) is then connected.

Proof Let \(C = [y=x_o,x_1,x_2,...,x_m=z]\) be a \([\alpha ,\beta , \gamma ]\),Link in G sequence of communication between y and z. Thus, if every edge has at least one end, m is vertex cover.

We assert that the equation\({\{x_o,x_1,x_2,...,x_m}\} \subseteq Adj(m)\) The statement is correct if either m=1 in Layer \(\beta\) or Layer \(\gamma\). Suppose \(m \ge 3\) and the link is correct for all minimum \({\alpha ,\beta .\gamma }\) links less than m. G is a K-edge connected subgraph if subgraph \(G'=(N, E)\) is connected for all \(S\subseteq E\) where \(|S|< K\).The highest value of k, such that G is k-edge-links, is the edge association of G.

4 Experiments

4.1 Experimental settings

The EKSCOMLGs model’s performance is assessed in this section; followed by the description of the experimental setup and presentation of findings. The algorithms were implemented using Python 3.0 with Anaconda and UCINET 6.733. The tests are performed on a Legion System (GPU/RTX) running Windows 10 and equipped with an Intel(R) Core(TM) i7-11800 H processor clocked at 2.30GHz, 2304MHz, six cores, 12 logical processors, 8 GB of RAM, and a 512GB SSD.

4.1.1 Data acquisition and descriptions

Six real-world data sets with distinct qualities were utilized. The primary reasons we considered the data sets are the wide range of characteristics, accessibility, and potential to make the study results repeatable, and they consist of techniques to do supervised and unsupervised learning on graph structure data where predictive, recommendation, and analytic approach are flexible with the real-world data set.

4.1.2 Data description

  1. 1.

    The Cora data set (Kipf & Welling, 2016), a citation network, comprises 2708 scientific publications. The nodes are categorized into one of seven subject classes. There are 5429 connections in the citation network. Nodes represent science articles, and the left node mentions the right node when an edge connects the two nodes. Each publication in the data set is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. One thousand four hundred thirty-three distinct words make up the dictionary.

  2. 2.

    Dolphin Data set, authored by Lusseau (2006) identifies bottle nose dolphin point locations in Doubtful Sound. It consists of 62 nodes and 159 undirected edges with three community numbers, where a link represents frequent associations between dolphins.

  3. 3.

    Jruvika has assembled the Fake News Detection (Kumar et al., 2021) data set on the Kaggle platform. Its four properties are site URL, Headline, Body, and Label (Real/Fake News). There were 4009 new occurrences in the data set at first. Following the first data cleaning steps, which included deleting entries with incorrect labels, missing headlines, and body content, 3988 rows containing 1867 Real and 2121 Fake news samples were obtained. The majority of articles focus on political and World news topics.

  4. 4.

    Kyphosi is a spin-related unusually large convex curvative. The 81 records with four attributes for each patient that underwent corrective spinal surgery in the kyphosis data set(John & Trevor, 1992), which was retrieved from Kagglehttps://www.kaggle.com/abbasit/kyphosisdata. A factor denoted present indicates a type of deformation was present after the surgery, suggesting that the patient may be recommended to undergo another surgery. Some key attributes of the data employed for analysis include the ages of patients, the number of patients involved, and the start date, which indicates the day a patient is operated upon.

  5. 5.

    The supermarket data set, containing historical sales data from three branches for 3 months includes 1000 rows and 17 columns from Kaggle. It includes information on invoice ID, branch, city, customer type, gender, product line, unit price, tax, total, date, time, payment, COGS(cost of goods sold) gross margin percentage, gross income, and rating.

  6. 6.

    Zachary’s Karate Club data, a university karate social network, was developed by Zachary (1977) is the final data set used in this investigation. Michelle Girvan’s 2002, makes use of a variant of Zachary’s data, popularized multi-layer graphs for illustrating community structures in networks. The data has 34 pairs of nodes and 78 edges. Each node represents a karate member, and a pair indicates the two members had interacted.

The characteristics for each data are listed in Table 3, describing the six data used to assess the efficiency of the Keyword Search Co-occurrence technique. The sets N and E in Table 3 correspond to the MLGs’ Nodes and edges, with class denoting the respective class.

Table 3 Description of Multi-Layer Graphs Data Sets

The specific features of the data set and extension of Table 3 are listed in Table 4, where N and E are nodes or rows and edges numbers or columns, respectively, with class representing their respective classes, k being the average degree of the graphs, and DD and DU for directed and undirected, respectively, denoting density. The diameter (DIA), radius (RA), and average path length (APL) make up Graph DD and Graph DU, respectively, which indicate the Graph distance for directed and undirected operations. For directed and undirected networks, respectively, CCD and CCU make up the clustering coefficient (CC) of the network. Every network has both directed and undirected linkages, which is important to note.

Table 4 Characteristic of multi-layer graphs data sets

4.1.3 Baseline methods

  1. 1.

    The principle of multi-layer embedding (Kumar et al., 2021) proposed three methods of multi-layer network into a continuous vector space.

  2. 2.

    Kumar et al. (2020) employ vertex attributes using the degree of overlapping between keywords research embedded with other features in the co-authorship work.

  3. 3.

    Deep learning is a member of the machine learning family of techniques, which is a subset of artificial intelligence and artificial neural networks, which are modeled after biological neural networks. Chauhan et al. (2023) presented a supervised machine learning and deep learning model for diagnosing kyphosis disease.

  4. 4.

    A research direction is to explore (Ma et al., 2021) whether to design a multi-layer graph embedding method that can naturally learn distance/similarity.

4.1.4 Model parameter settings and training

The basic size of the data sets can vary from hundreds of thousands of nodes, and edges can interact simultaneously; the elements u, v, and k (nodes) are represented in binary form with the values 1 and 0, respectively. Eighty percent of each data set was used to train the model, while the final twenty percent was used as the test set.

4.1.5 Hypothesis

The interdependence structure between nodes and edges might contain helpful information that leads to conclusive and supporting decision-making. Let P be the probability relation that meets certain requirements: a multi-layer graph may be a directed relation or a symmetric and transitivity relation. True with hypothesis if support H\({_\theta (x)}\) \(\ge\)0.5 and False if support H\({_\theta (x)}\) <0.5 the proposed approach is 0\(\le\) H\({_\theta (x)}\) \(\le\)1.

4.2 Evaluation metrics

Understanding the Effective Keyword Search Co-occurrence on Multi-Layer Graphs is the main goal of our evaluation. The performance of EKSCOMLGs and baseline methods are validated using quantitative measurements. Wilcoxon Rank Sample, Accuracy, Precision, Recall, F-measures, and support are used to assess the quality of trained classifiers.

The area under the curve (AUC) can be used to enumerate the vertices of a graph but cannot capture certain aspects of user satisfaction. Precision, recall, and f1 measures receive more attention than accuracy in our study. The confusion matrix of a binary classifier is shown in the Eqs. 1 to 4 (Yang et al., 2021; Kumar et al., 2020).

$$\begin{aligned} \textrm{Precision}= & {} \frac{\textrm{TP}}{\mathrm {TP+FP}} \end{aligned}$$
(1)
$$\begin{aligned} \textrm{Recall}= & {} \frac{\textrm{TP}}{\mathrm {TP+Fn}} \end{aligned}$$
(2)
$$\begin{aligned} \mathrm {F-measure}= & {} \frac{\mathrm {2*Precision* Recall}}{\mathrm {Precision + Recall}} \end{aligned}$$
(3)
$$\begin{aligned} \textrm{Accuracy}= & {} \frac{\mathrm {TP+TN}}{\mathrm {TP+FP+FN+TN}} \end{aligned}$$
(4)

Intuitively, recall measures how well the search engine finds all the co-occurrence graph items for a query, and precision measures how well it rejects non-occurrence graph items.

True Negative (TN). It is recommended to set the Class 0 non-occurrence data item to 0 rather than co-occurrence (the pattern does not link.)

True Positive (TP). The appropriate data item (Class 1) is advised as 1 and co-occurrence (pattern corresponding to links exist.)

False Negative (FN). A connection that is a part of the Graph’s co-occurrence data item (Class 1) is advised to be 0 and not co-occurrence (the pattern does not link.)

False Positive (FP). It is advised to treat the Class 0, not co-occurrence data item, as 1 and co-occurrence (pattern corresponding to links exist.)

5 Results and discussion

5.1 Performance evaluation of EKSCOMLGs using Wilcoxon rank sum test

The Wilcoxon signed-rank test is a non-parametric numerical hypothesis test that can be employed to evaluate two networks’ regions using two corresponding samples, assess a network using a sample of data, or carry out a paired difference test of recurring quantities on a single sample to ascertain whether the network mean ranks differ.

5.1.1 Test procedure

Two versions of the signed-rank test exist. The one-sample test is essential since it allows for the linked sample test to be obtained by modifying the data to correlate with the one-sample test’s criterion. Linked data, however, is where most of the signed-rank test’s practical claims originate. The data includes samples \({\{(X_{1}, Y_{1}),\dots ,(X_{n}, Y_{m})}\}\) for a paired sample test.

Every sample comprises two capacities; these capacities can be converted to absolute numbers or an interval scale in the most basic scenario. The linked sample test can be modeled to a one-sample test by changing every edge of values (\(X_{u_i}, Y_{v_j}\)) with their difference, \(X_{u_i}-Y_{v_j}\). Generally speaking, the alterations between the pairs must be ranked reasonably plausible. An ordered metric scale, which may have less evidence than an interval scale but carries more than an ordinary scale, is required for the data. Consequently, four real-world data sets-The Dolphin, Kyphosis, Supermarket, and Zachary Karate-are used to sample the Wilcoxon signed-rank test. Tables 5, 6, 7, 8, and 9 represent the general data set for all the real-world data.

Table 5 Wilcoxon rank sum test using dolphin data set

5.1.2 Wilcoxon rank sum test using dolphin data set

For the Dolphin data sets, we sample the top 20 elements from the data sets as shown in Table 6.

Table 6 Wilcoxon rank sum test using dolphin data set

Claim: The probability of Social Dolphin Co-occurrence using 20 sample data sets where the degree of the sample dolphin is used to rank the entire data sets from the lowest to the highest degree.

Claim \(n_1\) = \(n_2\), \(H_o\) = \(n_1\) = \(n_2\), \(H_A\) = \(n_1 \ne n_2\),\(\alpha\)= 0.05. The value of \(T_1\)= \(n_1=20\), \(T_2\) = \(n_2=20\).

Claim: \(n_1\) Co-occurrence among social dolphins = \(n_2\) Absent of Co-occurrence among social dolphins

\(H_o\) = \(n_1\) Co-occurrence present = \(n_2\) Co- occurrence absent

\(H_A\) = \(n_1\) Co-occurrence present \(\ne\) \(n_2\) Co-occurrence absent.

Ranking the sample data sets using \(n_1\)=20 and \(n_2\)=20, from the lowest to the highest degree, we observe that the value for the 20 samples from the top selected data sets is grouped into 2 communities. Ranking the entire sample we observe that some samples dolphin, such as Degree 3, appear twice, 4 degrees appear five times, 5 appear three times, 6 degrees appear nine times, 7 degrees appear seven times, 8 degrees appear six times, 9 degrees appear five times and 12 degrees appear 3 times with a total of the entire data sets = 820.Where our \(N_1\) = 20 and \(N_2\) = 20 \(T_1\)= 491.5 and \(T_2\)= 328.5. \(T_2\)= 328.5 is chosen for testing the two groups.

One set of vital values for one-tail \(\alpha\) = 0.025 and two-tail \(\alpha\) = 0.05 and another set for one-tail \(\alpha\) = 0.05 and two-tail \(\alpha\) = 0.10 exist for every pair of sample scopes (mn), according to the Wilcox-on Rank-Sum Test Table of Critical Values. The sample size for the smallest sample is shown in column m, while the sample for the largest is in column n. Either sample can be named m if the sample sizes are equal. Assume \(m = 20\) and \(n = 20\) for a two-tailed test at \(\alpha\) = 0.05. Both \(n = 20\) and \(m = 20\) are given. It is asserted that the social dolphins’ probability distribution is comparable. The following numbers can be found in the relevant row and column: 483, 337. The minimum and maximum critical values for WX, the testing statistic \(H_0\): \(MX = MY\), are 337 and 483. \(H_0\) would be rejected if \(W X \le 337\) or \(W X \ge 483\) while Fig. 4 shows the association between dolphin social networks and friends.

Fig. 4
figure 4

Dolphin social network representative-based associations

Table 7 Wilcoxon Rank Sum Test using Kyphosi data set

5.1.3 Wilcoxon rank sum test using kyphosis data set

Claim: The probability of kyphosis present or absent using 20 samples from kyphosis data sets using the age of the patient as the factor to rank the entire data sets starting from the minimum age to the maximum in the selected Table 7.

Claim \(n_1\) = \(n_2\), \(H_o\) = \(n_1\) = \(n_2\),\(H_A\) = \(n_1 \ne n_2\),\(\alpha\)= 0.05. The value of \(T_1\)= \(n_1=10\), \(T_2\) = \(n_2=10\).

Claim: \(n_1\) Kyphosis present = \(n_2\) Kyphosis absent

\(H_o\) = \(n_1\) kyphosis present = \(n_2\) kyphosis absent

\(H_A\) = \(n_1\) kyphosis present \(\ne\) \(n_2\) kyphosis absent

Ranking the sample data sets using \(n_1\)=10 and \(n_2\)=10, from the minimum to maximum, we observe that the value for the 20 samples is 210 consisting of 4 samples showing patients with kyphosis disease present and 16 samples showing that kyphosis disease is absent summing the total ranking number we have where \(T_1\) = 105 and \(T_2\)= 105 for the 20 sample data. Using the Wilcoxon Rank-Sum Test Critical Values Table, assume a two-tailed test at \(\alpha\) = 0.05, we have \(m = 10\) and \(n = 10\). The claim is that the probability distribution associated with the kyphosis disease is equivalent. In the appropriate row and column, we find 78, 132, 78, and 132, the minimum and maximum critical values for WX; the testing statistic H0: \(MX = MY\). If \(WX \le 78\) or \(WX \ge 132\), H0 would be rejected while Fig. 5 shows the Kyphosi disease representation based on the patients’ age.

Fig. 5
figure 5

kyphosis disease representative-based age

Table 8 Wilcoxon rank sum test using supermarket data set

5.1.4 Wilcoxon rank sum test using supermarket data set

Claim: The association denote the Co-purchase of product in a supermarket using Supermarket data sets grouped into three branches, A, B, and C, using the branch and product purchase as the factor to rank the entire data sets starting from the minimum to the maximum where only 10 samples whereas selected whereas selected from Table 8, we realized that using the 3 branches, branch A has the highest number of top 10 sample data followed by Branch C and B. Ranking the entire sample from the minimum to the maximum, we observe that some product purchases in branch A appear 6 times, C appears 3 times and B once with a product such as Health and Beauty and electronic and accessories appear 3 times, each, Home and lifestyle appear 2. In contrast, sports and traveling and food and beverages appear once each. Claim \(n_1\) = \(n_2\), \(n_3\) \(H_o\) = \(n_1\) = \(n_2\) or \(n_1\) = \(n_3\), \(H_A\) = \(n_1 \ne n_2\) \(H_A\) = \(n_1 \ne n_3\), \(\alpha\)= 0.05 The value of \(T_1\)= is the 3 branch of supermarket, \(T_2\) = The product purchased from the 3 branches. Claim: \(n_1\) Co-purchase of produce exists in the three branches = \(n_2\) Co-purchase didn’t exist among members. \(n_3\) There is a Likelihood of mutual existence among buyers. \(H_o\) = \(n_1\) Co-purchase present = \(n_2\) Co-purchased absent, \(n_3\) Likelihood of mutual purchase of products. \(H_A\) = \(n_1\) Mutual purchase occur \(\ne\) \(n_2\) Mutual purchase is absent \(T_1\)= 55 and \(T_2\) =55. Ranking the sample data sets using \(n_1\)=10 and \(n_2\)=10, and \(n_3\)=10 from the minimum to maximum, we observe that the value for the 10 samples is 55 where \(T_1\) = 55 and \(T_2\)= 55. The following numbers, 78 and 132, can be found by using the Table of Critical Values for the Wilcox on Rank-Sum Test and assuming that, for a two-tailed test at \(\alpha\) = 0.05, we have \(m = 10\) and \(n = 10\). The argument is that the probability of sales distribution in the three branches is identical. The statistic testing H0: \(MX = MY\) has lower and higher critical values of 78 and 132 for WX. H0 would be denied if \(WX \le 78\) or \(WX \ge 132\) Fig. 6 shows the co-purchase between three branches of a supermarket based on gender.

Fig. 6
figure 6

Supermarket representative based on three branches and genders

Table 9 Wilcoxon rank sum test using zachary karate data set

5.1.5 Wilcoxon rank sum test using Zachary’s data set

Claim: The association of Zachary’s relationship using Zachary’s data grouped into three communities using the degree as the factor to rank the entire data sets starting from the lowest to the highest age. Ranking the entire sample from the minimum to the maximum, as shown in Table 9, we observe that some sample data such as 1, 9, 10,12,16, and 17 appear once in the sample data sets, 2 appear eleven times, 3 appear 6 times, 4 appear 6 times, 5 appear 3 times, and 6 appear 2 times.

Claim: \(n_1\) = \(n_2\), \(n_3\) \(H_o\) = \(n_1\) = \(n_2\) or \(n_1\) = \(n_3\), \(H_A\) = \(n_1 \ne n_2\) \(H_A\) = \(n_1 \ne n_3\), \(\alpha\)= 0.05 The value of \(T_1\)= \(n_1=10\), \(T_2\) = \(n_2=10\) and \(T_3\)= \(n_3= 14\)

Claim: \(n_1\) Co-occurrence exist among member = \(n_2\) Co-occurrence didn’t exist among member. \(n_3\) There is a likelihood of mutual existence among members.

\(H_o\) = \(n_1\) Co-occurrence present = \(n_2\) Co-occurrence absent,\(n_3\) Likelihood of mutual occurrence.\(H_A\) = \(n_1\) \(\ne\) \(n_2\) Mutual relation is absent.

Ranking the sample data sets using \(n_1\)=17 and \(n_2\)=9, and \(n_3\)=8 from the minimum to maximum, we observe that the value for the 34 samples is 605 where \(T_1\) = 309.5 and \(T_2\)= 165 and \(T_3\)= 130.5. Assume for a two-tailed test at \(\alpha\) = 0.05 that we have \(m = 17\) and \(n = 9\) and \(n = 8\). The Table of Critical Values for the Wilcox on Rank-Sum Test is utilized. It is asserted that the Zachary Karate Club’s probability distribution is comparable.

For the two-tailed test, we make use of \(n_1\)=17 and \(n_2\)=9, \(T_1\) = 309.5 and \(T_2\)= 165 followed by \(n_1\)=17 and \(n_3\)=8 utilizing \(T_1\) = 309.5 and \(T_3\)= 130.5 where the minimum value serves as our test value for the sample data sets. We find 84 159 numbers for \(n_1\) and \(n_2\) in the appropriate row and column. The 84 and 159 are the minimum and maximum critical values for WX; the testing statistic H0: \(MX = MY\). If \(WX \le 84\) or \(WX \ge 159\), H0 would be denied.

We find the following numbers, 70 and 138, for \(n_1\) and \(n_3\). The 70 and 138 are the lower and upper critical values for WX; the statistic testing H0: \(MX = MY\). If \(WX \le 70\) or \(WX \ge 138\), H0 would be denied. Figure 6 shows the Zachary karate network representative using club members (Fig. 7).

Fig. 7
figure 7

Zachary social network representative

5.2 Performance using graph machine learning and deep learning algorithm

Using machine learning models, such as (a) Logistic Regression (LR), (b)Gradient Boosting Classifier GBC (C) Random Forest (RF) classifier, and (D) K Nearest Neighbor (Alimadadi et al., 2019), we examine the performance of EKSCOMLGs using Logistic regression and Gradient boosting classifier. Machine learning relies heavily on LR, especially when dealing with categorization issues. This algorithm performs exceptionally well in situations where one of two possible outcomes is a diagnosis of a medical problem or the behavior of an application in the real world. In real-world applications, minimizing the loss function in logistic regression is often achieved through gradient-boosting classifiers.Eighty percent of each data set was used for training, and the remaining twenty percent was utilized as the test set to compare the performance of the proposed technique with that of the existing methods using metrics like Precision, Recall, and F1-Score. Precision-Recall and F1 Measure are used to summarize the various machine-learning models for real-world data sets in Tables 10 through Table 16. The cross-validation represents a stratified study that is ten-fold and five-fold.

Table 10 Analysis description using cora data set
Table 11 Analysis description using dolphin data set
Table 12 Analysis description using kyphosi data set
Table 13 Analysis description using fakenew data set
Table 14 Analysis description using supermarket
Table 15 Analysis description using zachary karate datasets
Table 16 Cross-validation for keyword co-occurrence graph

GML analysis Table 10 shows that LR has the highest value in terms of precision Recall F1-Score and Accuracy, followed by GCB, KNN, and RFC. LR and GBC perform well using precision one and Recall zero, while RFC and KNN have similar outcomes in three analyses. LR and GBC performed well in almost all the data sets, with LR performing excellently using the Cora and Zachary karate data set. The general result shows that using LR and GCB yields 2–10% increment compared to other approaches as shown in Tables 10, 11, 12, 13, 14, 15, 16. While the Fake New Data set operated for days to provide an overall result, it was unsuccessful when employing KNN with its two neighbors to achieve accuracy. It explains the reason why the KNN result for the Fake Data set is not available. The keyword co-occurrence graph found the proposed strategy to be helpful in accurately and consistently guiding the potential MLG linkages across data sets and methodologies, according to experimental results.

6 Related work

6.1 Graph theory and vital application

A graph represents binary, multiple associations among a person’s contents and thus is a prevalent data structure. Several essential tools are typically used for real-world applications, like the Greedy Search technique for Graph Mining, the Inductive Database Search technique for Graph Mining, and the Graph Clustering technique for Graph Mining, which describes achieving more enhancements (Han et al., 2022; Scarselli et al., 2008). When well-educated heuristics are available to direct the search, greed search is a successful and effective technique for searching an intractably ample space. When optimizing or minimizing an objective function is required, greed searches are employed. Greedy algorithms, in contrast to backtracking, must determine the best option all at once and are unable to reverse their conclusion.

The idea of searching databases of graphs for (subgraph) patterns and the application of particular data structures that reflect the space of solutions define the inductive database technique for graph mining. For the former, it is required to have a query language for defining the patterns of interest. Although most applications of the latter focus on small molecule structure-activity relationships (SARs), they still attempt to provide a concise representation of the solution patterns. The graph mining strategy on multi-layer networks usually focuses on varying granularity depending on the job.

Graph clustering is an active technique for grouping data into different collections or clusters based on the similarity of the attributes and characteristics of the data points (Aggarwal, 2016; Boccaletti et al., 2014; Kivelä et al., 2014). Graph clustering is divided into two categories of tasks: (1). Developing a model to forecast a graph’s class is the first task (2). Predicting node labels in big graphs is the second. However, considering the vast diversity of graph types and the information they can convey, the labeling costs associated with graph data are relatively significant. Multi-layered networks represent intricate connections found in contemporary networked information technology systems. Each pair of nodes in such a network may have multiple edges connecting them, each representing a distinct user activity related to cooperation or communication. For instance, the study (Huang et al., 2021) presents multi-layered degree centrality for multi-layered social networks, and (Bolorunduro & Zou, 2023) describes a practical application of centrality and depth-first search for community detection on multi-layer graphs based on intra-layer and inter-layer linkage graphs.

Graph Neural Networks (GNNs) are special neural networks or neural message-passing networks originally proposed for learning molecular graph representation that work with a graph data structure(Wu et al., 2020). They are highly influenced by Convolution Neural Networks (CNNs) and Graph Embedding. Graph Neural Networks (GNNs) have been extensively employed in graph illustration learning, attaining cutting-edge results in Node categorization, Link Prediction, and graph-based assignments. The essential idea of most of these methods is to formulate previous GNNs as a framework of neural message transmission among nodes or designed to learn node representations on fixed single graphs. At the same time, Graph Convolutional Networks (GCNs) that utilize aggregations are a distinct type of GNNs, and other models of GNNs based on different aggregations such as gated graph neural networks (Li et al., 2015) and graph attention networks (Velickovic et al., 2017) exist. The limitations especially become problematic when learning representations on a multi-layer graph consisting of various nodes and edges (Hamilton, 2020). GNNs were introduced when CNNs failed to accomplish optimal outcomes due to the arbitrary size of the graph and complex structure. Both shallow neural networks and deep neural networks face challenges despite their enormous success in learning graph representations; the existing GNN model has shown how susceptible they are to hostile examples that may exist in graph structure data. While (Yang et al., 2020) uses two network information-topology and node attributes-to collect semantic variance from the privileged group of actual and false samples, it must address the over-fitting issue. Although the proximity can represent underlying linkages within communities, there are not enough edges in sparsely connected real-world networks.

6.2 Keyword search co-occurrence graph

Searching over graphs has attracted much attention recently (Yang et al., 2021; Garg, 2021; Bast et al., 2016) because it gives helpful information without being aware of the underlying entities, schema, or access techniques. Search for information over massive, complicated graphs and various sophisticated keyword search algorithms have developed a connection between keyword search co-occurrence and an artificial index classification (Han et al., 2022; Rossi et al., 2021; Makarov et al., 2021). Thus, keyword search is fundamental to retrieving information most relevant to the query keywords. Latent Semantic Indexing (LSI) (Han et al., 2022; Aggarwal, 2016), known as Singular Value Decomposition, utilizes a matrix to the bipartite network of keywords and documents to assess similarity and generalized searches. However, LSI has two fundamental problems with vector space retrieval. (1) LSI cannot be expressed in Negation. (2) Boolean conditions cannot enforce it, and the SVD has a high computational cost.

By combining computer science and statistics, machine learning creates graph mining models that work better when exposed to relevant data than when given specific instructions. The benefit of Co-occurrence is focused on relatedness rather than similarity, which expresses how many traits two items share. Feature extraction is a primary problem in classical machine learning models, where the programmer must precisely specify the features that the computer is to be trained to detect. These attributes will facilitate decision-making. Deep neural networks are an option if simple pattern recognition remains problematic as pattern complexity increases. Capturing the latent information of the Keyword Search Co-occurrence analysis, our proposed method employs a multi-layer graph embedding and graph neural network for Effective keyword Search Co-occurrence Multi-layer Graph Mining.

6.3 Multi-layer graph embedding and graph neural networks

Property graphs are converted into a vector or a collection of vectors through graph embedding. Instead of focusing on a local structure, embedding approaches offer a global picture of latent relationships (Rossi et al., 2021; Makarov et al., 2021). Three basic inference tasks can be easily implemented in space using graph embedding: Finding a query vertex’s closest neighbors in the embedding space is the first step in Node classification. The second step is to Link suggestions of nodes that will be connected in the future or missing, and Community Detection finding potential edges from the input graph is the third step. Before learning multi-layer representations, graph-based representation learning to graph embedding (such as Deep Walk (Perozzi et al., 2014), LINE(Tang et al., 2015), and node2vec (Grover & Leskovec, 2016)) characterize vertex neighborhoods using random walks techniques. However, these techniques are based on a single graph. As far as we know, a thorough investigation has yet to be done on the graph embedding technique for multi-layer graphs.

The decoder is an early technique for learning multi-relational embedding, called RESCAL, as described by Hamilton (2020). A critical family of decoders labels relationships as translations in the embedding space; TransE published their model in 2013. A second well-known type of research generalizes the dot-product decoder from graphs to build multi-relational decoders, as opposed to developing a decoder based on translating embedding. The method is commonly referred to as DISTMULT. One major limitation of the DISTMULT decoder is that it cannot encode directed and diversified relations, which include most multi-relational graph relation types. Each relation type is individually embedded by Deep Graph infomax for attributed Multiplex network embedding (DMGI) (Park et al., 2020), which then computes network embeddings to maximize globally shared features to detect communities. Through the discriminator, constructive learning takes place on each layer between the original network and a corrupted network.

Knowledge graph embedding techniques generate random walks and embedding vectors based on meta-path schemes. Meta Path Aggregated Graph Neural Networks (MAGNN) (Fu et al., 2020) provide a better community discovery method, which uses multi-information semantic meta-pathways to identify multi-layer structures in graph layers. By collecting semantic variants over nodes and meta paths, MAGNN leverages the attention mechanism in its embeddings. Hamilton (2020) embeds network schema and meta path are also gotten through Heterogeneous Information Networks.

The two works mentioned above employ meta pathways to promote community, but creating meaningful ones takes a great deal of topic expertise. By enlarging graph mining into a multi-layer network, a researcher proposes a generic multi-layer graph embedding framework that can be applied to any graph embedding approach model for single-layer graphs. Three approaches have been modeled to project a multi-layer network into a continuous vector space: "network aggregation," "results aggregation," and "layer co-analysis" (Liu et al., 2017). However, to consider the impact of interlayer interactions, "layer co-analysis" extends any single-layer network embedding technique to a multi-layer network. Our work differs from these approaches as we study how to perform Effective Keyword Search Co-occurrence Multi-layer Graph Mining utilizing enhanced Multi-layer Graph Embedding and Graph Neural Networks that provide insights about data and explainable conclusions.

7 Conclusions

While similarity search is helpful in many applications, multi-layer networks make meaningful measures of objects of diverse types more and more crucial. One compelling problem setting that arises from a real-world application, like a keyword search in large publication databases, is when the database can be viewed as an entity relation graph between the paper, authors, and words, or it can be used to characterize various kinds of connections (like clicks, favorites, adds, etc.). A different term for graph representation in low dimensional vectors that can be useful for network research tasks and edge and node prediction is graph embedding. The rapid emergence of graph neural networks, a technical mix of deep learning and graph data mining, illustrates their ability to model and capture complex relationships in graph-based data. Thus, an Effective keyword Search Co-occurrence that considers the significance and relatedness between nodes and edges in real-world applications where Multi-layer Graph Embedding and Graph Neural Network is utilized is presented using graph mining where users can utilize and locate communities that are related to them using our proposed KSCOMLGs. Furthermore, data relations from neighbors, edges, nodes, or multi-layer networks can be concurrently recognized with a particular focus on deep learning to attain efficient outcomes. Graph data are frequently noisy and imprecise in real applications; hence, it is usual to describe them as uncertainty graphs here. Each pair of edges is assigned a worth, indicating the chance it exists. Thus, a likely research direction is to extend this present work using uncertainty in medical data analysis.