1 Introduction

The online interaction of users has tremendously increased with the availability of different social applications and networking sites like Facebook, Twitter, WeChat, YouTube, Skype and many others. People with similar behaviors using social applications are linked with each other. The community structure in different networks like Internet, email, transportation, biochemical, citation and social networks shows a set of nodes with dense connections within community and sparse links out of community (Newman and Girvan 2004). The detection of such community structures in network systems is one of the key issues, known as community detection. The structures revealed by detecting communities in different networks are meaningful such as online and contact-based groups in social networks, customers’ groups with similar interests in purchasing from online social networks, clusters of scientists in interdisciplinary collaboration networks. (Fortunato 2010).

Modularity maximization has been one of the most popular methods for community detection over partitioning a network (Newman and Girvan 2004; Newman 2006, 2004; Leicht and Newman 2008). Algorithms for modularity optimization including greedy algorithms such as Fast general hierarchical method, greedy optimization-based agglomeration algorithm, three forms of CNM algorithm by integrating consolidation ratio metrics, heuristic method by optimizing modularity (Newman 2004; Clauset et al. 2004; Wakita and Tsurumi 2007; Blondel et al. 2008), sampling technique using unsupervised method comprising of the proximity estimation and validation of hierarchical group of networks (Sales-Pardo et al. 2007), Eigen spectrum, spectral graph tri-partitioning algorithm, objective function maximization by proposing two new spectral methods, heuristic algorithm Qcut and recursive algorithm HQcut, Kcut spectral methods (Newman 2006; Newman 2006; Richardson et al. 2009; White and Smyth 2005; Ruan and Zhang 2008; Ruan and Zhang 2007; Newman 2013), extremal optimization algorithm (Duch and Arenas 2005), mathematical programming by proposing two unique linear programming and vector programming algorithms (Agarwal and Kempe 2008) and simulated annealing such as cartographic method and Monte Carlo methods (Guimera et al. , 2005a; b; Massen and Doye 2005; Medus. et al. 2005) have been proposed. This quality metric of network has been used as measurement of strength of community structure and is the difference between actual edges within community and expected edges in a randomized graph of same nodes and degrees. The degree is the number of edges connected to a node. This paper focuses on divisive clustering by maximizing graph modularity that add scores of every pair of nodes placed together in a single community.

Divisive clustering algorithms are ‘top-down’ in which all nodes are initially in a single cluster. The cluster splits recursively until each node forms its own cluster. Girvan–Newman algorithm (Girvan and Newman 2002) is a common divisive method that uses edge betweenness, the sparse connections between vertices of different communities, to determine the strength of edges and delete those edges whose has biggest betweenness until algorithm finds no edge for deletion. Another algorithm called Fast-Newman (Newman 2004) takes modularity as an objective function and gives optimal outcome when objective function indicated by Q has the highest value.

When dense clusters are selected which have sparse connections to the rest of the graph, this process is called community identification. In social networks, many overlapping of these communities are present with each node participating in many communities, which reveals the network features. Many approaches exist for community detection. However, the coupled-seed expansion method is effective as compared to many other existing algorithms such as Bigclam, OSLOM, SE, Demon, OMSTMO, LC, Ego-Splitting (Asmi et al. 2021). The modularity-based local community detection methods are widely used but also have some limitations to seed node selection and community instability. Considering the local modularity density and using Jaccard coefficient, the local communities can be formed by core area detection stage and the extension stage of the local communities which also provides efficiency and precision (Guo et al. 2021). A more generalized modularity measure called f-modularity when applied to simulated networks and also to the real-world market networks quantifies the community structure estimating the information existing between discrete random samples and big amount of value space (Guo et al. 2021). A more recent new algorithm which is slightly different from the graph neural network of unsupervised network community detection using modularity optimization has been proposed which is more efficient than fast Louvain method (Sobolevsky 2021). (v) Social networks and their analysis combine many techniques such as K-means clustering algorithm for many novel predictions such as drug target interactions using Bayes network, Naïve Bayes and SVM (Aghakhani et al. 2018).

Social applications network comprises different communication applications that facilitates different purposes including news sharing, marketing, entertainment, relationships, education, merchandising. Users of different social applications have more than one account and use these accounts for different purposes depending on the situation. In this case, Twitter is used for politics, YouTube for videos, WeChat for transactions, Facebook for profile information of products, WhatsApp for personal communications, Skype for meeting/interviews and Instagram for pictures.

In this research, we derive insights from a social applications network by creating a cosine similarity weighted graph of users. The cosine similarity is defined as counting same applications used by users divided by square root of the total applications used by one user multiplied by square root of the total applications used by other user. r-neighborhood technique is used for pruning edges of a network in which edges for a particular value of r are maintained while removing all other edges. It is hard to group a web of customers together present in r-neighborhood graph. Also, to determine whether a customer is present in a single or various communities, we use graph modularity maximization to make decisions about community assignments. Knowledge is extracted by analyzing the trends of social applications in order to forward advertisements, information, services, and recommendations to users. k-nearest neighbors’ technique is also implemented for deletion of edges from social applications network of users. Communities are detected using modularity maximization by divisive clustering approach from r-neighborhood graph and k-nearest neighbors’ graph. Gephi tool is also used to perform modularity maximization. All the three techniques indicate Twitter, YouTube and Facebook that are the most popular applications among other applications. However, modularity function of k-nearest neighbors has the highest value of 0.581 as compared to r-neighborhood and gephi tool which have values of 0.554 and 0.555.

2 Related work

2.1 Community detection

A review of various community detection metrics is presented and an efficient algorithm has been proposed that maximizes modularity density (Qds) (Chen et al. 2014). In another study, ten algorithms are re-implemented and evaluated on real-world datasets for community detection in a proposed framework (Wang et al. 2015). A new paradigm called HICODE is proposed to detect hidden community structures in many domains of real world. Experiments show that hidden communities exist in network (He et al. 2018). Community detection acts as a tool for analyzing network data, for example communities in social network defines the nature of social interactions among people.

There are natural divisions that exist in many complex systems and social networks that can be grouped into clusters having strong connections within the clusters and sparse links between them, known as community structure. In context of social applications, web has evolved and became a source of information helpful in analysis of web information using different models and brought intelligence through automation of web services (Cena et al. 2011). Improving recommender systems by describing different approaches used for recommendation and suggesting possible extensions for the limitations of mentioned approaches that can enhance the performance of the recommendation systems by forwarding different services and contents through web automatically (Adomavicius and Tuzhilin 2005). The hidden community structures in a social network that have to be explored in any social network are of great significance. To resolve this problem, graph compression-based community detection algorithms exist (Zhao et al. 2021) where the number of communities in a compressed social network with their initial community seeds is found out simultaneously. Addressing the heterogeneous properties of a vertex and using new probabilistic c-means model that uses attribute and structural similarities. This new model serves like fuzzy community detection that resolves the overlapping community detection problem (Naderipour et al. 2021). For stream graph, the local overlapping communities are detected at the end points of a newly found edge with common communities (Panchal 2021).

Based on the review of different empirical studies about the functionality and structure of a variety of networks, the task of community detection gives an insight into the core structure of networks. Developments in the statistical characteristics of different networks such as clustering, path lengths, degree distributions were mainly focused (Newman 2003). Due to complexity of the internal structure, these networks are defined as complex networks. Mathematical models, used to represent networks, are called graphs. In modern graph theory, the problem of partitioning a graph is also known as community detection (Diestel 2012; Bollobás 1998). Typically, there are two types of graph clustering algorithms with the first type having condensed regions of nodes and second type cluster different graphs using edges and structural characteristics (Aggarwal and Wang 2010). Different solutions include a new efficient, scalable algorithm based on recursive shingling and clustering steps that specifies huge dense subnetworks. A label distribution algorithm that assigns unique label to each community requires linear time for computations and is therefore less expensive (Gibson et al. 2005; Raghavan et al. 2007).

2.2 Modularity optimization

A new method related to the community structure is developed in many social and biological networks for the detection of communities. This new technique is based on the centrality indices to find the boundaries of the communities (Girvan and Newman 2002). This quality function having certain drawbacks like it may be unable to specify modules below a scale depending on the network size and degree. This drawback is validated in different real and artificial biological, technological and social networks (Fortunato and Barthelemy 2007; Wakita and Tsurumi 2007). Modularity is widely used because of the capability of auto-detection of optimal number of clusters by utilizing k-nearest neighbor graph construction and applying distance modularity by modifying Louvain algorithm (Ruan 2009; Shakarian et al. 2013). A graph with high modularity value indicates quality partitions and a good community structure. There are many modularity maximization methods introduced. One of the hierarchical method that maximizes modularity is Louvain algorithm (Adomavicius and Tuzhilin 2005). On large-scale networks, this algorithm runs very fast besides its ease of implementation and also avoids the resolution limit of modularity. A famous scholar Fortunato recommended it as best performance modularity optimization algorithm for community detection (Fortunato 2010).

2.3 Nearest neighbors

Neighborhood graphs model relationships among data points in various fields of machine learning including clustering, semi-supervised learning or dimensionality reduction. The two popular techniques are the r-neighborhood graph in which a specific point is connected to other points for a particular value of r and k-nearest neighbor graph (kNN) in which a point is connected to k-nearest neighbors. kNN is a popular classification technique (Samanthula et al. 2014; Xu et al. 2018; Wu et al. 2008; Cover and Hart 1967) that is used in different fields such as novel Voronoi-based kNN approach in spatial databases that outperforms online distance-based methods (Kolahdouzan and Shahabi 2004), gene classification by combining genetic algorithm and kNN method GA/kNN for assessment (Li al. 2001), and fault detection using kNN method (FD-kNN) in semiconductors is developed to handle nonlinearity in operation data (He and Wang 2007).

3 Community detection from business perspective in social networks

Social network analysis is based on community detection with nodes and edges representing the actors and their social connections, respectively, in a social graph which are commonly web in a dense manner with highly related and yet separated groups from each other. A lot of work has been done in this field of social network analysis, and many methods have been proposed in this regard (Chunaev 2020). The businesses around the world are growing due to social media boom as their target audience join and use these social networks in a regular manner and businesses have to take advantage of these social media platforms like Facebook, Twitter or Instagram to reach their highly targeted potential customers. Social media users and customers log into their accounts regularly with 70 percent of users logging into at least one per day (Pew Research Center 2021) which is the best source of staying on the top of customers’ minds with effective digital marketing strategy.

With Facebook having almost over 2.7 billion active users around 180 countries and Twitter having 1 billion active users per month worldwide, the business owners should embed and understand the relevance of social networks and should design their communication strategies. The rapid growth of personal communities to business communities in online social networks proves it to be a highly cost-effective way of engaging with the customers with a significant value. Targeting the right customers on right social media platforms should be the integral part of any business plan with customer behaviors, demographics and trend analysis being properly worked upon in social media marketing strategy.

3.1 Contributions

Social media applications usage has changed the business dynamics in a tremendous manner, making it the only way forward to the future. This research serves to be a part of the new wave of making smarter business decisions by keeping near to the customers as much as possible. Both internal and external communications are crucial for the survival and progress of the businesses. Following are the contributions of the research:

  • r-neighborhood, k-nearest neighbors’ methods are used for removal of edges from network.

  • Modularity maximization using divisive clustering approach is used for the detection of communities.

  • Gephi tool is also used for detection of communities.

  • The modularity score using r-neighborhood, k-nearest neighbors, and gephi tool is compared determining which technique results in better detection of communities.

  • Knowledge is extracted according to popularity of social applications used in each community.

  • The aim is to improve the scope, quality, richness, depth, interactivity and reach of the targeted contents using social applications popularity in a particular community. The effective decisions can also be taken among different fields such as improvement in business, i.e., forwarding product contents through particular social application maximum usage in a community. Community detection is performed by maximizing modularity using r-neighborhood, kNN, gephi and results are compared.

4 Methodology

4.1 Research framework

This research presents different social applications with different functionalities such as transactions, politics, video and profile information accessed through different mediums including mobile, tablet, computer and iPad for particular purpose. A set of users using those social applications is considered. The similarity between users is determined using cosine similarity, and a network of similar users is constructed. r-neighborhood and k-nearest neighbor’s graphs are constructed by removing unnecessary edges from user similarity network. Communities are detected in r-neighborhood and kNN graphs using modularity maximization by divisive clustering approach. Gephi tool is also used for communities’ detection using modularity maximization. Knowledge is extracted by determining which technique gives better and clear interpretation of communities.

The metadata consist of two types of data. Each social application used consists of functionality, purpose, application number and medium of access. It is also known that which user used the particular application by specifying the application number. In this paper, we have 32 instances representing different social applications accessed more than once through different mediums for different purposes and a list of about 324 usages of these applications by 100 users.

4.2 Cosine similarity weighted graph construction

A user-to-user graph is constructed using cosine similarity matrix that shows how much users are similar to each other in terms of usage of social applications. Consider two vectors (1, 1) and (1, 0) where 1 represents usage of application by a user. The cosine similarity between users is calculated as (Foreman 2013):

Matching common applications usage between the two users divided by square root of total applications used by first user multiplied by square root of total applications used by second user.

Cosine (45) = 1 common application/SQRT {total applications used by first user} * SQRT {total applications used by second user} = 0.707.

This weighted graph using cosine similarity shows each pair of users having either a zero or nonzero value showing the strength of an edge, an affinity matrix.

4.3 r-Neighborhood graph construction

An r-neighborhood graph for set of nodes with vertex set V and edge v, such that the edge v ϵ V to its similar nodes in V for a given similarity, i.e., cosine similarity is constructed. To create adjacency matrix that comprises edges of certain strength for a given set of points x1, x2, x3……xn, the r-neighborhood graph is Gn, r: For an edge from point xi to xj, Aij is 1, if Simil (xi, xj) ≥ r, for all 1 ≤ i, j ≤ n, i ≠ j. In this case, r-neighborhood graph is produced for r = 0.5, in which edges are removed that has strength between users with similarity less than 0.5.

4.4 k-Nearest neighbors graph construction

In k-nearest neighbors graph, each node is connected to its nearest neighbors for a k value. Given a set of nodes P, the kNN graph is G (P, E), whereas E = {(u, v Simil (u, v)), vϵ NN(u)simil} where NN(u)simil is the nearest neighbor for each u ϵ P. In this case k = 5, we construct 5NN graph from the affinity matrix where five edges that have highest affinities are coming out of each node. Adjacency matrix is generated from affinity matrix, l represents the fifth highest affinity of each user, so Auv is 1, if Simil (u, v) ≥ l, for all 1 ≤ u, v ≤ n, u ≠ v.

4.5 Modularity maximization using divisive clustering

Modularity maximization using divisive clustering is used for community detection. This method assigns scores to each pair of nodes in the r-neighborhood network. Divisive clustering splits the graph into two communities and uses an optimization algorithm for different community assignments in order to get maximum modularity score. The two communities are further divided into four and so on, until modularity maximization stops and gives optimal communities. Mathematically,

$$Q = \sum\limits_{{c_{i} \in C}} {\left[ {\frac{{\left| {E_{{c_{i} }}^{{{\text{in}}}} } \right|}}{{\left| E \right|}} - \left( {\frac{{2\left| {E_{{c_{i} }}^{{{\text{in}}}} } \right| + \left| {E_{{c_{i} }}^{{{\text{out}}}} } \right|}}{{2\left| E \right|}}} \right)^{2} } \right]}$$
(1)

In the above equation, C represents all communities, where ci refers to a particular community, \(\left|{E}_{{c}_{i}}^{in}\right|\) shows edges of nodes inside community ci, \(\left|{E}_{{c}_{i}}^{out}\right|\) are the links to nodes of other community and \(\left|E\right|\) represents total edge count in a network.

4.6 Knowledge extraction

Knowledge is extracted by determining the maximum usage of social applications in a particular community so that targeted contents can be forwarded to a community using those popular social applications. The modularity maximization for community detection is also performed using gephi tool. Knowledge extracted is compared with r-neighborhood and k-nearest neighbors’ techniques for the same purpose of application popularity.

5 Results and discussion

This section presents the results and analysis. In Table 1, 32 instances of social applications including Twitter, WeChat, YouTube, Facebook, Instagram, WhatsApp and Skype are accessed for different purposes such as news sharing, merchandise, entertainment, marketing, brands information, educational/professional, greetings/personal.

Table 1 Social applications

In Table 2, for simplicity and specificity, metadata of only 13 random users from a list of 100 users accessing social applications are presented. User1 and User2 use Twitter through mobile, tablet and computer, User10 accesses YouTube. The User100 uses WeChat, Twitter, Facebook, YouTube and Instagram like other users.

Table 2 List of users

In Table 3, we present user-to-user cosine similarity matrix that shows user similarity within range 0–1, with 1 having highest similarity in the context of application usage. As the user has maximum similarity to himself but our interest is to construct graph of users that are similar to one another in terms of applications usage and not to himself, so those values are made 0. Other values show how much similar applications are used by the two users. A value of 0.5 or above shows that more than 50% of the applications used by the two users are same.

Table 3 Cosine similarity weighted matrix

A social applications network of users is shown in Fig. 1, with users similar to each other from the affinity matrix. Nodes representing users are connected to each other with value above zero in the affinity matrix and no node is connected to itself because those values are made zero representing no edge. The usage of applications by different users is converted into thousands of edges. Even if a single application is common between two users representing a very small value of cosine similarity, it is shown by an edge in the network.

Fig. 1
figure 1

Social applications similarity network of users

5.1 r-Neighborhood graph construction

To produce r-neighborhood graph, adjacency matrix is created, as shown in Table 4 from user similarity matrix that comprises only those edges that are of certain strength and not having small cosine similarity value that may appear due to random usage of a single application. In this case, r = 0.5, comprises 20 percent of relationships among users that have highest affinities. The values above r = 0.5 are made 1, whereas the values below are made 0. The value of 1 represents an edge between users, whereas 0 means that the edge does not exist in the adjacency matrix.

Table 4 Adjacency matrix

In Fig. 2, after the removal of unnecessary edges from the similarity network of users for r = 0.5, the number of edges is reduced and it can be observed that one user is alone and has value less than r = 0.5 showing no edges to other users.

Fig. 2
figure 2

r-neighborhood graph

Communities are detected using modularity maximization by divisive clustering. We assign scores to each pair of nodes using r-neighborhood graph as shown in Table 5 and then perform divisive clustering until modularity maximization stops. The negative score indicates that they do not share an edge and placing them in a community will give a negative modularity score; however, positive score between users indicates that they share an edge and are similar and placing them in the same community gives maximum modularity score.

Table 5 Scores for pair of nodes

The modularity maximization problem is approached using divisive clustering by partitioning the graph. The modularity score of every user from Table 5 is calculated in such a way that if a user is assigned to community 1, scores of all those users from the respective row will be added that are also assigned to community 1. The total score is calculated by summing all the modularity scores in Table 6 and normalizing it by the total stub count of the network. The upper limits determine the modularity score of each community where upper limit 1 is the modularity score of community 1 and upper limit 2 is the modularity score of community 2. The first division of graph results into community 1 and community 0 with a total normalized modularity score of 0.464 as shown in Table 6.

Table 6 Community assignment on first division

To maximize modularity score, we further split the communities. In Table 7, further division of users is shown in different communities; however, the first partition of users is presented under the last assignment column, and total modularity score increased from 0.464 to 0.554. The modularity score after second division is calculated in the same way by adding scores of users if they are placed in the same community, since the two users will be in the same community only if their last and current community assignment is same such as User100 and User11 in Table 7. However, User 10 and User 100 are currently assigned to community 1, whereas their last community assignment is different so they will be in different communities. The modularity score after third division is calculated in the same way.

Table 7 Community assignment on second division

In Table 8, the users are further divided into communities but the total modularity score remains the same and communities are not further divided. The second partition is shown as last Assignment1 in Table 8. To analyze the user assignment to different communities, they are first encoded into decimal shown in Table 9. Div1 and Div2 represent the first and second community assignments.

Table 8 Community assignment on third division
Table 9 Coding communities

5.2 Knowledge extraction

To analyze the trends of social applications in different communities, we determine the popularity of social applications by counting the number of times each application is accessed by users in every community. The partitioning of the graph results in a total of four communities represented by C1-C4 in Table 10. The first community C1 uses Facebook as the most popular social application; however, Twitter in the second community under C2 has highest values where YouTube is the popular application in C3 community. Community four is not clear but accesses Facebook in large number along with other social applications.

Table 10 Social applications popularity using r-neighborhood

In Figs. 3, 4, 5 and 6, as compared to Table 10, it can be observed that Facebook is used in community one to which only a single member is assigned as shown in r-neighborhood graph, whereas community two is a Twitter community, which can be interpreted as political community. Community three members are fond of YouTube and are not interested in politics, which can be named as entertainment community. The last community which is not quite clear but most members use Facebook and Instagram than the rest of the applications. The members of this community are found to be socially interactive.

Fig. 3
figure 3

Social application popularity

Fig. 4
figure 4

Social applications popularity

Fig. 5
figure 5

Social applications popularity

Fig. 6
figure 6

Social applications popularity

5.3 Community detection using gephi

In the following Table 11, communities are detected using r-neighborhood adjacency matrix and modularity maximization is performed using gephi tool. Five different communities are detected from 0 to 4 but for simplicity, the number of users presented in Table 11 is assigned from community zero to community three. The total modularity score using gephi is 0.555, whereas modularity score was 0.554 using divisive clustering.

Table 11 Community assignment Using Gephi

The community assignment of users is shown in Fig. 7, where users are divided into five different communities. Nodes in the red show community one members, whereas the orange mesh represents community two. Community three users are the pink ones. Gray nodes are for community four users, and a single member is represented by blue node in community five which can be compared against Table 12, where only few applications are used.

Fig. 7
figure 7

Community assignment of users

Table 12 Social applications popularity using gephi

5.4 Knowledge extraction using Gephi

Knowledge is extracted by determining the popularity of social applications in the assigned communities, presented in Table 12. There are five communities showing social applications usage. Twitter is accessed in community one. You Tube and Facebook are the popular social applications in community two, three and four along with other social applications. Community five is assigned a single member who uses Facebook, Instagram and WeChat.

In Fig. 8, Facebook, YouTube and Twitter are the obvious social applications used in all five communities which can be compared against Table 12. The results are almost similar such as community one can be regarded as Twitter community following politics, community two is the same as You Tube community. In community three and four, Facebook is the most common social application used apart from Skype and WhatsApp.

Fig. 8
figure 8

Social applications popularity

5.5 kNN graph construction

Different values of k can be evaluated but in this research, we perform comparison of k-nearest neighbors with r-neighborhood and gephi tool. Since r-neighborhood and gephi tool resulted in a maximum of five different communities, therefore a graph for k = 5 is constructed. Table 13 shows 5NN adjacency matrix where each user has a value of 1 in the respective row to five different users that have highest affinities in the similarity matrix than other users of the same row and are considered the nearest neighbors of that particular user.

Table 13. 5NN Adjacency matrix

The kNN graph for k = 5, called 5NN graph, using adjacency matrix is constructed from user similarity graph in which all edges are removed from each node leaving it with five nearest neighbors that have highest affinities, as shown in Fig. 9.

Fig. 9.
figure 9

5NN graph

In Table 14, we show community assignment of users to five different communities from 0 to 4 using kNN graph with a modularity score of 0.581.

Table 14 Community assignment of 5NN graph

The visualizations for the community assignment of 5NN graph are shown in Fig. 10, where users are divided into five different communities. Nodes in the red color present community zero, where the users of community one are pink in color. The nodes in yellow present users of community two. Community three is presented by gray color and finally nodes in blue belong to community four.

Fig. 10
figure 10

Community assignment of users

5.6 Knowledge extraction using 5NN graph

The five different communities are presented in Table 15 for knowledge extraction where each community represents the popular applications among other applications. Community one is the twitter community. Facebook has the highest values of usage in community two and three, whereas You Tube is popular in community four and five.

Table 15 Social applications popularity using kNN

The Table 15 shows different social applications used in different communities. The knowledge extracted is shown below with the help of figures that can be compared against Table 15.

It can be seen in Figs. 11, 12, 13, 14 and 15 that community one again is a Twitter community. In community two and three, most users are Facebook followers. Community four and five are clearly an entertainment and social communities with You Tube and Facebook accessed more than other applications.

Fig. 11
figure 11

Social applications popularity

Fig. 12
figure 12

Social applications popularity

Fig. 13
figure 13

Social applications popularity

Fig. 14
figure 14

Social applications popularity

Fig. 15
figure 15

Social applications popularity

The results of social applications popularity in different communities are almost the same that strengthen the knowledge extracted. All the three methods define communities with maximum usage of social applications that can help in forwarding information, advertisements, services and recommendations to a particular community through that application. In the context of comparison of r-neighborhood, kNN and gephi tool, the modularity score of kNN method has maximum value of 0.581 and gives better and clearer interpretation of communities detected in comparison with r-neighborhood and gephi tool. The kNN technique results in community one, three, four and five as Twitter, Facebook and YouTube communities; however, only community two is blur where the Facebook is observed as being most widely used application. The modularity score using gephi tool is 0.555 which is less compared to kNN as a result of which communities detected are not clear. The users of community one and two use Twitter and YouTube as major applications; however, the other three communities are not clear with any particular social application popularity which can be seen in Table 12. The modularity score using r-neighborhood is 0.554 and results in four different communities detected. Among the four communities, community two is defined as Twitter community with other three communities not clearly defining the social application popularity in comparison with kNN.

6 Conclusions

This paper is related to extracting the knowledge by detecting communities in social applications network for social applications and using popularity of those applications in communities; different contents related to politics, business products, sports, movies and other services can be forwarded. Communities are detected using modularity maximization by divisive clustering. Once users are assigned to different communities, the popularity of social applications is determined by counting the number of times each social application is accessed by users in that community. It is observed that performing community detection using r-neighborhood, k-nearest neighbors and gephi gives similar results such as Twitter, YouTube and Facebook were the most common social applications used among different communities that reflects that people in such communities are mostly political, entertaining and social in nature. Targeted contents can be forwarded to these communities using those social applications. Also, one community may not be well aware of the other because of the difference in core functionality, for example, YouTube followers may not be much active about the political changes taking place in the world. So, by knowing the social application popularity in a community, the people can be informed like sharing news on YouTube, Facebook and advertising business products on Twitter, YouTube and forwarding the sports, movies contents on Facebook and Twitter.