Data Description Through Information Granules: A Multiview Perspective

Balamash, Abdullah; Pedrycz, Witold; Al-Hmouz, Rami; Morfeq, Ali

doi:10.1007/s40815-020-00903-z

Data Description Through Information Granules: A Multiview Perspective

Published: 27 July 2020

Volume 22, pages 1731–1747, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Fuzzy Systems Aims and scope Submit manuscript

Data Description Through Information Granules: A Multiview Perspective

Download PDF

Abdullah Balamash^3,4,
Witold Pedrycz^1,2,3,
Rami Al-Hmouz³ &
…
Ali Morfeq³

271 Accesses
3 Citations
Explore all metrics

Abstract

In light of the remarkable diversity of data, arises an interesting and challenging problem of their description and concise interpretation. In a nutshell, in the proposed description pursued in this study, we consider a framework of information granules. The study develops a general scheme composed of two functional phases: (i) clustering data and features forming segments of original data and delivering a meaningful partition of data, and (ii) development of information granules. In both phases, we discuss a suite of performance indexes quantifying the quality of segments of data and the resulting information granules. Along this line, discussed are collections of information granules and their mutual relationships. A series of publicly available data sets is used in the experiments—their granular signature is quantified, and the quality of these findings is analyzed.

Optimised Information Abstraction in Granular Min/Max Clustering

Subjective Interestingness in Exploratory Data Mining

Design of granular interval-valued information granules with the use of the principle of justifiable granularity and their applications to system modeling of higher type

Article 04 November 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Data analysis and data analytics, in general, are inherently aimed at revealing and description of interpretable and stable relationships among variables as well as quantifying their changes over time and space. Along with large volumes of data and their diversity, comes a genuine need to develop a flexible, user-centric and computationally efficient environment producing meaningful results.

The key research hypothesis is that in the realization of the above stated agenda of data analytics, the concepts of a multiview perspective [1,2,3,4] of data with the use of information granules play a pivotal role both at the methodological as well as algorithmic level of ensuing constructs. The formation and engagement of the multiview organization of processing of data contributes in a tangible way to the efficient way of solving of a spectrum of tasks of data analysis, especially facilitating a thorough user-centric interpretation of results and producing readable yet fully legitimate outcomes supported by the existing experimental evidence. The varying (adjustable) perspective delivered by information granules helps establish a sound tradeoff between the representation capabilities of various views at the data and the efficiency of fundamental categories of tasks of data science such as association analysis, classification, prediction, link analysis and others.

Another important research hypothesis is that by engaging the multiview perspective of data analytics of the same data, we establish a coherent and holistic view of the data and ensuing models under consideration. Data are represented through a collection of information granules. The diversity of the constructed granules manifests itself by the fact that information granules are built based on subsets of data and subsets of features while the quality of granules is being assessed.

The term multiview data analysis has been used in the literature in the past, however this term comes with a different meaning. The study reported in [5] offers an interesting view focused on feature selection. In our case, the multiview character of information granules is concerned with the perspective established with regard to mutual organization involving some sections of the data and subsets of features. Furthermore, the multiview is formed in the conceptually appealing and computationally sound setting of information granules.

A number of well-focused research aims of this study are presented (subsequently leading to the formulation of a coherent and comprehensive methodological framework of the investigation):

i.
A multiview formation of data subspaces leading to dimensionality reduction, enhanced readability (interpretability) of the data and increased efficiency of ensuing analysis (such as e.g., prediction, association analysis, or classification). The varying (adjustable) levels of detail captured by the individual views (perspectives) are helpful in reducing computing overhead of individual optimization tasks.
ii.
The multiview facets built for the data are also concerned with granulation of the feature space, therefore leading to so-called meta-features (viz. collections of features, which exhibit some semantics and offer a view at the data at the higher level of abstraction). Both of these categories of views outlined in (i)–(ii) give rise to information granules of meta-features and information granules established in the joint data-feature space. Each view (perspective) gives rise to its focused perception of the same data and ensuing results produced in this setting.
iii.
Formation of optimization criteria quantifying the quality and practical relevance of the multiview perspective at the data. The essential criteria fall under the umbrella of representation capabilities of the data (which are commonly linked to the inevitable compression error) and the relevance of the established cognitive perspective in solving main categories of data analysis problems. An important and intriguing problem comes with a way on how to balance these two requirements and cope with their conflicting nature (higher representation capabilities do not directly translate into more efficient and computationally sound performance of data analysis).
iv.
While numeric prototypes are sound initial descriptors of segments of data and features, they are elevated to granular counterparts, which in turn offer better abstract and holistic descriptors of data.

The ultimate objective is to derive structural information [6] in the data and feature (attribute) space and construct information granules on combinations of subsets of data and features. Their quality is evaluated in view of various criteria depending on further use of information granules in system modeling (classification and prediction) and data representation. The constructed information granules are ranked with respect to the pertinent performance criteria (either reconstruction-based, prediction-oriented, or classification-based). An overall scheme of processing underlying a way of moving from data to information granules is displayed in Fig. 1. Here, the main phases are highlighted along with numeric and granular descriptors. The overall scheme outlined here entails also a significant level of originality as the comprehensive concept and its algorithmic environment have not been investigated.

The data-feature segmentation can be concisely captured in the following way:

$$\left( {D1,F1} \right) \ldots \left( {D_{i} ,F_{j} } \right),\quad i = 1,2, \ldots ,c;\quad j = 1,2, \ldots ,r,$$

(1)

where the data and feature sets are exhaustive and mutually exclusive, namely

$$\begin{array}{*{20}c} {\mathop \cup \limits_{i = 1}^{c} D_{i} = \varvec{D} D_{i} \cap D_{j} = \emptyset ,i \ne j} \\ {\mathop \cup \limits_{i = 1}^{r} F_{i} = \varvec{F} F_{i} \cap F_{j} = \emptyset ,i \ne j} \\ \end{array} ,$$

(2)

where c and r are the number of segments present in data and feature space.

The study offers some original insights into the problem of data description that have not been studied in the past: (i) the development of information granules in the data and feature space delivers a new focused view at the essence of the overall data set; (ii) the ensuing information granules built on a basis of numeric prototypes establish a so-called granular blueprint of data and help focus on the essence of the relationships present there, and (iii) the construction of classifiers and predictors at the granular level by engaging information granules as a backbone of such constructs.

To systematically organize the presentation on the concepts and their construction, the paper is structured as follows. Section 2 elaborates on the development of subsets of data and features (data views) with the use of fuzzy clustering, Fuzzy C-Means (FCM) [7], being more specific. Subsequently, in Sect. 3, the characterization of these views is offered through several performance indexes, say a reconstruction error, classification content and prediction content of numeric representatives of data views. Section 4 is devoted to the construction of information granules through the principle of justifiable granularity. Information granules form a blueprint of classifiers and predictors; these topics are covered in Sect. 5. Experiments using publicly available data are presented in Sect. 6. Conclusions and directions of future research are included in Sect. 7.

2 Development of Subsets of Data (Clusters) Through Clustering Completed in Data Space and Feature Space

Information granules are commonly constructed with the help of clustering techniques [8] regarded as a prerequisite design vehicle. Clustering is regarded as a sound departure point of further constructs. Here, we consider the Fuzzy C-Means (FCM) algorithm as a representative of vehicle of clustering. While the FCM is commonly used to cluster data, it can be also considered to cluster features, viz. build a collection of features. In what follows, we consider patterns (data) x₁, x₂,…, x_N expressed in the n-dimensional space of real numbers Rⁿ. Recall that clustering realized by the FCM algorithm returns a collection of prototypes and a partition matrix. The number of clusters in the data space is set to c, and the number of clusters in the feature space is set to r. In what follows, we recall the essence of building data segments and feature segments.

2.1 Clustering in the Data Space

The FCM is guided by the following well-known objective function:

$$Q = \sum\limits_{i = 1}^{c} {\sum\nolimits_{k = 1}^{N} {u_{ik}^{m} } } ||{\varvec{x}}_{k} - {\varvec{v}}_{i} ||^{2} ,$$

(3)

where c stands for the number of clusters, m is a fuzzification coefficient (m > 1) and ||.|| is a weighted Euclidean distance [9, 10], namely $||\varvec{a} - \varvec{b}||^{2} = \sum\nolimits_{j = 1}^{n} {\frac{{(a_{j} - b_{j} )^{2} }}{{\sigma_{j}^{2} }}}$, dim(a) = n with the weights being the standard deviations of the corresponding variables. The optimization (viz. partitioning the data) carried out in the data space is realized iteratively: one starts with a randomly initialized partition matrix U and then updates the parameters to be optimized, viz. the partition matrix and the collection of the prototypes v₁, v₂,…, v_c.

Note that the partition matrix is fuzzy, viz. the entries assume values in between 0 and 1. In other words, fuzzy sets formed by the FCM embrace almost all data but with some degrees of membership [11]. To form the constructed subsets of data, we make them Boolean (two-valued) by admitting those data which belong to the ith cluster to the highest extent, viz. D_i = {x_k| u_ik= max_j=1,2,…,c u_jk}.

In conclusion, one can regard the prototypes v_i as the concise numeric descriptors of D_i. The prototypes are just a manifestation of the data composing the clusters and as such are the most meaningful outcomes of fuzzy clustering to be used in further investigations.

2.2 Clustering in the Feature Space

When it comes to revealing structure in feature space, we reformulate the problem and look at the objects that are subject to clustering. Let us organize the original data into vectors positioned in R^N, namely z₁, z₂,…, z_n, where z_j = [x_j1 x_j2 … x_jN], j = 1, 2,…, n.

The objective function guiding the process of clustering of features (thus building subsets of features) is expressed as follows:

$$Q = \sum\limits_{i = 1}^{c} {\sum\limits_{j = 1}^{n} {g_{ij}^{m} } } ||\varvec{z}_{j} - \varvec{t}_{i} ||^{2} .$$

(4)

Here, the distance is expressed as follows:

$$||\varvec{z}_{j} - \varvec{t}_{i} ||^{2} = \sum\limits_{k = 1}^{N} {(z_{jk} - t_{ik} )^{2} } .$$

(5)

The partition matrix G conveys crucial information about the subsets of features forming so-called met features. The features belonging to the jth cluster to the highest extent are denoted by F_j. In virtue of identifying elements of the partition matrix F, the subsets F_j, F_l, etc., are mutually disjoint.

In summary, the results of clustering completed in the data and feature space come as data sets ad feature sets. We form all possible combinations of subsets produced by the clustering completed in the data space and feature space. For instance, the subset (D_i, F_j) describes the data belonging to D_i and having features belonging to F_j. Having c and r clusters in the data and feature space, respectively, we have cr subsets (segments) in the Cartesian product of these two spaces. In what follows, we evaluate the quality of such subsets; by computing the pertinent measures one can order the subsets and evaluate their distribution.

3 Characterization of Data Views (D_i, F_j)

The performance of each cluster (data view) can be evaluated in various ways. Depending upon applications, there are several main indexes to be considered: (i) reconstruction error, (ii) classification content, and (iii) prediction capabilities.

First, we elaborate on the reconstruction criterion. In total there are cr information granules (clusters) each associated with the reconstruction error. The results obtained for the corresponding clusters are arranged in a matrix form organizing results for all combinations of D_i and F_j.

3.1 Reconstruction Error

Denote the reconstruction error produced for (D_i, F_j) by V_ij, i = 1, 2.., c; j = 1, 2,…, r.

This error expresses the representation capabilities of the prototype v_ij associated with (D_i, F_j) by computing the following expression V_ij:

$$V_{ij} = \frac{1}{{{\text{card}}({\varvec{D}}_{i} )}}\frac{1}{{{\text{card}}({\varvec{F}}_{j} )}}\sum\limits_{\begin{subarray}{l} k = 1 \\ {\varvec{x}}_{k} \in D_{i} \end{subarray} }^{N} {||x_{k} - {\varvec{v}}_{ij} ||_{{F_{j} }}^{2} } .$$

(6)

The same as in clustering algorithms, the distance ||.|| is the weighted Euclidean involving the standard deviation of the variables; let us emphasize that the calculations are completed for features forming F_j. The prototype v_ij standing in (6) is computed as follows:

$$v_{ij,l} = \frac{1}{{N_{i} }}\mathop \sum \limits_{{x_{k} \in D_{i} }}^{{}} x_{k} ,$$

(7)

where l runs through indexes of features forming F_j. Obviously, the coordinates of v_ij stand for those variables which form F_j. We organize the values of the reconstruction error into a c by r matrix form containing the values of the V_ij. Furthermore, the values of V_ij can be arranged in an increasing order by ranking the relevance of numeric descriptors by starting from the most relevant ones (viz. with the smallest values of this criterion).

Furthermore, depending on the nature of the data under consideration, the quality of information granules can be assessed by viewing their discriminatory and predictive content (abilities).

3.2 Classification Content of Information Granule

When dealing with classification problem, one determines a class content of information granule. Consider that in the classification problem we encounter t classes ω₁, ω₂,…, ω_t. The quality of the information granule (cluster) formed by (D_i, F_j) is assessed by looking at the distribution of data in (D_i, F_j) across different classes. With regard to the matrix of segments of data and features, the result is the same across the columns in the given row.

Then we calculate the probability of classes present in this information granule p_i= [p_i1 p_i2 …, p_it], i = 1,2,…,c. The less homogenous the information granule is, the higher its vagueness becomes. The quantification is realized by means of the entropy [12] measure defined as follows:

$$h\left( u \right) = \left\{ {\begin{array}{*{20}l} {2u,u \in \left[ {0,1/2} \right]} \\ {2\left( {1 - u} \right),u \in \left[ {1/2,1} \right]} \\ \end{array} } \right..$$

(8)

The vagueness of the ith granule is expressed as follows:

$$C_{i} = \mathop \sum \limits_{l = 1}^{t} h\left( {p_{il} } \right).$$

(9)

3.3 Predictive Content of Information Granules

Comparing the performance indexes is completed by looking at the diversity of output data falling within the bounds of the information granules. The diversity is quantified by means of the variance of the output variable of data falling within the bounds of (D_i, F_j). In more detail, recall that the data come in the form (x_k, y_k), where y_k is the output variable. We calculate the variance

$$\sigma_{iy}^{2} = \frac{1}{{N_{i} - 1}}\mathop \sum \limits_{{\left( {x_{k} ,y_{k} } \right) \in D_{i} }} \left( {y_{k} - \bar{y}} \right)^{2} ,$$

(10)

where

$$\bar{y}_{i} = \frac{1}{{N_{i} }}\sum\limits_{{\left( {x_{k} ,y_{k} } \right) \in D_{i} }} {y_{k} }$$

(11)

and

$$R_{i} = \sigma_{iy}^{2}$$

(12)

i = 1, 2,…,c.

4 Construction of Information Granules

The data subsets (segments) (D_i, F_j) embracing some data and formed over a certain collection of features give rise to information granules. The granules are built with the use of the principle of justifiable granularity [13,14,15,16].

In a nutshell, this principle produces an information granule in such a way it meets the requirements of coverage and specificity whose product is maximized; see Fig. 2.

The design of information granule is realized in such a way that the granule is (i) experimentally justifiable and (ii) is semantically sound. The experimental justification means that there is enough data embraced (contained) in the constructed granule making its existence legitimate in terms of the experimental data (hence the aspect of experimental justification). The semantic soundness states that the granule has to exhibit some interpretation capabilities and its precision needs to be sufficient enough. The coverage is expressed in the following way:

$$\text{cov} (G_{ij} ) = \frac{1}{{N_{ij} }}{\text{card}}\left\{ {{\varvec{x}}_{k} \in (D_{i} ,F_{j} )|||{\varvec{x}}_{k} - {\varvec{v}}_{ij} ||_{{{\varvec{F}}_{j} }} \le n_{j} \rho_{ij}^{2} } \right\}.$$

(13)

The interpretation of the coverage criterion requires some attention. This criterion quantifies the amount of experimental evidence behind the constructed information granule. In more detail, we count the number of data whose distance computed over all features present in F_j, say

$||x_{k} - v_{ij} ||_{{F_{j} }} = \sum\nolimits_{l = 1}^{{n_{j} }} {\frac{{(x_{kl} - v_{ij,l} )^{2} }}{{\sigma_{il}^{2} }}}$, with $\sigma_{il}$ being equal to the standard deviation of data residing within the corresponding segment of the data) equal or smaller than $n_{j} \rho_{ij}^{2}$ the threshold implied by the radius of the constructed information granule $\rho_{ij}^{2}$.

The specificity regarded as a measure of precision is given as follows:

$${\text{sp}}(G_{ij} ) = 1 - \rho_{ij} .$$

(14)

Note again that the highest specificity is achieved for the radius set to zero. However, at this case, the coverage is practically equal to zero. On the other hand, the highest coverage implies a zero value of specificity. The increase of coverage implies the decrease of specificity and vice versa. If these conflicting criteria have to be optimized, one has to proceed with a bi-criteria optimization or formulate the problem as a scalar optimization by taking an aggregate of the criteria. The product of coverage and specificity could serve as a viable alternative here.

An information granule V_ij associated with (D_i, F_j) is the pair (v_ij, ρ_ij), where the radius ρ_ij is optimized by considering the optimization problem

$$\rho_{{ij,{\text{opt}}}} = \arg {\text{Max}}_{{\rho_{ij} \in \left[ {0,1} \right]}} \left[ {\text{cov} \left( {G_{ij} } \right){\text{sp}}\left( {G_{ij} } \right)} \right].$$

(15)

The higher the value of the optimized product of coverage and specificity, the more suitable (relevant) the constructed information granule becomes. Proceeding with the constructed information granules (after maximization of (13)), we can conveniently display them in the coverage-specificity plane; see Fig. 3. The location of information granules helps identify the best of them in terms of the specificity and coverage criteria.

5 Granular Predictors and Classifiers

The collection of information granules G_ij = (v_ij, ρ_ij), i = 1, 2,…, c; j = 1, 2…, r, forming the concise description of data are regarded as building modules (blueprint) so that they can give rise to granular predictors and classifiers. We briefly outline the essence of the underlying architecture; noticeable is a role of the granules as a skeleton of the construct.

5.1 Predictors

Let us consider that for each information granule, there is a numeric representative of the output variable. Any input x is matched vis-a-vis the individual information granules giving rise to the corresponding activation (matching) levels u₁, u₂, …, u_cr:

$$u_{ij} = \frac{1}{{\sum\limits_{\begin{subarray}{l} i_{1} = 1 \\ j_{1} = 1 \end{subarray} }^{c,r} {\left( {\frac{{||{\varvec{x}} - {\varvec{v}}_{ij} ||_{{{\varvec{F}}_{j} }} }}{{||{\varvec{x}} - {\varvec{v}}_{{i_{1} j_{1} }} ||_{{{\varvec{F}}_{j} }} }}} \right)^{2/(m - 1)} } }}.$$

(16)

The prediction result is computed by taking a linear combination of the numeric representatives of the individual information granules and their radii, namely

$$\hat{y} = \mathop \sum \limits_{i = 1}^{c} \bar{u}_{i} \bar{y}_{i} \;\quad \hat{\rho } = \mathop \sum \limits_{i = 1}^{c} \bar{\rho }_{i} \bar{y}_{i} ,$$

(17)

where

$$\begin{aligned} \bar{u}_{i} = \mathop \sum \limits_{j = 1}^{r} u_{ij} , \hfill \\ \bar{\rho }_{i} = \mathop \sum \limits_{j = 1}^{r} \rho_{ij} . \hfill \\ \end{aligned}$$

(18)

Thus, the prediction result arises as information granule $\hat{Y} = (\hat{y},\hat{\rho }).$

5.2 Classifiers

As presented so far, each information granule comes with a vector of probability of classes p_i. p_i= [p_i1 p_i2 …p_it], i = 1,2,…,c. They are coming as a result of counting the number of patterns belonging to the individual classes [17]. More specifically, denoting by Ni the number of data contained in D_i, n_i1, n_i2,…, n_it are the counts of number of data belonging to the corresponding classes. The vector p_i is composed of the ratios

$$p_{i} = \left[ {\frac{{n_{i1} }}{{N_{i} }}\frac{{n_{i2} }}{{N_{i} }} \cdots \frac{{n_{it} }}{{N_{i} }}} \right].$$

(19)

The process of class assignment proceeds in a similar way as discussed in case of predictors. The final class membership p is computed in the following way:

$$p = \mathop \sum \limits_{i = 1}^{c} \bar{u}_{i} p_{i} .$$

(20)

Using the maximum rule, one selects this class i₀ for which the coordinate of p attains the highest value, viz. $i_{0} = \arg {\text{Max}}_{i = 1,2, \ldots ,p} p_{i} .$

In other words, i₀ is the index of the largest coordinate of the vector p.

5.3 Illustrative Example

In this example, we assume a synthetic data of six data points of four features. The first three data points are from a certain normal distribution (classification class 1), and the next three from another normal distribution (classification class 2).

$$X = \begin{array}{*{20}c} {[ - 0. 3 7 4 8} & { - 0. 6 4 1 1} & { 2. 8 9 4 8} & {0. 8 5 3 3} \\ {0. 9 1 6 4} & { 1.0 2 90} & { 2. 1 9 7 2} & { 4.0 3 9 6} \\ { 1.0 4 3 2} & { 1. 300 9} & { - 1. 40 8 9} & { 2. 4 2 30} \\ { 4. 8 2 7 8} & { 2. 9 20 1} & { 2. 60 8 6} & { 6. 5 6 1 4} \\ { 5. 2 5 9 9} & { 2. 20 4 5} & { 5. 6 8 4 8} & { 7. 70 2 5} \\ { 5. 3 5 8 7} & { 1. 3 2 9 8} & { 2. 2 9 3 9} & { 9. 2 3 1 9]} \\ \end{array}$$

X is clustered into c = 2 data clusters, and r = 2 feature clusters as follows (Fig. 4).

Accordingly, we have four information granules: (D₁, F₁), (D₁, F₂), (D₂, F₁), and, (D₂, F₂). The prototype of each information granule is computed by averaging all its data points. For example, the computation of v₁₁ is illustrated in Fig. 5.

Now using Eq. (16), we compute the membership matrix u_ij. Figure 6 illustrates the semantics of u_ij.

Using Eq. (18), we compute the membership through data clusters ($\bar{u}_{i}$). For example, $\bar{u}_{i}$ is computed as shown in Fig. 7.

Using Eq. (19), we compute (p_i), the ratio of each classification class in each data cluster D_i. Then, using Eq. (20), the assigned class to a certain data point X_i is computed as shown in Fig. 8.

6 Experimental Studies

In this section, we elaborate on the development of information granules and their quality. Both classification and regression type of data are considered; see Table 1.

Table 1 Summary of data

Full size table

We proceed with the clustering algorithm in the data space or feature space as described in Sect. 2. The number of clusters in the data space is c while for clustering features we consider r clusters. The clustering results are transformed to the binary version. The values of these numbers are selected based on the behavior of the objective function versus the varying values of these parameters. For each data set, we report the results in a certain format. The numbers of segments in the data and feature space are selected on a basis of the changes of the performance indexes (objective functions) regarded as functions of c and r; see Fig. 9. They are treated as functions of the number of segments and tend to stabilize when moving towards higher values of c and r. Now we will demonstrate our scheme using one classification data set (Gender Voice data set), and one regression data set (Concrete Data set), then follow the same procedure for more data sets.

As we have cr information granules, the quality of obtained information granules is reported by means of the reconstruction index (6). The values of V_ij computed with the use of (6) for individual granules are presented in Tables 2 and 3. It is clear from these tables, and based on Fig. 9 that when the number of data clusters reaches 7 and above, and when the number feature clusters reaches 4 and above, we get a low value of the reconstruction error. This can be verified by computing the average reconstruction error.

Table 2 V_ij (Gender voice data set) for i = 1, 2,…, c; j = 1, 2,…,r

Full size table

Table 3 V_ij (Concrete data set) for i = 1, 2,…, c; j = 1, 2,…,r

Full size table

Figures 10 and 11 (bar plot) display the values of V_ij starting from the best information granules (viz. with the lowest value of V_ij). In general, the error is low for all information granules when the values of both c and r are relatively high (4 or higher for r, and 7 or higher for c).

In case of classification data, the quality of information granules is evaluated with the aid of entropy (9). The obtained values of entropy are shown in an increasing order by proceeding with the lowest value; see Fig. 12. While for the regression data, the quality of information granules is evaluated with the aid of the variance (10) shown in an increasing order by proceeding with the lowest value (see Fig. 13). When the number of information granules increases, the average variance and average vagueness decrease.

Proceeding with the characterization of information granules built on a basis of the numeric prototypes, we display the optimal values of coverage and specificity (viz. the values obtained when the product of coverage and specificity achieved the highest value). Some selected results are displayed in Figs. 14 and 15. It can be noted that when the total number of information granules is high, the average product of coverage and specificity becomes low.

Proceeding with the remaining data sets, we report the results in a similar manner in Tables 4, 5 and Figs. 16, 17, 18, 19, 20 and 21. These tables and figures support all conclusions we have reached by inspecting the above two data sets.

Table 4 V_ij (Abalone data set) for i = 1, 2,…, c; j = 1, 2,…,r

Full size table

Table 5 V_ij (WINE data set) for i = 1, 2,…, c; j = 1, 2,…,r

Full size table

7 Conclusions

The study was devoted to the concise development of data by constructing their numeric representatives followed by the augmentation of the prototypes expressed in terms of information granules. The developed optimization environment helps quantify the quality of information granules (in terms of entropy and diversity) and numeric prototypes (evaluated by means of the reconstruction error). Information granules form a blueprint of data and constitute an initial setting for a variety of constructs and classifiers, predictors and association networks. It is worth stressing that information granules are functional building modules that are used as generic components in the development of a plethora of models including predictors and classifiers. Equally important is the fact that the study delivered a way to quantify the quality of information granules with regard to their classification or prediction capabilities. Likewise, the multiview perspective at experimental data is essential to cope with massive data as one constructs essential information granules pertinent that are central to facilitate an efficient way of building classifiers and predictors.

While Sect. 5 elaborates on the fundamentals of the modeling constructs (which owing to the use of information granules can be referred to as granular predictors, granular classifiers, etc.), more detailed studies could follow that focus on the detailed architectures and some following learning schemes.

References

Sun, S., Shawe-Taylor, J., Mao, L.: PAC-Bayes analysis of multi-view learning. Inf. Fusion 35, 117–131 (2017)
Article Google Scholar
Jiang, B., Qiu, F., Wang, L.: Multi-view clustering via simultaneous weighting on views and features. Appl. Soft Comput. 47, 304–315 (2016)
Article Google Scholar
Pontes, B., Giráldez, R., Aguilar-Ruiz, J.S.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)
Article Google Scholar
Zong, L., Zhang, X., Yu, H., Zhao, Q., Ding, F.: Local linear neighbor reconstruction for multi-view data. Pattern Recognit. Lett. 84, 56–62 (2016)
Article Google Scholar
Yin, J., Sun, S.: Multiview uncorrelated locality preserving projection. IEEE Trans. Neural Netw. Learn. Syst. (2019). https://doi.org/10.1109/TNNLS.2019.2944664
Article Google Scholar
Henriques, R., Antunes, C., Madeira, S.C.: A structured view on pattern mining-based biclustering. Pattern Recognit. 48, 3941–3958 (2015)
Article Google Scholar
Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10, 191–203 (1984)
Article Google Scholar
Lunwen, W.: Study of granular analysis in clustering. Comput. Eng. Appl. 5, 29–31 (2006)
Google Scholar
Liberti, L., Lavor, C., Maculan, N., Mucherino, A.: Euclidean distance geometry and applications. SIAM Rev. 56, 3–69 (2014)
Article MathSciNet Google Scholar
Dattorro, J.: Convex optimization & euclidean distance geometry. Lulu.com (2010)
Zadeh, L.A.: Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 90, 111–127 (1997)
Article MathSciNet Google Scholar
de Sa, J., Silva, L., Santos, J., Alexandre, L.: Minimum error entropy classification. In Studies in Computational Intelligence. Springer (2013)
Wang, X., Pedrycz, W., Gacek, A., Liu, X.: From numeric data to information granules: a design through clustering and the principle of justifiable granularity. Knowl. Based Syst. 101, 100–113 (2016)
Article Google Scholar
Pedrycz, W., Homenda, W.: Building the fundamentals of granular computing: a principle of justifiable granularity. Appl. Soft Comput. 13, 4209–4218 (2013)
Article Google Scholar
Pedrycz, W., Al-Hmouz, R., Morfeq, A., Balamash, A.: The design of free structure granular mappings: the use of the principle of justifiable granularity. IEEE Trans. Cybern. 43, 2105–2113 (2013)
Article Google Scholar
Pedrycz, W.: The principle of justifiable granularity and an optimization of information granularity allocation as fundamentals of granular computing. J. Inf. Process. Syst. 7, 397–412 (2011)
Article Google Scholar
Balamash, A., Pedrycz, W., Al-Hmouz, R., Morfeq, A.: Granular classifiers and their design through refinement of information granules. Soft. Comput. 21, 2745–2759 (2017)
Article Google Scholar
Machine Learning Data (MLData). https://www.mldata.io/dataset-details/gender_voice/. Accessed 1 Sept 2019
UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php. Accessed 1 Sept 2019

Download references

Acknowledgements

This project was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, Saudi Arabia, under Grant No. (KEP-5-135-39). The authors, therefore, acknowledge with thanks DSR technical and financial support.

Author information

Authors and Affiliations

Department of Electrical & Computer Engineering, University of Alberta, Edmonton, AB, T6R 2V4, Canada
Witold Pedrycz
Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Witold Pedrycz
Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
Abdullah Balamash, Witold Pedrycz, Rami Al-Hmouz & Ali Morfeq
Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, 21589, Saudi Arabia
Abdullah Balamash

Authors

Abdullah Balamash
View author publications
You can also search for this author in PubMed Google Scholar
Witold Pedrycz
View author publications
You can also search for this author in PubMed Google Scholar
Rami Al-Hmouz
View author publications
You can also search for this author in PubMed Google Scholar
Ali Morfeq
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdullah Balamash.

Appendix A

Used symbols

Symbol	Description
D_i	Data cluster i
F_j	Feature cluster j
r	Number of feature clusters
c	Number of data clusters
Q	FCM objective variable
x_k	Data point k
z_j	Feature j
N	Total number of data points
n	Total number of features
v_i	Data cluster i prototype
m	Fuzzification coefficient
u_ik	The membership value of a data point x_k to the data cluster i
g_ij	The membership value of a feature z_j to the feature cluster i
V_ij	Reconstruction error produced for (D_i, F_j)
$\left\\| \cdot \right\\|_{{F_{j} }}$	Distance completed for features forming F_j
ρ_ij	The probability class j exists in information granule i
v_ij	Data cluster i prototype computed by averaging cluster data points just for features forming F_j
h	Entropy
C_i	Vagueness of the i_th information granule
σ_iy	The variance of the output values of data cluster i
$\bar{y}_{i}$	The average of the output values of data cluster i
R_i
N_ij	Number of data points in information granule G_ij ≡ (D_i, F_j)
cov	Coverage
sp	Specify
$\hat{y}$	Predicted y value
$\rho$	Predicted class

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balamash, A., Pedrycz, W., Al-Hmouz, R. et al. Data Description Through Information Granules: A Multiview Perspective. Int. J. Fuzzy Syst. 22, 1731–1747 (2020). https://doi.org/10.1007/s40815-020-00903-z

Download citation

Received: 17 October 2019
Revised: 25 May 2020
Accepted: 09 June 2020
Published: 27 July 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s40815-020-00903-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data Description Through Information Granules: A Multiview Perspective

Abstract

Similar content being viewed by others

Optimised Information Abstraction in Granular Min/Max Clustering

Subjective Interestingness in Exploratory Data Mining

Design of granular interval-valued information granules with the use of the principle of justifiable granularity and their applications to system modeling of higher type

1 Introduction