1 Introduction

In the last decade, formal concept analysis (FCA) has been applied in various research fields for knowledge processing tasks (Poelmans et al. 2013a, b). The concept of FCA was introduced by Wille (1982) using the mathematics of applied lattice theory. FCA processes input data set as context format to discover formal concept and concept lattice (Ganter and Wille 1999). A formal context represents binary relationship among set of objects and their corresponding attribute set as row–column matrix format. The matrix contains × if the objects having the corresponding attributes otherwise null. From the given context, FCA discovers patterns in form of object and their common attribute set specially called as formal concept. It is maximal pair of set of objects (extent) and their corresponding attributes (intent) closed with Galois connection. All the discovered formal concepts can be visualized in a hierarchically ordered structure called as concept lattice (Aswani Kumar and Prem Kumar 2014). There are numerous interesting extensions of concept lattice in fuzzy setting (Burusco and Fuentes–Gonzales 1994), fuzzy graph (Ghosh et al. 2010), interval-valued fuzzy setting (Prem Kumar et al. 2016a; Yao 2016), bipolar fuzzy setting (Prem Kumar and Aswani Kumar 2014a, b), three-polar (Prem Kumar 2016a, b, and other mathematical models Macko 2013; Poelmans et al. 2013b; Ignatov et al. 2015). In each orientation, the concept lattice generated from a large number of attributes may provide some un-important formal concepts as demonstrated by Prem Kumar et al. 2016a, b. In this case, selecting some of the important concepts from the large number of generated concepts is a major concern for the researchers. Recently, attention has been paid towards reducing the size of concept lattice using K-means clustering (Aswani Kumar and Srinivas 2010), non-negative matrix factorization (Aswani Kumar et al. 2015), stability index (Babin and Kuznetsov 2012), computing weight (Bělohlávek and Macko 2012; Bělohlávek and Trnecka 2012), Junction Based Object Similarity (JBOS) (Dias and Viera 2013), entropy (Li et al. 2013; Prem Kumar and Abdullah Gani 2015; Zhang et al. 2012), K-medoids (Li et al. 2016), and homomorphism (Prem Kumar and Aswani Kumar 2014a, b). None of the available approaches provides a way to process the large context based on the user defined information granules (Bart et al. 2012; Dias and Viera 2015; Li et al. 2016). The reason is user or expert need some of the important concepts based on his/her requirement due to that it may differ from expert to expert. To deal with this issue a study on concept lattice reduction based on a chosen information granules is deeply required.

Recently, some of the researchers have made attention towards concept lattice representation via a defined information granules to process the large context into several small context for precise analysis of knowledge processing tasks (Yao 2004; Pedrycz 2013; Li et al. 2015). The information granules also used to find some of the frequent item set based on their tree (Vo et al. 2013; Yao 2016a), orthopairs (Cucci 2016) and triarchy (Yao 2016a) structure for adequate analysis of situation awareness (Loia et al. 2013), intelligent system (Pedrycz and Chen 2011), Big data (Pedrycz and Chen 2015a), decision making process (Pedrycz and Chen 2015b), neural network (Song and Wang 2016), and other fields for human–data interactions (Wilke and Portmann 2016; Zadeh 2008). Recently, the properties of information granules is extended to handle the data with binary (Bělohlávek et al. 2014), fuzzy (Kang et al. 2012; Li et al. 2015), interval-valued (Yao 2016b) and bipolar fuzzy attribute (Prem Kumar and Aswani Kumar 2014a, b) to refine some of the important concepts based on user defined granulation (Prem Kumar and Aswani Kumar 2012; Prem Kumar and Abdullah Gani 2015). These recent studies given an interactive (Skowron et al. 2016) and a new format of granular computing (Dubois and Prade 2016) which can be useful to bridge its gap from knowledge reduction tasks (Wu et al. 2009). These recent analysis motivated us to focus on an another form of granular computing (i.e., subset of attributes as information granules) to reduce the size of concept lattice in this paper.

The necessity of using subset of attributes as information granules is to find some of the important pattern (i.e., concepts) by their closeness (Dias and Viera 2013), functional relationships (Prem Kumar and Aswani Kumar 2014a, b), crisp ordering (Prem Kumar and Aswani Kumar 2015), similarity (Prem Kumar and Abdullah Gani 2015) or a defined complex granules (Skowron et al. 2016). The selection of granules is based on the shape and size of the given problem which should commensurate with the user requirements to resolve the given problem. It means the chosen level of granulation provides a way to process the large context with an efficient manner via modularizing the complex problem into a series of well-defined sub problems (modules) within minimal computation cost as discussed by Loia et al. (2016). In this paper, shape is used to represent the formal context and size represents its dimension, whereas subset of attributes are considered as small information granules. The level of granulation for choosing the particular subset can be defined by user based on his/her requirements. For example, suppose a context have three attributes \(\{ 1,2,3\}\) to process the knowledge. In this case, following subset can be generated: \(\phi\), {1}, \(\left\{ 2\right\}\), \(\left\{ 3\right\}\), \(\left\{ 1,2\right\}\), \(\left\{ 1,3\right\}\), \(\left\{ 2,3\right\}\), \(\left\{ 1,2,3\right\}\). Among these subset user can choose any of the subset as level of granulation. The level of granulation shows the reduced number of attributes in the chosen subset as given below:

  • Granulation level 0 means none of the attributes are reduced so user selects {1}, {2}, {3}

  • Granulation level 1 means one of the attribute is reduced. In this case, user may selects following subsets:

    1. 1.

      ({1}, {2, 3})

    2. 2.

      ({2}, {1, 3})

    3. 3.

      ({1, 2}, {3})

  • Granulation level 2 means two attributes are reduced. In this case, user can selects the subset {1, 2, 3}.

Above information shows that choosing a subset of attributes as information granules provides a mechanism to reduce the large context by changing the size of subset. In this process, it may possible to hide or reveal a certain amount of details for the chosen subset of attributes to solve the particular problem based on its complexity and requirements. The reason is each of the chosen subset of attributes as information granules provides a specific way to describe the particular part of the problem. Now the chosen subset of attributes can be visualized as vertices of the graph (Berry and Sigayret 2004) as it is applied in mathematical searching (Nguyen et al. 2012), preference analysis (Obiedkov 2012), item set mining (Troiano and Scibelli 2014), AFS algebra (Wang and Liu 2008), and interval-set approximation (Yao 2016b). The complexity of concept lattice visualization and its processing time increases when number of attributes are more in the given context. In this case, a problem arises when a user want to visualize the data using some of the potential subset of attributes. Such that user can find some of the important concepts which may or may not be detectable while using all the attributes. To achieve the goal this paper paper aimed at following proposals:

  1. 1.

    To propose a method considering the chosen subset of attributes as granulation to process the large context.

  2. 2.

    To reduce the size of concept lattice based on chosen granulation level for the subsets of attributes.

  3. 3.

    To find some of the important concepts from the obtained context at different granulation of their computed weight using entropy.

  4. 4.

    To provide an empirical analysis of the proposed method with granular tree method given by Bělohlávek et al. (2014). The reason is granular tree method also provides a way to control the size of concept lattice using spatial neighborhood of attributes.

Rest of this paper is organized as follows: Sect. 2 provides a brief background about FCA. Section 3 contains the proposed method. Section 4 provides illustration of the proposed method. The empirical analysis of the proposed method with granularity tree is demonstrated in Sect. 5 followed by conclusions, acknowledgement and references.

2 Formal concept analysis

Definition 1

(Formal context) A formal context (F) = (X, Y, R) represents set of objects (X), set of attributes (Y), and binary relation(R) among them in form of row–column matrix. If the objects having corresponding attributes then represents × otherwise null in the corresponding row–column of the matrix.

Definition 2

(Concept forming operators) The operators \(\uparrow\): \(2^{X} \rightarrow 2^{Y}\) and \(\downarrow\): \(2^{Y} \rightarrow 2^{X}\) defined for every A \(\subseteq\) X and B \(\subseteq\) Y by

\(A^{\uparrow }\) = \(\left\{ y\in Y | \forall x \in A: (x, y) \in R \right\}\),

\(B^{\downarrow }\) = \(\left\{ x\in X | \forall y \in B: (x, y) \in R \right\}\),

\(A^{\uparrow }\) is the set of all attributes shared by all objects from A. Similarly, \(B^{\downarrow }\) is the set of all objects sharing all attributes from B.

Definition 3

(Formal concept) It is a pair (A, B) of maximal subset of objects and attributes set, respectively, where A \(\subseteq\) X and B \(\subseteq\) Y. The given subset of attributes are closed as follows: \(A^{\uparrow }\) = B and \(B^{\downarrow }\) = A. This connection generates the pair of data having same properties called as extent-intent. The collection of all such pairs of concepts forms a concept lattice under the closure operation.

Definition 4

(Concept lattice) The concept lattice structure determines the hierarchy of formal concepts. It defines the partial ordering principle, i.e., \((\textit{A}_{1},\textit{B}_{1})\le (\textit{A}_{2},\textit{B}_{2})\Longleftrightarrow \textit{A}_{1}\subseteq \textit{A}_{2}(\Longleftrightarrow \textit{B}_{2}\subseteq \textit{B}_{1})\) among each of the formal concepts. In this case, the concept \((A_{1}, B_{1})\) can be considered as more specific when compare to \((A_{2}, B_{2})\) (i.e., \((A_{2}, B_{2})\) is more general when compare to \((A_{1}, B_{1})\)). From this ordering it can be concluded that each of the concept lattice structure contains two special nodes at their top and bottom boundaries representing the most general and the most specific concepts, respectively. The generalized concepts contain more objects while specialized concepts contain more attributes. The attributes of each formal concept are inherited from the most general maximum node, while the objects are inherited from the most specific minimum node as given by Ganter and Wille (1999):

  • \(\wedge _{j\in J}(A_{j}, B_{j})\) = \((\bigcap _{j\in J} A_{j}, (\bigcup _{j\in J}B_{j})^{\downarrow \uparrow })\),

  • \(\vee _{j\in J} (A_{j}, B_{j})\) = \(((\bigcup _{j \in J} A_{j})^{\uparrow \downarrow },\bigcap _{j\in J} B_{j})\).

Definition 5

(Granular computing) It is an important tool to process the large or chunks of information based on their small information granules. The information granules includes collections of some attributes based on their similarity, functional adjacency, and indistinguishability. In this paper, subset of attributes are used as a information granules to detect some important patterns in the given data set. Hence, the level of granulation provides a way to process the large context with an efficient manner via modularizing the complex problem into a series of well-defined subproblems (modules) within minimal computation cost. The importance of submodules or information granules can be defined using their computed weight (w) where \(0\le w \le 1\). Such that user can select some of the concepts based on his/her requirements at different granulation–\(\theta\) (0\(\le \theta \le\)1). However, the selection of particular granules is based on user or experts choice or requirements of the problem. It is an important tool to analyze the data set having large attributes set. The information granule includes one or another way to quantify the lack of numeric precision in the given large attribute data set. Hence, it provides collection of small information to detect some important patterns in the given large data set.

Concept lattice provides hierarchical order visualization of formal concepts to accelerate the knowledge processing tasks using FCA. However, FCA discovers large number of formal concepts even for the middle size of formal context. In this case, selecting some of the important formal concepts is major concern for the practical applications of FCA. To encounter this problem, a method is proposed in the next section based on chosen subset of attributes as information granules and their computed weight at defined granulation.

3 Proposed method

3.1 Granulation based subset of attributes

The proposed method in this paper is focused on controlling the size of concept lattice based on chosen subset of attributes. For this purpose, it uses subset of attributes as information granules. The step by step procedure of the proposed method is given as below:

Step 1 Let us suppose a formal context–F = (X, Y, R) having n–number of objects and m–number of attributes.

Step 2 Find all the subset of attributes (Y), i.e., 2m in the given formal context.

Step 3 Now consider the subset of attributes (\(S_{j}\)) as granulation level and defined the level of granularity as given follows:

  • Granulation level 0 means none of the attributes are reduced by chosen subset of attributes.

  • Granulation level 1 means one of the attribute is reduced by chosen subset of attributes.

  • Granulation level 2 means two attributes are reduced by chosen subset of attributes.

  • Similarly granulation level \(m-1\) means \(m-1\) attributes are reduced by chosen subset of attributes.

It can be observed that the level of granulation indicates only the number of reduced attributes by the chosen subset of attributes. In this case, choosing the right subset is another issue. To resolve this issue, next step provides an equation to verify the chosen level of granulation.

Step 4 Previous step shows that if the chosen subset of attributes have equal number of attributes then they may contain similar granulation level. In this case, a user can choose the subset of attributes which follows the following equality: \(S_{1}\) \(\cup\) \(S_{2}\) \(\cup \cdots\) \(\cup\) \(S_{j}\) = Y where \(|S_{j}|\le 2^{m}\).

Step 5 The chosen subset of attributes and their corresponding relationship with given objects set provides an another formal context F S = \((X, S_{j}, R_{1})\) where \(|S_{j}| <|Y| = m\) and \(|R_{1}| <|R|\). The size of new formal context can be controlled using the chosen level of granulation for the subset of attributes as shown in Step 3.

Step 6 Now the concepts can be generated from newly obtained context F S = \((X, S_{j}, R_{1})\) for knowledge processing tasks. Of course for the reducing the size of formal context the proposed method does not removed any attributes or objects set. Hence, comparatively less information loss using the proposed method when compare to other available approaches in FCA with binary setting.

Table 1 Proposed algorithm for concept lattice reduction using chosen subset of attributes

The steps of the proposed algorithms are shown in Table 1. The proposed algorithm first computes all the subset for the given attributes in the context using Step 1 and Step 2. Represent each subset of attributes as a defined set as shown in Step 3. The computed subset of attributes can be order based on their level of granulation as shown in Step 4. A user can choose any of the subset of attributes as information granules to reduce the size of concept lattice based on his/her requirement to solve the given problem. The chosen subset of attributes should follow: \(S_{1}\) \(\cup\) \(S_{2}\) \(\cup \cdots\) \(S_{j}\) = Y as shown in Step 5. The chosen subset of attributes can be represented in the form of context using given objects set and their corresponding relationship as shown in Steps 6 and 7. This newly obtained context can be written as a formal context (F \(_{S}\)) = \((X, S_{j}, R_{1})\) where \(|S_{j}| <|Y| = m\) and \(|R_{1}| <|R|\) for further process using the properties of FCA as shown in Step 8. From this newly obtained formal context all the formal concepts can be generated for knowledge processing tasks (as shown in Step 9). Similarly, the size of concept lattice can be controlled by choosing the different subset of attributes as granulation (as shown in Step 10). To accomplish these tasks, the proposed method does not removed any attributes or objects set which assures comparatively less information loss when compare to other available approaches in FCA with binary setting.

3.2 Proposed algorithm to choose some important concepts generated from subset of attributes

In this section, a method is proposed to find some of the important concepts generated from obtained context by the chosen subset of attributes as information granules (shown in Table 1). It may provide some randomness to measure this properties of Shannon entropy is utilized. To compute the weight for each of the chosen subset of attributes (\(S_{j}\)) as follows: let us consider any object \(x_{i}\in X\) of the reduced context F S = \((X, S_{j}, R)\). The probability (P) of object \(x_{i}\) possessing the chosen subset of attributes (\(S_{j}\)) can be computed by P\((S_{j}/x_{i})\) where \(S_j\) and \(x_i\) represent j-th attribute set and i-th object, respectively. To compute the average information weight for the chosen subset of attributes (\(S_{j}\)) can be computed as \(E(S_{j})\). This provides an average weight of the object ( \(x_{i}\)) to provide the subset of attribute (\(S_{j}\in\) F S). The weight value for the chosen subset of attributes as \(w_{j}\). Now the weight of generated concepts can be obtained using the summation of each subset of attributes contained in its intent. Further, an average weight of concepts can be computed by dividing total number of newly generated concepts as given below:

  1. 1.

    \(E(S_{j}) = -\sum _{j}\) P\((S_{j}/x_{i})\) log\(_{2}\)(P\((S_{j}/x_{i}))\), where j represents the total number of subset of attributes selected to make the new context F S. This selection is totally based on user expert and choice.

  2. 2.

    \(w_{j} = E(S_{j})/ \sum _{j}E(S_{j})\).

  3. 3.

    \(Weight(k) = \sum ^{k}_{j = 1}(w_j)/k\).

In this way, the proposed method helps in deciding the importance of concepts generated by context builds from chosen subset of attributes at different granulation of their weight. The selection of granules is totally based on user requirements to find its important concepts via the shape, size of the given problems.

Table 2 Proposed algorithm to reduce the concepts generated from Table 1 at different granulation

The steps of the proposed algorithm are shown in Table 2. The proposed algorithm starts working from newly obtained context (F S) = \((X, S_{j}, R_{1})\) and its generated concepts. Since the newly obtained context in this paper is based on choosing the subset of attributes (\(S_{j}\)) as granulation. Then for computing the weight chosen subset of attributes (\(S_{j}\)) are considered as shown in the Step 1. The proposed algorithm first computes the probability for the chosen subset of attributes(\(S_{j}\)) to posses the corresponding objects as shown in Step 2. After that the average information weight for the chosen subset of attributes is computed using the entropy theory as shown in Steps 3 and 4. Now the total weight of concepts generated from chosen subset of attributes (F S) = \((X, S_{j}, R_{1})\) can be computed using Steps 5 to 7. A user can select some of the important formal concepts based on a defined granulation for their computed weight as shown in Steps 8–10. In this way, the proposed method provides depth analysis of concept lattice generated from the chosen subset of attributes which is another advantages of the proposed method.

Complexity Let us consider the number of objects in the given formal context is n and number of attributes in the formal context is m. The proposed method find some subset of attributes (\(S_j\)) which takes complexity \(2^{m}\) where \(j<m\). The proposed method provides a way to control the size of concept lattice using subset of attributes as information granules (as shown in Table 1) to find some important concepts based on their computed weight at different granulation (as shown in Table 2). This takes computational complexity O (\(j \ln (j)\)) where j is number of chosen subset of attributes, i.e., \(j<m\). In this way the proposed method takes less complexity when compare to the granular tree method (Bělohlávek et al. 2014) which is NP-hard. The granular tree method is expert based as discussed by Bělohlávek et al. (2014), whereas the proposed method can be used by any expert or non-expert user as illustrated in the next section.

4 Concept lattice reduction using granular based subset of attributes

Several methods are proposed for concept lattice reduction to increase the applicability of FCA in various research fields (Bart et al. 2012; Dias and Viera 2015). Recent years research trends are turned towards granular based concept lattice reduction using their subset (Prem Kumar et al. 2016a, b; Yao 2016). Pandey et al. (2016) tried the classification of Indian Algae data set based on their common subset of attributes (http://indianalgae.co.in). In this paper, we focused on concept lattice reduction using their subset of attributes as information granules to reveal some important pattern in the given data set. For this purpose a method is proposed in Table 1. To illustrate the proposed method an example is given as below:

Example 1

Let us suppose a binary context shown in Table 3, where \(x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}\) represents objects and \(y_{1}, y_{2}, y_{3}, y_{4}\) represents attributes set. Concept lattice generated from this context is shown in Fig. 1, which contains nine concepts. Now our aim is to reduce the size of concept lattice using granular based subset of attributes to process the knowledge.

Table 3 A binary formal context
Fig. 1
figure 1

Concept lattice generated from context shown in Table 3

Table 4 16 possible subset of attributes for \(\left\{ y_{1}, y_{2}, y_{3}, y_{4}\right\}\) shown in Table 3
Table 5 Possible selection of subset of attributes based on the level of granulation

The Table 3 contains four attributes \(y_{1}\), \(y_{2}\), \(y_{3}\), \(y_{4}\). For these attributes, total \(2^{4} = 16\) possible subset is shown in Table 4. User can choose any of the subsets to control the size of concept based on his/her requirements at defined level of granulation for the subset of attributes as shown in Table 5. Each of the chosen subset of attributes refines the specific information based on the shape and size of the particular problem (Loia et al. 2016). In this paper, the level of granulation shown in Table 5 is defined as below:

  1. 1.

    Granulation level 0 means none of the attributes are reduced from the chosen subset.

  2. 2.

    Granulation level 1 means one attribute is reduced using the chosen subset when compared to its original context.

  3. 3.

    Granulation level 2 means two attributes are reduced using the chosen subset when compared to its original context.

To demonstrate the proposed method, let us consider granulation level 1 shown in Table 5. The chosen level of granulation provides following possible combination 2, 3, 4, 5, 6 and 7 to select from the Table 5. User can select any of them to process the context. Let us consider user has chosen combination 2.\(\left\{ S_{2}, S_{3}, S_{11} \right\}\). It can be observed that the selected subset of attributes reduces the original context into three attributes. The reduced context based on relationship with their corresponding set is represented in Table 6. Concept lattice generated from this context is shown in Fig. 2. It can be observed that Fig. 2 reduces the Fig. 1 from nine concepts to seven concepts using granulation level 1, to process the knowledge. Similarly, the concept lattice can be reduced using other level of granulation shown in Table 5. In this process, the proposed method does not reduced or discard any of the objects or attributes set which assured less possibility of information loss. To verify this doubt, knowledge represented by reduced concept lattice shown in Fig. 2 is compared with Fig. 1 in form of Table 7. This table assures that the reduced concept lattice shown in Fig. 1 preserves the knowledge represented by its original concept lattice shown in Fig. 1.

Table 6 Context shown in Table 3 based on chosen subset–\(\left\{ S_{2}, S_{3}, S_{11} \right\}\)
Fig. 2
figure 2

Concept lattice generated from the context shown in Table 6

Table 7 Knowledge discovered by Fig. 1 and its reduced lattice shown in Fig. 2

Furthermore, some important concepts can be found from the reduced concept lattice shown in Fig. 2. To accomplish these tasks, the proposed method shown in Table 2 provides a way to select some of the concepts based on their computed at different granulation. To illustrate this process, context shown in Table 6 is considered. Table 8 shows the computed weight for the chosen subset of attributes \(S_{2}\), \(S_{3}\) and \(S_{11}\). Table 9 represents the computed weight for each concept shown in Fig. 2 generated from the chosen subset of attributes. To select some of the concepts based on their computed weight is shown in Table 10. In this way, the proposed method reduces the concept lattice using granular based chosen subset of attributes and their computed weight at different granulation.

Table 8 Computed weight for each subset of attributes shown in Table 6
Table 9 Weight of formal concepts shown in Fig. 2 using their intent
Table 10 Some selected formal concepts from the Fig. 2 using granulation

It can be observed that the proposed method provides two continuous way to reduce the concept lattice for selecting some of the important data. To validate the result of the proposed method, granular tree method is considered (Bělohlávek et al. 2014) whose functionality is closed to the proposed method in this paper. For this purpose, same car data set is adapted from Bělohlávek et al. (2014) as shown in the next section.

5 Empirical analysis

In this paper, a method is proposed to control the size of concept lattice using subset of attributes as granules. The reason is that the properties of granulation gives a way to refine the larger context into various smaller context. To utilize these advantages of granular computing, recently several methods are proposed to discover specific pattern from the formal context based on their closeness (Dias and Viera 2013), functional relationships (Wu et al. 2009), similar weight (Prem Kumar et al. 2016a), Huffman coding (Prem Kumar and Abdullah Gani 2015), crisp ordering (Prem Kumar and Aswani Kumar 2015) and interval-valued subset (Yao 2016). The cognitive viewpoint of concept lattice using granular computing is also discussed by Li et al. (2015). Bělohlávek et al. (2014) introduced another method to control the size of concept lattice using a defined granular tree by an expert. Among these available methods the proposed method and its analysis is closed to granular tree concept lattice reduction. To illustrate the difference of proposed method from granular tree an example is given es below:

Example 2

Let us suppose car accident data set shown in Table 11. It shows the information about accidents based on the reference number of car, driver name, cause of accident (like using alcohol; priority means office time; driver have not used the steering in the right way or car steering is not correct; car brakes were not working properly), and accident time of the car. From this data set following important patterns are investigated by Bělohlávek et al. (2014) using granular tree method:

  1. 1.

    Significant number of “night accidents caused by alcohol”.

  2. 2.

    Significant number of “morning accidents caused by priority (failure to yield way)”.

Now our goal is to compare the above obtained investigations with the analysis derived from the proposed method. The given data shown in Table 11 is in raw format. It may possible that it provide some un-important or unusual patterns (concepts) along with usual patterns. This may affect the computation of precise knowledge processing tasks. For this purpose, the given data set need to be trained based on its objects and attributes set as given below:

Table 11 Data with binary attributes for the car accident

Table 11 contains following distinct attributes to process it for knowledge dicovery tasks:

  1. 1.

    Cause of accident: Alcohol, Brakes, Priority (like office time), Steering and,

  2. 2.

    Time of accident: 1 AM, 6 AM, 7 AM, 9 AM, 10 AM, 12 AM, 8 PM, 9 PM, 10 PM.

For the above given attributes their subsets are shown in Tables 12 and 13, respectively.

Table 12 Possible subset for the attributes cause of accidents shown in Table 11
Table 13 Possible subset of attributes for time of accident shown in Table 11

Some of the possible subset of attributes using Table 12 and Table 13 which follows, i.e., \(S_{1}\) \(\cup\) \(S_{2}\) \(\cup\)...\(S_{j}\) = Y are:

  1. 1.

    \(\left\{ S_{2}, S_{3}, S_{4}, S_{5}, AM, PM \right\}\)

  2. 2.

    \(\left\{ S_{4}, S_{5}, S_{6}, AM, PM \right\}\)

  3. 3.

    \(\left\{ S_{3}, S_{5}, S_{7}, AM, PM \right\}\)

  4. 4.

    \(\left\{ S_{3}, S_{4}, S_{8}, AM, PM \right\}\)

  5. 5.

    \(\left\{ S_{2}, S_{5}, S_{9}, AM, PM \right\}\)

  6. 6.

    \(\left\{ S_{2}, S_{4}, S_{10}, AM, PM \right\}\)

  7. 7.

    \(\left\{ S_{2}, S_{1}, S_{11}, AM, PM \right\}\)

  8. 8.

    \(\left\{ S_{6}, S_{11}, AM, PM \right\}\)

  9. 9.

    \(\left\{ S_{7}, S_{10}, AM, PM \right\}\)

  10. 10.

    \(\left\{ S_{8}, S_{9}, AM, PM \right\}\)

  11. 11.

    \(\left\{ S_{2}, S_{15}, AM, PM \right\}\)

  12. 12.

    \(\left\{ S_{3}, S_{14}, AM, PM \right\}\)

  13. 13.

    \(\left\{ S_{4}, S_{13}, AM, PM \right\}\)

  14. 14.

    \(\left\{ S_{5}, S_{12}, AM, PM \right\}\)

  15. 15.

    \(\left\{ S_{5}, S_{12}, AM, PM \right\}\)

  16. 16.

    \(\left\{ S_{16}, AM, PM \right\}\)

Now user can select any of the subset from the above list to find the important pattern based on his/her requirements as given below:

  1. 1.

    If user want to analyze the pattern of car accident based on: Alcohol, Brakes, Priority and Steering. Then user can select \(\left\{ S_{2}, S_{3}, S_{4}, S_{5}, AM, PM \right\}\).

  2. 2.

    If users want to analyze the patterns of car accident based on:\(\left\{ Alcohol, Priority \right\}\) and \(\left\{ Brakes, Steering \right\}\). Then user can select the subset of attributes: \(\left\{ S_{7}, S_{10}, AM, PM \right\}\). Similarly other subset can be chosen based on user requirements to analyze the patterns in the car accident.

Example 2.1

Let us suppose that user has chosen 1. \(\left\{ S_{2}, S_{3}, S_{4}, S_{5}, AM, PM \right\}\) which includes the following attributes: \(\left\{ Alcohol, Brakes, Priority, Steering, AM, PM \right\}\) as per Table 12. The corresponding formal context for the chosen subset of attributes and their corresponding object set is shown in Table 14. The list of concepts generated from this context is shown in Table 15, whereas their hierarchical order visualization in the concept lattice is shown in Fig. 3.

Table 14 The formal context for the car accident data set example of Table 11
Fig. 3
figure 3

Concept lattice generated from the context shown in Table 14

Table 15 Extent and intent of the formal concepts shown in Fig. 3

From Table 15 following important patterns can be discovered:

  1. 1.

    Most of the accident happens in AM.

  2. 2.

    Significant number of accidents happen in the AM happen due to priority (like office time).

  3. 3.

    Significant number of accidents happen in the PM due to alcohol or Brakes.

Furthermore, if the user want to refine some specific pattern in the concept shown in Table 15 then it can be done through their compute weight at different granulation as per the proposed algorithm shown in Table 2. Based on this algorithm the computed weight for each of the attributes of Table 14 is given in Table 16. Table 17 shows the computed weight for each of the listed concepts shown in Table 15 based on their intent. Table 18 shows the selection of specific concepts at different granulation of their computed weight.

Table 16 Computed weight for each attributes shown in Table 14
Table 17 Weight of formal concepts shown in Fig. 3 using their intent
Table 18 Some important concepts from the Fig. 3 using granulation

From Table 18 following information can be extracted:

  1. 1.

    Significant number of accidents happen in night (PM) is due to Alcohol or Brakes.

  2. 2.

    Significant number of accidents happens in morning (AM) is due to Priority or Brakes.

We can observe that the analysis derived from the proposed method is in good agreement with granular tree given by Bělohlávek et al. (2014) for the chosen subset of attributes. Further if user want to analyze the patterns based on other subset shown in Tables 12 and 13 then user can choose another one as given below:

Example 2.2

If the users want to analyze the the patterns of car accident based on: \(\left\{ Alcohol, Priority \right\}\) and \(\left\{ Brakes, Steering \right\}\). Then user can select the subset of attributes: 9. \(\left\{ S_{7}, S_{10}, AM, PM \right\}\). The formal context based on this chosen subset is shown in Table 19. The concept lattice generated from this context is shown in Fig. 4, whereas the list of concepts is shown in Table 20.

Table 19 The formal context based on subset of attributes: \(\left\{ S_{7}, S_{10}, AM, PM \right\}\)
Fig. 4
figure 4

Concept lattice generated from the context shown in Table 19

Table 20 Extent and intent of concepts shown in Fig. 4

From Table 20 we can find some interesting patterns as follows:

  1. 1.

    Most of the accident happens in AM.

  2. 2.

    Significant number of accidents happen due to \(\left\{ Alcohol, Priority \right\}\).

  3. 3.

    Significant number of accidents happens in PM due to Brakes and Steering.

It can be observed that the reduced concept lattice shown in Fig. 4 preserves the knowledge represented by its original concept lattice shown in Fig. 3. Further, the proposed method provides a way to refine some of the important concepts from Table 20 based on his/her requirements using their computed weight at different granulations (the proposed algorithm shown in Table 2). Table 21 shows that computed weight for each subset of the attributes shown in Table 19. Table 22 represents the computed weight for each of the concept listed in Table 20. Now some of the important concepts can be selected based on their weight at different granulation as shown in Table 23.

Table 21 Computed weight for each attributes shown in Table 19
Table 22 Computed weight for each concepts of Fig. 4 shown in Table 20
Table 23 Choosing some important concepts from Table 22 based on granulation

We can observe that the analysis derived from the proposed method is in agreement with granular tree method given by Bělohlávek et al. (2014). However, the proposed method can be used by any expert or non-expert users, whereas granular tree method more suitable for the user who is expert. Furthermore, the proposed method provides a way to refine some of the specific or important concepts at different granulation of their computed weight based on user requirements within complexity O (\(j \ln (j)\)) where j is number of chosen subset of attributes. Hence, number of chosen subset of attributes (j) is lesser than the number of given attributes (m) when compare to its original context, i.e., (\(j<m\)). Due to this fact the proposed method reduces the size of concept lattice using less computational cost too when compare to recently published method by Prem Kumar et al. (2016a) as well as granular tree given by Bělohlávek et al. (2014). For more understanding the proposed method is compared with granular tree method based on many parameters as shown in Table 24.

Table 24 Comparison of granular tree and the proposed method

From Table 24 following observations can be made:

  • Granular tree method is useful if the user is expert, whereas the proposed method can be used by any of the non-expert users too.

  • Granular tree method is focused on partition of attributes, whereas the proposed method focused on subset of attributes. In this case, computing the subset is more easier than finding the partitions.

  • Computing the granular tree is NP-hard problem, whereas proposed method takes O (\(j \ln (j)\)) where \(j<m\) complexity.

  • Granular tree method can be applied on the context having related attributes, whereas the proposed can be applied on any of the binary context.

  • Granular tree method does not provide any way to encode the concepts or reduces the space complexity. However, the proposed method provides numerical representation of the formal concepts which helps in encoding the data.

6 Conclusions and future work

This paper aimed at reducing the size of concept lattice considering subset of attributes as information granules. The proposed method defines level of granulation for each of the chosen subset to provide many ways for refining the knowledge. Furthermore, the proposed method gives another way to find a specific patterns (concepts) in the obtained context by computing their weight at different granulation. In this process, none of the objects or attributes are reduced by the proposed method. To complete these tasks the proposed takes O (\(j \ln (j)\)) time, which is computationally less expensive when compare to granular tree method (Bělohlávek et al. 2014). However, the analysis derived from the proposed method is in agreement with granular tree method and provide more depth analysis to refine the knowledge. In future, the work will be focused on the applications of the proposed method beyond the binary attributes and its extension to interval-valued subset selection.