In this paper, we introduce the concept of exploring the feature space to aid learning in the context of design space exploration. The feature space is defined as a possible set of features mapped in a 2D plane with each axis representing different interestingness measures, such as precision or recall. Similar to how a designer explores the design space, one can explore the feature space by observing how different features vary in their ability to explain a set of design solutions. We hypothesize that such process helps designers gain a better understanding of the design space. To test this hypothesis, we conduct a controlled experiment with human subjects. The result suggests that exploring the feature space has the potential to enhance the user’s ability to identify important features and predict the performance of a design. However, such observation is limited only to the participants with some previous experience with design space exploration.

Introduction

Over the last two decades, the “design by shopping” paradigm [1] has become a popular approach to tackle early-phase (conceptual design or system architecting) engineering design problems. An important step in this approach is called design space exploration (a.k.a. tradespace exploration), where the designer analyzes the structure of the design space and learns about the trade-offs in the system, sensitivities of design criteria to design decisions, couplings between design decisions, etc. For the remainder of this paper, “learning” in tradespace exploration refers to gaining knowledge about these parameters, and more generally about the mapping between design decisions and design criteria. Through the process of design space exploration, designers can make a more informed decision for the selection of the final design.

However, design space exploration presents us with the challenge of information overload. The problem of information overload becomes more prominent in design tasks involving many design decisions, multiple objectives, and intricate couplings between them. It has been shown that as design problems get more complex, designers are overwhelmed by the size and the complexity of the data, thus leading to the degradation of their ability to understand the relationships between different variables [2,3,4].

To address this issue, various data visualization methods and tools have been developed for design space exploration [5,6,7,8,9,10,11,12,13,14]. Most of these tools focus on providing different views of designs defined in a multidimensional space, in some cases coupled with unsupervised machine learning methods such as clustering, feature selection, and manifold learning. However, due to the knowledge being implicit in visualization, these methods require an additional step for the humans to visually inspect the result and make interpretations. Therefore, the knowledge obtained through visualization can be ambiguous and subjective. Moreover, visually inspecting and finding patterns may be challenging without sophisticated rearranging strategies [15, 16].

Another complementary approach to learn about the design space is to extract knowledge using data mining algorithms that mine knowledge explicitly in the form of logical if-then rules [17,18,19,20]. These methods can be used to extract driving features, i.e., the common features (specific values of design decisions, attributes, or combinations thereof) that are shared by a group of designs that exhibit similar objective values [21]. For example, Watanabe et al. use association rule mining to analyze hybrid rocket engine designs, and find that 83% of all non-dominated (Pareto optimal) solutions had a similar initial port radius [22]. The major advantage of such knowledge is that it can be expressed relatively concisely and unambiguously through a formal representation [23].

While having been used successfully in the past to analyze design spaces, these methods are not without limitations. One of the limitations of the current methods is that they impose a rigid structure in the mined features (the conditional “if” parts of the rules); indeed, all features are represented as predicates (i.e., binary features) joined by logical conjunctions (i.e., “and” operator). From a mathematical point of view, this does not reduce expressivity, as any logical formula can be converted into a disjunctive normal form or DNF (i.e., a disjunction—OR—of conjunctive clauses) [24]. Therefore, any Boolean concept (a concept whose membership is determined by a combination of binary features [25]) can be represented using a set of rules (disjunction of rules). However, from a human learning point of view, the conversion to DNF often results in longer features, and thus harder to understand by humans.

Another limitation of the data mining methods is that they generate a large set of features without an easy way to identify the most useful and informative one [26]. Identifying a single feature that best explains the region of interest of the design space while staying compact could improve learning. One approach to select a single feature is to sort all features using one measure such as confidence or lift [27]. Intuitively, these interestingness measures provide a quantitative metric of the predictive power of the feature. However, selecting a single metric from a large list of alternatives can be arbitrary, and may not necessarily be the right measure for the given design problem [28, 29].

In this paper, we present a new method, feature space exploration, to aid human learning in design space exploration, and a tool to use with the method. The aim of this method is to improve the designer’s ability to identify important features and generate new insights. In order to foster learning, we enable designers to explore various forms of features and get immediate feedback on how well these features explain a certain region of the design space (e.g., a cluster, or the Pareto front). This is done by defining a space of all possible features (called the feature space), visualized on a 2D plane. Each axis in the plane represents one of the interestingness measures of features used in classification (e.g., precision and recall [29]) or association analysis (e.g., confidence, lift, and Gini index [28]). If one selects conflicting goodness measures such as precision and recall [30], the Pareto front of the feature space can also be defined. The designer can then use the visualization to observe how the goodness measures change in response to a change in the feature, and elicit his or her preferences among those two important measures. Due to its similarity to how a designer explores the design space, we refer to this process as “exploring the feature space”. Exploring the feature space helps the designer identify the driving features that shape the structure of the design space. The process takes advantage of the intuitive and fast nature of exploring options through visualization, as well as the ease of learning through formal representations that are clear and concise.

To demonstrate the effectiveness of this new method to improve learning, we conduct a controlled experiment with human subjects. The experiment tests whether exploring the feature space improves the designer’s learning, which is measured as his/her ability to predict whether a given design will exhibit desirable performance and cost. The result shows that exploring the feature space may indeed improve learning about the design space but only under certain conditions—for subjects who have received some formal training in design space exploration.

Example Design Problem: Architecting Earth Observing Satellite System

Before explaining how the proposed method works, we first introduce an example design problem to help explain the methodology in the remainder of the paper. It should be noted that the proposed method is not specific to a type of design problem. However, there are some implementation details that are tailored to the structure of this problem. This point will be elaborated on after the design problem is outlined.

The design problem is a real-world system architecting problem previously studied in [31]. The goal of the design task is to architect a constellation of satellites to provide operational observations of the earth’s climate. There are two objectives: maximizing the scientific benefit and minimizing the lifecycle cost. The scientific benefit is a function of an architecture’s satisfaction of 371 climate-related measurement objectives, generated based on the World Meteorological Organization’s OSCAR (Observing Systems Capability Analysis and Review Tool) database.Footnote 1 The level of satisfaction of each measurement objective is quantified based on the capabilities of each design and then aggregated to obtain a number that represents how much scientific benefit each design brings to the climate scientific community.

The design problem has been formulated as an assignment problem between a set of candidate measurement instruments (space-based sensors related to climate monitoring) and a set of candidate orbits (defined by orbital parameters such as altitude and inclination). Given a set P of candidate instruments and a set O of candidate orbits, the design space is defined as a set of all binary relations from P to O. Each instrument in P can be assigned to any subset of orbits in O, including the empty set. Therefore, the size of the design space is \(2^{\left| P \right|\left| O \right|}\), where \(\left| P \right|\) is the number of candidate instruments and \(\left| O \right|\) is the number of candidate orbits. In this work, we considered 12 candidate instruments and 5 candidate orbits, making a total of \(2^{60}\) possible designs. Each design is represented by a Boolean matrix M of size \(5 \times 12\), where \(M\left( {o,p} \right) = 1\) if instrument p is assigned to orbit o, and \(M\left( {o,p} \right) = 0\) otherwise. Graphically, this can be displayed by a figure similar to Fig. 1. Here, each row represents a mission that will fly in each orbit, and the columns represent the assignment of different instruments. Note that in the examples that will follow throughout this paper, we replace the names of the actual orbits and instruments with numbers (e.g., 1000, 2000) and alphabetical letters (e.g., A, B, C) to simplify the presentation of the examples.

Fig. 1
figure 1

An example architecture representation. Each row represents a single spacecraft flying in a certain orbit. For example, a spacecraft carrying the cloud and precipitation radar (CPR_RAD) and the UV/VIS limb spectrometer (CHEM_UVSPEC) will fly in a sun-synchronous orbit at an altitude of 800 km, and an afternoon local time of the ascending node

Once the design decisions and the corresponding objective values are provided in a structured format, the proposed method mostly considers the design problem as a black box. At the implementation level, however, there is one critical step necessary in order to run data mining, which is formulating the base features. The base features are predicates used to construct more sophisticated Boolean concepts related to the design space. In its simplest form, a base feature can be a single design decision set to 0 or 1. However, we introduce more complex base features to prespecify the structure of the patterns to be searched, thus biasing the search toward more promising regions in the search space. The base features used for the current system architecting problem are shown in Table 1. The formulation of such base features requires some domain-specific knowledge and insights obtained by observing the structure of the design problem. Based on those insights, we can speculate which form of features may drive the performance of a design.

Table 1 Base features

For example, Present is a base feature that describes whether an instrument \(i\) is used in at least one of the orbits. This feature is equivalent to a disjunction of five base features (instrument \(i\) being assigned to each one of the orbits). Present may potentially speed up the search, since the decision whether to use an instrument or not has a bigger influence in the objective value compared to the decision of which orbit it should be assigned to. While such decision may or may not be useful in capturing the driving features, introducing the predicate Present helps searching that hypothesis space effectively. In the remaining sections of this paper, we will use this predefined set of base features to build more complex features.

Exploring the Feature Space

In this paper, we propose exploring the feature space as a learning aid for design space exploration. We define the feature space as a set of all possible features, visualized by mapping features in a coordinate system where each axis represents a different measure of the goodness of a feature (e.g., precision, recall, F score, confidence, lift, and mutual information [28, 29]). In the following sections, we introduce the graphical user interface that enables visualizing and exploring the feature space and explain how a designer can use it for insight generation and learning.

iFEED

The capability to explore the feature space is built as an extension to the interactive knowledge discovery tool called iFEED [21]. Its goal is to help engineering designers learn interesting features that drive designs toward a particular region of the objective space as they interact with the tool. A user of iFEED can select a group of target designs, and run data mining algorithms to extract the common features that are shared by those designs. The main interface of iFEED is shown in Fig. 2. It consists of an interactive scatter plot, which shows the design space populated by thousands of alternative designs. When the user hovers his or her mouse over one of the points in the scatter plot, the information about that design is displayed below the scatter plot. The displayed information includes values of the objectives and design decisions of a design.

Fig. 2
figure 2

The main graphical user interface of iFEED, which consists of a scatter plot showing the objective space and a display of the design that is currently viewed. Dots highlighted in cyan represent the target region selected by the user

The scatter plot can help the user select a region of interest in the objective space. When the user drags the mouse over the scatter plot, designs in the selected region are highlighted, and they are considered as target solutions when running the data mining process.

The data mining process is based on the Apriori algorithm, which is one of the earliest and most popular algorithms developed for association rule mining [32]. The algorithm has been extended to mine classification association rules, which follow the structure \(X \to C\). Here, X is a feature that describes a design, and C is a class label that indicates whether a certain design belongs to the target region (cyan area in Fig. 2) or not. The data mining returns a list of features that are shared by the target designs. For more details on the data mining algorithm, readers are referred to [21].

Visualization of Feature Space

The features extracted by running the data mining algorithm have varying level of “goodness” in explaining the target designs. Such measures can be defined using various metrics used in binary classification and association rule mining [28, 29]. In this work, we use two measures of confidence defined as follows.

$$conf\left( {S \to F} \right) = \frac{{supp\left( {S \cap F} \right)}}{supp\left( S \right)}$$
$$conf\left( {F \to S} \right) = \frac{{supp\left( {S \cap F} \right)}}{supp\left( F \right)}$$

Here, S is the set of all designs that are in the target region, and F is the set of all designs that have the particular feature that is being considered. supp stands for support, which is defined as

$$supp\left( X \right) = \frac{\left| X \right|}{\left| U \right|}$$

where U is the set of all designs in the database and \(\left| \cdot \right|\) indicates the cardinality of the set. Confidence is often used in association rule mining to represent the strength of a rule [32]. \(conf\left( {S \to F} \right)\) represents how complete the feature is in terms of the fraction of the target region that exhibits the feature, while \(conf\left( {F \to S} \right)\) represents how consistent or specific the feature is in explaining only the target region (fraction of designs with the feature that is in the target region). In fact, because we extract only binary classification rules, \(conf\left( {S \to F} \right)\) and \(conf\left( {F \to S} \right)\) are equivalent to recall and precision, respectively.

After we calculate both confidence measures for the extracted features, we can map them in a two-dimensional plane with each axis representing one of the confidence measures, as shown in Fig. 3. This visualizes the feature space as we defined at the beginning of this section. In the figure, each triangle is a feature obtained from the data mining algorithm. The general trend in the mined features shows that there is a trade-off between the two confidences, consistent with the relationship often seen between recall and precision.

Fig. 3
figure 3

Feature space plot, where each axis is one of the confidence measures. Each triangle represents one feature. The red star (upper-right corner) represents the utopia point of the feature space (Color figure online)

The scatter plot displaying the feature space is also implemented as an interactive plot. When the user hovers the mouse over a feature in Fig. 3, the designs that have the feature are highlighted in the scatter plot as shown in Fig. 4. From these figures, the user can get a quick and intuitive sense of how the feature is distributed within the design space. For example, Fig. 4a shows a design space, and it highlights a feature whose \(conf\left( {S \to F} \right)\) is high and \(conf\left( {F \to S} \right)\) is low. This feature explains most of the target designs, but it is too general, such that it also covers many other designs that are not in the target region. In contrast, the feature highlighted in Fig. 4b has low \(conf\left( {S \to F} \right)\) and high \(conf\left( {F \to S} \right)\). The designs that have this feature fall mostly inside the target region, but only a small portion of the target region is explained by this feature.

Fig. 4
figure 4

Design space highlighting different features. a Designs that have the feature with high \({\text{conf}}\left( {S \to F} \right)\) and b another feature with high \({\text{conf}}\left( {F \to S} \right)\) are highlighted. The cyan dots are the target designs. The pink dots are the designs that have the feature. The purple dots are the overlap of those two sets of designs. The Venn diagram depicts the proportions of the highlighted designs

Representing and Modifying Features

When the user hovers a mouse over a feature in the feature space plot, a tooltip appears with the name of the feature. For example, the following text represents a feature that consists of two base features linked with a conjunction. “

$$absent (I)\,AND\,present (K)$$

In natural language, this can be interpreted as, “Instrument I is not used in any orbit, and instrument K is used in at least one of the orbits.” However, in our tool, such representation can only be used to view the extracted features and cannot be used to modify the given feature or input a new one.

In order to enable the user to modify and explore other features, we implemented a graphical representation of the feature as shown in Fig. 5. This representation uses a tree structure, consisting of two types of nodes. A leaf node represents a base feature, and a logical connective node represents a logical connective (logical conjunction or disjunction) that links all its children nodes. Therefore, the feature shown in Fig. 5 can also be written in text as: “absent(I) AND present(K) AND notInOrbit(4000, K, G, B) AND (inOrbit(1000, L) OR inOrbit(1000, G, A)).” This graphical representation allows the user to easily see the hierarchical structure within a logical expression when both conjunctions and disjunctions are used. Moreover, the user can modify the structure of a feature by changing the location of nodes through a simple drag-and-drop. Being able to modify and test different features is important in order to quickly explore the feature space and gather information.

Fig. 5
figure 5

Representation of a feature using a graph. The displayed feature consists of five base features linked using both conjunctions and disjunctions. The feature can be interpreted in text as “absent(I) AND present(K) AND notInOrbit(4000, K, G, B) AND (inOrbit(1000, L) OR inOrbit(1000, G, A))”

Search in Feature Space

While the user can explore the feature space by modifying and testing individual features, we also implement a local search method to speed up the exploration process. The local search extends a given feature by adding an additional base feature either using a conjunction (AND) or a disjunction (OR). The possible set of base features is set by the user during the problem formulation step, and its size is limited to a small number. Therefore, the system can test the addition of all possible base features, and return the new set of features that improve one of the goodness metrics.

To run the local search, the user has to select a feature from Fig. 3 by clicking on it. Then the user can choose to use either a conjunction or a disjunction in linking the new base feature to the selected feature. When a conjunction is used, the feature becomes more specific (the feature covers fewer designs), most likely leading to an increase in \(conf\left( {F \to S} \right)\). On the other hand, if a disjunction is used instead, the feature becomes more general (the feature covers more designs), thus increasing \(conf\left( {S \to F} \right)\). The newly generated features are compared with the existing set of features and only the non-dominated ones are added to the visualization. This provides a quick and easy way for the user to explore the feature space effectively, advancing the Pareto front of the feature space.

Evaluation

To test the efficacy of exploring the feature space as a way to improve the user’s learning, we conduct a controlled experiment with human participants.

Hypothesis and Experiment Conditions

The aim of the experiment is to examine whether exploring the feature space improves learning, compared to when the user interacts only with the design space. Learning is defined here as learning the mapping between design decisions and objective values. Therefore, we set our hypothesis as the following:

  • H1: Exploring the feature space improves a designer’s ability to predict the performance of a design.

To test this hypothesis, we use a within-subject experiment design and compare the learning in two different conditions: design space versus feature space exploration. The capabilities of the tool in these two conditions are summarized in Table 2.

Table 2 The capabilities provided in each condition

In the first condition, called the design space exploration condition, we provide only the parts in the graphical user interface that are related to the design space. For example, the user can inspect each design shown in the design space (see Fig. 2) and observe the values of design decisions and objectives. The user can also modify each design by adding/deleting/moving instruments through drag-and-drop. After modifying the design, the new design can be evaluated to get the corresponding objective values. In addition, a local search in design space has been implemented to mimic the local search in feature space. The local search is done by randomly sampling four neighboring designs from the currently selected design, evaluating them, and displaying the newly added designs to the scatter plot. A neighboring design is defined as a design that can be reached by changing a single design decision from the currently selected design.

The second condition is called the feature space exploration condition. Here, the user is still able to inspect individual designs in the design space. However, other interactions in the design space (evaluate new designs, local search) are not allowed. Instead, the user can run data mining to obtain an initial set of features visualized in a similar manner to Fig. 3. Modifying, evaluating, and inspecting each feature is also enabled through the interface shown in Fig. 4. Moreover, the user can run a local search to quickly explore the feature space.

The conditions are designed to make the types of interactions as similar as possible in both conditions. The user can modify, evaluate, and inspect designs/features, and run local searches in the respective spaces.

Experiment Protocol

Participants are first provided with an interactive tutorial that explains the design problem as well as all the capabilities of the tool. The tutorial is designed to take around 20–30 min to finish. After the tutorial, each participant is given two different tasks within the same system architecting problem described above (architecting a constellation of climate-monitoring satellites). The two tasks differ in the set of capabilities provided (two experimental conditions). For each task, the participant is asked to find and take notes of the features that would be useful to identify whether an arbitrary design will be in the target region or not. The tasks are designed to be representative of a designer’s effort to find patterns within a group of designs. Different target regions are specified and given to the user to investigate in each task. The two treatment conditions are presented in a random order, and a 10 min time limit is applied to each task to control how much time each participant spends in learning.

After each 10 min session, the participants are given a short quiz to measure how much they have learned during the interaction. For each question, a figure similar to Fig. 1 is given, and the user is asked to predict whether a given design will be located inside the target region or not. A total of 25 YES/NO questions are given.

Participants

We recruited 38 participants, all of whom are university students. The study was approved by the Cornell University Institutional Review Board. Written informed consent was obtained from all participants. The average age of the participants is 23.0, with a standard deviation of 4.05. There were 21 male participants and 17 female participants. 26 students identified themselves as majoring in the STEM field, and 12 students identified themselves as having majors other than STEM.

The recruitment was done through two different channels. First, we recruited from the general student population on campus and offered $15 Amazon gift cards as compensation. 23 participants were recruited using this method.

Second, we recruited students who were taking a graduate-level course on Systems Architecture. These students were offered a small amount of extra credit for the class as compensation. The reason for recruiting from this second group of students was our previous experience running a pilot experiment and with other experiments with similar interfaces. We have observed in the past that participants who had not been exposed before to some basic concepts in design space exploration—such as design decisions, objectives, features, recall, and precision—often struggled to understand the task they were asked to perform and did not utilize all the capabilities of the tool that was provided. In addition to our main hypothesis, we also wanted to test if the participants’ formal training in some of the important concepts has any interaction effect with their performance in each condition. 15 participants were recruited from the class.

Result

The test scores of all participants are summarized in Table 3. The average scores shown in the table represents the percentage of questions that were answered correctly out of 25 questions asked in each problem set. It shows that the average scores for both conditions are effectively the same. Running a paired samples one-tailed t-test gives a p-value of 0.209.

Table 3 Descriptives: all subjects. The mean score shows the percentage of questions answered correctly out of 25 questions in each test

A more interesting result is observed when the participants are grouped based on whether they had the formal training (a first-year graduate course on system architecture) or not. We ran two-way repeated measures ANOVA to compare the difference in the mean scores of the two conditions while also considering the effect of the formal training of the subjects. Table 4 shows the within-subject effects, and Table 5 shows the between-subject effects. The result shows that there is no statistical significance when we only consider either the experiment condition or the formal training (taking the System Architecture class or not) separately. However, there is a significant interaction effect between the two factors (the p-value is 0.002).

Table 4 Within-subject effects
Table 5 Between-subject effects

Figure 6 shows the average test scores after the participants have been divided into two groups (received formal training or not). These two groups of participants exhibit opposite trends in the scores. Those who have not received any formal training scored better in the quiz (one-tailed paired samples t-test: t = 1.261, p = 0.890) when they explored the design space (M = 74.09, SD = 12.41) than when they explored the feature space (M = 70.26, SD = 11.11). On the other hand, those who received formal training performed better (one-tailed paired samples t-test: t = 3.759, p < 0.001), when they explored the feature space (M = 82.13, SD = 6.906), compared to when they explored the design space (M = 71.20, SD = 8.029).

Fig. 6
figure 6

The test scores for each exploration strategy, factored by whether participants received formal training in system architecture design. The error bar shows the standard error

Discussion

From the experiment, we find that there is an interaction effect between the exploration strategy and the formal training. While the participants who had no formal training performed equivalently in both tasks, the participants who received formal training performed significantly better in the feature space exploration condition than in the design space exploration condition. This suggests that those who had previously been exposed to the basic concepts in engineering design and tradespace analysis found the feature space exploration more useful. A possible explanation for such observation is that the tool to explore the feature space is less intuitive and difficult to learn. Exploring the feature space requires reasoning at a higher level of abstraction, as it deals with what groups of designs have in common, rather than individual designs. Moreover, it requires understanding how features are represented (as shown in Fig. 5) as well as the basic concepts of interestingness measures in binary classification (e.g., precision and recall). While each subject is given a 30 min tutorial prior to the actual tasks, he or she may not be able to grasp all the concepts needed to make full use of all capabilities. This is also reflected in the qualitative feedback that we obtained from the participants after each session. Many participants reported that they had difficulty in understanding how to effectively use the capabilities to explore the feature space. Most of the participants thought that manually inspecting each design was more helpful.

While some of the participants who received formal training also made similar reports, others thought that exploring the feature space was more practical and useful in answering the questions in the quiz. It is possible that having received formal training helped them to better understand the tool. In the Systems Architecture class, the lectures cover a wide range of topics related to tradespace analysis including decision space, objective space, Pareto dominance, driving features, and sensitivity analysis among others. While this does not ensure a student’s understanding of these subjects, we can assume that they have been exposed to, and thus familiar with, these topics.

The current experiment result supports our hypothesis that exploring the feature space improves learning, with a condition that the user has to be familiar with the key concepts of design space exploration and have been trained to reason in an abstract space. We believe that this is a promising result since the proposed method is mainly intended for professional engineering designers and systems engineers who are familiar with design space exploration.

Conclusion

This paper introduced a new concept in design space exploration, namely, that of exploring the feature space, where the feature space is defined as a set of possible features (combinations of values for various design decisions). The feature space is visualized in a 2D plane, with each axis representing \(conf\left( {S \to F} \right)\) and \(conf\left( {F \to S} \right)\)—two measures that are equivalent to recall and precision, respectively. The designer can explore the feature space by modifying and testing different features and receiving immediate feedback on how the values of the goodness of features change in response. Such interaction provides a chance to learn and compare how well different features explain a selected region of the design space. This is in contrast to the conventional ways of presenting and selecting features in data mining, where the mined features are usually sorted by a single goodness metric and only a handful of them are inspected by the user. By inspecting the feature space, the designer can easily identify the major features that drive the performance of design and how well they explain the data.

The result from a controlled human subject experiment showed that the participants who received formal training in the key concepts of design space exploration performed better when they had a chance to explore the feature space, as opposed to when they explored only in the design space. This shows that feature space exploration has the potential to enhance designer’s learning about the important features, which is reflected in their ability to predict the behavior of a design. For the purpose of this study, feature space exploration was tested separately from design space exploration. However, it is designed as a supplementary tool that helps the engineering designers to learn and gain new insights about what features constitute good designs. The designers can then leverage this knowledge to explore the design space more effectively.

A limitation in the result presented in this paper is that only the participants who received formal training performed better under the feature space exploration condition. While we obtained unstructured qualitative feedback after each experiment, how their previous exposure to design space exploration influenced the result is not clear. This will need to be investigated further in the future with a larger sample size and ways to measure how effectively each participant used feature space exploration.

There also exist other limitations with the current method to explore the feature space. The local search method to create and test new features is very simple and intuitive, but at the same time greedy and prone to overfitting. Using the local search, the users can easily generate features that have very high confidence metrics \(conf\left( {S \to F} \right)\) and \(conf\left( {F \to S} \right)\), but they are often too complex (large number of literals in a complex nested structure of disjunctions and conjunctions). When a feature becomes too complex, it becomes very difficult to comprehend and learn any insights from it. To resolve this issue, the authors are currently investigating new ways to populate the feature space.