Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Recently, there has been an increasing interest in business analytics and big data tools to understand and drive industries evolution. The healthcare industry is also interested in new methods to analyze data and provide better care. Given the wealth of data that various institutions are accumulating, it is natural to take advantage of data driven decision-making solutions. Modern computing techniques, including machine learning, intelligent data analysis and decision support systems technologies, provide a new promising way to better understand, further improve and support the treatment. The main motivation for researching this topic is to study and analyze the possibilities of applying modern information technologies and machine learning methods in the area of medicine. Machine learning and data exploration methods should help in understanding relationships among the treatment factors and audiological measurements, in order to better understand tinnitus treatment. Understanding the relationships between patterns among treatment factors would help to optimize the treatment process. Additionally, different preprocessing techniques will be used so that to transform the tinnitus dataset into more suitable for machine understanding.

2 Background

Tinnitus, popularly known as “ringing in the ears”, nowadays, affects a significant portion of the population—according to some estimations about 10–20% general population. Causes of tinnitus are often not clear—it is associated with hearing loss, ear infections, acoustic neuroma, Menere’s syndrome, aging and side-effect of some drugs. There is no cure for it and treatment methodologies prove ineffective in many cases and some methods of treatment work well for some patients but not necessary for the others (must be highly personalized).

Tinnitus Retraining Therapy is a highly successful method of treatment proposed and developed by Dr. Jastreboff. The patients are categorized into one of four groups of tinnitus based on interview, audiological and medical evaluation (see Table 1). The therapy consists of a series of counseling sessions accompanied by use of devices called sound generators. Treatment progress and results were historically collected by Dr. Jastreboff resulting in a database of demographic and medical data of patients, as well as a series of metrics measuring treatment progress for each visit.

Our motivation was to further study factors behind therapy’s effectiveness in order to collect actionable knowledge gathered by Dr. Jastreboff over several years of treatment (1999–2005). This would allow for further proliferation of therapy, introducing objectivity and standardization of the therapy in places lacking expertise in the field.

Table 1 Determining categories of tinnitus patients [1]

3 Approach

The approach based on the action rules presents a new method of machine learning, which solves problems that traditional methods, such as classification or association rules, cannot handle. The purpose here is to analyze data in order to improve the understanding of the data and seek specific actions to enhance the decision-making process. In contrast to learning the association rules, the action rule approach mines actionable patterns that can be employed to reach a desired goal (such as to increase treatment progress) instead of merely extracting passive relations between variables. Since its introduction in 2000 [2], action rules have been successfully applied in many domain areas including business [2], medical diagnosis and treatment [3], and music automatic indexing and retrieval [4].

Action rules seem to be especially promising in the field of medical data, as a doctor can examine the effect of treatment decisions on a patient’s improved state. For example, in the tinnitus dataset, such an indicator for tracking improvement progress would be a Total score attribute, calculated by the sum of the responses from the interview form.

3.1 Origins in Rough Set Theory

Concepts of Action Rule, Reducts, Decision Table and Information System have their origins in the theory of Rough Sets, developed by Professor Zdzisław Pawlak at the beginning of 1980s [5]. The theory proposed a novel approach to the formal representation of knowledge description, and since its introduction was developing extensively all around the word, confirming its usefulness in practical settings.

3.2 Decision Rules

The decision rule, for a given decision table, is a rule in the form: \((\phi \rightarrow \delta )\), where \(\phi \) is called premise (or assumption) and \(\delta \) is called conclusion (or thesis) of the rule. The premise for an atomic rule can be a single term or a conjunction of k elementary conditions: \(\phi = p_1 \wedge p_2 \wedge ... \wedge p_n\), and \(\delta \) is a decision attribute. Decision rule describing a class \(K_j\) means that objects, which satisfy (match) the rule’s premise, belong to \(K_j\).

Each rule can be characterized by the following features:

  • length(r) = number of descriptors in the premise of the rule,

  • [r] = a set of objects from U matching the rule’s premise,

  • support(r) = number of objects from U matching the rule’s premise: |[r]| (relative support is further divided by number of objects N),

  • confidence(r) = reliability of the rule: \(\frac{|[r] \cap DEC_k|}{|[r]|}\)—number of objects matching both rule’s premise and conclusion, divided by absolute support.

3.2.1 Classification Rules

In the context of prediction problem, decision rules generated from training dataset, are used for classifying new objects (for example classifying a new patient for tinnitus category). New objects are understood as objects that were not used for the rules induction (new patients coming to the doctor). The new objects are described by attribute values (for instance a patient with conducted audiological evaluation and form responses). The goal of classification is to assign a new object to one of the decision classes. Prediction is performed by matching the object description with the rule antecedents.

3.3 Action Rules

An action is understood as a way of controlling or changing some of attribute values in an information system to achieve desired results [6]. An action rule is defined [2] as a rule extracted from an information system, that describes a transition that may occur within objects from one state to another, with respect to decision attribute, as defined by the user. In nomenclature, action rule is defined as a term: \([(\omega ) \wedge (\alpha \rightarrow \beta ) \rightarrow (\varPhi \rightarrow \varPsi )]\), where \(\omega \) denotes conjunction of fixed condition attributes, \((\alpha \rightarrow \beta )\) are proposed changes in values of flexible features, and \((\varPhi \rightarrow \varPsi )\) is a desired change of decision attribute (action effect). Action rule discovery applied to tinnitus dataset could, for example, suggest a change in a flexible attribute, such as type of sound generator instrument, to help “reclassify” or “transit” an object (patient) to a different category and consequently, attain better treatment effectiveness.

An action rule is built from atomic action sets.

Definition 1

Atomic action term is an expression \((a, a_1 \rightarrow a_2)\), where a is attribute, and \(a_1, a_2 \in V_a\), where \(V_a\) is a domain of attribute a.

If \(a_1 = a_2\) then a is called stable on \(a_1\).

Definition 2

By action sets we mean the smallest collection of sets such that:

  1. 1.

    If t is an atomic action term, then t is an action set.

  2. 2.

    If \(t_1, t_2\) are action sets, then \(t_1 \wedge t_2\) is a candidate action set.

  3. 3.

    If t is a candidate action set and for any two atomic actions \((a, a_1 \rightarrow a_2)\), \((b, b_1 \rightarrow b_2)\) contained in t we have \(a \ne b\), then t is an action set. Here b is another attribute \((b \in A)\), and \(b_1, b_2 \in V_b\).

Definition 3

By an action rule we mean any expression \(r=[t_1 \Rightarrow t_2]\), where \(t_1\) and \(t_2\) are action sets.

The interpretation of the action rule r is, that by applying the action set \(t_1\), we would get, as a result, the changes of states in action set \(t_2\).

Example 1

Assuming that a, b and d are stable attribute, flexible attribute and decision attribute respectively in S, expressions \((a, a_2)\), \((b, b_1 \rightarrow b_2)\), \((d, d_1 \rightarrow d_2)\) are examples of atomic action sets. Expression \((a, a_2)\) means that the value \(a_2\) of attribute a remains unchanged, \((b, b_1 \rightarrow b_2)\) that value of attribute b is changed from \(b_1\) to \(b_2\). Expression \(r = [\{(a, a_2) \wedge (b, b_1 \rightarrow b_2) \} \Rightarrow \{(d, d_1 \rightarrow d_2)\}]\) is an example of an action rule meaning that if value \(a_2\) of a remains unchanged and value of b will change from \(b_1\) to \(b_2\), then the value of d will be expected to transition from \(d_1\) to \(d_2\). Rule r can be also perceived as the composition of two association rules \(r_1\) and \(r_2\), where \(r_1 = [\{a,a_2) \wedge (b,b_1)\} \Rightarrow (d,d_1)]\) and \(r_2 = [\{a,a_2) \wedge (b,b_2)\} \Rightarrow (d,d_2)]\).

In other words, if we apply action rule r on a patient satisfying rule \(r_1\), then it is also expected that this patient will satisfy rule \(r_2\). The confidence of action rule r is defined as (confidence of \(r_1\)) x (confidence of \(r_2\)).

3.4 Meta Actions

Action rules are mined on the entire set of objects in S. Meta-actions, on the other hand, are chosen based on the action rules. They are formally defined as higher level concepts used to model a generalization of action rules in an information system [2]. They trigger actions that cause transitions in values of some flexible attributes in the information system. These changes, in turn, result in a change of decision attributes’ values.

Definition 4

Let M(S) be a set of meta-actions associated with an information system S. Let \(a \in A\), \(x \in X\), and \(M \subset M(S)\). Applying the meta-actions in the set M on object x will result in \(M(a(x)) = a(y)\), where object x is converted to object y by applying all meta-actions in M to x.

Example 2

Let M(S), where \(S = (X, A)\), be a set of meta-actions associated with an information system S. In addition let \(T = \{v_{i,j}: j \in J_i, x_i \in X\}\) be the set of ordered transactions, patient visits, such that \(v_{i,j} = [(x_i, A(x_i)_j)]\), where \(A(x_i)_j\) is a set of attribute values \(\{a(x_i): a \in A\}\) of the object \(x_i\) for the visit represented uniquely by the visit identifier j. Each visit represents the current state of the object (patient) and current diagnosis. For each patient’s two consecutive visits \((v_{i,j}, v_{i, j+1})\), where meta-actions were applied at visit j, it is possible to extract an action set. In this example, an action set is understood as an expression that defines a change of state for a distinct attribute that takes several values (multivalued attribute) at any object state. For example \(\{a_1, a_2, a_3\} \rightarrow \{a_1, a_4\}\) is an action set that defines a change of values for attribute \(a \in A\) from the set \(\{a_1, a_2, a_3\}\) to \(\{a_1, a_4\}\), where \(\{a_1, a_2, a_3, a_4\} \subseteq V_a\) [7].

These action sets resulting from the application of meta-actions represent the actionable knowledge needed by practitioners. However, not every patient reacts the same way to the same meta-actions, because patients may have different preconditions. In other words, some patients can be partially affected by the meta actions and may have other side-effects. So, there is a need to introduce personalization on meta actions when executing action rules. The problem of personalized meta-actions is a fairly new topic that creates room for new improvements. There is a minor work on the personalization of meta-actions done so far [8]. Action sets have to be additionally mined for the historical patterns. To evaluate these action set patterns some frequency measure for all patients has to be used (for example support or confidence). There is a room for improvements in personalized meta action mining, as well. In healthcare for instance, meta actions representing patient’s treatments can be mined from doctor’s prescription. In addition to action rule mining in healthcare, meta actions present an interesting area for personalized treatments mining.

4 Experiments

4.1 Dataset

The progress of treatment with Tinnitus Retraining Therapy (habituation of tinnitus) was monitored and collected in Tinnitus and Hyperacusis Center at Emory University School of Medicine. Original sample of 555 patients, described by forms during initial or follow-up visits, collected by Dr. Jastreboff, was used. Additionally, the Tinnitus Handicap Inventory was administered to individuals during their visits to the Center. The database consists of tuples identified with patient and visit numbers and have been developed over years by inserting patients’ information from paper forms (devised by doctor Jastreboff).

The raw dataset was organized into 11 tables including data on:

  • Demographics—includes al the demographics information such as address, age, gender, occupation, work status.

  • Pharmacology—information on medications taken by a patient.

  • Visits—the main inventory of visits and their outcomes, timestamped.

  • Audiological measurements—carried out by physician at visits.

  • Initial and follow-up forms’ questions on tinnitus, sound tolerance and hearing problem—the answers are mostly Likert scale.

  • Newman form questions—contain patient’s subjective opinion on impact of tinnitus on three areas of their lives: emotional, functional and catastrophical, along with summary values and total score of all of them.

  • Instruments—sound generators used within the therapy with the details such as type, model, etc.

  • REM—settings used for sound generators as a part of the therapy.

4.2 Preprocessing and Feature Extraction

The raw dataset was preprocessed: tables were merged into one dataset of visits. The dataset was found to be incomplete and inconsistent in terms of visits’ numbering and timestamps, which had to be fixed manually. Some columns contained too many missing values, so they had to be discarded.

New features were introduced as described below.

4.2.1 Tinnitus Background

Binary features related to Tinnitus background, such as: STI—Stress Tinnitus Induced, NTI—Noise Tinnitus Induced, etc., developed based on the textual descriptions in T Induced and H Induced columns in the Demographics table—for example STI was identified with keywords, such as ‘divorce’, ‘excessive work’, etc. NTI—with ‘noise exposure’, ‘shooting guns’. Other binary attributes developed to indicate a tinnitus/hyperacusis cause were related to specific medical conditions:

  • HLTIHearing Loss Tinnitus Induced—covers patients who associated their tinnitus with a hearing loss.

  • DETIDepression Tinnitus Induced—relates tinnitus symptoms to depression.

  • AATIAuto Accident Tinnitus Induced—whether tinnitus emerged as a result of auto accident, which involved head injuries.

  • OTIOperation Tinnitus Induced—patients after surgeries.

  • OMTIOther Medical—patients, whose tinnitus was related to medical conditions other than a hearing loss, depression or an operation—patients with acoustic neuroma, Lyme’s disease, ear infections, obsessive compulsive disorder and others.

4.2.2 Temporal Features

Having information about patient’s date of birth, as well as date of the first visit, a column, informing what was the age of the patient when they started treatment, can be derived. Temporal information could be also extracted from T induced column (or H induced), which often contains data about how long ago or the date the tinnitus (or hyperacusis) appeared. This way DTI/DHI columns were developed-by checking each tuple of a patient and calculating it manually. Having this information it was possible to derive a number of new features: the age of a patient when tinnitus started, as well as the time elapse between the tinnitus onset and the initial visit to doctor. It can potentially lead to discovering the knowledge on an impact of patient’s age at the start of the treatment, the age when tinnitus began, and time elapse from the tinnitus symptoms onset to the treatment start, on the effectiveness of particular treatment methods in TRT.

To summarize the work on temporal features development, following new columns were added to the original database:

  • DTIDate Tinnitus Induced—date column derived from text columns,

  • DHIDate Hyperacusis Induced—analogous to the above, but derived from H induced column—these both new attributes convey general information about when “the problem” started and both were developed manually,

  • AgeInd—patient’s age when the problem (tinnitus or hyperacusis) was induced—derived from DOB and DTI/DHI columns,

  • AgeBeg—patient’s age when they started TRT treatment (first visit to doctor Jastreboff)—derived from DOB and Date (of visit 0) columns,

  • numerical columns DAgo, WAgo, MAgo, YAgo—informing how many days, weeks, months, and years ago the problem started,

  • binary columns calculated on the basis of columns above: Y30, Y20, Y10, Y5, Y3, Y1, M6, M3, M1, W2, W1, D1 informing to which group of time elapse, between the tinnitus onset to the treatment start, a patient belongs (Y—years, M—months, W—weeks, D-days, and numerical value). For example, having “True” value in Y5 column for the given patient, means that the problem was induced between 5 to 10 years before starting TRT treatment.

4.2.3 Binary Features for Medications Taken by Patients

Instead of maintaining a list of medications for each patient, they were altered into pivotal features. By pivoting the data values on the medication column, the resulting set contains a single row per patient. This single row lists all the medication taken by a patient, with the medication names shown as column names, and a binary value (True/False) for the columns. Pivot transformation was deployed with PL/SQL procedures. Each distinct value in Medication column of Pharmacology table was developed into additional column. Bit values in the column indicate, for each patient-visit tuple, whether the medication denoted by a column name was taken. As a result, 311 additional features were developed, each for distinct medication. Similar approach was taken to Application column in Pharmacology table. Values in this column describe patients’ medical problems that are associated with the taken medications. As a result, additional 161 columns were developed for each separate medical state (for example “anxiety”, “asthma”, “insomnia”, “ulcers”, etc.).

4.3 Feature Selection

Feature selection experiments were performed along with classification (in WEKA) to choose the most relevant subset of features. The most important features for the diagnosis classification purposes proved to be audiological measurements.

5 Diagnostic Rule Extraction

5.1 Methodology

The associations of interest to mine for are factors affecting patient’s category of tinnitus, such as audiological measurements, demographics data, forms’ answers and pharmacology. The simplified process of diagnosis is presented in Fig. 1. The treatment approach varies according to category; thus, accurate placement of patients into these categories is critical to provide proper treatment.

Fig. 1
figure 1

Factors and data flow in the process of determining patient’s category and problem

Experiments on diagnostic rule discovery (association rules) were carried out with LISp-Miner system, which offers exploratory data analysis, implemented by its own procedures, called GUHA—highly optimized algorithm for rules generation [9]. The GUHA method, an original Czech data mining method with strong theoretical background, uses definition of the set of association rules (or G-rules in Ac4ft-Miner) to generate and verify particular rules on the data provided to the system. Algorithm does not use Apriori-like, but bit-string approach to mine rules. Premise and conclusion of the GUHA rule (relevant pattern) are defined in terms of boolean attributes, which are, in turn, defined as conjunction or disjunction of boolean attributes or literals.

5.2 Results

The results from the system are printed as general hypotheses (all factors were used for an algorithm). The following printings show example of generated comprehensive rules along with confidence ad support values for each rule (the explanation for the attribute’s abbreviations is provided in Appendix in Table 4). These obtained from experiments targeting the best confidence can be interpreted as being more accurate, but less general. On the other hand, rules extracted with such settings, so that to obtain best support, hold true more generally (in greater population).

Besides, a series of experiments was run for each area of interest separately in relation to patient’s category (to obtain more detailed results), which are discussed in the subsequent subsections:

  • Interview \(\implies \) Category

  • Audiology \(\implies \) Category

  • Demographics \(\implies \) Category

  • Pharmacology \(\implies \) Category

  • Pharmacology \(\implies \) Tinnitus

Some interesting findings were obtained. We show only the most interesting examples of rules with best support and confidence, among many rules generated. In the listings below, confidence and support values are provided for each rule, where support is defined as percentage of objects in the whole dataset satisfying that rule.

5.2.1 Comprehensive—Most General rules (names of attributes are explained in Table 4)

Examples of rules for each category, mined from all the relevant attributes, with the highest support:

Hypotheses 1

H EL \(\text {<}\) 1 \(\implies _{0.52;0.04}\) C(0)

L SD \(\ge \) 100 \(\wedge \) LL4 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.5;0.04}\) C(0)

L SD \(\ge \) 100 \(\wedge \) LL4 \(\ge \) 999 \(\wedge \) LL8 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.5;0.04}\) C(0)

Hypotheses 2

LL12 \(\ge \) 999 \(\wedge \) LR12 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.58;0.11}\) C(1)

LR12 \(\ge \) 999 \(\wedge \) T EL \(\ge \) 8 \(\implies _{0.57;0.09}\) C(1)

T An \(\ge \) 8 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.56;0.09}\) C(1)

Hypotheses 3

L4 \(\ge \)65\(\,\implies _{0.62;0.1}\) C(2)

HL pr \(\ge \) 5 \(\implies _{0.54;0.14}\) C(2)

Hypotheses 4

H An \(\ge \) 8 \(\wedge \) H Sv \(\ge \) 7.5 \(\implies _{0.55;0.1}\) C(3)

H Sv \(\ge \) 7.5 \(\implies _{0.5;0.11}\) C(3)

H EL \(\ge \) 8 \(\implies _{0.5;0.11}\) C(3)

Hypotheses 5

L SD \(\ge \) 100 \(\wedge \) L4\(\text {<}\)10 \(\wedge \) LL3\(\text {<}\)75 \(\implies _{0.67;0.02}\) C(4)

L3\(\text {<}\)5 \(\wedge \) LL3\(\text {<}\)75 \(\implies _{0.59;0.02}\) C(4)

L4\(\text {<}\)10 \(\wedge \) LL3\(\text {<}\)75 \(\implies _{0.53;0.02}\) C(4)

5.2.2 Comprehensive—Most Accurate

Examples of rules for each category, mined from all the relevant attributes, with the highest confidence:

Hypotheses 6

DST(N) \(\implies _{0.78;0.01}\) C(0)

Concert(0) \(\implies _{0.75;0.01}\) C(0)

Rest(0) \(\implies _{0.75;0.01}\) C(0)

Hypotheses 7

R3(\(\text {<}\)15;20)) \(\wedge \) T An \(\ge \) 8 \(\implies _{0.94;0.03}\) C(1)

LL2 \(\ge \) 999 \(\wedge \) LR12 \(\ge \) 999 \(\wedge \) R4(\(\text {<}\)15;20)) \(\wedge \) T EL \(\ge \) 8 \(\implies _{0.94;0.03}\) C(1)

LR12 \(\ge \) 999 \(\wedge \) R4(\(\text {<}\)15;20)) \(\wedge \) T EL \(\ge \) 8 \(\implies _{0.94;0.03}\) C(1)

R4(\(\text {<}\)15;20)) \(\wedge \) T Sv \(\ge \) 8 \(\implies _{0.94;0.03}\) C(1)

Hypotheses 8

LR8 \(\ge \) 999 \(\wedge \) R6 \(\ge \) 75 \(\wedge \) T Sv \(\ge \) 8 \(\implies _{0.96;0.04}\) C(2)

LL8 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R6 \(\ge \) 75 \(\wedge \) T Sv \(\ge \) 8 \(\implies _{0.96;0.04}\) C(2)

LR6 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R2 \(\ge \) 45 \(\wedge \) R3 \(\ge \) 60 \(\wedge \) R6 \(\ge \) 75 \(\implies _{0.95;0.03}\) C(2)

LR6 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R2 \(\ge \) 45 \(\wedge \) R4 \(\ge \) 65 \(\wedge \) R6 \(\ge \) 75 \(\implies _{0.95;0.03}\) C(2)

LR6 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R2 \(\ge \) 45 \(\wedge \) R4 \(\ge \) 65 \(\wedge \) R8 \(\ge \) 75 \(\implies _{0.95;0.03}\) C(2)

L2 \(\ge \) 50 \(\wedge \) L3 \(\ge \) 60 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R6 \(\ge \) 75 \(\implies _{0.95;0.03}\) C(2)

Hypotheses 9

LL3(\(\text {<}\)85;91)) \(\wedge \) H pr \(\ge \) 7 \(\wedge \) H Sv \(\ge \) 7.5\(\,\implies _{1;0.03}\) C(3)

LL3(\(\text {<}\)85;91)) \(\wedge \) H An \(\ge \) 8 \(\wedge \) H EL \(\ge \) 8 \(\wedge \) H Sv \(\ge \) 7.5 \(\implies _{1;0.03}\) C(3)

LL3(\(\text {<}\)85;91)) \(\wedge \) H EL \(\ge \) 8 \(\wedge \) H Sv \(\ge \) 7.5 \(\implies _{1;0.03}\) C(3)

LR1 \(\text {<}\) 74 \(\wedge \) LR2 \(\text {<}\) 74 \(\wedge \) LR6 \(\text {<}\) 78 \(\wedge \) H pr \(\ge \) 7 \(\wedge \) H An \(\ge \) 8 \(\implies _{0.94;0.03}\) C(3)

5.2.3 Interview \(\implies \) Category

Hypotheses 10

H EL \(\text {<}\) 1 \(\implies _{0.52;0.04}\) C(0)

H An \(\text {<}\) 1.5 \(\wedge \) H EL \(\text {<}\) 1 \(\wedge \) H Sv \(\text {<}\) 1.5 \(\implies _{0.51;0.03}\) C(0)

Hypotheses 11

HL pr \(\text {<}\) 0.5 \(\wedge \) T EL \(\ge \) 8 \(\implies _{0.55;0.06}\) C(1)

H pr \(\text {<}\) 0.5 \(\wedge \) HL pr \(\text {<}\)0.5 \(\implies _{0.58;0.04}\) C(1)

Hypotheses 12

HL pr \(\ge \) 5 \(\wedge \) T EL \(\ge \) 8 \(\implies _{0.57;0.07}\) C(2)

HL pr \(\ge \) 5 \(\wedge \) T Sv \(\ge \) 8 \(\implies _{0.57;0.06}\) C(2)

HL pr \(\ge \) 5 \(\wedge \) T An \(\ge \) 8 \(\implies _{0.55;0.07}\) C(2)

Hypotheses 13

H An \(\ge \) 8 \(\wedge \) H EL \(\ge \) 8 \(\wedge \) H Sv \(\ge \) 7.5 \(\implies _{0.58;0.09}\) C(3)

H pr \(\ge \) 7 \(\wedge \) H An \(\ge \) 8 \(\wedge \) H EL \(\ge \) 8 \(\wedge \) H Sv \(\ge \) 7.5 \(\implies _{0.58;0.08}\) C(3)

The obtained rules seem to confirm the expert’s (medical) knowledge:

  • patients categorized into 0 group have a problem a low impact on life (H EL is low),

  • category-1 patients have significant tinnitus problem, but without hyperacusis (H pr is low) and there is no significant hearing loss (HL pr is low),

  • category 2 is characterized on the other hand with significant hearing loss (HL pr \(\ge \) 5),

  • category 3 is associated by the expert with significant hyperacusis problem—obtained hypotheses show association of high values of H An, H Sv and H EL with this category.

5.2.4 Audiology \(\implies \) Category

Hypotheses 14

L SD \(\ge \) 100 \(\wedge \) LL4 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.5;0.04}\) C(0)

L SD \(\ge \) 100 \(\wedge \) LL4 \(\ge \) 999 \(\wedge \) LL8 \(\ge \) 999 \(\wedge \) LR8 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.5;0.04}\) C(0)

Hypotheses 15

LL12 \(\ge \) 999 \(\wedge \) LR12 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.58;0.11}\) C(1)

LL12 \(\ge \) 999 \(\wedge \) R SD \(\ge \) 100 \(\implies _{0.55;0.12}\) C(1)

Hypotheses 16

LR8 \(\ge \) 999 \(\wedge \) R4 \(\ge \) 65 \(\implies _{0.78;0.08}\) C(2)

L2 \(\ge \) 50 \(\implies _{0.7;0.1}\) C(2)

Hypotheses 17

LR6 \(\text {<}\) 78 \(\implies _{0.63;0.07}\) C(3)

LR2 \(\text {<}\) 74 \(\implies _{0.62;0.07}\) C(3)

Hypotheses 18

L SD \(\ge \) 100 \(\wedge \) L4 \(\text {<}\) 10 AND LL3 \(\text {<}\) 75 \(\implies _{0.67;0.02}\) C(4)

L3 \(\text {<}\) 5 \(\wedge \) LL3 \(\text {<}\) 75 \(\implies _{0.59;0.02}\) C(4)

For the second tested area generated hypotheses inform that a basic audiogram with LDLs is the crucial test for diagnosis. Based on obtained rules, it can be concluded that the lower the tolerance, the more severe category of tinnitus should be assigned to a patient. According to our medical expertise, the found results were interesting and also in theory it is expected to have strong correlation of THI with LDL.

5.2.5 Demographics \(\implies \) Category

Hypotheses 19

Country(USA) \(\wedge \) MedNr(\(\text {<}\)3;4)) \(\wedge \) State(GA) \(\implies _{0.56;0.02}\) C(0)

Hypotheses 20

AgeBeg(\(\text {<}\)50;55)) \(\wedge \) Country(USA) \(\wedge \) G(m) \(\implies _{0.58;0.02}\) C(1)

AgeBeg(\(\text {<}\)50;55)) \(\wedge \) G(m) \(\implies _{0.56;0.02}\) C(1)

Country(USA) \(\wedge \) G(m) \(\wedge \) M6(yes)\(\implies _{0.5;0.02}\) C(1)

Hypotheses 21

AgeBeg \(\ge \) 68 \(\implies _{0.58;0.03}\) C(2)

G(m) \(\wedge \) MedNr \(\ge \) 5 \(\wedge \) T side(yes) \(\implies _{0.53;0.03}\) C(2)

Hypotheses 22

Work(h) \(\implies _{0.69;0.02}\) C(3)

Country(USA) \(\wedge \) G(f) \(\wedge \) M1(yes)\(\implies _{0.83;0.01}\) C(3)

AgeBeg \(\ge \) 40 \(\wedge \) Country(USA) \(\wedge \) AgeInd(\(\text {<}\)30;38)) \(\implies _{0.71;0.01}\) C(3)

Occup(homemaker) \(\implies _{0.71;0.01}\) C(3)

G(f) \(\wedge \) STI(yes)\(\implies _{0.5;0.01}\) C(3)

Hypotheses 23

Country(USA) \(\wedge \) G(m) \(\wedge \) MedNr(3) \(\wedge \) Y10(yes) \(\implies _{0.8;0.01}\) C(4)

Some relevant patterns of patients’ demographics in particular categories were also found out. For example, as a rule, patients with tinnitus of low effect on their lives (that is, category-0) came from the state of Georgia in the USA (that is nearby the clinic) and were affected with 3 other afflictions (were taking three types of medications for treating them). According to our medical expertise, that probably just reflects the fact that long distance patients with low level of severity did not bother to come as it would involve cost and effort; coming was much easier for people from Georgia. Another common pattern for patients in category 1 was: a male aged 50–55 from the USA, whose tinnitus had started 6–12 months before he began TRT.

It could also be observed that category-2 patients are typically older (age when they began treatment typically higher than 68 years old, as a rule), they had been taking more medications (5 and more) and their tinnitus was associated with taking these medications (T side(yes)). According to our medical knowledge, we can confirm that older patients are taking more medications. Also, hearing loss, which has to be present for Category 2, is strongly correlated with the age.

Relevant patterns for Category 3 included:

  • the patients who worked at home (and also their tinnitus was induced by medications),

  • the patients occupied with homemaking,

  • the female patients with the tinnitus induced 1–3 months before they went to a doctor,

  • females whose tinnitus was associated with stressful situations,

  • patients relatively young (younger than 40 years old, whose problem started at 30–38 years old), living in the USA.

Pattern found for patients with the fourth category, included males curing three other afflictions with the corresponding medications, whose tinnitus is 10–20 years old.

It should be noted that the demographic-based rules must not be primarily used in diagnosis, and the medical knowledge confirms it. Patient’s category should not be based on their age, place of residence, occupation, etc., but rather on more objective medical factors, such as, audiological measures or interview. Nevertheless, they reveal some common demographic patterns in categories of patients treated in the past, which may bring additional knowledge, used as heuristics or hints in the rule-based decision support system.

5.2.6 Pharmacology \(\implies \) Category

Another series of experiments were focused on discovering patterns relating additional patients’ afflictions and medication taken in order to cure them, to the category of tinnitus treatment.

Hypotheses 24

Ativan(yes) \(\wedge \) Anxiety disorder(yes) \(\implies _{0.58;0.01}\) C(1)

Klonopin(yes) \(\wedge \) Panic disorder(yes) \(\wedge \) Seizures(yes)\(\implies _{0.53;0.01}\) C(1)

Depression disorder(yes) \(\wedge \) Panic disorder(yes) \(\wedge \) Seizures(yes)\(\implies _{0.5;0.02}\) C(1)

Preliminary results have shown that patients with accompanying depression, anxiety or panic disorders were assigned to Category 1, while patients with hypertension, for example, belonged to category 2. Relevant group of patients treated for anxiety, panic/seizures or depression disorders (with Ativan/Klonopin) was diagnosed with the first category of tinnitus. According to our medical expertise, these drugs are routinely prescribed by physicians for treating tinnitus, in order to decrease anxiety or depression.

Hypotheses 25

Angin(yes) \(\wedge \) Hypertension(yes) \(\implies _{0.69;0.02}\) C(2)

Patients with hypertension and angina can be hypothetically classified into the second category of tinnitus (with 69% confidence). According to our medical expertise, typically these conditions are associated with aging which in turn is strongly associated with hearing loss.

5.2.7 Pharmacology \(\implies \) Tinnitus

The last group of experiments aimed at finding out which drugs might cause side-effect of tinnitus:

Hypotheses 26

Norvasc(yes) \(\wedge \) T side(yes) \(\implies _{0.67;0.01}\) C(2)

Prozac(yes) \(\wedge \) T side(yes) \(\implies _{0.6;0.01}\) C(1)

Synthroid(yes) \(\wedge \) T side(yes) \(\implies _{0.6;0.01}\) C(2)

Atenolol(yes) \(\wedge \) T side(yes) \(\implies _{0.56;0.01}\) C(2)

Celebrex(yes) \(\wedge \) T side(yes) \(\implies _{0.56;0.01}\) C(2)

Klonopin(yes) \(\wedge \) T side(yes) \(\implies _{0.56;0.01}\) C(1)

The first medication is applied for hypertension and angina, the second for depression, bulimia nervosa, OCD. Synthroid is used in thyroid hormone therapy, Atenolol reduces blood pressure (treats hypertension). Celebrex acts anti -inflammatory and Klonopin—anti-panic and anti-seizure.

The conclusion from the experiment is that these medications should be further investigated on their side-effects. Patients taking them and seeking help for their tinnitus might recover simply after stop taking them or switching to other complementary pharmaceuticals, with no such side-effects. It might also save time on complex tinnitus therapy, avoiding unnecessary actions. As for depression, however, it is not clear, whether this disorder is a cause or an effect of tinnitus. According to our medical expertise it can be both.

6 Treatment Rule Extraction

6.1 Methodology

As stated earlier, action rules should help in choosing the right course of treatment within Tinnitus Retraining Therapy. The treatment process and a data flow within it is shown in Fig. 2.

Fig. 2
figure 2

The process and data flow of treatment actions and TRT tracking

Appropriate tasks were defined in LISp-Miner [9] based on analysis of the process. Treatment actions include: treatment protocol (relevant for each category), applying a sound generator and setting the generator (REM). An attribute chosen to track treatment and improvement is a Total score, which indicates severity of tinnitus according to the following scale: 0–16 -slight, 18–36 -mild, 38–56 -moderate, 58–76 -severe, and 78–100 -catastrophic handicap. The aim of extracting the action rules is to find treatment actions that lead to changes in severity of a patient’s tinnitus from higher to lower. A Total Score attribute was missing for about half of visits registered in a tinnitus database and the same for A Tinnitus Awareness attribute. Even when considering case of both Tsc or Taw still about 40% values were missing. To handle this problem and retain all the tuples for visits which may contain useful information about treatment actions, an algorithm for imputation of missing values was developed and applied.

6.2 Decision Attribute Development

In order to find out action rules that indicate improvement, the decision attribute was further preprocessed and new derived features were developed:

  • ChTsc—Change in Total Score

  • ChTaw—Change in Tinnitus Awareness

  • PerChTsc—Percentage Change in Tinnitus Score

  • PerChTaw—Percentage Change in Tinnitus Awareness

On the top of these, one decision attribute was developed: a new change attribute for X indicator at visit v of a given patient, in a following way:

Definition 5

\(CH_{X, v} = \)

  • NULL, for \(X_n = NULL\) or \(X_{n+1}=NULL\)

  • 0, for \(X_n = 0\) and \(X_{n+1} = 0\)

  • \(\frac{-100}{dist_{n+1,n}}\), for \(X_n = 0\) and \(X_{n+1}> 0\)

  • \((100 \% * \frac{X_{n+1} - X_n}{X_n}) / (dist_{n+1,n})\), for \(X_n \ne 0\)

where:

  • \(X= Tsc\) or \(X=Taw\)

  • \(X_n\) is a measurement of X at v, or the closest previous measurement of X from v: \(DATE(v) \ge DATE(X_{n})\)

  • \(X_{n+1}\) is the closest next measurement of X since the visit v: \(DATE(v)< DATE(X_{n+1})\)

  • dist is a distance defined as below:

Definition 6

\(dist_{n+1,n} = \)

  • NULL, for \(CH_{X_n, v} = NULL \)

  • \(DATEDIFF(weeks, DATE(X_{n+1}), DATE(X_{n}))\), for \(DATE(X_{n+1}) > DATE(X_{n})\)

Algorithm calculates the change and distance values for each visit based on the definitions presented above.

One final change attribute is a combined change attribute defined as follows:

Definition 7

\(CH = ChTsc\) and \(distCh = distTsc\)

in the following cases (in order of priority):

  • \(Ch_{Tsc}\) is not NULL and \(Ch_{Taw}\) is NULL—this is the most obvious case—we choose a change in indicator that is available,

  • ScT is not NULL and AwT is NULL—the case when both change features are available for the tuple, but change for Sc t is accurate, while ChTaw is approximated by “neighboring” previous and next measurements,

  • \(Ch_{Tsc}\) is not NULL and \(Ch_{Taw}\) is not NULL and \(distTsc < distTaw\)—there are values for change attributes for both indicators, as well as current values of indicators themselves (Sc t and Aw T)—a change value associated with lower distance is chosen (it is assumed that treatment effectiveness measured in shorter time distance is more accurate).

Analogously:

\(CH = ChTaw\) and \(distCh = distTaw\)

when:

  • \(Ch_{Tsc}\) is NULL and \(Ch_{Taw}\) is not NULL,

  • ScT is NULL and AwT is not NULL

  • \(Ch_{Tsc}\) is not NULL and \(Ch_{Taw}\) is not NULL and \(distTsc > distTaw\).

The last case, not resolved by the two above, is when \(Ch_{Tsc}\) is not NULL and \(Ch_{Taw}\) is not NULL and \(distTsc = distTaw\). Then a “combined” change is calculated as an average of both:

\(CH = \frac{Ch_{Tsc} + Ch_{Taw}}{2}\) and \(distCh = distTaw = distTsc\).

A new attribute Ch (with corresponding distCh attribute) was introduced to LISp-Miner environment under Temporal group of attributes (see Table 2).

Figure 3 shows categories defined for a Ch attribute, as intervals, along with their balanced frequency (absolute, relative and cumulated).

Table 2 Ch and treat len attributes definition in LISp-Miner
Fig. 3
figure 3

Frequency distribution of categories for Ch attribute in tinnitus dataset

Table 3 Category names and corresponding intervals for Ch attribute

There are 5 categories for a change value: “worse” for positive values of Ch, “about the same” for no change, and three categories for different magnitudes of negative values: “slightly better”, “better” and “much better”. Corresponding intervals for each category are shown in Table 3.

6.3 Distance Features

An additional column, indicating length of treatment of a given measure (distCh), was defined as an interval attribute—treat len. In order to relate patient’s visits temporally, following columns were additionally developed:

  • distPrev—time difference (in weeks) between the current and the previous visit of a patient (for initial visit the distance is 0),

  • dist0—for each visit: time elapse (in weeks) from the initial visit, the last visit’s dist0 informs about the total time of a patient’s treatment,

After defining the additional attributes in LISp-Miner, they were used for defining relevant patterns. It is assumed that actions that generally lead to a “better” condition are interesting (for now, no matter if an improvement is slight, moderate or significant). The procedure is enforced to generate only interesting action rules (for treatment purposes) and generates only effective treatment actions.

With a new, accurate change attribute Ch for the succedent part, developed as described above, final choice of the most reliable rules can be made.

Besides considering a Ch attribute in the experimental setup, also temporal dependencies between actions and their effects were considered, suggesting a change in the length of treatment with a particular method (treat attribute).

6.4 Instrument Fitting

(names of all attributes are explained in the Table 4):

Hypotheses 27

Instr(GHI):Freq LE(\(\text {<}\)3000;3150)) \(\rightarrow \) Freq LE \(\ge \) 3775)\(\implies _{0.32;37;8}\) Ch(better/much better/slightly better)

Instr(SG): Mix R SL(\(\text {<}\)9;10)) \(\rightarrow \) Mix R SL(\(\text {<}\)11;12)) \(\implies _{0.27;8;11}\) Ch(better/much better/slightly better)

Instr(SG): Mix R SL(\(\text {<}\)9;10)) \(\rightarrow \) Mix R SL(\(\text {<}\)15;17)) \(\implies _{0.27;8;8}\) Ch(better/much better/slightly better)

Instr(GHI): Mix L SL(\(\text {<}\)7;8)) \(\rightarrow \)Mix L SL\(\text {<}\)2 \(\implies _{0.27;8;8}\) Ch(better/much better/slightly better)

Instr(GHI): Mix L SL(\(\text {<}\)7;8)) \(\rightarrow \) Mix L SL(\(\text {<}\)11;12)) \(\implies _{0.27;8;8}\) Ch(better/much better/slightly better)

Instr(SG): Freq LE(\(\text {<}\)2670;2800)) \(\wedge \) Freq RE(\(\text {<}\)2670;2800)) \(\rightarrow \) Freq LE(\(\text {<}\)2500;2670)) \(\wedge \) Freq RE(\(\text {<}\)2500;2670))\(\implies _{0.23;6;7}\) Ch(better/much better/slightly better)

Instr(GHI): Th L SPL(\(\text {<}\)36;37)) \(\rightarrow \) Th L SPL(\(\text {<}\)37;38)) \(\implies _{0.17;8;9}\) Ch(better/much better/slightly better)

Instr(GHI): Mix R SL(\(\text {<}\)6;7)) \(\rightarrow \) Mix R SL(\(\text {<}\)9;10)) \(\implies _{0.17;9;8}\) Ch(better/much better/slightly better)

Instr(SG): Freq RE(\(\text {<}\)3000;3150)) \(\rightarrow \) Freq RE(\(\text {<}\)2500;2670)) \(\implies _{0.11;9;12}\) Ch(better/much better)

Instr(SG): Freq RE(\(\text {<}\)3000;3150)) \(\rightarrow \) Freq RE(\(\text {<}\)2500;2670)) \(\implies _{0.03;5;6}\) Ch(slightly better) \(\rightarrow \) Ch(better/much better)

Instr(SG): Freq LE(\(\text {<}\)2670;2800)) \(\rightarrow \) Freq LE(\(\text {<}\)2500;2670)) \(\implies _{0.1;12;9}\) Ch(better/much better/slightly better)

Instr(SG): Freq LE(\(\text {<}\)2670;2800)) \(\rightarrow \) Freq LE(\(\text {<}\)3000;3150)) \(\implies _{0.1;8;11}\) Ch(better/much better)

Instr(SG) \(\wedge \) Model(TR COE): Freq RE(\(\text {<}\)2500;2670)) \(\rightarrow \) Freq RE(\(\text {<}\)2670;2800))\(\implies _{0.09;10;10}\) Ch(better/much better/slightly better)

Instr(SG) \(\wedge \) Model(TR COE): Freq RE(\(\text {<}\)2500;2670)) \(\rightarrow \) Freq RE(\(\text {<}\)3000;3150)) \(\implies _{0.08;10;12}\) Ch(better/much better/slightly better)

Instr(SG): Freq RE(\(\text {<}\)2670;2800)) \(\rightarrow \) Freq RE(\(\text {<}\)2500;2670)) \(\implies _{0.08;11;12}\) Ch(better/much better/slightly better)

Instr(GHS): Freq RE(\(\text {<}\)2800;3000)) \(\rightarrow \) Freq RE(\(\text {<}\)2670;2800)) \(\implies _{0.07;11;12}\) Ch(better/much better/slightly better)

Instr(SG): Th R SPL(\(\text {<}\)33;34)) \(\rightarrow \) Th R SPL(\(\text {<}\)36;37)) \(\implies _{0.02;8;9}\) Ch(better/much better/slightly better)

Type(GHH): Freq RE(\(\text {<}\)2670;2800)) \(\rightarrow \) Freq RE(\(\text {<}\)3000;3150)) \(\implies _{0.02;8;13}\) Ch(better/much better/slightly better)

FU(A) \(\wedge \) Instr(GHI) \(\wedge \) Freq RE(\(\text {<}\)3000;3150)): treat(\(\text {<}\)6;8)) \(\rightarrow \) treat(\(\text {<}\)5;6))\(\implies _{0.1;9;8}\) Ch(better/much better/slightly better)

The above action rules, related to the instruments’ fitting with REM, include rules for the following types of instruments: “SG” (sound generators generally), “GHI” (general type of sound generator that includes both GHI hard and GHI soft models), particular types: “GHS” (GHI soft) and “GHH” (GHI hard), up to specific model, such as “TRI-COE”. The following settings of the instruments were considered in the variable antecedent parts of rules: Freq RE, Freq LE, Mix R SL, Mix L SL, Th R SPL, Th L SPL. These constitute quite a significant subset of settings for fitting the instruments.

For example, the last action rule from Hypotheses 27 informs that the probability of a successful treatment increases by 10 percentage points, when the “Audiological/counseling” treatment combined with the setting of "GHI" instrumentation to "Freq RE" at in \(\text {<}\)3000;3150) shortens from 6–8 weeks to 5–6 weeks.

6.5 Treatment Protocol

The second rule from the listing below informs that changing the treatment of a patient in Category-1 from the treatment protocol “0” lasting 12–16 weeks to the treatment protocol “1” for more than 32 weeks, should increase improvement by 61 percentage points.

Hypotheses 28

Cat(0): CC(0) \(\rightarrow \) CC(1) \(\implies _{0.33;42;9}\) Ch(better/slightly better)

Cat(0): CC(0) \(\wedge \) treat(\(\text {<}\)12;16)) \(\rightarrow \) CC(1) \(\wedge \) treat \(\ge \) 32 \(\implies _{0.61;42;9}\) Ch(better/slightly better)

Cat(3): Instr(GHH) \(\wedge \) FU(T) \(\wedge \) CC(3) \(\rightarrow \) Instr(TCI-C) \(\wedge \) FU(A) \(\wedge \) CC(2) \(\implies _{0.33;8;9}\) Ch(better/much better/slightly better)

Cat(3): Instr(Viennatone) \(\wedge \) CC(3) \(\rightarrow \) Instr(TCI-C) \(\wedge \) CC(2) \(\implies _{0.25;8;8}\) Ch(slightly better/better/much better)

Cat(3): CC(0) \(\rightarrow \) CC(2) \(\implies _{0.18;8;22}\) Ch(slightly better/better/much better)

Cat(1): CC(0) \(\rightarrow \) CC(1) \(\implies _{0.14;9;491}\) Ch(slightly better/better/much better)

Cat(3): CC(0) \(\rightarrow \) CC(3) \(\implies _{0.08;8;239}\) Ch(slightly better/better/much better)

6.6 Treatment Personalized for Demographics

The following hypotheses were generated for the tasks that were defined in order to maximize treatment personalization.

Hypotheses 29

AgeBeg(\(\text {<}\)50;55)) \(\wedge \) G(m) \(\wedge \) Cat(1) \(\wedge \) T side(yes): MedNr \(\ge \) 5 \(\rightarrow \) MedNr(\(\text {<}\)2;3)) \(\implies _{0.55;14;8}\) Ch(slightly better/better)

G(m) \(\wedge \) Cat(1) \(\wedge \) OMTI(yes) \(\wedge \) T side(yes): MedNr(\(\text {<}\)3;4)) \(\rightarrow \) MedNr(\(\text {<}\)4;5)) \(\implies _{0.41;9;18}\) Ch(slightly better/better/much better)

AgeBeg(\(\text {<}\)50;55)) \(\wedge \) G(m) \(\wedge \) AgeInd(\(\text {<}\)50;56)) \(\wedge \) T side(yes): CC(2) \(\rightarrow \) CC(1) \(\implies _{0.55;14;11}\) Ch(slightly better/better/much better)

G(m) \(\wedge \) Cat(1) \(\wedge \) OMTI(yes) \(\wedge \) T side(yes): F(T) \(\rightarrow \) F(A) \(\implies _{0.23;9;16}\) Ch(slightly better/much better)

AgeBeg(\(\text {<}\)55;60)) \(\wedge \) G(m) \(\wedge \) Cat(1) \(\wedge \) T side(yes): Instr(GHS) \(\rightarrow \) Instr(GHH) \(\implies _{0.19;8;8}\) Ch(better/much better)

6.7 Treatment Personalized for Tinnitus Background

Hypotheses 30

OMTI(yes) \(\wedge \) T side(yes): Instr(Viennatone) \(\wedge \) FU(T) \(\rightarrow \) Instr(GHH) \(\wedge \) FU(A) \(\implies _{0.56;8;8}\) Ch(slightly better/better)

NTI(yes) \(\wedge \) G(m): Instr(GHS) \(\rightarrow \) Instr(GHH) \(\implies _{0.33;28;8}\) Ch(slightly better/better/much better)

G(m) \(\wedge \) OMTI(yes) \(\wedge \) M6(yes) \(\wedge \) Cat(1): FU(T) \(\rightarrow \) FU(A) \(\implies _{0.3;5;10}\) Ch(slightly better/better/much better)

OMTI(yes) \(\wedge \) T side(yes): Work(h) \(\rightarrow \) Work(w)\(\implies _{0.3;13;11}\) Ch(slightly better/better)

OMTI(yes) \(\wedge \) G(f): Instr(GHS) \(\rightarrow \) Instr(GHH) \(\implies _{0.28;10;8}\) Ch(slightly better/better)

OMTI(yes) \(\wedge \) T side(yes) \(\wedge \) Cat(1): Instr(GHS) \(\rightarrow \) Instr(GHH) \(\implies _{0.3;13;11}\) Ch(slightly better/better)

OMTI(yes) \(\wedge \) G(f): Instr(Viennatone) \(\rightarrow \) Instr(GHH) \(\implies _{0.25;8;8}\) Ch(slightly better/better/much better)

OMTI(yes) \(\wedge \) T side(yes) \(\wedge \) Cat(1): Instr(Viennatone) \(\wedge \) FU(T) \(\rightarrow \) Instr(GHS) \(\wedge \) FU(A) \(\implies _{0.24;8;8}\) Ch(slightly better/better)

G(m) \(\wedge \) NTI(yes) \(\wedge \) M3(yes) \(\wedge \) Cat(3): FU(A) \(\rightarrow \) FU(T) \(\implies _{0.18;6;8}\) Ch(slightly better/better/much better)

OMTI(yes) \(\wedge \) Instr(GHS): treat \(\ge \) 32 \(\rightarrow \) treat(\(\text {<}\)5;6))\(\implies _{0.06;9;6}\) Ch(slightly better/better/much better)

OMTI(yes) \(\wedge \) FU(T): treat(\(\text {<}\)21;32)) \(\rightarrow \) treat(\(\text {<}\)8;10)) \(\implies _{0.01;11;14}\) Ch(slightly better/better/much better)

The two last rules hypothesize that in case of medical-induced tinnitus (OMTI(yes)), it should be advantageous to shorten the treatment with “GHS” instrumentation from “above 32 weeks” to 5–6 weeks, as well as shorten the telephone-based treatment from 21–32 weeks to 8–10 weeks.

6.8 Treatment Personalized for Medical Condition

The relevant action rules, which consider other diseases in a patient, include: patients with ulcers, hypertension, seizures, depression/anxiety disorders. The treatment actions include: reducing the number of medications (which are also associated with tinnitus as a side-effect), changing instrumentation (for example, from “GHS” to “HA”, or from “Viennatone” to “GHS”), but also changing place of residence (for example, state “NY” to “WI”, “GA” to “IL”).

Hypotheses 31

G(m) \(\wedge \) T side(yes) \(\wedge \) Ulcers(yes): Med(\(\ge \)5) \(\wedge \) State(GA) \(\rightarrow \) Med(\(\text {<}\)2;3)) \(\wedge \) State(IL) \(\implies _{0.73;10;8}\) Ch(slightly better/better)

G(m) \(\wedge \) T side(yes) \(\wedge \) Ulcers(yes) \(\wedge \) Erosive arthritis(yes) \(\wedge \) GERD(yes): Med(\(\ge \)5) \(\wedge \) State(GA) \(\rightarrow \) Med(\(\text {<}\)2;3)) \(\wedge \) State(IL) \(\implies _{0.71;10;8}\) Ch(slightly better/better)

Cat(1) \(\wedge \) T side(yes) \(\wedge \) Hypertension(yes): Med(\(\ge \)5) \(\wedge \) FU(T) \(\rightarrow \) Med(\(\text {<}\)4;5)) \(\wedge \) FU(A) \(\implies _{0.56;12;11}\) Ch(slightly better/better/much better)

G(m) \(\wedge \) T side(yes) \(\wedge \) Seizures(yes): Instr(GHS) \(\wedge \) State(NY) \(\rightarrow \) Instr(HA) \(\wedge \) State(WI) \(\implies _{0.53;9;8}\) Ch(slightly better/much better)

G(m) \(\wedge \) T side(yes) \(\wedge \) Depression(yes) \(\wedge \) Panic disorder (yes) \(\wedge \) Seizures(yes): Instr(GHS) \(\wedge \) State(NY) \(\rightarrow \) Instr(HA) \(\wedge \) State(WI) \(\implies _{0.53;9;8}\) Ch(slightly better/much better)

Cat(1) \(\wedge \) T side(yes) \(\wedge \) Depression(yes) \(\wedge \) Anxiety disorder (yes): Instr(GHH) \(\wedge \) Med(\(\ge \)5) \(\rightarrow \) Instr(GHS) \(\wedge \) Med(\(\text {<}\)4;5)) \(\implies _{0.48;9;12}\) Ch(slightly better/better/much better)

G(m) \(\wedge \) T side(yes) \(\wedge \) Depression(yes): Med(\(\ge \)5) \(\wedge \) State(GA) \(\rightarrow \) Med(\(\text {<}\)2;3))) \(\wedge \) State(WI) \(\implies _{0.47;14;10}\) Ch(slightly better/much better)

G(m) \(\wedge \) T side(yes) \(\wedge \) Seizures(yes): Med(\(\ge \)5) \(\wedge \) FU(A) \(\rightarrow \) Med(\(\text {<}\)2;3)) \(\wedge \) FU(T) \(\implies _{0.47;9;8}\) Ch(slightly better/better/much better)

OMTI(yes) \(\wedge \) T side(yes) \(\wedge \) Depression(yes): Instr(Viennatone) \(\wedge \) Med(\(\ge \)5) \(\rightarrow \) Instr(GHS) \(\wedge \) Med(\(\text {<}\)4;5))\(\implies _{0.42;21;12}\) Ch(slightly better/much better)

6.9 Meta Actions

Following hypotheses show examples of meta actions generated for the patient with an ID 01054 (that is, a set of effective actions, for this particular patient).

Hypotheses 32

THC(01054) \(\wedge \) Cat(1): Instr(VSS) \(\wedge \) F(T) \(\rightarrow \) Instr(V - AMTI) \(\wedge \) F(A) \(\implies _{0.67;1;4}\) Ch(slightly better/better/much better)

THC(01054) \(\wedge \) Cat(1): Freg LE(\(\text {<}\)2500;2670)) \(\wedge \) Freg RE(\(\text {<}\)2500;2670)) \(\wedge \) Mix R SPL(\(\text {<}\)51;52)) \(\rightarrow \) Freg LE(\(\text {<}\)2120;2380)) \(\wedge \) Freg RE(\(\text {<}\)2380;2500)) \(\wedge \) Mix R SPL(\(\text {<}\)53;55)) \(\implies _{0.5;1;1}\) Ch(better/much better)

THC(01054) \(\wedge \) Cat(1): Freg LE(\(\text {<}\)3000;3150)) \(\wedge \) Freg RE(\(\text {<}\)3000;3150)) \(\wedge \) Mix R SL(\(\text {<}\)9;10)) \(\rightarrow \) Freg LE(\(\text {<}\)2500;2670)) \(\wedge \) Freg RE(\(\text {<}\)2500;2670)) \(\wedge \) Mix R SL(\(\text {<}\)14;15)) \(\implies _{0.5;1;1}\) Ch(slightly better/much better)

THC(01054) \(\wedge \) Cat(1): Freg LE(\(\text {<}\)3000;3150)) \(\wedge \) Freg RE(\(\text {<}\)3000;3150)) \(\wedge \) Mix R SL(\(\text {<}\)9;10)) \(\rightarrow \) Freg LE(\(\text {<}\)2120;2380)) \(\wedge \) Freg RE(\(\text {<}\)2380;2500)) \(\wedge \) Mix R SL(\(\text {<}\)11;12)) \(\implies _{0.5;1;1}\) Ch(slightly better/better/much better)

THC(01054) \(\wedge \) Cat(1): Mix R SL(\(\text {<}\)13;14)) \(\wedge \) Mix R SPL(\(\text {<}\)51;52)) \(\wedge \) Th R SPL(\(\text {<}\)38;39)) \(\rightarrow \) Mix R SL(\(\text {<}\)11;12)) \(\wedge \) Mix R SPL(\(\text {<}\)53;54)) \(\wedge \) Th R SPL(\(\text {<}\)42;43)) \(\implies _{0;1;1}\) Ch(better/much better)

The above sets of actions for the patient “01054” (meta-actions) are examples of an effective treatment undertaken for this case (profile) of a patient. It can be also observed that the last set of actions (the last hypothesis) brought no results.

7 Discussion

7.1 Summary of Experiments on Rule Extraction

Experiments described in the two previous sections were conducted in order to extract knowledge on tinnitus diagnosis and treatment, in the form of rules—decision rules and action rules, whose theoretical background was presented in Sect. 3. While the former should help in understanding the relations between different diagnosis factors, the latter suggest a course of treatment (action) leading to improvement in a patient’s condition. The experiments on finding association rules can also help in analyzing the collected data in terms of patient’s characteristics and discover patterns that are not obvious from a medical point of view.

However, the main advantage of the proposed approach based on rule extraction, is a possibility to automatically retrieve knowledge in the form of rules, without engaging time of a medical expert. It seems promising, as experts are usually not widely available. Additionally, knowledge engineering based on interviewing experts is quite time- consuming. Often, it is also cumbersome for experts to formulate their knowledge in the form of specific rules, as they often make decisions intuitively, based on experience. This knowledge, on the other hand, is hidden in large databases which can be extracted in the form of rules that imitate human behavior. This methodology is particularly interesting and useful for building a rule-based decision support system.

The discovered rules can be either exploited in a qualitative way by an expert, or used to perform classification (scoring) of incoming objects. Ultimately, automatically extracted rules should be built into the rule engine of decision support system. An appropriate mechanism of automatic rule execution (or alternatively inference engine) should be implemented. The relevant rules could be then evoked by matching new data with the rules’ premises (antecedents) and their conclusions (succedents) can be presented to the system user.

Rule extraction, in contrary to the method based on building a classifier provides a better insight into different diagnostic and treatment factors. It also enables customizing the associations which are supposed to be discovered. Also, when implemented into a knowledge base of a decision support system, they can potentially provide an explanatory mechanism. It means, the decision the system arrives at, can be explained by means of antecedent parts of the rules that were triggered. It can also potentially serve for educational purposes. The personnel untrained in tinnitus treatment can learn tinnitus diagnosis and treatment by using the system and its explanatory functionalities, which imitate the behavior and decision making processes of a human expert.

To sum up the experiments, the discovered rules confronted with expert knowledge confirm the correctness of the approach and methodology. The discovered new, unknown patterns provide additional knowledge to an expert that otherwise could not be easily noticed from a large and complex dataset. It is important to note that the discovered knowledge should be treated as hypotheses, which nevertheless, have to be either confirmed by an expert or by a controlled study, designed to validate the hypothetical claims. In particular, the rules generated and presented in this work should not be used for any diagnosis or treatment decision, or suggest any particular course of treatment. The work within this research is experimental and aimed at presenting potential application of action rules and meta actions in the area of medical diagnosis and treatment. The final validity check of the presented approach can be done by comparing clinical results with the extracted knowledge.

7.2 Conclusions

The work within this chapter verifies a possibility of applying theory of traditional machine learning techniques, such as classification and association rules, as well as novel data mining methods, including action rules and meta actions, to a practical decision problem in the area of medicine.

The work included a series of data preprocessing steps, building new features and testing the proposed approach with the use of chosen methodologies and tools. The tests on knowledge discovery approach were divided into: testing the classification model first (not presented in this chapter), extracting the decision rules and generating action rules/ meta actions. New temporal features were introduced to describe the sparse records of patients’ visits. Next, they were used in building a classification model and extracting the rules. Interesting and potentially novel rules relating treatment factors to symptoms were revealed.