Keywords

1 Introduction

Demographic shifts in the population and the fact that people are living longer have created an awareness that the health care system is and will be increasingly difficult to control, organize and finance especially where the ageing population are concerned. The senior citizen population is increasing along with the diversity of their health backgrounds and medico-social needs which cannot be provided easily because of health aspects, social conventions and lifestyles that are intertwined with the ageing process. Long-term care is a variety of services that includes medical and non-medical care to people who have a chronic illness or disability. This illness or disability could include a problem with memory loss, confusion, or disorientation. This is called cognitive impairment and can result from conditions such as Alzheimer’s disease [15]. Care needs often progress as age or as chronic illness or as disability progresses. Long-term care helps meet health or personal needs. Most long-term care is to assist people with support services such as activities of daily living like dressing, bathing, and using the toilet. Approximately 70 % of individuals over the age of 65 will require at least some type of long-term care services during their lifetime. Over 40 % will need care in a nursing home for some period of time. Nursing homes provide long-term care to people who need more extensive care, particularly those whose needs include nursing care or 24-hour supervision in addition to their personal care needs. We focus our interest on nursing homes.

This presentation is split up into eight sections. After an introduction describing the scope of the study, the study context is presented in Sect. 2. The characteristics of the collected data are described in Sect. 3. The profiles of residents obtained by using cluster analysis are presented in Sect. 4. A brief review of previous works is presented in Sect. 5. Section 6 deals with the techniques used (regular probabilistic grammar inference) to model the automaton symbolizing the changing profiles and their development over time. Starting from this automaton, a Markov model is deduced. Thereby, it is possible to verify if our system is achieving a steady state. Section 7 presents the results obtained concerning the four medical nursing homes (called Bernadette, Soleil, Les Myosotis, Val Dorlay situated in France) and dementia. We conclude with some future works.

2 Study Context

This project is being carried out in close collaboration with a French mutual benefit organization called “Mutualité Française de la Loire” which manages several nursing homes.

To fully explore the opportunities for our approach, we propose a modelling environment. This allows one to manage the resources, to elaborate medico-social resource planning and to simulate them in order to evaluate the performance of each of them (Fig. 1).

Fig. 1.
figure 1

Modelling environment

The Decisional Information System is crucial in the methodological framework. The objective is to design an environment for decision aid-tools dedicated to health-care and social services professionals for strategic and tactical decisions such that:

  • Which activity should be developed, and at what cost?

  • Whether or not to open additional facilities - and if so which ones?

  • etc.

The aim is to provide a certain standard of care to dependent old people in nursing homes, through the sharing of resources (staff, finance and equipment) thanks to all the key indicators (via the managers’ choices).

The methodological framework is described in Fig. 2.

Fig. 2.
figure 2

Methodological framework

So, in order to do this, we have to identify different senior citizen profiles as well as the length of stay according to these profiles and to study their development over time. The collected data comes from administrative data, previous medical history, etc.

We evaluate the workload in function of the profile of each elderly person in the nursing home. Based on nursing home pricing (for accommodation and care in function of the degree of dependence…), we can verify that:

  • the budget will not be exceeded,

  • we have enough staff to carry out the workload.

Simulations are used to forecast population ageing. So we can identify the workload in the short-term, medium-term and long-term and estimate the resources needed. From the potential elderly people needs, we can also evaluate which profiles should be accepted to ensure that both quality and safety criteria are respected from a workload and financial point of view.

In this article, we will present the study of the progression over time of elderly people autonomy-disability. The steps of the project consist in:

  1. 1.

    The specification of elderly people profiles by using unsupervised learning approach [9],

  2. 2.

    The study of the development of these profiles over time by using a probabilistic graph of transitions between the clusters inferred by k-TSSI (k-Testable Languages in the Strict Sense Inference) algorithm. The objective is to deduce Markov process which has a discrete (finite or countable) state-space.

  3. 3.

    Discrete-time Markov chain simulation is used to forecast population ageing. It allows to identify the elderly people care needs and the workload in short-term, medium-term and long-term and to predict the future costs. An application is presented in [8].

3 Data Collected

The quantitative data arises from the databases and the corresponding information system deals with the evaluation of autonomy/disability of elderly people. Dependence evaluation in France is carried out using a specific national grid called AGGIR: Autonomy-Gerontology-Group-Iso-Resources. The quantitative data concerns 628 residents and more than 2,200 observations of independence evaluations. The evaluations are made by the resident doctor in collaboration with the medical staff. An item can be evaluated using the four adverbs (see Fig. 3):

Fig. 3.
figure 3

A.G.G.I.R. grid

  • Spontaneously corresponding to the letter S,

  • Entirely corresponding to the letter E,

  • Correctly corresponding to the letter C,

  • Usually corresponding to the letter U.

The codification is the following. If all four adverbs are marked, the code is C. If less than four adverbs are checked (three or two or one), the code is B. If no adverb is checked, the code is A.

The proposed algorithm uses numerical data. So, the corresponding values are:

  • 0 for code A meaning the person can do it alone,

  • 1 for code B meaning the person can partially do it,

  • 2 for code C meaning the person cannot do it alone.

The first step is to analyze the degree of autonomy-disability in order to identify clusters.

4 Identification of Residents’ Profiles

The aim is to find feature-patterns related to the autonomy-disability level of elderly people living in nursing homes. These levels correspond to profiles based on the people’s ability to perform activities of daily living like being able to wash, dress and move. To achieve this aim, an unsupervised learning approach is proposed [9]. It is based on principal component analysis technique to direct the determination of the clusters with self-organizing partitions. Cluster analysis is made on the 8 variables: Transferring to or from bed or chair, Moving indoors, Washing, Toilet, Dressing, Food, Orientation, Coherence. The cluster analysis identifies two kinds of patterns (see Fig. 4):

Fig. 4.
figure 4

Contribution of the clustering

  • The decline in executive functions regarding to motor and functional abilities called apraxia disorders,

  • The cognitive impairment and neuropsychological deficits.

By combining clustering with a machine learning process, we could be able to predict the development of physical autonomy loss or mental autonomy loss in elderly people over time. To reach this objective, we use machine learning approach based on grammar inference in order to infer a probabilistic automaton. In the article, we only present the patients’ profiles evolution regarding to upper body functional disorders (cognitive impairment).

5 Related Works

We want to obtain a probabilistic graph of transitions between states (clusters) with the length-of-stay in each state (temporal state representations). It is also interesting to study cluster succession of length k (for example, the 3 last states of resident’s clusters). Probabilistic automata are used in various areas in pattern recognition or in fields to which pattern recognition is linked. Different concept learning algorithms have been developed for different types of concepts.

We are interested by the class of regular grammars that are the simplest class of formal grammars in the Chomsky hierarchy and it consists in the identification of the corresponding learning of deterministic finite automata (DFA).

The learning of DFA, also called regular inference is based on acceptance of regular languages which allow to model the behaviour of systems. The aim consists in constructing a DFA from information about the set of words it accepts. There are many algorithms for learning DFAs, the most well-known being the algorithm due to Dana Angluin [4, 5]. There are many approaches for regular inference [6, 7, 11, 16, 17, 19]. For more information, the book [14] presents an overview on learning automata and grammar inference.

A finite automaton with transition probabilities represents a distribution over the set of all strings defined over a finite alphabet. The articles [18, 24] present a survey and a study of the relations and properties of probabilistic finite-automata and tree. The article [10] clarifies the links between probabilistic automata and Hidden Markov Models (HMM). In a first part of this work, the authors present:

  • the probabilities distributions generated by these models,

  • the necessary and sufficient conditions for an automaton to define a probabilistic language.

The authors show that one the one hand, probabilistic deterministic finite automata (PDFA) form a proper subclass of probabilistic non-deterministic automata (PNFA) and the other hand, PNFA and HMM are equivalent.

However, there are almost no extensions of these algorithms to the setting of timed systems. A variant of a DFA including the notion of time is called Timed Automaton (TA). These models are based on the fact that each symbol of a word occurs at a certain point in time. TA can see that an automaton that generates strings with event-time value pairs called timed strings. Very few works exist in the domain [13, 13, 20, 21]. Timed automata correspond to finite state models where explicit notion of time is taken into account and is represented by timed events. Time can be modelled in different ways, e.g. discrete or continuous. The more recent works [22, 23] propose an algorithm for learning simple timed automata, known as Deterministic Real-Time Automata (DRTA) where the transitions of real-time automata can have a temporal constraint on the time of occurrence of the current symbol relative to the previous symbol. The main drawback of such approaches is that it can result in an exponential blow-up of both the input data and the resulting size of the model and the obtained models are quite difficult to interpret.

The main objective of the study is the ability to analyze an identified model in identifying automaton models from observations. We want to take an established method to learn a DFA and apply it to our timed sequences. Our problem could be modelled as a timed-state transition graph, a probabilistic deterministic finite automaton (PDFA) taking into account timed-event. We also have a set of positive timed-strings (or time-stamped event sequences).

From a set of labelled (positive only) time stamped event sequences, the problem to solve is to find the automaton model that most likely produce the data. We do not want to learn (identify) a DRTA such that [22], with time constraints because we do not exactly have the same problematic of real-time system. We only have timed-strings from which we propose to automatically deduce a Markov chain model.

The k-TSSI (k-Testable Languages in the Strict Sense Inference) algorithm [11, 12] could be useful, convenient and suitable for two reasons: the simplicity of implementation and the possibility to take into account memory effects (macro-states). The inductive inference of the class of k-testable languages in the strict sense (k-TLSS) has been studied and adapted to local languages, N-grams and tree languages. A k-TLSS is essentially defined by a finite set of substrings of length k that are permitted to appear in the strings of then language. Given a size k of memory, the objective is to find an automaton for the language. This subclass of language called k-testable language has the property that the next character is only dependent on the previous k-1 characters. In our case, it is interesting to be able to identify the substrings (memory) of length k.

But, our goal is to infer a timed-system model and an automaton inferred by the k-TSSI algorithm does not take into account the timed strings.

The problem is also that it is difficult to take into account a set of substrings of length k(k > 1) and the algorithm is not generalized to probabilistic timed-automata. In this section we propose a model in order to take into account the concept of time in the automaton inferred by the k-TSSI algorithm (i.e. the duration of time a resident spends in a particular cluster) taking into account timed-transition system. In the next section, we present the implementation of the model.

6 Development of Patients’ Profiles: Model Implementation

We present an algorithm for inferring a model of a timed transition system based on the K-Testable Language in Strict Sense (K-TLSS).

The method consists of:

  1. 1.

    Learning a Deterministic Finite Automata (DFA) of timed-transition systems by using an extension of k-TSSI algorithm.

  2. 2.

    Transforming this DFA into a probabilistic DFA.

  3. 3.

    Converting this probabilistic DFA in a Markov chain model.

6.1 Preliminaries

The aim of grammatical inference is to learn models of languages from sample sentences in these languages. A sentence can be any structured composition of primitive elements or symbols, though the most common type of composition is concatenation. So we infer grammar and the corresponding representation is an automaton.

A finite automaton is a 5-tuple A = (Q, Σ, δ, q 0 , F) with:

  • Σ: a finite input alphabet of symbols,

  • Q: a finite set of states with q 0 as start state,

  • F: a set of final states (F  Q),

  • δ: a transition function of Q × Σ  Q. So that  = δ( q, σ ) returns a state for current state q and input symbol σ from Σ. Each transition is noted by 3-tuple (q, σ, qʹ).

If for all qQ and for all σΣ, δ(q, σ) corresponds to a unique state of Q, then the automaton is said to be a Deterministic Finite Automaton (DFA). Grammatical inference refers to the process of learning rules from a set of labelled examples. It belongs to a class of inductive inference problems [4] in which the target domain is a formal language (a set of strings generated from some alphabet Σ) and the hypothesis space is a family of grammars. It is also often referred to as automata induction, grammar induction, or automatic language acquisition. The inference process aims at finding a minimum automaton (the canonical automaton) that is compatible with the examples. In regular grammar inference, we have a finite alphabet Σ and a regular language L  Σ* where:

  • Σ*: the set of all finite length strings generated from Σ,

  • L: a sub-set of Σ* corresponding to the words recognized from the automaton A.

Given a set of examples that are in the language (I+) and a (possibly empty) set of examples not in the language (I-), the task is to infer a deterministic finite automaton A that accepts the examples in I+ and rejects the examples in I-.

6.2 Automata Induction

The k-TSSI algorithm [11] allows us to infer k-Testable Languages in the Strict Sense. The inductive inference of the class of k-Testable Languages in the Strict Sense is defined by a finite set of substrings of length k that are allowed to appear in the strings of the language. Given a positive sample I + L of strings of an unknown language, a deterministic finite-state automaton that recognizes the smallest k-TLSS containing I + is obtained. An automaton inferred by the k-TSSI algorithm is by its construction, non-ambiguous. Moreover, our choice is justified by the fact that k-testable (k > 1) can take into account a memory effect (i.e. N-gram). Indeed, we observed during data analysis that the change in evolution of the autonomy/disability state depends on the previous resident’s states and their diseases (especially for chronic and disabling diseases such as osteoarticular degenerative diseases, anxio-depressive disorder, behavioural disorders…). To illustrate our approach and for the sake of simplicity, we will present in this article, the results obtained with 1-TSSL (the next state depends only on the previous states) in order to explain how we turn the timed transitions into sequences (time-series). We choose to divide up the length-of-stay in each cluster (for example, one discrete step T = 30 days).

Consequently, the corresponding automaton is a 6-tuple (Q, Σ, δ, q 0 , F, d) where d corresponds to the length-of-stay in the clusters. At each cluster, we assign a symbol σ ∈ Σ. Each symbol is represented by a pair (σ i, d i ), where d i is the delay on the symbol σ i which corresponds to an application d: Σ  N with d(σ i ) = d i .

In the following sections, we explain the implementation of the model through an example (on only six residents: 7, 12, 17, 14, 8, 44 corresponding to an excerpt of the collected data).

Setting Up the Alphabet. The assessment of elderly people’s autonomy/disability allows us to classify residents into five levels of mental dependence situation (5 to 1 in decreasing order of severity). Figure 5 presents the data collected from the database.

Fig. 5.
figure 5

Data and sequencing

The resident assessment is made on different dates. For example, resident number 7 was evaluated at level 3 (mental disorder) on the 06/24/2002. For all the assessments concerning resident number 7, we can deduce the sequence: 3321111. But this sequence does not express the amount of time the person spends in each state (level of mental disorder). In this model, each symbol of a word occurs at a certain point in time.

From such observations, we only obtain positive data I+, and now we have to describe how we can obtain timed strings. The following paragraph present the basic notions explaining how we take into account timed-transitions between clusters.

Preliminary Mapping of the Set of Strings. The objective is to obtain a stochastic state transition graph taking into account the length-of-stay in each state. So we have to associate for each occurrence of a symbol (event) in order to model time value. In practice, we use the evaluation date.

The first step consists in the definition of the alphabet (the set Σ). The set Σ is based on an alphabet of 6 symbols - {a, b, c, d, e, f} which correspond to:

  • a length-of-stay in cluster number 1 during a given period T (example: 30 days) meaning d(a) = T,

  • b length-of-stay in cluster number 2 during a given period,

  • etc. (until the symbol e for cluster number 5).

The symbol f models the fact that a resident can leave the nursing home or corresponds to the last resident assessment during the last 30 days before the data extraction. It is only used when we want to deduce the Markov model. Consequently, in the following example, the symbol f does not exist in Fig. 6.

Fig. 6.
figure 6

The automaton inferred by the algorithm k-TSSI with q 0  = 0

The second step concerns the identification of the words which corresponds to the translation of the initial sequence in order to take into account length of time spent in each cluster. Resident number 7 stayed in cluster number 3 from 06/24/2002 to 03/15/2004 (date at which the resident was evaluated and changed to cluster number 2). Thus resident number 7 stayed in cluster number 3 for about 22 periods of 30 days. The symbol modeling cluster number 3 for 30 days is c, consequently the initial sequence “33” becomes “cccccccccccccccccccccc”. The resident stayed in cluster number 2 for 9 periods

And the corresponding word is:

$$ ccccccccccccccccccccccbbbbbbbbbaaaaaaaaaaaaaaaaaa $$

So we obtain the set I+  Σ*. I+ corresponds to the learning set from which the automaton is inferred. The initial set of sequences (Fig. 5)

  • {3321111, 42, 212, 56656, 243333, 4}

becomes:

$$ \begin{aligned} {\text{I}}_{ + } & = \{ ccccccccccccccccccccccbbbbbbbbbaaaaaaaaaaaaaaaaaa, \, ddb, \, bbbbbaaaab, \, \\ & \quad \quad \quad ddddeeeeeeeeddee, \, bbbdddddddccccccccccccccccc, \, dd\} \\ \end{aligned} $$

From the set I+ by using k-TSSI algorithm (to simplify, we present the case corresponding to k = 1), we obtain the automaton described in Fig. 6. The algorithm [12] consists in building the sets Q, Σ, δ, q 0 , F by observation of the corresponding events in the training strings. From these sets, a finite-state automaton that recognizes the associated language is straightforwardly built.

6.3 Computation of Probabilities

The automaton is inferred by the k-TSSI algorithm. We have to associate transition probabilities with states. In order to compute these probabilities, we use the learning set I+. From the words of set I+, when they are recognized by the automaton inferred by k-TSSI, we count:

  • The transition between two states for a given symbol (transition from the state q by the symbol σ): cp δ(q,σ) ,

  • each transition in a state q: cp q ,

  • if a state q is the final state (end of the words): cp q_final .

For the algorithm, we use the three epochs-counts in order to estimate the probabilities. The algorithm computing the probabilities from a learning set is the following.

From the automaton (Fig. 6) and the set I+, we count:

  • cp q : The number of times the state q was used while generating the set of words of I + .

  • cp δ(q,σ) : The number of times the transition δ (q,σ) was used while generating the set of words of I + .

  • cp q_final : The number of times that a state q is final (state gets at end of the words).

The obtained results from the sample presented in Fig. 5 are:

And afterwards, we deduce the probabilities:

So we obtain the probabilistic deterministic automaton where the time series are taken into account. The advantage of using 1-TSSL (k-TSSI algorithm with k = 1) lies in the fact that one state corresponds to one symbol. We have added a new symbol f and a final state q 6 in order to facilitate the translation of the probabilistic automaton into a Markov process. For all q states where p q_final  > 0, we add a transition δ(q,g) = q 6 , p δ(q,g)  = p q_final and p q_final  ← 0. We note that p q6_final  = 1.

From patients’ file living in Soleil nursing home and suffering from Alzheimer disease, the probability matrix of transitions between states and the corresponding automaton are respectively presented in Table 1 and in Fig. 7.

Table 1. The corresponding probability matrix of transitions between states (Fig. 7).
Fig. 7.
figure 7

The automaton inferred by the algorithm k-TSSL (Soleil nursing home: residents suffering from dementia).

6.4 Markov Model

The final state q 6 does not only represent the resident state when they left the system but also the last resident assessment (resident present in the system at the date of database extraction).

In order to obtain the Markov chain model, we have to compute the probabilities:

  • Pe i : Input probabilities (i.e. the initial resident assessments) in each cluster i (i = 1..5),

  • Psi: Output probabilities (i.e. the last resident assessments when residents leave the system) in being cluster i (i = 1..5) after d(cluster i ) = 30 days (corresponding to the equidistant discrete time described in the automaton definition in the paragraph 5.2).

We have also to modify the probabilities of staying in cluster i (i = 1..5), regarding if the patient is staying in the nursing home at the at the date of database extraction (these evaluations are taken into account in the transition with the symbol f to q6 in Table 1). We add the number of evaluations in the corresponding cluster i . It is the reason that the probability to be in cluster1, (initially is 0.9738 in Table 1) becomes 0.9902 in the Markov matrix.

When a resident leaves the system, he is immediately replaced by a new resident. Consequently, two other probabilities are taken into account PE and PS. The Markov matrix is presented in Table 2.

Table 2. The Markov matrix obtained from the collected data - Soleil Nursing home: patient suffering from dementia.

We verify if the system reaches a steady state. Out of definition, an eigenvector x is associated to eigenvalue l if: \( \qquad \varvec{A*x = l*x}\varvec{.} \)

(A corresponding to the probabilities matrix presented in Table 2 )

If an eigenvector of x is associated to a unique eigenvalue 1, such a vector is called a steady state vector . If we identify only one eigenvalue 1, then the distribution is said to be irreducible and aperiodic.

The eigenvector associated with the eigenvalue 1 has been computed. We have one eigenvalue 1 and the corresponding eigenvector x is the following:

0.00692 0.01263 0.01966 0.12108 0.03768 0.79510 0.00693.

The interpretation of this eigenvector is that the system (ratio of the resident profiles without 0.69 % of resident turnover of input/output in the nursing home) evolves towards a state where the percentages of population are:

  • 1.28 % are in cluster1,

  • 1.99 % are in cluster2,

  • 12.28 % are in cluster3,

  • 3.82 % are in cluster4,

  • 80.63 % are in cluster5.

7 Experiments

The experiment deals with the evolution of a patient’s loss of cognitive autonomy over time. Table 3 presents the steady state vectors from different samples which correspond to a stable condition that does not change over time or in which change in one direction is continually balanced by change in another. We see that the decline is more significant for elderly people with dementia than non-demented elderly people.

Table 3. Steady state: population staying in medical nursing homes.

Now, we simulate the evolution over time by using transition matrix used to model the Markov chain concerning each population. The results concerning the patients’ profile progress in 2 years are presented in Tables 4 and 5.

Table 4. Evolution of patients’ profiles in 2 years (patients without dementia)
Table 5. Evolution of patients’ profiles in 2 years (patients suffering from dementia)

If the patient does not suffer from dementia disease, if he is initially in cluster5, in 2 years, the probabilities that the patient will be staying in:

  • Cluster5 is 50.9 %,

  • Cluster4 is 16 %,

  • Cluster3 is 5.8 %…

  • and leaves the system with a probability near to 23 %.

If the patient suffers of dementia, the probabilities that the patient which will be staying in:

  • Cluster5 is 9.7 %,

  • Cluster4 is 20.6 %,

  • Cluster3 is 27.1 %,…

  • and leaves the system with a probability near to 10 %.

8 Conclusion

A real case application of grammatical inference to identify the progression of a resident’s autonomy-disability over time has been presented. From profiles identified by using clustering approach [9], we propose preliminary results of an investigation where regular grammars are used for modeling the evolution of ageing over time. The finite automaton is inferred by using the k-TSSI algorithm and afterward modified in order to obtain a probabilistic graph of transitions between states (clusters) with the length-of-stay in each state. From this graph, we automatically deduce the corresponding Markov chain model. For the sake of simplicity, we only present in the article, the case where k = 1. It is evident that in this case, we can use a bi-gram. But we have also studied the evolution with k = 2..n. So, the approach allows identifying a sub-sequence of n items from sequences.

In future work, we will extend and validate the different models to other class of diseases. Approximately 1–1.5 % of the French population suffer from dementia and the causes of dementia are neurological disorders such as Alzheimer’s disease (which causes 50 %–70 % of all dementia), blood flow-related (vascular) disorders such as multi-infarct diseases, inherited disorders such as Huntington’s disease, and infections such as HIV [15]. In fact, we would like to simulate the patient’s progress in order to forecast and to analyze the need for long, medium and short-term care. This allows us to evaluate human, financial and physical resources in the future.