Keywords

1 Introduction

Recently, Europe faces two societal challenges: the increasing of overweight and obesity and the population aging. These problems, while having a tremendous impact on population life quality (e.g. poor health, social exclusion, increase in the need of assistance), are challenging the food industry to develop new strategies to produce well-balanced products in terms of nutritional requirements (e.g. less fat, sugar and salt) while using sustainable transformation processes. It is therefore crucial to better understand the food production system and a very interesting issue is to combine data and knowledge from different disciplines, like food composition in terms of nutrition, food digestion as a physiological process and sensorial perception of food.

Delicious project addresses the problem of analyzing the production and transformation processes of dairy gels using information available from different collaborative projects concerning the food composition, food structure, mobility/bioavailability of flavor compounds and nutrients, sensory perception and digestibility. It involves domain experts and computer scientists researchers from INRA, the French National Institute for Agricultural Research. The expected result of Delicious project is to collect and structure the available data and knowledge into a data warehouse in order to enhance the analysis of the production process according to different cross-domain criteria. However, it is very difficult to take advantage of all the available data and knowledge from Delicious project. The main difficulty comes from the heterogeneity of their sources, the different inter-domain or cross-domain vocabularies, the different formalisms used according to the involved domain. A second challenge concerning the data integration task is the uncertainty quantification such as randomness, incompleteness, imprecision, vagueness, resulting from the natural variability of the domain and the lack of information. In order to address the question of the integration of knowledge and data, a relevant solution is the use of an ontology [4]. An ontology can be defined as a formal common vocabulary of a given domain, shared by the domain experts [7].

This paper present the Process and Observation Ontology, called \(PO^2\), designed for Delicious project. The scenario 6 of the NeON methodology [2], i.e. reusing, merging and re-engineering ontological ressources, was used for building \(PO^2\). The core component is implemented in OWLFootnote 1 and the domain component is under development.

By making use of the \(PO^2\) vocabulary, the data sets available for the project were well-structured for the integration task. An use case is presented in order to show the complexity of this task.

This first step of building the \(PO^2\) ontology allows to structure and organize the knowledge into a meaningful model at the knowledge level. This will lead to the possibility of designing more complex decision support systems allowing to compare different production scenarios and therefore suggesting improvements concerning the product quality while reducing the environmental impact. It may also help the field by giving hints about what data should be collected in order to perform an analysis concerning a target population (e.g. children or old people) or an cause and effect analysis. It may also provide the French food industry with the necessary tools to anticipate and develop future food products.

The paper is organized as follows. In Sect. 2, we present the ontology specification. In Sect. 3, the conceptualisation of \(PO^2\) is detailed. In Sect. 4, we illustrate \(PO^2\) through a use case. Finally, we conclude in Sect. 5 and present our further work.

2 Ontology Specification

Ontology specification was done during an iterative process. The ontology developers and the domain experts had a lot of meetings in order to identify (1) why the domain experts want to build an ontology (i.e. for what purpose), (2) what its intended users will be and (3) what are the main entities.

First, the purpose of building an ontology is to provide a consensual model of the production and transformation of dairy gels and to solve the lack of communication between domain experts. Available data were gathered for many different purposes by different experts with their own experimental itineraries, vocabularies and technical materiel and methods. There is an obvious need to build a common and shared structured vocabulary.

Second, the intended users are researchers in several distinct domains: nutrition, microbiology, biochemistry, physico-chemistry, chemistry, process engineering, food science and sensory analysis. Reaching a consensus about a common vocabulary was therefore a hard task. The ontology developers and the 15 domain experts involved in Delicious project spent about 20 h using CMap ToolFootnote 2 to identify a vocabulary common to all the involved experts. The resulting vocabulary was unstructured and composed of approximately 500 entities dealing with composition, structure, technical and physiological transformation processes, mobility and bioavailability of small molecules in relation with sensory perception and nutritional value. It proposes a first representation of the explicite and implicite knowledge of all the involved domain experts.

Third and finally, in order to investigate how to structure the vocabulary, we focused on a small representative subset of data and knowledge concerning the In the mouth process. Taking into account the previously identified entities, relying on available documents [1, 5] and data and in close collaboration with domain experts of the target domain, entities were grouped into three main parts (see Fig. 1):

  • the part concerning the production and transformation process which contains the concepts: process, itinerary and step;

  • the part concerning the participant which contains the concepts: product, mixture, material and sensing device;

  • the part concerning the observation which contains the concepts: observation, scale, sensor output, computed observation, method and measure.

Fig. 1.
figure 1

The three main parts of the ontology for Delicious

We therefore reached a consensus about a common structured vocabulary with the following specifications. An itinerary is an execution of a production or transformation process, i.e. a set of interrelated steps. A step is characterized by its participants and its temporal duration/interval. A participant may be a mixture, a material or a sensing device. Each participant is characterized by its experimental conditions. Moreover a mixture is characterized by its composition. An observation observes a participant at a certain scale during a step. It is characterized by some participants such as a given material or a sensing device and implements a method. It has for result a sensor output and/or a computed observation, each of them can have for value a function or a simple measure. A measure is characterized by either a quantity and a unit of measure or a symbolic concept and a measurement scale.

3 Ontology Conceptualisation

The ontology conceptualization follows the Scenario 6 of the NeON methodology [2], i.e. reusing, merging and re-engineering ontological ressources. A number of existing ontologies have been analyzed: the supply chain ontology [6], the bussiness process ontology [9], the ontology for wine production [8], SSNFootnote 3, BFOFootnote 4, IAOFootnote 5 and \([MS]^2O\) (Multi Scales and Multi Steps Ontology)Footnote 6.

Fig. 2.
figure 2

\(PO^2\) core component

Based on our experience and after a careful analysis, it was decided that the best method to adopt for building the \(PO^2\), Process and Observation Ontology, is to re-engineer the core component of \([MS]^2O\), an ontology designed for a project concerning the representation of the production of stabilized micro-organisms (see [3] for more details). This re-engineering task of \([MS]^2O\) was done with the two following main concerns:

  • establish a clear distinction between a process and its participants which was achieved by reusing BFO;

  • link all together the observations with the step where they occur, their participants, their materials and methods and their measures reusing IAO (Information Artifact Ontology) an ontology of information entities.

The \(PO^2\) core component is given in Fig. 2. The concepts identified in Sect. 2 during the ontology specification are represented as nodes and the relations between the concepts are represented as arrows.

The \(PO^2\) core component is implemented in OWL and it is available at http://agroportal.lirmm.fr/ontologies/PO2. The domain component is under development.

4 \(PO^2\) Use Case

This section presents an use case concerning the In the mouth process, in order to show the complexity of the representation task.

At the beginning of the Delicious project, data and knowledge concerning this process were available in different vocabularies and formats. By making use of the vocabulary from the \(PO^2\) core component presented in Sect. 3, the data concerning the studied use case were structured into 20 EXCEL files:

  • 2 files describe the In the mouth process (e.g. Fig. 3),

  • 11 files describe the mixture composition (e.g. Fig. 4),

  • 6 files describe experimental observations (e.g. Fig. 6), and

  • 1 file describes the materials and methods with 29 methods and 16 materials (e.g. Figs. 8 and 9).

Let us notice that these EXCEL files allow the domain experts to collect and re-structure the available data using the \(PO^2\) vocabulary. Moreover, these files can be automatically translated into instances of \(PO^2\) (see e.g. Figs. 5 and 7).

Fig. 3.
figure 3

The EXCEL file which describes the In the Mouth process

In Fig. 3 the description of In the mouth process is given: it contains one itinerary which is composed of two steps: the Before putting in the mouth step and the In the Mouth step. The last step is composed of two sub-steps: Chewing and Swallowing.

This process has for studied object a sample of the mixture cheese model identified by the code number L20P28. This mixture is composed of ten products as described in Fig. 4, each product being characterized by the input attribute Weight.

Fig. 4.
figure 4

The EXCEL file which describes the composition of the mixture L20P28

Figure 5 gives an example of an instance extracted from Fig. 4: the mixture L20P28 is composed of the product Rennet casein where its input attribute Weight has for simple measure the value 238.3 of unit of measure g/kg of cheese model.

Fig. 5.
figure 5

An example of instance concerning a mixture and its composition

Fig. 6.
figure 6

The EXCEL file which describes an experimental observation during the sub-step chewing of the step In the mouth for the mixture L20P28

Fig. 7.
figure 7

An example of instance representing an experimental observation Observation1 during the sub-step Chewing for the mixture L20P28

Fig. 8.
figure 8

The EXCEL file which describes the Material 14 used in the experimental observation Observation1 of Fig. 7

Fig. 9.
figure 9

The EXCEL file which describes the Method 22 and the Method 23 used in the experimental observation Observation1 of Fig. 7

Let us now focused on an experimental observation of the In the Mouth process as described in the EXCEL file of Fig. 6. This instance of observation, called in the following Observation1, has the following properties (see Fig. 7):

  • is observed during the sub-step Chewing of the step In the mouth;

  • observes the mixture L20P28;

  • has for participants the two materials: Material 1 and Material 14 as described in Fig. 8;

  • has for scale the molecular scale;

  • has for date 10/09/2012;

  • implements the two methods: Method 22 and Method 23, both described in Fig. 9;

  • has for observation result the sensor output Sodium concentration in the saliva which is function of the sodium concentration during time;

  • has for computed result the computed observation yield curve of the release which has for measure the value 2.75 of unit mM.

What it is interesting to report about our experience with this use case is that the process of building the ontology is an iterative one. Notice that the EXCEL files of Figs. 3, 4, 6 and 8 contain well structured data and knowledge, but the EXCEL file of Fig. 9 describing the methods with many textual informations, is currently unusable for automatic querying. Domain experts were not able up to now to express their needs about the querying concerning the different methods they used in the different domains. The lessons they learned while they organized and structured their data and knowledge according to the concepts from \(PO^2\) give them the understanding that allow to refine the specification concerning the methods. This is an ongoing process.

To conclude, we would like to stress on the fact that the complexity of the knowledge representation task of this use case allows us to identify a common and shared structured vocabulary that encompasses almost all the domains involved in the Delicious project.

5 Conclusion

In this paper we presented the building of \(PO^2\), a Process and Observation Ontology, designed for a cross-domain project concerning the production and transformation of dairy gels. The core component of \(PO^2\) is the result of re-engineering \([MS]^2O\), using BFO and IOA. A use case on an In the Mouth process was presented.

Further work is to express users requirements through competency questions and prioritizing those requirements. Then the domain component will be developed and the ontology will be validated against the competency questions.

\(PO^2\) aims to play a key role as the representation layer of the querying and simulation system of Delicious project. This leads to the possibility of comparing different production systems and may also help to develop a decision support system taking into account the uncertainty of data.

The developed ontology could be further adapted to other types of food products, such as bakery, vegetable or meat products. This may provide to the French food industry tools in order to develop food products according to the nutritional recommendation for a healthy population while increasing efficiency and adopting an eco design approach.