Keywords

1 Introduction

Knowledge management is an established practice in many industrial areas and public authorities with the general objective to contribute to a systematic identification, capturing, integration and maintenance of knowledge important for the organization [1]. The IT support for knowledge management typically consists of different systems tailored to the organizational needs, such as knowledge management systems, decision support systems or knowledge portals. A typical feature common for these systems is that they incorporate or built upon a clearly defined terminology, often formalized as taxonomy, dictionary, semantic net or ontology, which forms the basis for knowledge representation and definition of rules in the systems. From an organizational perspective, it is important that this terminology is shared by both business and IT stakeholders, because sharing knowledge essentially depends on a common understanding of the terminology used. From an organizational perspective, knowledge management and sharing a terminology contributes to business and IT alignment (BITA). BITA is a concept that cuts through a number of dimensions of enterprises and BITA can be addressed from different perspectives. In general, strategic, structural, social and cultural dimensions of BITA can be identified [2]. Another way to address BITA is to say that we need to create both horizontal- and vertical alignment between business and IT. Horizontal BITA means to apply a product lifecycle perspective which would include activities such as: product development, manufacturing, production design, production planning, production, logistics, time to, and launch on market, sales, predictive maintenance, customer involvement etc. Vertical BITA, on the other hand, will include aspects such as: strategy, vision, organization, business models, processes, infrastructures, technologies and all the way down to components, people, algorithms, micro services etc. Regardless of which dimensions are in focus for the current BITA effort, there is an apparent need for a shared and clear common terminology.

In this paper we therefore focus on a specific aspect of knowledge management and BITA: the development of ontologies representing the shared terminology in an enterprise or in a selected application domain within an enterprise. The paper investigates how industrial context affects development strategy in an industrial ontology engineering projects.

2 Background

2.1 Ontology Engineering

There are a number of methodologies for ontology development. Ontology 101 proposes a seven-step method to create an ontology using an ontology editing environment, such as Protégé [3]. The method introduces competency questions and ontology reuse as well as practical advice on ontology design. Another well-known ontology development methodology is METHONTOLOGY [4]. It contributes with a general framework for ontology development, which describes series of activities for development, management and support sub-processes. In addition, METHONTOLOGY proposes ontology life cycle as a series of evolving prototypes. The On-To-Knowledge methodology is focused on application-driven ontology development [5]. According to this method, engineering and industrial experts should actively be involved in the development of an ontology, in particular during the early stages of ontology engineering. Agile approach to ontology development is reflected in the eXtreme Design (XD) methodology that emphasises the use of ontology design patterns, customer involvement, task-oriented design, and collaborative and incremental development [6]. These methods focus on a collaborative, incremental, and iterative process of ontology development.

2.2 Business and IT Alignment

As been mentioned in the introduction, BITA is a concept that cuts through a number of dimensions of enterprises and BITA can be addressed from different perspectives. An important dimension of BITA is to bridge the gap between organizational context and technology. Several scholars have emphasized the need to capture both organization (business) and technology during design and implementation of Information System (IS) (e.g. cf. [7]). One way to approach this is to create a common vocabulary that will serve as abstraction of complex enterprises and support communication, dissemination and reuse of knowledge [8]. Formalized common vocabularies are able to capture different aspects of enterprise practices in terms of procedures, operations, and management and thus bridge the gap between organizational context and technology to facilitate BITA [9].

A promising way to link organizational context and technology is to view enterprises from an action perspective where actions are performed by humans and artefacts. Socio-instrumental pragmatism [10] incorporates human, organizational, and IS/IT-enabled actions within a single, coherent taxonomy or ontology. This concern of theorizing actions has also been acknowledged by actor-network theory (ANT) [11], where technology and people are both regarded as social actors. As identified by Goldkuhl and Ågerfalk [12] it is necessary, therefore, to acknowledge both the social in the technical and the technical in the social—a duality that is a main concern within BITA. Research has shown that alignment of business with IT is often addressed as a top concern of IT and business practitioners [2]. De Haes and Van Grembergen [13] discuss various practice-oriented frameworks such as Enterprise Governance of IT (EGIT), and COBIT and Val IT. They argue, in particular, that implementation of EGIT impacts the achievement of specific IT goals, which in turn impacts the achievement of business goals.

2.3 Related Work

Previous work of relevance for our research originates from two areas of computer science: experience reports on ontology development and case descriptions reporting on industrial ontology engineering. In the scientific literature, there are only a few publications which systematically investigate experiences and reflect on practices from ontology engineering. Almeida Falbo [14] discusses strong points and weaknesses of the SABiO (Systematic Approach for Building Ontologies) ontology development approach and derives improvement opportunities. Park et al. [15] report on the development of an ontology based on the guidelines provided by METHONTOLOGY, examines the utility of the method and discusses the drawbacks and disadvantages. Mizoguchi [16] presents focuses on the practice of ontological engineering and presents results and experiences without addressing any specific method. Brusa et al. [17] reflect experiences from merging different ontology development methods in software engineering and outlines best practices. Finally, Hristozova et al. [18] report on lessons learnt during the development of an ontology using the EXPLODE method for value-added publishing. Furthermore, our previous work in [19, 20] report on experiences from ontology construction in practice, which is substantially extended in this paper by including additional cases and another methodical approach for comparing cases and strategies.

3 Research Approach

Our research was driven by the following research question which originated from the observation that in particular industrial development projects ontologies were quite different from their construction process:

What are the factors in industrial ontology engineering projects that affect ontology construction?

The research method used for working on this research question is a combination of literature study and multiple case study. Based on the research question defined, we started identifying research areas with relevant work for this question and analyzed the literature in these areas. The purpose of the analysis was to find existing studies which systematically analyze experiences of ontology engineering and present theories about factors and their origin. Due to the focus on engineering processes, an additional area to investigate are methodologies for ontology construction with their built-in possibilities to adapt to industrial requirements. The summary of the results are presented in Sect. 2.3.

Since the literature study returned only “candidates” for factors to be investigated rather than proven theories, we decided to perform a multiple case study in order to gather information pertinent for the subject area. Qualitative case study is an approach to research that facilitates exploration of a phenomenon within its context using a variety of data sources. This ensures that the subject under consideration is not explored from only one perspective, but rather from a variety of perspectives which allows for multiple facets of the phenomenon to be revealed and understood. Yin differentiates various kinds of case studies: explanatory, exploratory and descriptive [21]. The case studies presented in Sect. 4 have to be considered as descriptive, as they are used to describe the phenomenon of ontology engineering in the real-life context in which it occurs. Based on the case study results, we conclude that there are certain factors to consider. This argumentative-deductive part of our work is discussed in Sect. 5.

As we aim at investigating factors for industrial ontology engineering, our focus has to be on data sources containing very detailed reports and rich case descriptions. This type of report is quite sparse in scientific literature on ontology engineering (see Sect. 2.3). Thus, we decided to use case studies of ontology development projects performed in our own research groups. We selected four projects (see next section) where not only the original project documentation was available to us but also the personnel involved in the project could be contacted and interviewed. The projects analysed originated from School of Engineering at Jönköping University (Sweden), research group computer science and informatics, and Rostock University (Germany), research group business information systems who in some projects jointly worked on the tasks. The analysis of the projects was done in distributed teams using a joint list of aspects to be investigated.

4 Industrial Cases

4.1 OSTAG

The project Ontology-based Software Test Case Generation (OSTAG) was supported by the Knowledge Foundation and aimed at improving the automation of testing activities related to software systems. The technical purpose was to create a method for deriving test case data from an ontology representing the requirement specification and domain for a software system. One of the industrial cases was provided by the participating company, Saab Avionics and originated from the avionics domain. A software requirements ontology was developed representing the requirements of a software component pertaining to an embedded system located in an airplane [22]. The developed ontology includes three specific pieces of knowledge: (1) a meta model of the software requirements, (2) the domain knowledge of the application, e.g. general knowledge of the hardware and software, electronic communication standards, etc., (3) all the requirement specifications defined in the SRS documents provided by Saab. The ontology was created with Protégé and was written in OWL (Web Ontology Language). The final version of the ontology contained 43 concepts, 37 object properties, 12 datatype properties, and 206 instances in total. The ontology was utilised to create software test cases by using inference rules, coded in Prolog, that represented the expertise of an expert software tester [23].

During the development, the team of two ontology engineers worked as a pair and followed an iterative and incremental process. There were five iterations, which resulted in four major versions of the requirements ontology. The followed methodology was the combination of the steps in Ontology 101 [3], and the activities in the supporting process in METHONTOLOGY [4]. Lightweight competence questions (CQs) were also used to guide the creation of the ontology. A project workshop with the participating companies was conducted in the beginning of each iteration, which included presentation of the current results and feedback from the industry experts. Moreover, four meetings with the developers and testers from Saab were arranged to discuss the issues encountered during the acquisition and specification steps. The evaluation of the requirements ontology consisted of its use in the test case generation in every iteration and a evaluation session with the developers and testers from Saab. The feedback was used by the knowledge engineering team to modify the ontology. The HermiT reasoner was used to check the consistency of the ontology.

4.2 CLICK

The second case was the CLICK project financed by the Vinnvård ProgrammeFootnote 1. The focus of CLICK was on supporting networking among researchers in a number of Vinnvård-financed projects spread throughout Sweden. The technical objective of this project was to create an on-line service to facilitate the task of seeking collaborators for joint work on producing scientific artefacts [24]. Each researcher was modelled by a profile that represented competences of the researcher and was comprised of major research areas, published papers, engagement in projects, and known co-workers. The search of potential collaborators was based on matching the researcher’s profile against others’ profiles. The resulting competence profiles were represented as sets of linked instances in the competence profile ontology saved in OWL. The ontology was created with TopBraid Composer and contained 52321 concepts, 11 object properties, 14 datatype properties, and 22790 instances in total. The most part of data needed for the creation of competence profiles was pulled from two publicly available data sources: PubMed (the database of citations for biomedical literature) and DiVA (Academic Archive On-line) through schema transformations. The rest of the data was imported from proprietary files.

The ontology development was carried out as an inherent part of the software development process, which followed agile approach. Consequently, the ontology development process was a combination of Ontology 101 [3] with elements of XD [6]: an iterative process with close involvement of the main stakeholder. The profile ontology creation started with the reuse of the thesaurus of MeSH (Medical Subject Headings)Footnote 2. During the eight project iterations, five prototypes of the on-line service was developed, including five versions of the profile ontology. An ontology engineer and two domain experts were working on the ontology continuously. The representatives of the main stakeholder (the Vinnvård Programme) were part of several workshops, which focused on detailed project updates and in-depth discussions of the prototypes. These workshops contributed to gathering requirements to the ontology and elaboration of the ontology design. Moreover, two rounds of interviews with end users were carried out during the user study. The user interviews resulted in creation of several user stories that contained requirements to the content of the ontology profiles. Between the interview rounds a workshop with a bigger group of end users was arranged. This user workshop resulted in identification of important expertise to model in a profile as well as in prioritization of the tasks in the ontology development plan.

4.3 SEMCO

The third industrial case originates from automotive industries and the SEMCO project (Semantic modeling of Components in automotive industries) [25]. Automotive manufacturers and suppliers have to manage a large number of product variations and their integration into a specific car model. Many products are designed as product families consisting of configurable components prepared for reuse across different product instantiations. In order to manage and control variety, manufacturers and suppliers increasingly recognize the need to manage project entities like models, documents, metadata, and classification taxonomies in such a manner that the integrative usage of these entities is supported. In SEMCO, the application scenario guiding the development of an ontology was the integration of different kinds of structures reflecting the artefacts and their interrelations. On the one hand, model hierarchies had to be captured, structured and implemented on different modelling levels (system, software, hardware, etc.), which also included model instances (artefacts) to be managed. On the other hand, networks of terms and taxonomies had to be considered as equally important. These networks represented organizational structures, product structures or taxonomies originating from customers closely related to artefacts. Explicit denotation of these relationships proved beneficial for identification of reuse potential of components or artefacts. The ontology construction was performed in a Swedish automotive supplier of safety components implemented as software-intensive systems. The development process applied is an enhanced version of the METHONTOLOGY process [4]. Most important knowledge sources were (1) a description of the suppliers internal software development process with defined procedures for all major aspects of software development and software project management and (2) documentation of two example cases for requirement handling, including original customer requirements, system and functional requirements, and (3) interviews and working sessions with members of the software development department were conducted including project manager, software developers and engineers. The resulting ontology consisted of 379 concepts and with an average depth of inheritance of 3.5 [25].

4.4 SEMA

Object recognition in videos is the topic of a research project funded by the State Ministry for Economics of Mecklenburg-Western Pomerania. Partners in this project are Future TV, an innovative company from Rostock (Germany) in the field of media and entertainment, Fraunhofer-Institute IGD-Rostock and the chair Business Information Systems. The core intention of the project is to develop a fully automatic tagging of arbitrary video films which provides information about the objects visible in the video in different scenes and at different points in time. This information about the video content could be used to select the most relevant video for the interests of a consumer or what commercials would fit best to the video content and the user watching the video-to give just two examples. From a technical perspective, the project aims at combining different techniques from artificial intelligence to improve precision of the object recognition and at the same time reduce the computational workload. The sub-project of Rostock University is called “Semantic Assistance for Object Recognition in Videos (SEMA)” and has a planned runtime of 24 months starting in April 2018.

The requirements to the ontology in SEMA are that two quite diverse areas have to be covered and integrated. One area is the content of the videos, i.e. the application domains from which the video content is taken has to be reflected. For the initial phase of the project this includes economics, home décor, selected parts of men’s fashion, sports and cars. Later on, the content might be extended which requires a design of the ontology prepared for change and extension. The other area is the recognition of situations in videos in the context of demand profiles of individual users or of companies targeting customers in this context. Although this area is to some extent dependent on the content of the videos (first area) it also includes own structures and knowledge, for example needed for structuring demands, targets and situations.

Development of the ontology is done in a mixed group of knowledge engineers from research, marketing people and domains experts from the company, and software engineers from the company aiming at integrating the ontology into the existing content management system. The development approach used is to first build an application ontology overarching both areas mentioned above and afterwards identify existing ontologies or conceptual models for the content areas under consideration, re-engineer and integrate them. To some extent, this approach is inspired by the methodology proposed in [3] and the ontology pattern based work by [26].

5 Comparison of the Cases

The comparison of the cases discussed in the previous section aims at tackling the research question formulated in Sect. 3. Some input to answering this question is already available from previous research on ontology development methodologies (Sect. 2.1) which identifies factors such as:

  • The possibility to reuse existing ontologies and how this changes the construction process,

  • The representation selected for the ontology and how thus might change the way of coding it, or

  • The required level of formality which affects the time required for the development and potential application level of an ontology.

For the above aspects, our work aims at evaluating whether they can be confirmed in our industrial cases. What from our observation is not sufficiently addressed in existing research (see Sect. 2.3) are the effects of stakeholder involvement, ontology update cycles and technical constraints on the ontology development process. Ontology update cycle in this context refers to the expected need for changing or adapting the ontology to new usage scenarios or to changing domain knowledge. Furthermore, we designed our study based on the assumption that the effects on ontology engineering projects not only manifest in the development process but also in other aspects of the project organisation. More concrete, we consider:

  • The roles established in the engineering team (e.g. knowledge engineer, tool expert, knowledge analyst),

  • The stakeholders involved from the organisation under consideration (e.g. domain experts, project manager or process owner),

  • The steps and activities of the development process,

  • The artefacts created during the development process.

With the three factors mentioned in the literature and the four organizational aspects identified above, we all in all have seven aspects to consider for each case. We produced Table 1 comparing the cases regarding the seven aspects.

Table 1. Comparison of the cases with respect to the factors

The observations made in our industrial cases confirmed that the possibility to reuse either ontologies or related conceptual models changes the construction process massively. In SEMA, the project with most intense reuse, a specific sub-procedure was designed and performed for managing reuse. This sub-procedure consisted of the essential activities of identification of potentially reusable models, in-depth check of suitability (from content and technical perspective), selection of what exactly to integrate, integration into ontology and quality check. This kind of sub-process was not required in the other projects. In CLICK, the reuse was determined by the domain and required only integration activity.

The level of formality is mentioned in the literature as a factor influencing the construction process. Formality level also affects potential applications of an ontology. However, it is not clearly defined in literature how to “measure” or determine the level of applicability in a“standardized” way. Alan Newell’s idea of knowledge levels was applied in our case comparison [27]. Newell’s view is that data structures (symbols) can be used to represent knowledge in a knowledge base, but those symbols cannot generate intelligent behaviour—unless some process is applied to those symbols. This means we have to distinguish the symbols in a knowledge base (knowledge representation) from the knowledge (capacity for rational behaviour) that the symbols can be used to generate. We cannot share knowledge bases if we do not also share the inference engines (or mental processes) that bring our knowledge bases to life. The level of applicability of ontologies could be differentiated according to the knowledge levels reflected in the knowledge bases. The most simple levels are only a known representation format (level 1) and additionally a shared vocabulary (level 2). In an ontology, we usually reach the next formality level (3), a shared conceptual model. If inference rules are properly and completely defined, the “process reuse” level (4) is given i.e. the process to be performed by interpreter when using the shared conceptual model is defined. The most complex level (5) is the problem solving level which requires completeness of knowledge base and inference rules with the problem and solution space. The highest applicability level is observed in the OSTAG case due to the complexity of the task to be solved—test case generation.

Roles in the engineering team, stakeholder groups involved in the construction process and tasks/activities to perform seem to be mutually reflective. Whenever domain experts without engineering background or end users have to be involved, there seems to be the need for business analyst in the engineering team in addition to knowledge engineers. If there is a high formality requirement, senior experts in the representation used are required. In reuse situations, the role of reuse engineer should be established. Moreover, in the CLICK project the intensive involvement of end users was necessary because the ontology requirements were not specified by the customer. Instead, the requirements were gradually refined through participative design during several iterations in line with the XD methodology, which differs from the other projects.

Regarding the representation of ontology, there was little variation. Nevertheless, the choice of tools to use in the project needs to be taken into account by the project management. Finally, the artefacts to be delivered affected the ontology development in one case. The creation of competence profiles in CLICK required extensive population of the ontology, which was carried out through data integration from the publicly available sources.

6 Conclusions

This paper presented the results of the literature study and multiple case study aimed at determining the factors that affect ontology development strategy in the context of industrial projects. The seven factors were identified and subsequently used to compare the four industrial cases. The most influential factors are reuse of ontologies/models, stakeholder groups involved, and level of applicability of ontology. The results can be utilised to develop formalized common vocabularies capturing both organization (business) and technology to facilitate BITA. Our recommendations derived from the analysis of the case comparison are as follows.

  • Recommendation 1: In projects with substantial reuse of existing conceptual models or ontologies, establish the role of a reuse engineer and explicitly define a reuse sub-procedure to be followed.

  • Recommendation 2: In projects with many non-expert modellers as domain experts or end users, integrate a person/role with experience in explicating implicit knowledge or in participatory work. This is essential when ontology requirements are to be gathered in a series of project iterations.

  • Recommendation 3: Use Alan Newell’s proposal of knowledge levels to define the level of applicability of the required result. If the applicability level exceeds the shared conceptual model level, integrate a knowledge engineer specialising in rule engines and inference mechanisms into the team.

  • Recommendation 4: Whenever creation of artefacts implies extensive population of an ontology, investigate early in the project what data source are available and how they can be integrated in an automated manner.