A Domain-Specific Visual Modeling Language for Augmented Reality Applications Using WebXR

Muff, Fabian; Fill, Hans-Georg

doi:10.1007/978-3-031-47262-6_18

Fabian Muff¹² &
Hans-Georg Fill¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14320))

Included in the following conference series:

International Conference on Conceptual Modeling

799 Accesses
1 Citations

Abstract

Augmented reality (AR) is a technology that overlays digital information onto real-world objects using devices like smartphones, tablets, or head-mounted displays to enrich human comprehension and interaction with the physical environment. The creation of AR software applications requires today advanced coding skills, particularly when aiming to realize complex, multifaceted scenarios. As an alternative, we propose a domain-specific visual modeling language for designing AR scenarios, enabling users to define augmentations and AR workflows graphically. The language has been implemented on the ADOxx metamodeling platform, together with a software engine for running the AR applications using the W3C WebXR Device API for web-based augmented reality. The language and the AR application are demonstrated through a furniture assembly use case. In an initial evaluation, we show, via a comprehensive feature comparison, that the proposed language exhibits a more extensive coverage of AR concepts compared to preceding model-based approaches.

Financial support is gratefully acknowledged by the Smart Living Lab funded by the University of Fribourg, EPFL, and HEIA-FR.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Toward Development Tools for Augmented Reality Applications – A Practitioner Perspective

On the Development of Context-Aware Augmented Reality Applications

History of Augmented Reality

Keywords

1 Introduction

Augmented reality (AR) plays an important role in the ongoing convergence of the physical and the digital world [28]. At its core, augmented reality enhances the user’s perception by superimposing visual information such as images, videos, or three-dimensional (3D) visualizations onto real-world environments in real time [2, 39]. It uses computer vision techniques to align objects in the virtual and physical worlds and displays the virtual information using see-through displays or screens, e.g., on smartphones or head-mounted displays [32]. AR reverts to markers or detectors of real-world objects to determine their location and orientation in three-dimensional space to accurately map visual information onto them. For realizing complex AR workflows in practical work scenarios, additional concepts such as the integration of external data sources in combination with triggers, conditions, and actions to process this data become necessary.

Recent technological advances have made augmented reality affordable via its availability on standard smartphones and tablets [38]. In addition, the future open W3C standard WebXR Device API is being developed for accessing AR devices on the web across a wide variety of hardware form factors [18]. In terms of industrial applications, market research by Gartner [27] and PwC [7] indicates that AR is a highly promising technology allowing for broad usage in industrial scenarios such as maintenance tasks or training [15].

Creating augmented reality applications requires today advanced programming skills, e.g., for platforms and APIs such as Vuforia^{Footnote 1}, ARKit^{Footnote 2}, Google ARCore^{Footnote 3}, or MRTK^{Footnote 4}. For easing the creation of AR applications, several proposals have been made in model-driven engineering (MDE) and conceptual modeling. This includes, for example XML and JSON schemas for describing AR scenes in generic, platform-independent formats [21, 30] or with a focus on learning experiences [37]; domain-specific languages for creating AR model editors using Vuforia, ARKit, or MRTK [6, 29, 33]; or a BPMN extension for representing process information in AR using the Unity platform [15]. In addition, commercial low-code and no-code tools are offered that aim to empower non-technical users to create AR applications. This includes tools such as UniteAR^{Footnote 5}, or Adobe Aero^{Footnote 6}. However, these tools are mostly designed for creating a single AR scene or very simple workflows.

What is missing so far is a visual modeling approach that can represent complex AR workflows for diverse application scenarios, that can be easily adapted to new requirements, and that is based on open standards. To facilitate the creation of AR applications that take advantage of the accessibility, portability, interoperability, and openness of the web, we propose a domain-specific modeling language (DSML) based on models conforming to the W3C WebXR Device API recommendation, thereby enabling the definition of different scenarios such as assembly processes, maintenance tasks, or learning experiences. The development of the language follows guidelines for DSML development proposed by Frank [13]. The DSML has been implemented on the ADOxx metamodeling platform and applied to a furniture assembly use case [11]. For a first evaluation, we conduct a feature comparison with similar languages in the area of augmented reality [34].

The remainder of the paper is organized as follows. Section 2 describes fundamental concepts in AR and the most important development platforms for achieving a common understanding. In Sect. 3, we analyze previous related work in MDE and conceptual modeling in the context of AR. From these insights, we derive generic and specific requirements for a domain-specific visual modeling language for AR applications and present its specification and implementation in Sect. 4. This is followed by a use case in Sect. 5. In Sect. 6, we evaluate the language through a feature comparison. Finally, in Sect. 7, we conclude the paper and point to future work.

2 Foundations

As augmented reality relies on a range of specific techniques from computer vision to achieve the intended user experience, we will briefly explain the most important concepts in the following for ensuring a common understanding.

2.1 Augmented Reality

Augmented reality is a technology that allows computer-generated virtual images to be embedded in the real environment [39], thereby creating a three-dimensional alignment between virtual and real objects that allows for interaction in real-time [2].

Augmented reality relies on three core concepts from the field of computer vision [32]: (1) Detectables/Trackables, (2) Coordinate Mappings, and (3) Augmentations. First, for determining the location and orientation of the real-world environment, computer vision algorithms are used to estimate the position and orientation based on two-dimensional (2D) or 3D sensor information, e.g., from a camera stream or a LiDAR scanner [9, 31]. This detection can either revert to detectables in the form of natural features or markers such as QR codes as surrogates for simplifying the detection and tracking [32]. Coordinate mappings are then needed to align objects in the real and the virtual world to each other. Thereby, a real world origin reference position, e.g., stemming from global positioning system (GPS) coordinates, must be mapped to the global coordinate system of the virtual environment. Further, local coordinate systems are used for any real-world or virtual object. These permit to define reference points for placing virtual objects relative to other objects, independent of the current global coordinates. Finally, virtual information is superimposed on the real world through so-called augmentations. These can be animations, 2D images, videos, audio, text labels, 3D objects, hyperlinks, checklists, or forms. By defining anchors, augmentations can be fixed at a particular position in real space.

For more complex AR scenarios, further concepts are necessary. This includes in particular the integration and processing of additional data that is acquired throughout the life-cycle of an AR scenario via sensors or user interactions. To enable dynamic changes in the AR environment, at least basic workflow concepts such as triggers, conditions, and actions need to be foreseen [37]. Thereby, triggers include: click, detection, sensor, or timer events; voice commands; entry/exit of defined spatial areas; or, gestures. Conditions specify the branchings into different process flows and actions refer to any change applied to the virtual objects such as the appearance and disappearance of objects or transformations, i.e., rotation, scaling, and positioning.

2.2 Implementation Platforms

For creating AR applications, several development platforms and software development kits (SDK) are provided. Most of them require significant programming skills and are either commercial or closed-source. Examples include the Unity runtime and development environment, Apples ARKit, Wikitude, Vuforia, Kudan, Unreal Engine, or Adobe Aero. In addition, open source platforms and SDKs are available, such as Google ARCore, ARToolKit+, OpenXR, or Holokit.

An alternative to the above platforms and SDKs is the WebXR Device API [18]. It specifies a web Application Programming Interface (API) that provides browser-based access to handheld or head-mounted augmented reality and virtual reality devices, including sensors. This allows AR content to be rendered by any compatible WebXR-enabled browser without the need to install additional software or use SDKs. As of today, WebXR is supported, for example, by Chromium-based browsers on the Android operating system^{Footnote 7}, including handheld smartphones and tablets, as well as head-mounted displays, e.g., the Microsoft HoloLens 2^{Footnote 8}. Further, WebXR is already included in the WebKit engine used by iOS Safari^{Footnote 9} and will be supported by the Apple Vision Pro^{Footnote 10}. WebXR does not facilitate the development of technical applications, but applications developed with it are more accessible.

3 Related Work

Several approaches have explored the application of conceptual modeling and model-driven engineering for augmented reality applications. In a comprehensive literature analysis, we previously identified 201 relevant papers at the intersection of conceptual modeling and virtual reality/augmented reality and derived the major research streams in these areas [26]. From the results of this study, we selected the most important contributions in the area of model-driven engineering and conceptual modeling for AR which are related to our approach. These will be briefly characterized in the following.

Ruminski and Walczak [30] describe a text-based declarative language for modeling dynamic, contextual augmented reality environments called CARL. They claim that CARL can simplify the creation of AR experiences by allowing developers to create reusable, modular components. Their development approach is based on textual modeling and does not include a visual representation.

Wild et al. [37] focused on data exchange formats for AR experiences in manufacturing workplaces. They propose two textual modeling languages that include the definition of learning activities (activityML) and the definition of workplaces (workplaceML). Based on this work, a new IEEE standard for Augmented Reality Learning Experience Models has been developed [36], which includes a reference implementation^{Footnote 11}. It enables the direct definition of learning workflows within an AR context. However, the textual models for these workflows are stored only at runtime, precluding a definition outside the tool.

A similar approach has been developed by Lechner [21]. He proposes the XML-based Augmented Reality Markup Language (ARML 2.0) for describing virtual objects, their appearance, and anchors in an AR scene in relation to the real world. ARML 2.0 has been included in a standard issued by the Open Geospatial Consortium^{Footnote 12} in the form of an XML grammar.

Ruiz-Rube et al. [29] proposed a model-driven development approach for creating AR-based model editors, aiming at more efficient means of creating and editing conceptual models in AR. Thus, the generated applications target modeling itself. They demonstrate their approach by a tool called ARE4DSL^{Footnote 13}. It only allows for the definition of AR-based modeling applications and not for the definition of other types of AR applications.

Seiger et al. [33] presented Holoflows, a modeling approach for creating Internet of Things (IoT) processes in augmented reality environments. The approach includes an interface allowing non-experts to design IoT processes without process or modeling knowledge. The approach is specific to the IoT domain and modeling is only possible within the provided AR application.

Grambow et al. [15] introduced an approach called BPMN-CARX. It stands for a solution integrating context-awareness, visual AR support, and process modeling in BPMN of Industrial Internet of Things (IIoT) processes. The approach allows to extend business process management software with AR and IIoT capabilities. Further, it supports the modeling of context-aware and AR-enabled business processes. BPMN-CARX extends BPMN with new elements including a graphical notation. The approach is specific to business process modeling and does not seem applicable to other scenarios.

Campos-Lopez et al. [6] and Brunschwig et al. [5] proposed an automated approach for constructing AR-based interfaces for information systems using model-driven and software language engineering principles without the need for coding knowledge. They introduced a model-driven approach for AR interface construction, where the interface is automatically generated from a high-level domain metamodel of the system and includes AR features like augmentations, a mechanism for anchors based on real-world position, or the recognition of barcodes and quick response (QR) codes. Additionally, it is possible to define API calls to be performed upon certain user interactions, e.g., the creation of objects. The approach is mainly designed for modeling systems that use AR, however, there is no possibility to define states or executable workflows. They demonstrate the feasibility of their approach through a prototypical iOS app called AlteR that is based on Apple’s ARKit^{Footnote 14}.

In summary, approaches exist for (1) generating specific AR applications based on models and schemata, (2) generating AR-based modeling tools based on MDE, and (3) AR modeling applications based on conceptual modeling languages. However, to the best of our knowledge, there is no visual modeling approach available so far for representing executable AR workflows for diverse application scenarios and that is based on open AR standards. Therefore, we advance in the next section to the definition of the requirements of such a modeling language and its implementation, as well as an exemplary use case.

4 Derivation of the Visual Modeling Language

Domain-specific languages in general provide constructs that are tailored to a specific field of application with the goal of gaining expressiveness and ease of use to increase productivity [22]. In the area of model-driven software development, typically languages with a visual notation are proposed, which we will denote in the following as domain-specific visual modeling languages, cf. [13, 19]. Related to this is a trend found today in industrial software development with the rise of low-code and no-code approaches which aim at empowering users to develop software with less or no programming expertise [3, 8]. We will thus derive a domain-specific visual modeling language for creating augmented reality applications.

4.1 Methodology

Several guidelines and methodologies have been proposed for the development of domain-specific languages, cf. [13, 17, 20, 35]. We will mainly follow the macro process proposed by Frank [13], who describes seven phases including details for each phase - see Fig. 1. For the language specification and the creation of the modeling tool we further considered the methodology by Visic et al. [35], which focuses on the interplay between a modeling language and algorithms and the deployment of the modeling tool.

In terms of scope and purpose, we aim for a language that permits users with no programming expertise to create augmented reality applications that include complex workflows and run in a web browser without further plugins or software components on a broad range of devices.

4.2 Requirements

Frank distinguishes between generic and specific requirements that need to be analyzed prior to the language specification [13]. As Gulden and Yu pointed out, these requirements have to be carefully balanced for considering trade-offs between different design alternatives [16], especially in terms of simplicity, comprehensibility, and convenience of use of the language [13].

Thus, we defined the following seven generic requirements (GR\(_{1-7}\)) for our language as proposed by Frank [13] and in similar fashion by Karsai et al. [20], as well as Jannaber et al. [17]: GR\(_1\): The language should allow the specification of AR applications of various types without programming skills, making AR application development more intuitive and user-friendly than traditional approaches. GR\(_2\): The modeling language shall use concepts that a potential user is familiar with, i.e., concepts that are either common in everyday life or related to AR environments. GR\(_3\): The modeling language shall contain special constructs that are tailored to the domain of augmented reality. These terms need to be understood in the same way in all situations and by all users. GR\(_4\): The constructs of the language should allow modeling at a level of detail sufficient for all foreseeable AR applications. GR\(_5\): The language shall provide different levels of abstraction to avoid overloading and thus compromising the proper interpretation of a model. GR\(_6\): There shall be a clear association between the language constructs and the constructs of the relevant target representations in the AR application. GR\(_7\): In addition, Frank describes the requirement of choosing an appropriate metamodeling language that is consistent with the generic requirements described, which we will consider later for the language specification.

Further, we added twelve specific requirements SR\(_{1-12}\) that originate from: (a) our analysis of the domain of augmented reality in the form of fundamental concepts and existing software platforms and approaches – see Sect. 2, (b) previously identified academic approaches in the area of model-driven engineering for AR [26], and (c) requirements concerning the implementation of the language in terms of satisfying the purpose of platform-independent execution using WebXR [18]. The specific requirements have been further grouped into three categories: Domain, Abstraction, and Implementation.

The category Domain refers to specific requirements that emerge from the domain of augmented reality applications. SR\(_1\): Superimposing virtual objects on the real world (Augmentation) is the main functionality of augmented reality applications [6, 15, 21, 29, 30, 33, 37]. The domain-specific modeling language must allow the user to represent virtual augmentations in various forms such as images, text labels, animations, or 3D objects. SR\(_2\): To create a realistic AR experience, the digital augmentations superimposed on the physical world must align with the real world [6, 29, 37]. A virtual augmentation placed on a real object should remain in its original position relative to the real object, even as the user moves around. Therefore, the modeling language must provide a concept for creating a local real-world origin to provide a reference point at application runtime (World Origin Reference). SR\(_3\): It must be possible to specify the location of virtual augmentations in relation to other objects or the world origin in real or virtual space during model specification (Reference Point) [6, 21, 37]. SR\(_4\): It must be possible to specify real-world objects that can be tracked during application runtime (Detectable/Trackable) [6, 15, 21, 29, 30, 37]. Therefore, a concept is required to create such detectable objects during modeling. These detectables should not only specify the existence of a real-world object, but also provide data to recognize these objects at runtime, for example using images or 3D object data. SR\(_5\): Specifying the modification of different objects based on different actions is a critical functionality of AR applications [21, 29, 30, 33, 37]. Thus, the modeling language should permit to define transitions to subsequent actions and to directly manipulate and transform augmentations. SR\(_6\): For realizing complex AR workflows [15, 33, 37], triggers and conditions are required to enable dynamic branchings in AR applications [6, 15, 29, 30, 33, 37].

The category Abstraction refers to a general aspect for creating an AR modeling language and contains only one specific requirement, which details the generic requirement of different abstraction levels (GR\(_5\)). SR\(_7\): To reduce complexity and to separate the different roles required during the specification of AR scenarios, the modeling language shall include concepts for abstraction, e.g., model decomposition, and separation of concerns to allow task sharing among stakeholders with different responsibilities [21, 29, 30, 37]. For example, a designer could work on visualizing augmentations, while a domain expert could specify the application workflow.

The final category, Implementation, considers the requirements that must be supported in terms of language specification and implementation. SR\(_8\): Due to the nature of modeling languages, an abstract and a concrete syntax in textual notation needs to be provided [13, 20], also for easing future interoperability with previous approaches [6, 15, 21, 29, 30, 33, 37]. In addition, as visual notations are more intuitive and user-friendly than text-based notations, a two-dimensional graphical notation needs to be specified [15]. Finally, since the AR domain reverts largely to 3D content, specifying models directly in a 3D environment is useful to facilitate spatial imagination [6, 33, 37]. Thus, a domain-specific modeling language should consider concepts for text-based, 2D visual, and 3D spatial modeling. SR\(_9\): To allow for an easy and rapid adaptation of the language as requirements change, the modeling language shall be based on metamodeling [6, 13, 29]. SR\(_{10}\): It should be possible to directly feed the model into an AR application for the execution of the modeled AR scenario [15, 29, 30, 37]. Thus, a domain-specific modeling language for AR applications shall provide a data format that can be processed by an AR engine during runtime [10] or generate code for creating the AR application itself from the models [15]. SR\(_{11}\): AR applications are often built using commercial SDKs such as Apple ARKit, Wikitude, or Vuforia, most of which depend on the closed-source Unity development platform. To make the modeling language widely applicable on a large range of devices and enable non-commercial long-term research, the modeling language (specification) and code generated from it (execution) shall be based on open standards, such as the WebXR Device API [18]. SR\(_{12}\): To ensure reproducibility and accessibility, the implementation of the domain-specific modeling language shall be made openly available [29, 33, 37].

4.3 Language Specification

According to Frank the phase of language specification contains several parts [13]. The first step is to create a glossary containing all the concepts that are considered relevant to the domain of discourse. These terms were derived from the requirements shown above, e.g., augmentation, detectable, or condition. Next, for each concept in the glossary, it has to be decided whether it shall be part of the modeling language and how it will be expressed with the language during instantiation. Further, it needs to be decided which metamodeling language or meta\(^2\) model shall be used. Subsequent to the language specification, Frank foresees a separate phase for the design of the graphical notation. First, an overview of the language concepts and the abstract syntax is presented in the form of a metamodel. Thereafter, we show the graphical notation and details on the semantics of the constructs.

For the definition of the modeling language, we used the metamodeling language of ADOxx [11]. ADOxx was chosen due to its wide usage within projects of the OMiLAB network [14] and the availability of an open platform for the implementation of model editors. The main metamodeling concepts in ADOxx are [11, 12]: ModelType , Class , Relationclass , and Attribute. Modeltypes contain one or more classes, which may be connected by relationclasses. Modeltypes, classes, and relationclasses may have attributes. Instances of classes and relationclasses can only be contained in one particular instance of a modeltype. Special attributes of type act as pointers to other class instances or model instances. In the metamodel introduced in the following, each concept will be marked with the icons introduced above (, , , ) to indicate the corresponding meta\(^2\)-concept.

Figure 2 shows the metamodel of the new domain-specific modeling language. The modeling language is divided into three separate ModelTypes : ObjectSpace, Statechange, and FlowScene. This results from requirements GR\(_2\), GR\(_5\) and SR\(_7\). An ObjectSpace defines the real world of an AR environment. It contains the two classes Augmentation and Detectable as defined by requirements SR\(_1\) and SR\(_4\). Further, augmentations can include other augmentations, indicated by the child relationclass and they may be connected to Detectables via anchored relations (SR\(_3\)). A Detectable has an attribute is_origin , specifying if a Detectable references the world origin (SR\(_2\)).

Statechanges are described in the separate ModelType Statechange - SR\(_5\) and SR\(_7\). Within such models, Augmentations from the ObjectSpace model are referenced (Reference ) and changes on their attributes - e.g., a rotation transformation - are expressed via the attribute statechange_list .

The FlowScene ModelType defines the workflow of the AR application and how it reacts to different environmental conditions (GR\(_4\), SR\(_6\)). Every FlowScene contains exactly one Start and one End instance (SR\(_6\)). Each FlowScene contains an ObjectSpace instance, which references an instance of the ObjectSpace ModelType. Inside this ObjectSpace class instance, the FlowScene model defines an Origin , one or multiple Statechanges , Conditions , and Resolves (SR\(_2\), SR\(_6\)). They are linked to the ObjectSpace with the is_inside relationclass, specifying that these concepts are linked to one specific ObjectSpace. The Origin is used to define the world origin of the AR environment. Thus, it references a Detectable in the ObjectSpace model. Conditions define requirements which are necessary to trigger the subsequent Statechanges, or to trigger Resolves, if there are no consecutive Statechanges (SR\(_6\)). Thus, Statechanges and Resolves are connected to Conditions by the triggers relationclass. Conditions, on the other hand, follow an Origin or Statechange via the has_condition relationclass. Furthermore, Conditions can be associated with an Observer using the has_observer relationclass. Observers can be used to monitor sensor data or APIs (SR\(_6\)).

Table 1. Semantics and notation of the modeling language. For each ModelType, the semantic definition of the contained constructs is explained and the visual notation is shown.

Full size table

For each of the classes and relationclasses, we added a graphical notation and details about the meaning of each construct in the form of a semantic definition, as shown in Table 1. Thereby, we considered principles from graphical notation design by Moody as far as possible [23]. In particular we aimed for Semiotic Clarity, Perceptual Discriminability, Semantic Transparency, Complexity Management, Cognitive Integration, Visual Expressiveness, Dual Coding, Graphic Economy, and Cognitive Fit. The further development of the graphical notation including more advanced methods such as recently described by Bork and Roelens is planned for the future [4].

4.4 Implementation and Execution

Subsequently, the modeling language has been implemented using the freely available and open ADOxx metamodeling platform and will be made available via Zenodo [25]. The platform allows the easy definition and adaptation of metamodels based on the ADOxx meta\(^2\) model and the creation of model instances in automatically generated model editors (SR\(_9\)). ADOxx provides several text-based formats for defining metamodels and models, as well as a DSL for graphical notation (SR\(_8\)). In this way, the models can be exported manually or programmatically in XML format for processing them in other applications.

The ADOxx XML interface has been chosen as a basis for enabling the execution of the modeling language (SR\(_{10}\)). For this purpose, a software component has been designed in the form of an AR engine to interpret the models. The engine is implemented as a platform-independent web application using the 3D JavaScript library three.js^{Footnote 15} and the VR/AR immersive web standard WebXR [18]. The application can be accessed through a WebXR-compatible web browser on any mobile device, such as smartphones or head-mounted displays in line with requirement SR\(_{11}\). For starting an AR experience, the engine processes the models selected by the user and monitors the user’s environment for potentially relevant changes. Based on these environmental changes and user interactions, the application adapts the environment according to the specified workflows specified through triggers, conditions, and actions (SR\(_6\)).

5 Use Case

To demonstrate the use of the modeling language and showcase a practical application, we have developed a use case involving augmented reality-assisted assembly of a bedside table. The goal of this use case is to guide a user through the assembly of a bedside table using an augmented reality application instead of traditional 2D instructions on paper. Figure 3 shows a screenshot of the implementation in ADOxx. It includes an excerpt of a FlowScene model (1), the referenced ObjectSpace model (2), and two Statechange models (3, 4).

In the upper part of Fig. 3, the excerpt of the FlowScene model shows how to define the process for assembling the piece of furniture step by step. This includes steps such as turning the pieces into the correct position and attaching them piece by piece. It is important to note that no static flows are defined here but rather trigger-condition-action sequences. The FlowScene model references one ObjectSpace model (2) and several Statechange models (3 & 4).

In the lower left part of Fig. 3, the ObjectSpace model is shown (2). It includes ten Detectables that contain images of markers that are well-suited for computer vision detection algorithms. These act as surrogates for more advanced 3D object recognition algorithms that would permit the direct detection of physical objects. Further, the model includes Augmentation instances for each part of the furniture piece, e.g., “TopPlate 1”. These Augmentations are provided as GLTF files^{Footnote 16}, which is a common format for 3D objects and their textures. The Augmentations are connected by is_child relations to facilitate positioning and can be assigned Detectables to use them as reference points by anchored relations. The Augmentations and Detectables defined in the ObjectSpace model are then referenced in the FlowScene model.

Furthermore, the FlowScene model (1) includes Statechange instances - e.g., “Init MiddlePlate” - which reference Statechange models. In the lower right of Fig. 3, two examples of Statechange models “Init MiddlePlate” (3) and “Leg 1 Positioned” (4) are shown. They reference one or more Augmentations from the ObjectSpace model and define the state of the position, rotation, and visibility parameters during the execution of the FlowScene model. These parameters are also displayed as a table. A detailed description of the semantics and notation of each language concept is available in Table 1.

The execution of the models of the use case is shown in Fig. 4 by using parts from an IKEA table [1]. Subfigures (a)–(c) illustrate the traditional 2D assembly instructions for (a) “attaching Leg 1”, (b) “turning MiddlePlate 90\(^\circ \) counterclockwise”, and (c) “attaching Leg 2”. Subfigures (d)–(f) illustrate the same steps of the instructions in augmented reality using the aforementioned models [25] and the WebXR AR engine. The screenshots were taken while using the WebXR AR engine in the Chrome browser on a Samsung Galaxy Tab S7 tablet. Subfigure (d) shows the Statechange “Leg 1 Positioned”. It superimposes an image of Leg 1 on top of the real MiddlePlate, whose existence, position, and orientation are detected via a marker – Detectable 10. The Statechange “Rotate MiddlePlate”, where the virtual object is rotated according to the desired position for further assembly of the table is shown in Subfigure (e). Subfigure (f) shows the Statechange “Leg 2 Positioned”. The augmentation shows where the next leg shall be attached. As can be seen in subfigures (d), (e) and (f), several colored markers are placed on the real object at strategic points and according to the ObjectSpace model. Once a marker is detected, it is decided based on the current state of the workflow defined by the FlowScene model if it triggers an action or not. If an action is triggered, the workflow moves on and waits until the next detectable (marker) in line is detected. The flexible structure of the DSML allows multiple workflow paths to be active at the same time by checking for multiple detectables simultaneously. Detectables are also tracked when they are not part of the FlowScene. To avoid making the use case unnecessarily complex, the concepts of Resolves and Observer were not used.

Table 2. Feature comparison of the new domain-specific visual modeling language ARWFML based on twelve specific requirements SR\(_{1-12}\). (Y): Requirement met. (N): Requirement not met. (-): Not specified.

Full size table

6 Evaluation

Several techniques can be chosen to evaluate the new modeling language, including feature comparisons, theoretical and conceptual investigations, and empirical evaluations [34]. Thereby we opted for a feature comparison to previous approaches along the specific requirements that we had formulated. The previous approaches we considered were the ones from Ruminski and Walczak [30], Grambow et al. [15], Seiger et al. [33], Lechner [21], Campos-Lopez et al. [6], Ruiz-Rube et al. [29], and Wild et al. [37].

For each specific requirement that we had formulated, we conducted a detailed comparison using multiple dimensions, as shown in Table 2. This provides a detailed overview of the features supported by previous approaches and our new modeling language in terms of augmented reality concepts, levels of abstraction, user interaction, metamodeling capabilities, model execution, support for open standards, and availability of according implementations. Thereby, we can show that our new modeling language denoted as ARWFML (AR Workflow Modeling Language) currently supports 26 out of 33 dimensions of requirements, whereas the next runner-up only supports 21 dimensions.

In regard to Augmentations (SR\(_1\)), features such as animations, links, checklists, and forms are not yet supported by our language. However, this is more of a technical than a conceptual issue and will be addressed in future versions. The same holds true for area triggers (SR\(_6\)). Concerning User Interaction (SR\(_8\)), the current implementation of our language only supports text-based and 2D visual modeling, which is due to limitations of the ADOxx platform, which is not yet available as open source. 3D spatial modeling, such as in a 3D-capable modeling tool or directly in AR, is not yet supported. For enabling 3D spatial modeling, the adaptation of current metamodeling platforms would be necessary, e.g., for directly supporting open 3D standards such as WebXR [18] (SR\(_{11}\)). This would certainly facilitate the specification of models, as 3D modeling greatly facilitates spatial imagination.

7 Conclusion and Outlook

In this paper, we presented a domain-specific visual modeling language that is capable of representing complex augmented reality workflows for diverse application scenarios and that can be executed using the open WebXR standard. The modeling language allows designers to specify three different types of visual models: (1) for defining the AR environment, (2) the AR workflow, and (3) different statechanges within this workflow. Thus, the language emphasizes a high level of abstraction and separation of concerns. This abstraction bridges potentially missing knowledge about the technical implementation for AR environments and allows the user to focus on the content and functionality of AR applications. The technical feasibility was demonstrated by implementing the modeling language using the ADOxx platform and a prototypical web application for executing the models. A first evaluation has been conducted through a feature comparison to previous approaches and indicated the high coverage of the defined requirements.

In future research, we plan a further evaluation of the DSML and the AR application by means of a user study, which allows to identify bottlenecks or blind spots of the DSML. Furthermore, the 2D modeling approach presented here has some limitations due to modeling 3D environments in 2D modeling tools. For example, specifying the position of the legs in the application use case described above requires a good understanding of three-dimensional space. It is almost impossible to define position and rotation vectors in 3D space without visualizing them in 3D. Therefore, a new metamodeling platform is currently being developed to incorporate the third dimension during visual modeling, enabling 3D modeling in three-dimensional space [24]. Once the approach has gained further maturity, it will be possible to evaluate it empirically.

Notes

References

IKEA Online Shop (2023). https://www.ikea.com/us/en/p/knarrevik-nightstand-black-30381183/. Accessed 28 Apr 2023
Azuma, R.T.: A survey of augmented reality. Presence Teleoperators Virtual Environ. 6(4), 355–385 (1997)
Article Google Scholar
Bock, A.C., Frank, U.: Low-code platform. Bus. Inf. Syst. Eng. 63(6), 733–740 (2021). https://doi.org/10.1007/s12599-021-00726-8
Article Google Scholar
Bork, D., Roelens, B.: A technique for evaluating and improving the semantic transparency of modeling language notations. Softw. Syst. Model. 20(4), 939–963 (2021). https://doi.org/10.1007/s10270-021-00895-w
Article Google Scholar
Brunschwig, L., Campos-Lopez, R., Guerra, E., de Lara, J.: Towards domain-specific modelling environments based on augmented reality. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), Madrid, ES, pp. 56–60. IEEE (2021). https://doi.org/10.1109/ICSE-NIER52604.2021.00020
Campos-López, R., Guerra, E., de Lara, J.: Towards automating the construction of augmented reality interfaces for information systems. In: Insfrán, E., et al. (eds.) Information Systems Development: Crossing Boundaries Between Development and Operations (DevOps) in Information Systems (ISD2021 Proceedings), Valencia, Spain, 8–10 September 2021. Universitat Politècnica de València/Association for Information Systems (2021). https://aisel.aisnet.org/isd2014/proceedings2021/hci/6
Dalton, J., Gillham, J.: Seeing is believing (2019). https://www.pwc.com/gx/en/industries/technology/publications/economic-impact-of-vr-ar.html. Accessed 09 Mar 2023
Di Ruscio, D., Kolovos, D., de Lara, J., Pierantonio, A., Tisi, M., Wimmer, M.: Low-code development and model-driven engineering: two sides of the same coin? Softw. Syst. Model. 21(2), 437–446 (2022). https://doi.org/10.1007/s10270-021-00970-2
Article Google Scholar
Doerner, R., Broll, W., Grimm, P., Jung, B. (eds.): Virtual and Augmented Reality (VR/AR). Springer, Cham (2022). https://doi.org/10.1007/978-3-030-79062-2
Book Google Scholar
Fill, H.-G., Härer, F., Muff, F., Curty, S.: Towards augmented enterprise models as low-code interfaces to digital systems. In: Shishkov, B. (ed.) BMSD 2021. LNBIP, vol. 422, pp. 343–352. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79976-2_22
Chapter Google Scholar
Fill, H., Karagiannis, D.: On the conceptualisation of modelling methods using the ADOxx meta modelling platform. Enterp. Model. Inf. Syst. Archit. Int. J. Concept. Model. 8(1), 4–25 (2013). https://doi.org/10.18417/emisa.8.1.1
Fill, H.-G., Redmond, T., Karagiannis, D.: Formalizing meta models with FDMM: the ADOxx case. In: Cordeiro, J., Maciaszek, L.A., Filipe, J. (eds.) ICEIS 2012. LNBIP, vol. 141, pp. 429–451. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40654-6_26
Chapter Google Scholar
Frank, U.: Domain-specific modeling languages: requirements analysis and design guidelines. In: Reinhartz-Berger, I., Sturm, A., Clark, T., Cohen, S., Bettin, J. (eds.) Domain Engineering, Product Lines, Languages, and Conceptual Models, pp. 133–157. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36654-3_6
Chapter Google Scholar
Götzinger, D., Miron, E.-T., Staffel, F.: OMiLAB: an open collaborative environment for modeling method engineering. In: Domain-Specific Conceptual Modeling, pp. 55–76. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39417-6_3
Chapter Google Scholar
Grambow, G., Hieber, D., Oberhauser, R., Pogolski, C.: A context and augmented reality BPMN and BPMS extension for industrial Internet of Things processes. In: Marrella, A., Weber, B. (eds.) BPM 2021. LNBIP, vol. 436, pp. 379–390. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-94343-1_29
Chapter Google Scholar
Gulden, J., Yu, E.: Toward requirements-driven design of visual modeling languages. In: Buchmann, R.A., Karagiannis, D., Kirikova, M. (eds.) PoEM 2018. LNBIP, vol. 335, pp. 21–36. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02302-7_2
Chapter Google Scholar
Jannaber, S., Riehle, D.M., Delfmann, P., Thomas, O., Becker, J.: Designing a framework for the development of domain-specific process modelling languages. In: Maedche, A., vom Brocke, J., Hevner, A. (eds.) DESRIST 2017. LNCS, vol. 10243, pp. 39–54. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59144-5_3
Chapter Google Scholar
Jones, B., Goregaokar, M., Cabanier, R.: WebXR device API. W3C candidate recommendation draft, work in progress, World Wide Web Consortium (2023). https://www.w3.org/TR/2023/CRD-webxr-20230303/
Karagiannis, D., Mayr, H.C., Mylopoulos, J. (eds.): Domain-Specific Conceptual Modeling, Concepts, Methods and Tools. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39417-6
Book Google Scholar
Karsai, G., Krahn, H., Pinkernell, C., Rumpe, B., Schneider, M., Völkel, S.: Design guidelines for domain specific languages. In: Rossi, M., Sprinkle, J., Gray, J., Tolvanen, J.P. (eds.) Proceedings of the 9th OOPSLA Workshop on Domain-Specific Modeling (DSM 2009), Orlanda, vol. B-108, pp. 7–13. Helsingin Kauppakorkeakoulu (2009)
Google Scholar
Lechner, M.: ARML 2.0 in the context of existing AR data formats. In: Latoschik, M.E., Reiners, D., Blach, R., Figueroa, P.A., Wingrave, C.A. (eds.) 6th Workshop on Software Engineering and Architectures for Realtime Interactive Systems, SEARIS 2013, Orlando, FL, USA, 17 March 2013, pp. 41–47. IEEE Computer Society (2013). https://doi.org/10.1109/SEARIS.2013.6798107
Mernik, M., Heering, J., Sloane, A.M.: When and how to develop domain-specific languages. ACM Comput. Surv. 37(4), 316–344 (2005). https://doi.org/10.1145/1118890.1118892
Article Google Scholar
Moody, D.L.: The physics of notations: toward a scientific basis for constructing visual notations in software engineering. IEEE Trans. Softw. Eng. 35(6), 756–779 (2009). https://doi.org/10.1109/TSE.2009.67
Article Google Scholar
Muff, F., Fill, H.: Initial concepts for augmented and virtual reality-based enterprise modeling. In: Lukyanenko, R., Samuel, B.M., Sturm, A. (eds.) Proceedings of the ER Demos and Posters 2021 Co-Located with 40th International Conference on Conceptual Modeling (ER 2021), St. John’s, NL, Canada, 18–21 October 2021. CEUR Workshop Proceedings, vol. 2958, pp. 49–54. CEUR-WS.org (2021). https://ceur-ws.org/Vol-2958/paper9.pdf
Muff, F., Fill, H.: ADOxx Library and UseCase Models for the ER23 Publication: A Domain-Specific Visual Modeling Language for Augmented Reality Applications Using WebXR (2023). https://doi.org/10.5281/zenodo.8207639
Muff, F., Fill, H.: Past achievements and future opportunities in combining conceptual modeling with VR/AR: a systematic derivation. In: Shishkov, B. (ed.) BMSD 2023. LNBIP, vol. 483, pp. 129–144. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-36757-1_8
Chapter Google Scholar
Nguyen, T.: 4 impactful technologies from the gartner emerging technologies and trends impact radar for 2021 (2021). https://www.gartner.com/smarterwithgartner/4-impactful-technologies-from-the-gartner-emerging-technologies-and-trends-impact-radar-for-2021. Accessed 09 Mar 2023
Roo, J.S., Hachet, M.: One reality: augmenting how the physical world is experienced by combining multiple mixed reality modalities. In: Gajos, K., Mankoff, J., Harrison, C. (eds.) Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, UIST 2017, Quebec City, QC, Canada, 22–25 October 2017, pp. 787–795. ACM (2017). https://doi.org/10.1145/3126594.3126638
Ruiz-Rube, I., Baena-Pérez, R., Mota, J.M., Sánchez, I.A.: Model-driven development of augmented reality-based editors for domain specific languages. IxD &A 45, 246–263 (2020). https://doi.org/10.55612/s-5002-045-011
Ruminski, D., Walczak, K.: Dynamic composition of interactive AR scenes with the carl language. In: Bourbakis, N.G., Tsihrintzis, G.A., Virvou, M. (eds.) IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications, Chania, Crete, Greece, pp. 329–334. IEEE (2014). https://doi.org/10.1109/IISA.2014.6878808
Saxena, D., Verma, J.K.: Recreating reality: classification of computer-assisted environments. In: Verma, J.K., Paul, S. (eds.) Advances in Augmented Reality and Virtual Reality. SCI, vol. 998, pp. 3–9. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-7220-0_1
Chapter Google Scholar
Schmalstieg, D.: Augmented Reality: Principles and Practice. Addison-Wesley, Boston (2016)
Google Scholar
Seiger, R., Kühn, R., Korzetz, M., Aßmann, U.: HoloFlows: modelling of processes for the Internet of Things in mixed reality. Softw. Syst. Model. 20(5), 1465–1489 (2021). https://doi.org/10.1007/s10270-020-00859-6
Article Google Scholar
Siau, K., Rossi, M.: Evaluation techniques for systems analysis and design modelling methods - a review and comparative analysis. Inf. Syst. J. 21(3), 249–268 (2011). https://doi.org/10.1111/j.1365-2575.2007.00255.x
Article Google Scholar
Visic, N., Fill, H., Buchmann, R.A., Karagiannis, D.: A domain-specific language for modeling method definition: from requirements to grammar. In: 9th IEEE International Conference on Research Challenges in Information Science, RCIS 2015, Athens, Greece, 13–15 May 2015, pp. 286–297. IEEE (2015). https://doi.org/10.1109/RCIS.2015.7128889
Wild, F., Perey, C., Hensen, B., Klamma, R.: IEEE standard for augmented reality learning experience models. In: Mitsuhara, H., et al. (eds.) IEEE International Conference on Teaching, Assessment, and Learning for Engineering, TALE 2020, Takamatsu, Japan, 8–11 December 2020, pp. 1–3. IEEE (2020). https://doi.org/10.1109/TALE48869.2020.9368405
Wild, F., et al.: Towards data exchange formats for learning experiences in manufacturing workplaces. In: Kravcik, M., Mikroyannidis, A., Pammer, V., Prilla, M., Ullmann, T.D., Wild, F. (eds.) Proceedings of the 4th Workshop on Awareness and Reflection in Technology-Enhanced Learning in conjunction with the 9th European Conference on Technology Enhanced Learning: Open Learning and Teaching in Educational Communities, ARTEL@EC-TEL 2014, Graz, Austria, 16 September 2014. CEUR Workshop Proceedings, vol. 1238, pp. 23–33. CEUR-WS.org (2014). http://ceur-ws.org/Vol-1238/paper2.pdf
Yin, K., He, Z., Xiong, J., Zou, J., Li, K., Wu, S.T.: Virtual reality and augmented reality displays: advances and future perspectives. J. Phys. Photonics 3(2), 022010 (2021). https://doi.org/10.1088/2515-7647/abf02e
Article Google Scholar
Zhou, F., Duh, H.B., Billinghurst, M.: Trends in augmented reality tracking, interaction and display: a review of ten years of ISMAR. In: 7th IEEE and ACM International Symposium on Mixed and Augmented Reality, ISMAR 2008, Cambridge, UK, 15–18 September 2008, pp. 193–202. IEEE Computer Society (2008). https://doi.org/10.1109/ISMAR.2008.4637362

Download references

Author information

Authors and Affiliations

Research Group Digitalization and Information Systems, University of Fribourg, Fribourg, Switzerland
Fabian Muff & Hans-Georg Fill

Authors

Fabian Muff
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Georg Fill
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabian Muff .

Editor information

Editors and Affiliations

Federal University of Espírito Santo, Vitória, Brazil
João Paulo A. Almeida
Universidade de Lisboa, Lisbon, Portugal
José Borbinha
University of Twente, Enschede, The Netherlands
Giancarlo Guizzardi
University of Auckland, Auckland, New Zealand
Sebastian Link
Stockholm University, Kista, Sweden
Jelena Zdravkovic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muff, F., Fill, HG. (2023). A Domain-Specific Visual Modeling Language for Augmented Reality Applications Using WebXR. In: Almeida, J.P.A., Borbinha, J., Guizzardi, G., Link, S., Zdravkovic, J. (eds) Conceptual Modeling. ER 2023. Lecture Notes in Computer Science, vol 14320. Springer, Cham. https://doi.org/10.1007/978-3-031-47262-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-47262-6_18
Published: 29 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47261-9
Online ISBN: 978-3-031-47262-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Domain-Specific Visual Modeling Language for Augmented Reality Applications Using WebXR

Abstract

Similar content being viewed by others

Toward Development Tools for Augmented Reality Applications – A Practitioner Perspective