1 Introduction

Remote collaboration tools are well-established in present-day work environments. However, collaborative virtual environment groupware is still not common at workplaces. One reason is the high costs of engaging in collaboration: Systems require a complex setup and usually focus on a small, homogeneous spectrum of hardware and software, excluding participants who do not satisfy all technical requirements. The loss of information and capabilities people are used to when working in co-located settings is another reason for the low utilization rate of collaborative virtual environment groupware systems.

Collaboration is a fundamental aspect of Design Thinking (DT). Team members generate ideas and communicate these to their teammates and other stakeholders. Ideas can be expressed in a tangible way, such as in three-dimensional hand-crafted prototypes using haptic materials (e.g. paper or bricks). As a dedicated representation of an evolving design (Houde and Hill 1997), a prototype serves as an illustration of people’s ideas and to externalize implicit knowledge (Buxton 2007). DT’s team-based approach is often applied in co-located settings.Footnote 1 Hence, the provided tools and materials mostly aim at supporting co-located teams. However, working at the same location is not always possible for all participants of a DT team. Dedicated software tools can help teams to work and prototype together over distances.

In this chapter, we present a web browser-based collaborative virtual environment for supporting geographically distributed design teams in their joint creation of three-dimensional low-fidelity prototypes. The application provides a shared 3D workplane surrounded by remote users’ avatars. All participants can jointly create and modify 3D shapes while seeing and hearing each other in real time. While focusing on virtual reality (VR) hardware, such as head-mounted display and respective controllers, our cross-platform application can also be operated by mouse or touch on mobile as well as on traditional desktop devices.

2 Prototyper

Figure 1 shows the application with two remote participants. The user at location A views the provided prototyping space on a traditional computer screen, whereas location B’s user wears a head-mounted display that shows a three-dimensional stereoscopic visualization of the prototyping space. The remote participant can equally interact with generated 3D artifacts using available input devices, such as mouse (location A) or dedicated 3D controllers (location B). At the same time, both users can see each other’s avatar, its current position and viewing direction.

Fig. 1
figure 1

The Prototyper web application being used simultaneously at two different locations. Users operate and view the application with traditional hardware, such as a computer screen and mouse (left) or with dedicated virtual reality hardware (right)

As a valuable part for distributed collaboration (Tang 1991; Whittaker et al. 1993), a shared workspace is required for jointly manipulating 3D artifacts in our application’s virtual prototyping space. Its conceptual setup is depicted in Fig. 2. The virtual prototyping space in Fig. 2b is modelled on the table setup schema shown in Fig. 2a.

Fig. 2
figure 2

Prototyper table setup

Remote users’ virtual embodiments are placed around a table representing a workplane that serves as the shared workspace where 3D artifacts can be created, modified and assembled to more complex structures. The proportions regarding participants’ body size and the table’s dimension shown in the schema are also maintained in the setup’s virtual counterpart. Users should be within arm’s reach of the 3D objects on the workplane. This is important for the application’s interaction concept described below.

Given this setup, our application implements three distinct types of spaces proposed by Buxton 2009: a person space of the remote participants’ avatars, a task space of involved 3D artifacts on the workplane and a reference space for pointing. Participants are aware of others’ activities since they can see who is present, where users look, and which part of the model they are working on.

2.1 3D Modeling Using a Basic Construction Kit

With Prototyper, we aim to build volumetric low-fidelity prototypes (e.g. the prototypes shown in Fig. 3Footnote 2). These prototypes only provide key elements of the underlying visual concept with a rather low demand regarding level of detail (Walker et al. 2002) and user skills (Babich 2017).

Fig. 3
figure 3

Examples of low fidelity prototypes

These aspects are reflected by our application’s 3D model construction and modification concept. Prototypes consist of a small set of building blocks, i.e. basic 3D shapes that can be transformed, colorized and textured. Figure 4 shows the four basic shapes our system offers. All shapes can be composed to more complex structures using Boolean set operators-based Constructive Solid Geometry (CSG) (Foley et al. 1990). The technique’s results are shown in Fig. 4b: as a first step, the three scaled and rotated cylinder shapes in the middle are combined utilizing a Boolean union operation; in a second step, the shape on the right is created by applying a difference operation on the left sided cube and the combined cylinder shapes.

Fig. 4
figure 4

Prototyper’s set of basic shapes

The combination of basic shapes and repetitive CSG allows the creation of complex 3D models. As described by Wenzel et al. 2016, users are able to resemble the real-world prototypes shown in Fig. 3. However, during initial user tests we learned that the application of CSG requires some experience from users, especially for beginners, in constructing 3D models. Hence, we added freeform shapes (see Fig. 4c) whose creation is easier and more intuitive for beginners. Conceptually, creating Prototyper’s freeform shapes in 3D relates to drawing lines with a marker on a sheet of paper in 2D. Our freeform approach is similar to the hand-held physical sketching device described by Agrawal et al. 2015, which proved to be useful for creative exploration. Creating freeform shapes in Prototyper is only possible in VR mode with respective 3D controllers, providing well-defined, continuous 3D positional data. Though being generally input method agnostic, this is our application’s only limitation regarding 3D content generation.

In the case that Prototyper is not initially used for creating 3D models from scratch, the system allows importing external 3D files and using the corresponding models just as any other shape within the system. An example for such an external model is shown in Fig. 4d. Even if it would not be intended to further modify an external model, Prototyper can serve as a tool for jointly viewing and discussing later stage high-fidelity prototypes.

2.2 Interfaces to the Analog World—Import and Export of Physical and Virtual Prototypes

Prototyper covers a specific use case. In order to preserve generated content and to embed Prototyper in larger contexts within users’ day-to-day work, a “bridge” to other already existing software systems is needed. Importing external 3D files into our system allows building upon existing 3D models (e.g. digital models created with a Computer Aided Design (CAD) tool). For the other direction, models created with Prototyper can in turn be exported into 3D files.

The exchange interfaces allow a connection to physical objects. Though not widespread at the moment, there are research efforts toward digitizing three-dimensional objects with a 3D scanner built into smaller devices, such as smartphones (Naegeli 2013; Stoller-Conrad 2015). In order to gather feedback on a digital 3D model, a 3D printer can be used to create a physical object (see Fig. 5b).

Fig. 5
figure 5

Virtual 3D model created with Prototyper, and the model’s physical, 3D printed version

2.3 Awareness Through Audio-Visual Remote User Embodiment

In general communication scenarios, speech is an important instrument for explicitly exchanging information and for gathering evidence that a message has been understood as intended (Clark and Brennan 1991). However, in co-located, face-to-face settings, people’s communication is also based on rather implicit sources of information, such as the views of others’ faces, bodies, and actions; views of the task objects; and views of the environment (Kraut et al. 2003). These visual cues help provide awareness of other group members, which is crucial for successful collaboration (Gutwin and Greenberg 2004; Dourish and Bellotti 1992), especially when the design and use of artifacts are involved (Poppe et al. 2013).

Within collaborative virtual environments, it is challenging to coordinate and manage the actions and intentions of users (Domingues et al. 2010). Three-dimensional user embodiment within virtual environments is an approach for addressing these challenges. Benford et al. 1995 consider user embodiment a key issue for collaborative virtual environments:

…without sufficient embodiment, users only become known to one another through their (disembodied) actions; one might draw an analogy between such users and poltergeists, only visible through paranormal activity.

Gutwin and Greenberg (1999) and Benford et al. (1995) identified a list of design issues that are important for awareness in real time groupware systems and collaborative virtual environments, respectively. These issues can be split into categories for providing information about (1) who users are working with, (2) what others are doing, (3) where they are working, (4) when events happen, and (5) how those events occur (Gutwin and Greenberg 1999).

Within Prototyper, we seek to address all of these issues based on 3D user embodiment. Remote participants’ avatars, as shown in Fig. 2b, provide information about the presence of a person, the location, gaze and field-of-view. The 3D artifacts on the workplane, together with the virtual hands of a remote user inform about what artifacts are modified by whom. The scale of the visual appearance of remote avatars, workplane and its objects, applies to the real-world setting shown in Fig. 2a. This way, a user can estimate a remote participant’s reach within the virtual space.

A limitation within Prototyper is that the system does not provide an actual live image of a remote participant. However, in order to provide information about the identity of a remote user we use additional metadata. Prototyper is a web-based system that includes—beside the Prototyper application itself—a web portal that provides user management, content organization and access control. This means, every Prototyper user has a user account connected with a profile (see Fig. 6a) where he/she can upload a profile photo used within the web portal but, more importantly, can also upload a 3D file to be used as a personal avatar within Prototyper.

Fig. 6
figure 6

Prototyper system web portal

With the help of the person’s gender and body height, we calculate the size and position of the 3D avatar (Medlej 2013). When no avatar file is specified, then a default avatar is displayed within Prototyper based on the provided gender. This way, our system provides at least avatar information that lets users distinguish between different remote participants helping to assign actions to specific persons.

Information regarding when and how events happen is currently not visualized directly within Prototyper. However, this information is stored automatically on a server storage for later use.

When using Prototyper, the people are connected via audio so they can talk to each other. In order to help users to distinguish who is actually talking, we provide a visual hint. The respective remote avatar gets a special color whenever talking. Visual cues regarding the user and remote participants as well as their activities in relation to the workspace materials are crucial but are oftentimes not sufficient for a sense of presence (Büscher et al. 2001). Especially when users are looking in another direction or the remote participant is not within a user’s field-of-view it can become difficult to know the speaker’s identity. In co-located settings people can locate another person by the sound of his/her voice. Within our system, we try to resemble this kind of out-of-sight localization. Based on the positions of the local user and the remote participant in the virtual space—this data is exchanged among all locations in real time—the remote user’s voice is adapted to “hear” his/her position.

2.4 A Web Browser-Based Cross-Platform Application for Immediate Access

The costs for a user to engage in a collaboration are a critical factor for the success of a remote collaboration system, i.e. that the users’ effort be kept to a minimum (Gutwin et al. 2008; Kraut et al. 2002).

User interface and user experience issues have to be considered when designing a remote collaboration system. However, in order to get to the point of experiencing a system, users have to actually use it, which might become problematic when it means telling the other person: “OK, remember what you wanted to say while I go and find a room, power up the system and install the software.”

Prototyper is a web browser-based application. This has two major advantages: (1) web browsers are available and mostly pre-installed on almost all user devices from mobile to traditional desktop hardware systems and (2) people know web browsers from their daily life; they know how to operate them and the corresponding paradigms.

Hence, Prototyper’s only requirement is a web browser. An installation of additional software is not necessary. This applies to operating Prototyper with mouse or touch input. For the intended usage with virtual reality hardware, additional (mostly driver-) software installation is required. In the current version of our system, we support HTC ViveFootnote 3 as virtual reality hardware. Prototyper is designed in such a way that other VR hardware can be added easily, e.g. mixed reality systems and mobile phones.

From a user perspective, the Prototyper system consists of the 3D modeling application and a web portal that serves as an entry point and administration interface for the system. Users can manage projects and associated prototypes in order to organize their work and control access rights. The menu on the left in Fig. 6a shows three prototypes contained in a project. User accounts are assigned to projects and therefore gain access to the projects’ prototypes.

The Prototyper application can be started from the web portal by any user that is assigned to the prototype’s project. All participants get connected automatically when the Prototyper application is started. From that point, all user interactions are synchronized among all participants in real time via our central collaboration server. The server does not only relay the synchronization messages but also stores the content data. When working with Prototyper, the data is stored automatically so users do not have to press a “save” button. This way, the latest state is shown when starting Prototyper. However, since all of a prototype’s content data is stored from the beginning, this data can be viewed via a dedicated history browser (see Fig. 6b) allowing users to navigate through a prototype’s course of development, creating a branch from any point in time if necessary.

2.5 Interaction

The ability to manipulate objects within virtual environments, such as virtual reality, is a defining feature for such systems (Bowman et al. 1997). This rises the need for interface and interaction techniques focusing on spatial input in a physical three-dimensional context (Bowman et al. 2012).

There are three basic user interaction tasks within virtual environments: navigation or travel, selection and manipulation (Bowman et al. 1997; Bowman and Hodges 1999; Bowman et al. 2001). There are different technical approaches for each of these tasks. Each of them can be distinguished with regard to the “objective degree of exactness with which real-world interactions can be reproduced in an interactive system” (McMahan 2011). High levels of this naturalism can enhance performance and the overall user experience (Bowman et al. 2012).

2.5.1 Selection and Manipulation

Raycast and arm extension techniques are common approaches for the object selection and manipulation tasks (Bowman and Hodges 1997). These techniques usually originate from the need to reach and interact with objects that are further away and not within arm’s reach. Without such techniques, users would need to change their avatar’s position within the virtual space, i.e. they would have to travel. Compared to other techniques, raycasting and arm extension methods’ level of naturalism is lower since there is no exact, direct mapping of user’s arm and the resulting movement in the virtual world. A common, and more natural technique, is called simple virtual hand (Bowman et al. 2004). Here, users control a virtual hand directly mapped to user’s real hand movements. For selecting and manipulating, users touch a virtual object, in a way that parallels how users interact with real-world objects. This technique requires the user to be within arm’s reach. The intended interaction space within Prototyper is basically the workplane, whose size is designed to be within arm reach. Thus, we utilize the simple virtual hand technique in our system. Users’ virtual hands are represented by real-world sized virtual controllers that the user has in his/her hands. Figure 7a shows the controller touching a virtual object.

Fig. 7
figure 7

Interaction techniques: a Simple virtual hand and b Raycasting. When touching a virtual object, a shortcut menu is shown on the virtual controller for fast reachability of common options (a). Raycasting is used to select options on a controller aligned context menu (b)

When there are many objects in a smaller area it becomes difficult to distinguish which object is actually being touched. Therefore, Prototyper provides a visual cue by means of a slight color change in the touched object. Interacting with virtual objects is also challenging for users due to the lack of haptic feedback (Mine et al. 1997). In order to provide, at least a small part of such feedback, a force feedback motor causes a slight controller vibration whenever an object is touched. Users preferred our visual/haptic hand approach over a simpler raycast version that we had implemented first.

Moving and rotating virtual objects with the simple hand approach is quite straightforward: when an object is grabbed, it is attached to the user’s hand, or controller respectively, so that all translation and rotation transformations by the controller are applied on the object in the same way. A different approach is necessary for changing the size of a virtual object. Our solution for scaling is to divide horizontal, vertical and uniform scaling based on the controller orientation and the number of involved controllers. Holding a special scale button on one controller causes the grabbed object to scale in horizontal dimensions by controller’s amount of translation when the controller is oriented horizontally. Vertical scaling is realized with a vertical controller orientation. For a uniform scale, the user just grabs an object with both hands/controllers and pulls or squeeze it to increase or decrease its size.

2.5.2 System Control Using Menus

With Prototyper, objects can be created, transformed or changed in their visual appearance. This functionality is provided by different tools and options within the system. The task of changing a virtual system’s state or mode of interaction is called system control (Bowman and Wingrave 2001). Changing object’s visual attributes or switching between different tools is realized with the help of different menus. Figure 7b shows a menu for changing an object’s visuals. The menu is displayed on the user’s hand. It is also attached to the controller so that it moves and rotates with it. Since the selection icons are smaller, touching these is difficult. Furthermore, selecting an option is just a “click”. We therefore provide a ray originating from the other hand to make this binary click selection.

Prototyper provides three different interaction modes: (1) object transformation (2) object creation and (3) freeform drawing. Switching between these modes is shown in the left picture of Fig. 8. Pressing the mode selection button on a controller shows three working mode symbols that users can select by rotating the controller around its longitudinal axis pointing to the desired tool symbol.

Fig. 8
figure 8

Menus for switching between different modes of interaction and setting options and parameters

The touch sensitive area of the controllers are used to dynamically display and select different options, such as color and thickness of freeform drawings (see middle and right picture of Fig. 8) or shortcut options for object duplication and deletion when touching an object (see Fig. 7a).

2.5.3 Travel

As already mentioned, the interaction area within Prototyper is relatively small. Travelling should not be necessary since objects are usually within arm’s reach of the user. However, since research shows that physical turning and walking can enhance spatial orientation and movement understanding (Bowman et al. 2012), users are supposed to physically walk in order to change their avatar’s position when using Prototyper. This requires a precise Six degrees of freedom (6DoF) tracking within a larger area. Given modern VR systems, both requirements are fulfilled providing precise tracking space of e.g. 5-meter diagonal.Footnote 4

2.5.4 Mouse and Touch Interaction

An important factor is the support of heterogeneous hardware in regard to user input devices. This way, users who do not have dedicated VR hardware can take part in Prototyper’s collaborative virtual environment.

Hence, Prototyper provides a “fallback” mechanism for the aforementioned user interaction tasks. Figure 9 shows the user interface for object transforms and visual attributes change.

Fig. 9
figure 9

User interface for mouse- and touch-based object transform and styling

This approach enables synchronous collaboration with different hardware setups.

3 Conclusion and Future Work

In this chapter, we presented Prototyper, a web browser-based collaborative virtual environment that supports the joint real time creation of three-dimensional low-fidelity prototypes. It is a cross-platform application that runs on a multitude of hardware devices. While focusing on a usage with virtual reality hardware, users are free to participate when there are only traditional input and output devices available. The system provides enhanced awareness through visual remote user embodiment combined with spatial audio communication.

In future work, we want to focus on mobile devices, such as smartphones, and the user interactions in VR when there are either no or only basic input devices available.