Creativity Support in Projection-Based Augmented Environments

Simões, Bruno; Prandi, Federico; De Amicis, Raffaele

doi:10.1007/978-3-319-22888-4_13

Bruno Simões¹⁵,
Federico Prandi¹⁵ &
Raffaele De Amicis¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9254))

Included in the following conference series:

International Conference on Augmented and Virtual Reality

4901 Accesses
4 Citations

Abstract

A key theme in ubiquitous computing is to create easy-to-use smart environments in which there is a seamless integration of people, user experience, information, and physical reality. The next generation of ultra-portable handheld projectors is paving the way in this direction, as it develops to its pinnacle. When coupled with appropriate tracking technologies, this technology can extend user experience beyond the confines of the device itself to encompass the physical properties of environment. In this paper, we introduce a projection-based augmented reality framework that aims at facilitating the creation of augmented environments where digital content is snapped to objects around the user. We extend previous literature with a projection-oriented framework that leverages on handheld projectors to seamless combine physical interactions with projection-based content snapped to real objects, while solving inherent issues related to use of projective technology. To evaluate the relevance and impact of this study, we discuss a few application scenarios that are enabling this interaction paradigm.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Exploring the Museum with a Handheld Projector in Your Own Room

Effect of 3D Projection Mapping Art: Digital Surrealism

Interaction with Adaptive and Ubiquitous User Interfaces

Keywords

1 Introduction

Projection-based technology is receiving increasing attention over the last years, in part because it provides effective means to overcome the inherent information-display limitations associated with small screens. The driving idea behind this technology is that we can create interaction displays that go much beyond the confines of the device itself, for example, to encompass virtually any external object present in a given a physical space.

The use of projection-based technology is not completely new. Yet, the level of interactivity with such technology is, at the moment, somewhat pedestrian. The way most applications work is similar to the idea of an ordinary projector connected to a laptop. In many applications, the image that is projected is actually a clone of the image of the device controlling it, and the user interaction is basically limited to the interaction with the device itself. This implies that the display space and the user interaction remain separate; information does not adapt to the shape of physical objects and users cannot interact with the projection itself. In the end the technology is only used to extend the areas on which content is displayed and not to pave the way to new types of collaborative interaction and shared experiences that facilitating the interaction with information that can be visualised directly on the object or subject of interest using real world interaction metaphors as we know them.

Typically, the interaction with the projection device (e.g., proximity, using tilt gestures, etc.) or on the device (e.g., tap, touch, etc.) has direct influence on the display space and not on the projection space. Examples include the use projection to facilitate social interaction on a workplace by displaying photos associated with the person in physical proximity [26], or to function as a community poster board, which shows both content that has been user created or automatically sampled from the workplace intranet [10]. More recently, it has also been used to create new ways of experiencing daily activities. SubliMotion [43] for example, is using projection mapping to provide an unparalleled techno-gastro experience that goes beyond the experience of taste. In most of the cases, the main advantage of using projections is that the projected image can easily be shared among a multiple of users. Hence, the user interaction with the projection itself is somehow limited to translation, rotation and scale, as the situation requires. Yet, there are more exciting ways of using content projections that capitalise on a new wave of projection-based devices like smartphones that have an integrated camera and embedded projector, and digital cameras that can project their photos, which due to their probability can make projection-based interaction more ubiquitous, something that in the past has been limited to fixed project-based systems setups.

In this work, we introduce new a projective augmented reality paradigm that aims mostly at supporting creativity processes. The augmented reality systems that are available today for our smart phones do not provide seamless integration between reality, user experience and information. On the one hand, they lack of a proper communication infrastructure. On the other hand, most applications require users to wear head-mounted displays or to hold up the smart phones or tablets and switch on the camera view to overlay graphical elements into the reality, which are well-suitable for solitary, single user experiences. Hence, in many cases, they are more like display or portal to the reality rather than the immersive reality itself. The interaction techniques consider the case where user have access to handheld projectors, so the interactivity between users and projectors can result in a rich design space for multi-user interaction, and ultimately path a new way in the augmentation of the surrounding environment with new scenarios for collaborative experience.

2 Motivation and Objectives

In the previous section, we described the driving idea behind projection-based technology. In this section we present our motivation and we enumerate open issues and research challenges correlated to our approach. In the next section, we survey the relevant literature, both to see if similar studies have been done, and to define the framework from which to evaluate the relevance and impact of this study. Then, in Sect. 4, we describe the proposed methodology as well as the implementation details. In Sects. 5 and 6 we present several use case scenarios as well as an evaluation of the framework. Lastly, in Sect. 7, we wrap up the results of our work with a few conclusions and future work.

Our motivation to exploit the use of projection-based technology is driven by three factors. First, in our framework called c-Space, we capitalise on the ongoing move of smart phones towards near-ubiquity to create near real-time automatic 3D reconstructions of spatio-temporal events, e.g. concerts, moving objects, etc. Hence, we seek new ways of interaction with replicas of physical objects of events, that are similar to real world interaction metaphors as we know them. Second, it is undeniable that smart phones have, aside from the lack of memory and processor power, the very small display sizes as their major bottleneck. Therefore, we want to investigate if the interactivity between users and Personal Pico Projection (PPP) technology (e.g. smartphones with integrated projectors) can result in a rich design space for multi-user interaction – one of the key side-effect of interacting with small display sizes. Lastly, we want to investigate to which point this new way of interaction can support creativity and collaborative experiences, which include but are not limited to i.e. creation of spatio-temporal annotations, combination of real objects with digital content, and content sharing and reuse. Figure 1 summarises the c-Space framework, which aims at providing a new disruptive technology that unleashes users’ creativity to create and user 4D content in a completely new way.

The list of research challenges that were identified while creating dynamic projections with mobile setups based on time-of-flight camera (ToF camera) and PPP technology are:

RC1
How can we exploit the real world scene depth to create projections that fit in intelligent way different types of surfaces, in order to avoid the image projected to be perceived as distorted?
RC2
How can we seamless compose (i.e. how can we efficiently define projection mappings) and share new interactions that bridge digital content and real word objects, independently of the projector’s location and orientation?
RC3
How can affective computing and recommendation systems be integrated into a projective augmented system, so we can adapt the content to the emotional state or needs of the user?
RC4
What mechanisms can be put in place to foster or promote collaboration and content sharing among users?
RC5
The use of projective interfaces exposes the end-user to the risk of projecting sensitive information by mistake, e.g. phone number, contact list, etc., which open up new privacy challenges. Hence, what mechanisms can we implement to tackle the issue of sensitive information disclosure?
RC6
The invasiveness of projected content can lead to “visual pollution” or bring annoyance to other people in the vicinity. Additionally, more powered projecting devices with the potential to project from a long-range distance can be even dangerous for other users. Hence, which is the social impact of PPP technology? How can we create a normative policy that regulates the use and the power of PPP technology in certain scenarios like streets, where drivers or passengers could be temporarily blinded by the projection?
RC7
Manually setting the projection focus raises a critical barrier to mobile content projections because users have to readjust the focus every time they move. Hence, how can we reduce the effect of out-of-focus in projections when using non-laser projectors?

3 Related Work

In this section we review works that relate to our 3D projective augmented environments concept. Afterwards, we discuss in detail how our solution advances the state of the art.

We consider existent literature to be clustered into three major categories: projector-based augmented spaces, multi-user handheld projector systems, and projector tracking.

3.1 Projector-Based Augmented Spaces

Traditional techniques to augment the world with additional information require the use of head-mounted displays [1] or a portable device serving as a “magic lens” [3]. Their weakness resides at the hardware level. Their hardware is not designed for simultaneous multi-user interaction, in contrast to projection metaphors that target multi-individual experiences by augmenting objects in the user environment without hampering any existent collaboration.

In 1999, Underkoffler [42] described for the first time a system that uses projection (I/O bulb co-located projector) and video-capture techniques for distributing digitally-generated graphics throughout a physical space. Later, Hereld et al. [17] described how to build projector-based tiled display systems that incorporates cameras into the environment, to automate the calibration process. Afterwards, the authors of [32] investigated the use of steerable projector to explore content projection on arbitrary indoor surfaces. In 2003, Raskar et al. introduced a system called iLamps that creates distortion free projections on various surfaces [35]. The RFIGLamps [34] extended iLamps with the possibility to create object adaptive projections. One of the use case scenarios proposed consisted in visually identify products with the closest date of expiry.

Prototypes that target mostly interactive tabletop experiences include Play Anywhere [48], the work of Hang et al. [15] that takes advantages of projected displays to explore large-scale information, The Bonfire that uses several handheld projectors mounted on a laptop to extend the desktop experience to the tabletop [20], Map Torchlight that enables paper map content augmentation [38], and Marauder’s Light that can be used to project on a paper map locations retrieved from Google Latitude [24].

In 2005, Blasko et al. [4] investigated possible interactions with a wrist-worn projection display. A short-throw projector was used in their lab’ experimental setup to simulate the mobile projector. A few years later, Mistry et al. [27] introduces Wear Ur World, an application that relies on a portable projector, a mirror and a camera, to demonstrate that mobile projection can be integrated in daily life interactions. Fiducial markers attached to fingertips were used to improve precision and speed of the computer vision process. More recent works include SecondLight [18] that can be used to interact with projected imagery on top of real life objects and in near time; and OmniTouch [16] for interactive multi-touch applications using arbitrary surfaces by employing a wearable depth-sensing as well as a projection system.

3.2 Multi-user Handheld Projector Systems

Modern handheld projectors can produce relatively large public displays, often considered an important requirement in many multi-user interaction scenarios. The possibilities of multi-user interaction with handheld projectors has been an active research field. In 2005, Sugimoto et. al. described an experimental system that explored the concept of overlapping two projection screens to initialise file transfer between different devices [41]. In 2007, Cao et. al. presented a wide range of multi-user interaction techniques to manage virtual workspaces that relied on motion capture systems for location tracking [6, 7], e.g. they designed interaction techniques to visualise content, define content ownership, to perform content docking, and to initiate transfers.

Multi-user games have also received some attention in recent years. Hosoi et. al. introduced a multi-user game where users have the challenge of guiding a small robot by line up projected pieces of track so the robot can move around [19]. The prototype used a fixed camera that was placed above the interaction area to enable the interaction between the handheld projectors. Another example of a game that uses projection metaphors is the multi-user jigsaw game proposed by Cao et. al. where users have to pick up and place pieces of a puzzle together [8]. In this case, the interaction between multiple handheld projectors was enabled by means of a professional motion capture system.

3.3 Projector Tracking and Interaction

2D barcode-style fiducial markers have been used widely in tracking due to their robust and fast performance. A well-known issue with this type of fiducial markers is their unnatural appearance – which is not readable by users. Additionally, the integration of barcode style markers into the design of interactive systems raises also resistance due to some of their properties, e.g. fixed aesthetic and intolerance to changes in shape, material and colour.

There are numerous techniques that were developed to hide or disguise fiducial markers from the user. Park et al. used invisible inks to create markers that are visible with IR cameras [31]. Grundhofer et al. investigated the use of temporal sequencing of markers with high speed cameras and projectors [14]. In 2007, Saio et al. created custom marker patterns that are disguised to look like normal wallpaper [37]. The use of IR lasers to project structured pattern style markers was investigated in [21, 45]. Nakazato et al. used retro-reflective markers together with lights and IR cameras [28]. Other works include the projection of IR [9, 40, 44] or hybrid IR/visible light markers [22].

The use of natural marker was also proposed as a solution to overcome some of the limitations of 2D barcode-style fiducial markers, e.g. fixed aesthetic. For this reason, most work has now been placed in the development of natural marker detection techniques that can use the natural features of the object as a marker, therefore, replacing the need of incorporating structured marker patterns [5, 30]. The issue with natural markers is that normally they require a training step for each object that has to be recognised and they are computationally more expensive than structured marker detection techniques.

Sensor-based projection tracking designs were also proposed in many works. Dao et al. proposed the use of fixed positions [12]. A technique presented in [36] works by making assumptions about the user’s arm position. In 2011, Willis et al. [47] described a system that used a motion sensor input and an ultrasonic distance sensor for pointing-based games, which was used to study users’ pointing behaviour. The different version of the system investigated a camera-based approach, where a customised prototype camera+projector with infrared fiducial markers was used for tracking [46]. Additional visual-based device tracking methods include projector-based pointing interaction [2, 33], and shadow pointing to the projected image to interact [11].

Most methods described in the literature either require a pre-calibrated infrastructure to be installed in the physical environment [7] or limit the interaction between participants and the projection [39]. Additionally, most systems were designed to project on flat surfaces, therefore, ignoring the depth of the real scenes which leads to distortions. In our prototype, we use a vision-based approach to track user-defined AR setups – based on natural markers – that enable the user to interact and spontaneously change its location, as the projection automatically adjusts to changes in the projector position.

4 Methodology

In this section, we describe the methodology that was developed to solve the research challenges in Sect. 2. The explanation is provided in parallel to the description of the application workflow.

4.1 Overview

An important consideration that we must bear in mind while projecting images on a surface that is not perpendicular to the projector view is how to acquire information on scene depth – which influences the distance between two projection points. Scene depth can either be extrapolated in automatic with computer vision techniques or, alternatively, by means of hardware that can capture depth, like depth cameras. Many of these techniques either require intensive computing algorithms or the end-user to execute additional calibration steps.

The solution that we propose in this manuscript to RC1: How can we exploit the real world scene depth to create projections that fit in intelligent way different types of surfaces, in order to avoid the image projected to be perceived as distorted? is deeply interlinked with the solution that we propose to RC2: How can we seamless compose (i.e. how can we efficiently define projection mappings) and share new interactions that bridge digital content and real word objects, independently of the projector’s location and orientation?. Our methodology does not require the computation of depth matrix to guarantee that projections will not be distorted by the depth of the scene objects.

Scene depth is extracted automatically from the transformation matrix that is computed for each user movement, as well as any user-defined information on the scene. In the next sections, we describe in detail how our novel optical-flow-based tracking technique can achieve this with the following steps: first we search for distinctive invariant features in the video stream; then we use a user-friendly interface to defined where and how content is projected; afterwards we rely on the use of optical-flow-based techniques to track the user movements; and finally we integrated new ways of interacting with PPP technology.

4.2 Invariant Feature Detection

In this section we explain how to detect distinctive features in images – the first step that has to be performed in order to track the projector’ camera position.

In order to detect distinctive features in images, we need to know the role of unknown variables such as lens distortion, illumination, viewing angle, and so forth, in the image formation process [25]. For example, the difference in perspective between two frames constitutes a significant factor especially when the camera baseline is large between the two views. The process of feature matching requires the extracting of key features from images that have invariant properties for large differences in viewing angles and camera translation. Additionally, the features that are used have to be discriminative if we want the process of recognising the scene to be robust.

In this work, we use the scale invariant feature transform BRISK algorithm [23] to detect distinctive features. The BRISK algorithm provides a robustness and performance comparable to the well-know SURF algorithm, but at much less computation time [23]. A strategy to select pairs between frames can be computed according to their spatial distance. The distance between BRISK descriptors can be calculated with the Bruteforce Hamming algorithm. The correspondences between features in different frames can then be used for estimating a camera pose.

4.3 Optical-flow-based Tracking

In this subsection we explain how pairs of BRISK descriptors can be used to track features in the real world without the use of fiducial markers.

There are several actions – given our problem statement – that can be implemented in order to reduce the computation time of the feature matching step. We know a-priori the regions of interest, i.e. around control points that are used to define projection mappings, therefore, we can use that information to reduce the search space for BRISK features. Afterwards, we can compute the list of pairs. The problem of tracking a shape between two consecutive frames is considered in literature as a small-baseline tracking problem because the transformation from the image at time frame t to the image that corresponds to the time frame \(t + dt\) can be modelled with a translational model, given a small dt.

The computation of the optical flow for feature points that correspond to the control points of a user-defined shape is the core of the algorithm for a frame-to-frame feature tracking in which the computation of the translational model corresponds to the computation of the homography matrix between two different frames. Homography is a projective transformation that provides the relation between a point on the camera space and a point in the world space.

The use of the homography matrix implies that under special circumstances a point in the reference image frame relates, by a linear relation, to a point that depicts the same information in different image frame. These circumstances are valid in case of pure rotation or if the view is a planar scene. In such case, the 3\(\,\times \,\)4 matrix that represents the projective relation between a 3D point and its image on camera becomes a 3\(\,\times \,\)3 matrix.

In our case each shape is defined to cover only a planar surface. Nevertheless, our system provides support for multiple shapes which can then be used to apply content to non-planar surfaces. Thus, in our case, homography can be used, if we provide the right quantity and quality of matching point. Before we can compute the homography with the OpenCV’ function called findHomography, we first need to find the list of feature points of interest, and extract their descriptors in order to find good matches. To find good matches more accurately, we compute the homography after removing the outliers with the RANSAC algorithm [13]. We use RANSAC to remove features that are on nonplanar objects, thus maintaining the planarity condition even for those images that consist of more complex geometry than a single working plane. We can use the RANSAC because we have the constraint that shapes can only be projected on planar structures. We use the RANSAC algorithm also for imposing the epipolar constraint between different images which help us to reject false matches.

The function to compute the homography requires an input of at least four points. Otherwise, we will not be able to map the points in the first image to the corresponding points in the second image. Afterwards, we compute the inverse homography with OpenCV’s perspective Transform function to find the matrix that maps points in the reference image to the equivalent points in the destination image. Note that homography works well if the BRISK descriptors are well distributed inside the shape. Otherwise homography might result too unstable for practical applications. To avoid the propagation of errors, we do not compute the tracking between two consecutive frames. We try to use the current frame against the oldest frame possible. For that, our algorithm keeps the oldest frame for which the homography was successfully computed, the current frame.

To make the user experience smoother, we implemented a threshold filter for low quality pairs, and we used the sensors of the smartphone to estimate the new pose. Our system shows a user notification if the homography cannot be computer within a few interactions. Tracking can be unsuccessful for two reasons. First, the initial surface of projection is not suitable for tracking due to the fact that we cannot extraction enough key points. Second, the camera baseline between the two views is too large or the surface is no longer visible. In this case the shape becomes invisible. Once the projection surface is again visible in the image frame the tracking restarts and the shape is drawn again in the right spot. Figure 2 provides an overview of the tracking process.

4.4 Creating and Maintaining a Virtual Scene

In this section, we describe how user-defined projection mappings are created and how they are visualised on top real world objects. User-defined projection mappings, or simply “shapes”, have the following properties: they are always defined by four points, they have graphical content associated (e.g. a video, an image, or some interactive 3D content), and they can store a user-defined depth correction.

In our application, users can drag shapes into the virtual reality scene and then map the vertices of the scene to the object they want to map – with simple drag-and-drop gestures (Fig. 3). Afterwards users can defined which content to associate to that shape and the slope of the surface – scene depth.

The physical setup consists of a smartphone that renders the content that will augmented the physical space and then sends it to the projector mapped in the projector’s perspective. We decided for this specific setup because our goal is to test the concept of interaction with portable technology that is both ubiquitous and accessible by everyone. The use of smartphones is not fundamental but is extremely useful to facilitate the process of creating and designing new augmented scenes. For example, users can use smartphones as lens during the process of creating a projection mapping, to facilitate the modelling process. An issue in our initial approach was that the field of view of the projection and the field of view of the camera attached to the projector (used to compute the pose of the projector) were not the same. To solve this issue we decided to project in each corner of the projected image an image, which we use as markers to track the field of view of the projector. In this way, we can automatically calibrate the projector and the camera attached to it. This is extremely important in order to be able to convert smartphone views into the projection view.

Another issue that we had to address was RC7: How can we reduce the effect of out-of-focus in projections when using non-laser projectors? This limitation raises a critical barrier to mobile content projections, since users have to readjust the focus if the distance to the projected surface changes. Hence, the whole purpose of having a fully automated interaction is lost in the case that the projector does not use a laser to keep the image focus. This problem can be solved by using a rangefinder and a closed-loop motion system consisting of a micro motor like piezoelectric SQUIGGLE and a non-contact position sensor like TRACKER. As an alternative, the out-of-focus projection blur can be reduced with image-based methods like the one proposed by Oyamada [29], which is well-suitable to reduce the image blur in non-perpendicular projections.

Our solution to RC5: What mechanisms can we implement to tackle the issue of sensitive information disclosure? is based on the fact that our system does project a clone of the mobile screen. The projector is identified and used as a different display. In this display, we do not render any user interface, as they are not needed. Only the shapes defined by the user are projected as well as any other information related to the collaborative task.

In our framework, we have also integrated in-house developed affective computing and recommendation systems - RC3: How can affective computing and recommendation systems be integrated into a projective augmented system, so we can adapt the content to the emotional state or needs of the user?. The affective computing system is capable of detecting user emotions, which are then combined with the user preferences to filter content or to change the way the user is interacting. This is especially relevant to us, as our system was originally designed for urban planning and advertising scenarios. In the same way that our mood affects the type of music we listen to, this system helps users to reach the user goals faster. That means, finding relevant/appealing products or suggesting designer’s alternatives.

To propose the RC6: How can we create a normative policy that regulates the use and the power of PPP technology in certain scenarios like streets, where drivers or passengers could be temporarily blinded by the projection? we propose a system that is based on image analysis. The system analyses the content of a frame in order to understand what kind of elements are present there. Hence, architectural elements or other things like people, and streets can be easily identified. To test our system, we defined a rule that interrupts projection if a street is detected. Figure 4 show the image analysis results, in percentage, for a given image frame.

In the next section, we explain what we did in what concerns the RC4: What mechanisms can be put in place to foster or promote collaboration and content sharing among users?.

5 Use Cases

Although our technology is applicable to a wide range of different scenarios, we decided to describe here only three use cases: an architecture and urban planning scenario; and augmented mobile advertisement scenario; and cultural heritage tourism scenario.

5.1 Architecture and Urban Planning

It is fundamental in scenarios like architecture and urban planning to have a system for decision-making that provides an overview of information relevant to the analysis (context) together with more detailed information for the various sub-tasks of interest.

Interaction with handheld projectors can be designed to effectively support this type of activities. For example, one projector can be held far from the projection area to create the low-resolution coarse-granularity, and another handheld projector can be held close to the focus region to display more detailed information, since the user can archive higher pixel densities as projection area shrinks. Hence, we would get an image resolution gradation interaction-based technique that capitalises on the distance between the projection surface and the projector itself, and a technique that would enable the visualisation of multiple information granularities. The viewing experience would be similar to that of a focus plus context display. Figure 5 shows a multi-granularity city map. The context region shows main streets only, while the focus region shows augmented urban information.

The solution that we propose is more flexible than previous focus plus context solution, where the resolution and position of both focus and context displays is fixed. In the solution that we propose, users can dynamically move in the environment and manipulate the resolution of the projections.

5.2 Cultural Heritage Tourism

In the previous use case, we described an interaction technique based on direct blending of multiple views. However, we can give a step further in terms of interaction with projected content. We can use the intersection of different projections as the trigger to quickly view information that involves multiple objects. For example, we can think of an interaction were multiple objects being projected by different handheld projectors can snap to each other when close enough. When snapped together, either they change their appearance to disclose additional information or they trigger the visualisation of more information. To unsnap them, we need to keep a small distance between objects that can trigger such actions (Fig. 6).

As an example, suppose that there are two users projecting information. The first user is projecting a map and a second user is exploring the 3D model of a monument. Intersecting the projection of these two users results in the visualisation of a map with the 4D model pinpointed. Then a third user projects another an object that has location associate. The intersection with the previous projections draw a route path between the location of the user’s object and the position of the monument now snapped to the map. The application supports the projection of multiple objects per projector. Hence, the limit of this technology lays on the creativity of its users. Additionally, the linkage between objects can be used as an authentication mechanism, where data is only disclosed when two objects are projected.

A side effect of using mobile devices to process the visual information that will be projected is that, without projectors, they can work as a traditional AR tools. For example, we use the mobile application to overlap historical pictures. Figure 7 depicts a smartphone overlapping the real world with an historical image.

5.3 Augmented Mobile Advertisement

The advertisement market can also benefit from the use our technology to reach out their audiences. First, our mobile prototype allows simple authoring of “augmented” advertisement content which then can be used to generate interaction with user within the “real” scene (Fig. 8). Below we have a smartphone product on a tabletop. In Fig. 9, the smartphone is the object being tracked by the projector’s camera. In this case, we use occlusion of features as an action trigger. For example, if the user puts a hand over the natural features of the button “more info” then our system triggers the action associated with that button.

We have also tested projection-based metaphors as a new way of transferring content to a mobile device without the requirement of having connectivity. This technique is especially relevant for tourists that are often dependent on roaming connectivity, which can be expensive. In our setup, we used a projector to display an animated QR code and a smartphone to read the animation in the form of download. The transfer rate archived is not suitable for general file transfers but works well for small amounts of information like text and small images, for example, information related to a product.

6 Evaluation

In this section, evaluate the relevance and impact of this study. As part of a preliminary user study, we asked 9 individuals to experiment our prototype. First, we demonstrated to each participant the features of the system, and then suggest them to try out the techniques described in the previous section. Each interaction session between participants lasted about 30 min. During the experiment we observed how participants used the system and then we conducted individual post-study interviews.

All participants manage to grasp the basic concepts of the prototype quite fast. Additionally, the participants did not show any difficulty learning the projection-based interaction techniques that were proposed. As we expected, the feature that was reported as the most appealing is the ability to easily exchange and combine information in a shared workspace, in addition to user-friendly approach used to setup a projection mapping.

There are although some technical aspects that can be improved. First the image analysis algorithm that we use to enforce normative policies cannot be executed at a real time frame rate. In our implementation it streams an image frame to an image analysis service that returns a set of tags that describe the image. The projection is automatically blocked if the projector is pointing, for example, to a car, people, or streets. In the future, we want to restrict the projection in such way that it will never hit people in the face, but while keeping the projecting around the person. This can be easily archived with a face tracking algorithms, which are already available in OpenCV.

The computational time required for tracking operations can also be improved. At the moment, the tracking algorithm can process only 17 frames per second. The method implemented for reducing out-of-focus blur helped us to archive better results with projectors that do not support auto-focus. Yet focus and projection size needs to be calibrated manually, because the focus of the projector can only be adjusted manually. This problem can be solved with the use of a laser-based projector.

7 Conclusion

In this paper we explored new perspectives on augmented reality systems that crystallise on new concepts of 3D projective mapping and interaction between multiple co-located users using handheld projectors. Interpersonal communication and collaboration may be supported more intuitively and efficiently compared to current handheld devices. Informal user feedback indicated that our designs were promising. Our work is the first mobile authoring system for 3D projective mapping that uses computer vision tracking techniques to facilitate the design live projections.

The current mobile projection technology has some limitations in terms of the light intensity, in addition to the fact that it can only provide image focus at a particular distance. Current handheld projectors have a luminance between 5 and 100 lumens, which we believe that will increase considerably next few years – some low-cost fixed projectors can nowadays reach 2,500 lumens. This limitation implies that handheld projectors can only be used indoors or outdoors at night. For dynamic mobile projections with variable distance between the projection surface and the projector, we advise the use of laser projectors which seem to be better suited, to the projection of sharp images.

As a future work, we are interested in empirically investigating how interaction between people may evolve with the usage of handheld projectors and how the technology is used for creative purposes. Finally, we plan to extensively explore other was of interacting with handheld projectors, by integrating for example gamification strategies, which may change the way people currently think. We will also investigate improvements in transfering data with QR codes through the visualisation of animated arrays since projected spaces have on their side the advantage of using large surfaces.

References

Azuma, R.T., et al.: A survey of augmented reality. Presence 6(4), 355–385 (1997)
Google Scholar
Beardsley, P., Van Baar, J., Raskar, R., Forlines, C.: Interaction using a handheld projector. IEEE Comput. Graph. Appl. 25(1), 39–43 (2005)
Article Google Scholar
Bier, E.A., Stone, M.C., Pier, K., Buxton, W., DeRose, T.D.: Toolglass and magic lenses: the see-through interface. In: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques, pp. 73–80. ACM (1993)
Google Scholar
Blasko, G., Coriand, F., Feiner, S.: Exploring interaction with a simulated wrist-worn projection display. In: Proceedings of the Ninth IEEE International Symposium on Wearable Computers, pp. 2–9. IEEE (2005)
Google Scholar
Cao, X.: Handheld projector interaction. Ph.D. thesis. University of Toronto (2009)
Google Scholar
Cao, X., Forlines, C., Balakrishnan, R.: Multi-user interaction using handheld projectors. In: Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology, UIST 2007, pp. 43–52. ACM, New York (2007). http://doi.acm.org/10.1145/1294211.1294220
Cao, X., Forlines, C., Balakrishnan, R.: Multi-user interaction using handheld projectors. In: Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology, pp. 43–52. ACM (2007)
Google Scholar
Cao, X., Massimi, M., Balakrishnan, R.: Flashlight jigsaw: an exploratory study of an ad-hoc multi-player game on public displays. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, pp. 77–86. ACM (2008)
Google Scholar
Chan, L.W., Wu, H.T., Kao, H.S., Ko, J.C., Lin, H.R., Chen, M.Y., Hsu, J., Hung, Y.P.: Enabling beyond-surface interactions for interactive surface with an invisible projection. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, pp. 263–272. ACM (2010)
Google Scholar
Churchill, E.F., Nelson, L., Denoue, L., Helfman, J., Murphy, P.: Sharing multimedia content with interactive public displays: a case study. In: Proceedings of the 5th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques, pp. 7–16. ACM (2004)
Google Scholar
Cowan, L.G., Li, K.A.: Shadowpuppets: supporting collocated interaction with mobile projector phones using hand shadows. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2707–2716. ACM (2011)
Google Scholar
Dao, V.N., Hosoi, K., Sugimoto, M.: A semi-automatic realtime calibration technique for a handheld projector. In: Proceedings of the 2007 ACM Symposium on Virtual Reality Software and Technology, pp. 43–46. ACM (2007)
Google Scholar
Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24, 381–395 (1981)
Article MathSciNet Google Scholar
Grundhofer, A., Seeger, M., Hantsch, F., Bimber, O.: Dynamic adaptation of projected imperceptible codes. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, ISMAR 2007, pp. 181–190. IEEE (2007)
Google Scholar
Hang, A., Rukzio, E., Greaves, A.: Projector phone: a study of using mobile phones with integrated projector for interaction with maps. In: Proceedings of the 10th International Conference on Human Computer Interaction with Mobile Devices and Sservices, pp. 207–216. ACM (2008)
Google Scholar
Harrison, C., Benko, H., Wilson, A.D.: Omnitouch: wearable multitouch interaction everywhere. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 441–450. ACM (2011)
Google Scholar
Hereld, M., Judson, I., Stevens, R.: Introduction to building projection-based tiled display systems. IEEE Comput. Graph. Appl. 20(4), 22–28 (2000)
Article Google Scholar
Hilliges, O., Izadi, S., Wilson, A.D., Hodges, S., Garcia-Mendoza, A., Butz, A.: Interactions in the air: adding further depth to interactive tabletops. In: Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology, pp. 139–148. ACM (2009)
Google Scholar
Hosoi, K., Dao, V.N., Mori, A., Sugimoto, M.: Cogame: manipulation using a handheld projector. In: ACM SIGGRAPH 2007 Emerging Technologies, p. 2. ACM (2007)
Google Scholar
Kane, S.K., Avrahami, D., Wobbrock, J.O., Harrison, B., Rea, A.D., Philipose, M., LaMarca, A.: Bonfire: a nomadic system for hybrid laptop-tabletop interaction. In: Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology, pp. 129–138. ACM (2009)
Google Scholar
Köhler, M., Patel, S.N., Summet, J.W., Stuntebeck, E.P., Abowd, G.D.: TrackSense: infrastructure free precise indoor positioning using projected patterns. In: LaMarca, A., Langheinrich, M., Truong, K.N. (eds.) Pervasive 2007. LNCS, vol. 4480, pp. 334–350. Springer, Heidelberg (2007)
Chapter Google Scholar
Lee, J., Hudson, S., Dietz, P.: Hybrid infrared and visible light projection for location tracking. In: Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology, pp. 57–60. ACM (2007)
Google Scholar
Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: binary robust invariant scalable keypoints. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2548–2555. IEEE (2011)
Google Scholar
Löchtefeld, M., Schöning, J., Rohs, M., Krüger, A.: Marauders light: replacing the wand with a mobile camera projector unit. In: Proceedings of the 8th International Conference on Mobile and Ubiquitous Multimedia, MUM 2009, pp. 19:1–19:4. ACM, New York (2009). http://doi.acm.org/10.1145/1658550.1658569
Ma, Y., Soatto, S., Kosecka, J., Sastry, S.: An Invitation to 3d Vision: From Images to Models. Springer, New York (2003)
Google Scholar
McCarthy, J.F., Congleton, B., Harper, F.M.: The context, content & community collage: sharing personal digital media in the physical workplace. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, pp. 97–106. ACM (2008)
Google Scholar
Mistry, P., Maes, P., Chang, L.: Wuw-wear ur world: a wearable gestural interface. In: CHI2009 Extended Abstracts on Human Factors in Computing Systems, pp. 4111–4116. ACM (2009)
Google Scholar
Nakazato, Y., Kanbara, M., Yokoya, N.: Localization system for large indoor environments using invisible markers. In: Proceedings of the 2008 ACM Symposium on Virtual Reality Software and Technology, pp. 295–296. ACM (2008)
Google Scholar
Oyamada, Y., Saito, H.: Focal pre-correction of projected image for deblurring screen image. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8, June 2007
Google Scholar
Ozuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 448–461 (2010)
Article Google Scholar
Park, H., Park, J.I.: Invisible marker based augmented reality system. In: Visual Communications and Image Processing 2005, pp. 59601I–59601I. International Society for Optics and Photonics (2005)
Google Scholar
Pinhanez, C.: The everywhere displays projector: a device to create ubiquitous graphical interfaces. In: Abowd, G.D., Brumitt, B., Shafer, S. (eds.) UbiComp 2001. LNCS, vol. 2201, pp. 315–331. Springer, Heidelberg (2001)
Chapter Google Scholar
Rapp, S., Michelitsch, G., Osen, M., Williams, J., Barbisch, M., Bohan, R., Valsan, Z., Emele, M.: Spotlight navigation: interaction with a handheld projection device (2004)
Google Scholar
Raskar, R., Beardsley, P., Van Baar, J., Wang, Y., Dietz, P., Lee, J., Leigh, D., Willwacher, T.: Rfig lamps: interacting with a self-describing world via photosensing wireless tags and projectors. In: ACM Transactions on Graphics (TOG), vol. 23, pp. 406–415. ACM (2004)
Google Scholar
Raskar, R., Van Baar, J., Beardsley, P., Willwacher, T., Rao, S., Forlines, C.: ilamps: geometrically aware and self-configuring projectors. In: ACM SIGGRAPH 2006 Courses, p. 7. ACM (2006)
Google Scholar
Robinson, S., Jones, M.: Haptiprojection: multimodal mobile information discovery. In: Ubiprojection Workshop at Pervasive 2010 (2010)
Google Scholar
Saito, S., Hiyama, A., Tanikawa, T., Hirose, M.: Indoor marker-based localization using coded seamless pattern for interior decoration. In: Virtual Reality Conference, VR2007, pp. 67–74. IEEE (2007)
Google Scholar
Schöning, J., Rohs, M., Kratz, S., Löchtefeld, M., Krüger, A.: Map torchlight: a mobile augmented reality camera projector unit. In: CHI 2009 Extended Abstracts on Human Factors in Computing Systems, CHI EA 2009, pp. 3841–3846. ACM, New York (2009). http://doi.acm.org/10.1145/1520340.1520581
Shilkrot, R., Hunter, S., Maes, P.: Pocomo: projected collaboration using mobile devices. In: Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services, pp. 333–336. ACM (2011)
Google Scholar
Shirai, Y., Matsushita, M., Ohguro, T.: Hiei projector: augmenting a real environment with invisible information. In: The 11th Workshop on Interactive Systems and Software (WISS2003), pp. 115–122 (2003)
Google Scholar
Sugimoto, M., Miyahara, K., Inoue, H., Tsunesada, Y.: Hotaru: intuitive manipulation techniques for projected displays of mobile devices. In: Costabile, M.F., Paternó, F. (eds.) INTERACT 2005. LNCS, vol. 3585, pp. 57–68. Springer, Heidelberg (2005)
Chapter Google Scholar
Underkoffler, J., Ullmer, B., Ishii, H.: Emancipated pixels: real-world graphics in the luminous room. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1999, pp. 385–392. ACM Press/Addison-Wesley Publishing Co., New York (1999). http://dx.doi.org/10.1145/311535.311593
Velasco, S.: ‘sublimotion’, un viaje a través de los sentidos. Revista Internacional de Protocolo: Ceremonial, Etiqueta, Heráldica, Nobiliaria y Vexilogía 71, 74–77 (2014)
Google Scholar
Weng, D., Huang, Y., Liu, Y., Wang, Y.: Study on an indoor tracking system with infrared projected markers for large-area applications. In: Proceedings of the 8th International Conference on Virtual Reality Continuum and its Applications in Industry, pp. 239–245. ACM (2009)
Google Scholar
Wienss, C., Nikitin, I., Goebbels, G., Troche, K., Göbel, M., Nikitina, L., Müller, S.: Sceptre: an infrared laser tracking system for virtual environments. In: Proceedings of the ACM Symposium on Virtual Reality Software and Technology, pp. 45–50. ACM (2006)
Google Scholar
Willis, K.D., Poupyrev, I., Hudson, S.E., Mahler, M.: Sidebyside: ad-hoc multi-user interaction with handheld projectors. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 431–440. ACM (2011)
Google Scholar
Willis, K.D., Poupyrev, I., Shiratori, T.: Motionbeam: a metaphor for character interaction with handheld projectors. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1031–1040. ACM (2011)
Google Scholar
Wilson, A.D.: Playanywhere: a compact interactive tabletop projection-vision system. In: Proceedings of the 18th Annual ACM Symposium on User Interface Software and Technology, pp. 83–92. ACM (2005)
Google Scholar

Download references

Acknowledgements

This research has been supported by the European Commission (EC) under the project c-Space (Grant Agreement N. 611040). The authors are solely responsible for the content of the paper. It does not represent the opinion of the European Community. The European Community is not responsible for any use that might be made of information contained herein. Special thanks should also be given to Ferdinando Cesaria, for his valuable technical support and useful recommendations.

Author information

Authors and Affiliations

Via Alla Cascata, 56/C, Trento, Italy
Bruno Simões, Federico Prandi & Raffaele De Amicis

Authors

Bruno Simões
View author publications
You can also search for this author in PubMed Google Scholar
Federico Prandi
View author publications
You can also search for this author in PubMed Google Scholar
Raffaele De Amicis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Simões .

Editor information

Editors and Affiliations

Department of Engineering for Innovation, University of Salento, Lecce, Italy
Lucio Tommaso De Paolis
Department of Engineering for Innovation, University of Salento, Lecce, Italy
Antonio Mongelli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Simões, B., Prandi, F., De Amicis, R. (2015). Creativity Support in Projection-Based Augmented Environments. In: De Paolis, L., Mongelli, A. (eds) Augmented and Virtual Reality. AVR 2015. Lecture Notes in Computer Science(), vol 9254. Springer, Cham. https://doi.org/10.1007/978-3-319-22888-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-22888-4_13
Published: 15 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22887-7
Online ISBN: 978-3-319-22888-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics