1 Introduction

When Snap launched their AR art platform using a giant geo-located augment of Jeff Koons’ Balloon Dog, placed in iconic Central Park, New York, they were certainly not expecting any controversy. Within days, the artwork had been hacked by Sebastian Errazuriz, a media artist from New York, who had built an AR graffiti app. to deface Koons’ masterpiece. Seemingly motivated by a desire to speak out against the increasing corporate control of AR space, the artist made a political statement against the privatisation of the virtual public sphere. At the same time, he was speaking out against gamification, in terms of Snapchat’s playable media experiences such as dogface and the vomit rainbow, images which have to date appeared in tens of millions of user-generated selfies.

Koons’ Balloon Dog appeared in Central Park through the World Lens filter, a physically non-invasive software generated means of geo-locating visual content at a specific physical site. The placement was part of Snap’s art programme, and subsequently Koons’ Popeye has been placed virtually in the steps of the Sydney Opera House (17 October 2018) causing no controversy of note. So, why especially does the placement of a famous sculpture as virtual ignite the debate on AR and public space? And can a playful low-brow social media ‘game’ become art, even with the help of one of America’s most iconic and widely revered artists? Snap contends, yes, with the launch of their new art platform for ‘lens’ creator’s asserting AR art’s right to gamify. In this context, the notion of gamification also extends to products which entice consumers to participate in their ethos through the deployment of game-like elements. In the Central Park location chosen for the display of Koons’ sculpture as virtual, it would be entirely illegal to erect a billboard advertising for Snap or any other company: Yet, it is the confluence of proprietary software and virtual space that has produced this unique and somewhat opportunistic marketing paradigm. However, it is this same ability to insert virtual objects into public space that has fuelled quite a different paradigm, that of the AR intervention, now a well-known technique in emergent artistic practice.

In a video posted to Instagram, Mr. Errazuriz explained why he had questioned Snapchat’s collaboration. “For a company to have the freedom to GPS tag whatever they want is an enormous luxury that we should not be giving out for free,” Mr. Errazuriz said. “The virtual public space belongs to us, we should charge them rent.” The idea of privatising or monetizing virtual space is also something Forbes magazine picked up on. For companies like Snap, a space where no rent is paid is a clear opportunity for free marketing. However, it is the overlay that is owned, as well as the means to access it—in the case of the Balloon Dog virtual sculpture, via the Snap proprietary platform, and in the case of Errazuriz’s artful intervention, his own proprietary app. The space itself remains in the public domain, free to use by all, even corporations in pursuit free exposure to fuel their privatised profits. Here, virtual public space is a resource that can be extracted and harnessed for whatever purpose.

2 AR/MR Commercial and Technical Development

In a widely accepted technical definition from Ronald Azuma, AR is any technological system which combines real and digital elements, is interactive in real time, and registers in three dimensions (Azuma 1997, p. 355). AR experience is generally framed by a display screen, often either held by or attached to human users: for example, as a smartphone or head worn display. These range from screens on smartphones, through to heads-up displays (HUDs), or large screens designed to reflect human scale and capture human movement. A sample of recent games and entertainment applications from the commercial world illustrate how AR continues to be confined to informatic overlay design. Wikitude (2008), for example, was the first application (app) for smartphone and tablet to use a Simultaneous Localization and Mapping (SLAM) algorithm to overlay three-dimensional coordinates in geographical space, through alignment with the accelerometer and gyroscope sensors. Cartographic and geo-locational information was held on a web server and transposed to appear as localised information on the screen space of the user. Another popular and commercially successful example from the mobile game industry, the massive multiplayer game Pokémon GO (2016) invites players to collect virtual avatars (Pokémon) which battle one another, and eventually cooperate to take over virtual bases called ‘gyms’ geo-located in real space. In the game’s AR mode, ‘trainers’ attempt to capture Pokémon that are visible as a layer on a smartphone screen. The game uses geo-location to spawn and track the Pokémon (Juhász and Hochmair 2017), in combination with an informatic overlay approach. In another context, Snapchat (2016), overlays novelty augments on the faces of users, such as dog, a dancing elf, or a rainbow. The application uses a ‘features point detection’ technique to precisely locate augments—called ‘lenses’ by the company—over the faces of users (Pawade et al. 2018). A user’s face is captured through the front-facing camera of their smartphone, placed in a ‘mirror,’ adapted through the addition of an augmented overlay, then re-presented in the screen display as an altered image stream.

In limited ways, these more recent examples of AR engage aspects of a user’s corporeality. It could be argued that applications like Snapchat provide new senses of embodiment for the participant by shifting the relation between the camera’s image stream and a second augmented image stream. Yet bodies here are highly delimited by their interaction with a screen. Captured in this ‘magic mirror,’ the body is re-constituted via a technical apparatus as components of computational vision, where a digital replica of an area of the user’s corporeality endlessly loops without variation. Additionally, many of the designs currently deployed in the mobile AR industry proceed from the assumption that the digital screen functions like as an analogue to a window. Wikitude is literally a map overlaying physical space; in Pokémon GO, the smartphone screen becomes a ‘portal’ to look through. Others, such as Snapchat, function through the trope of the magic mirror. Playful media, such as Snapchat’s world lens’ products have shifted our perception of what ‘art’ might be in the context of AR as a commercial enterprise. At the same time, more high-brow precursors such as serious art games, have expanded our understanding of what AR might do in the public domain. In each of the apps mentioned above, AR is focussed on what happens within the frame of the screen. The following section will elaborate on the engineering techniques that make the informatic overlay possible and indicate how—in the design of commercial products—those techniques also flow into MR as an emergent field.

The apparatus that started the HMD/HUD research trajectory in MR, was invented by Ivan Sutherland: his Sword of Damocles (1968) took up an entire room, had functional vision from one eye, could only be comfortably worn for a few minutes, and showed wireframe line drawings (Sutherland 1968, pp. 757–759). Yet its radical concept—that human vision might combine with digital content through a display worn on one’s head—established a research trajectory for AR/MR/VR that is proliferated today across many devices. Fundamental for this research trajectory is a pragmatic approach that aimed for greater efficiency in task completion, helpful in military and industrial contexts that embrace the notion that the human body would be enhanced by augmented vision. Current commercial and industrial approaches incorporate the latest sensing technology to convey sophisticated situated and highly contextual data aimed at adding clarity to the digital experience (such as Ohta and Tamura 2014).

Engineering handbooks are replete with a wide range of tracking techniques for AR/MR, including marker-based and image-based tracking, model targets stored in the Cloud and geo-locational information (Carmigniani and Furht 2011; Kent 2012; Craig 2013; Peddie 2017). These techniques have facilitated a plethora of AR games for smartphone, with mobile AR being the largest category of commercial MR use (as in the gamified examples mentioned above). Product launches attach data, such as a new car, to real-world objects such as cubes, while QR codes on supermarket cereal boxes aim to tempt buyers with embedded links to product websites.

3 AR/MR Through Media Art Practice and Scholarship

The approaches described in this section, are neither engineering-based nor confined by earlier media concepts. Artists involved with AR/MR have manifested different techniques for augmenting space, many of which pre-date the current thinking in computer science and engineering such as that encapsulated by the Reality-Virtuality RV Continuum and other similarly restrictive models. We will be examining artistic interventions that complicate both the informatic overlay approach and the notion of a ‘seamless’ connection between ‘real’ and ‘virtual’ in MR spaces. The experimental artworks surveyed in this section are sympathetic to what I will later analyse as a materialist approach to MR that focuses on relations, intra-action and the agential reality of all elements of the software assemblage. It is not intended to be a comprehensive selection of artwork in the field: rather, the artworks mentioned here reveal an interest in processes that encourage self-organisation, emergent relational forces, iterative re-assembly, and an aesthetically expanded role where audience members becomes participants/artists.

Inquisitive writers/practitioners from the avant-garde of media art practice have drawn attention to the need for an alternative formulation of MR. In an analysis of the collective Blast Theory’s augmented and mixed reality artwork, Steve Benford and Gabriella Giannachi note that Milgram and Kishino’s Reality-Virtuality Continuum might be more useful if it was ‘more rhizomatic’, since the classification system tends to place physical and virtual in opposition to one another rather than fostering a more relational system (Benford and Giannachi 2011, p. 3). They describe the Reality-Virtuality Continuum as a ‘largely mathematical and technology-centric’ method of ‘constructing virtual spaces’ in order to align them with physical space (2011, p. 43). In response, they offer the notion of ‘trajectories’, where the participant enacting an artwork moves experientially through real and virtual online worlds that are partially pre-scripted, and partially self-generated. The pioneering MR participatory performances Uncle Roy All Around (2003), and FlyPad (2009) by Blast Theory weave together theatrical performance, online and real-world environments, audience participation, role-playing, and data extraction in complex arrangements that unfold mutually across the digital as well as the physical (Benford and Giannachi 2011). Carving out a new genre of MR performance, Blast Theory’s contribution to a performative MR, advances a less digitally privileged mix of realities, where participants are given cues by the artworks that send them off on exploratory trajectories that pass through ‘hybrid space’. Referencing Gilles Deleuze’s notion of the ‘fold’ they describe hybrid space as:

composed of different, adjacent, “enfolding” spaces, simultaneously occupying different points on the mixed reality continuum, which remain, however, in a heterogenous, discontinuous, unsynthesized, and changing relationship with one another (Benford and Giannachi 2011, p. 45).

For example, in FlyPad (2009), a camera placed over a gallery’s atrium framed a wide orthographic view of the space below. In this frame—a ‘flying area’—visitors were able to take on the identity of a winged avatar (a brightly coloured insect), whose movement they controlled using a footpad. Flying across and through the atrium, the flyers could join together, performing new movements by melding their avatars, or remain separate but fly with less vigour. Prior to playing the game, as they walked around the gallery, data was extracted from participant’s movement by way of a Radio Frequency Identification Device (RFID) tag, and this data was added to their digital avatar in the FlyPad game. Incorporating a participant generated trajectory with pre-figured elements, produced an iterative artwork that could never unfold the same way twice (2011, p. 138–141). While Blast Theory’s work does not directly relate to my practice-based approach since it relates more to the potential of MR as a transmedia storytelling medium, it illuminates the need for more performative versions of MR, influenced by relational assemblages rather than technocentric formulations.

Following the thread of dissent toward technology driven methods that occlude the body and align corporeality with a pre-figured data system, leads us to artist generated approaches that use different techniques to combine corporeality with augmented digital worlds. In John McCormick and Adam Nash’s Reproduction—an artificially evolving performative digital ecology (2011), autonomous agents helped to generate embodied relations that re-sited image/colour data from human interactors to an adapting digital ‘life-form’. Extracted using motion capture technology, the results of the results of these hybrid reproductions—as they adapt in real time in response to feedback—were presented as a full-scale projection in an immersive room. The impact of this mutation on the visitor is discussed as provoking a multi-sensory state of ‘contemplative interaction’ (Riley and Innocent 2014; Riley and Nash 2014), and it is noted that this understanding differs from conventions that frequently situate MR as either ‘reactive or distracting’ (Riley and Nash 2014, p. 260). Contemplative interaction offers itself as a method that examines ‘notions of affect that relate bodies, locations, spaces and codes across the physical and virtual’ (Riley and Nash 2014, p. 263). Again, a much more complex figuration than offered by the Reality-Virtuality Continuum, and one that allies itself with Deleuzian notions of affect and embodiment (Riley and Nash 2014, p. 261).

Sandbox: Relational Architecture 17 (2010) by Rafael Lozano-Hemmer, used augmented projections to form spontaneous visual connections between people at Santa Monica Beach. An area of 740 m2 of the beach had been prepared in advance with a tracking system, surveillance cameras, and projectors that caste real time images of giant hands across its surface. These giant hands were magnifications of human hands playing in one of two 69 × 93 cm sandboxes nearby. At the same time, participants on the beach were captured by the same tracking/surveillance/projection system, with their full body images shrunk to a tiny scale. Re-projected into the sandbox, people there were able to play with human miniatures using their hands. Choices emerged: participants could join the sandbox and see their own hands projected as giants; or, they could remain in the expansive beach space and have their own full body images recorded, and shrunk, then projected back to the sandbox for hands there to explore.

As physical bodies passed by one another—in the illuminated darkness of the tactile and tractional sand—they passed over and against digital augments as light projections of the bodies of others. This relay of projections formed a recursive loop where new relations of overlapping bodies spontaneously and temporarily emerge, in oscillation between media environment and natural environment, corporeal bodies and projected bodies. Light gave these augmented bodies presence, through the relations it formed with other surfaces: the augmented sandbox, the participant’s skin, the luminous projection system, the tracking and surveillance software.

This is not Lozano-Hemmer’s first installation leveraging AR technology in a complex relational system. Indeed, it is clear that Lozano-Hemmer was a pioneer in the creative use of augmented material in media art. In Underscan (2007) as well as Body Movies (2002), an earlier version of the same tracking system used in Sandbox was deployed to track participants and activate augmented video that passed under their shadows as they walked. Ulrik Ekman has analysed emergence and embodiment in relation to Underscan:

This happens via the virtualization of their bodies, but also via the emergence of a complementary interplay, in their embodiments and the real environment, among locative media, signaletic telepresence, and ubiquitous computing (2012, p. 18).

Ekman notices a blurring of distinctions, such as between public and private space via the virtualization of bodies. Like Sandbox, the technical operation of Underscan involved luring people into a pre-prepared space embedded with sensors and surveillance equipment, where their movements would be tracked and used to trigger corresponding movement image sequences. In the case of Underscan, the projected sequences were from pre-recorded films, deployed on the pavement as augmented overlays. In the case of Sandbox, the projections on the sand unfold in real time, reflecting advances in tracking technology made in the seven years between these works. Looking over to the sandbox from the sand, participants could see others playing with their images, and they could do the same with the large-scale hands projected on them.

Sandbox’s project description lays the power relations and affective conjunctions of bodies (human and non-human) embedded in this artwork bare:

The project uses ominous infrared surveillance equipment not unlike what might be found at the US-Mexico border to track illegal immigrants, or at a shopping mall to track teenagers. These images are amplified by digital cinema projectors which create an animated topology over the beach, making tangible the power asymmetry inherent in technologies of amplification.

This ‘animated topology’ approached power relations as matters of scale: where giant hands manipulated tiny people, yet the tiny people were also able to overturn that relation and become the giant hands. Spaces are alluring and carefully composed, drawing humans toward their dynamism—such as movements of light and data—latent with affective potential. DeLanda (2007) suggests that human experience in Lozano-Hemmer’s artworks, is largely produced through ‘expressive spaces’ activated by underlying nonhuman energies. In such spaces, code is not simply executable but is affective, ‘long series of ones and zeros that ultimately embody the software animating the hardware’ (DeLanda 2007, p. 104). Establishing a giant structure populated with surveillance cameras, a bespoke tracking system, and ultra-high power projectors, Lozano-Hemmer generated an expressive space for humans to affectively perform within.

All the artist-led approaches described above convey the idea that the physical and digital spaces of MR are actually not separate but meet through oscillatory movements of bodies and data. In such assemblages, the materiality of the digital is not only affective, but further suggests that senses of embodiment operate between human and nonhuman. this affords the broad perspective that data and the corporeal might mutually co-constitute one another. Clearly, such approaches are markedly different from the engineering and computer science paradigms discussed earlier in this article, that would fix MR as a matter of technical arrangements on a display screen.

Lanfranco Aceti’s and Richard Rinehart’s (2013) edited special edition of Leonardo Journal, Not Here Not There, was the first comprehensive survey of AR/MR as an artistic category. Framing a quote from the Manifest.AR collective, Rinehart summarizes some of the provocative issues raised by mobile AR:

Sited art and intervention art meet in the art of the trespass. What is our current relationship to the sites we live in? What representational strategies are contemporary artists using to engage sites? How are sites politically activated? (Aceti and Rinehart 2013, p. 9)

This collection focussed on mobile AR deployed through geo-location, since this was a popular artistic movement at the time. Soon after Aceti and Rinehart’s collection, Augmented Reality Art: From an Emerging Technology to a Novel Creative Medium (2014) edited by Vladimir Geroimenko, became the first book to systematically analyse the artistic threads coming out of AR as a new medium. Contextualising that study, Geroimenko reproduced in full the Manifest.AR manifesto (Freeman et al. 2012), leading him to argue that what differentiates AR from other emergent media forms such as ‘virtual reality, Web art, video, and physical computing’ is that it is bound up with an activist politics that re-purposes technologies like mobile phones as radical artmaking devices (Geroimenko 2014, p. vii). Moving away from the restrictions imposed on AR as medium defined by the informatic overlay or by remediating older media, artists working with mobile AR have forged new critical pathways, such as those described in Geroimenko’s collection. Mobile AR—popularized by commercial products such as the smartphone games described earlier—in media art takes geo-location to activist contexts.

Mobile augmented reality art (MARt), emerged as cultural force in AR from about 2010, with the influential Manifest.AR group founded on January 25th, 2011, following their ground-breaking guerrilla exhibition/intervention in the Museum of Modern Art, New York, We AR in MoMA (2010). There, organisers Mark Skwarek and Sander Veenhof conspired to stage an exhibition of augmented art without the permission of the gallery, and got away with it. Holding tours of their artworks, the show went under the radar of the gallery authorities, and inaugurated a new movement in activist and interventionist installation. Subsequently, the group staged other interventions where geo-location was used to surreptitiously place augments at canonical and politically loaded sites, such as outside the New York Stock Exchange during Occupy Wall Street (Skwarek 2014, p. 3), and at the Venice Biennale 2011 and 2013 (Thiel 2011, 2014). During Occupy Wall Street, the area in front of the New York Stock Exchange was off limits for protestors, yet the Manifest.AR group were able to stage a ‘flash mob’ waving smartphones rather than obvious signage (Skwarek 2014, p. 17). While augments still operate as informatic overlays, they are skewed to critical ends by an activist culture. A deeper knowledge of how information operates via layering in AR, also featured in some of this work. For example, by hosting their work on private ‘layers’ in the app Layer, Manifest. AR avoided the marketing noise made by commercial mediatic assemblages. Such strategic uses of augments in conjunction with mobile devices, not only encouraged a critical turn in thinking about the informatic overlay, but also implied a more extensive concept of embodiment.

Moving around a site to discover augments on a smartphone involves a trajectory through physical space that engages meanings that are both pre-existing and inscribed. Space is not inert waiting to be written on by the artist/activist: space can be considered as expressive, as will be examined shortly (DeLanda 2007). Re-thought as affective toward its human inhabitants, space can be conceived as a force of the nonhuman that makes connections with embodied actions, generating affects that adapt behaviours and practices. Artworks using mobile AR that likewise imply space might be expressive, and that work alongside the embodied actions of participants to co-compose the experience, are Tamiko Thiel and Will Pappenheimer’s Biomer Skelters (Liverpool 2013 and various iterations), and Janet Cardiff and George Bures Miller’s the City of Forking Paths (Sydney, 2014–2017).

In Tamiko Thiel and Will Pappenheimer’s Biomer Skelters (Liverpool 2013 and various iterations), a participant walks around a pre-determined area of a city with their smartphone. As they move the camera sensor, they see an array of digital plants appear on their phone screen. The participant holds a Zephyr heart rate monitor to connect a bespoke smartphone app to their heartbeat. The frequency of the signal generated by the beating rhythm of their heart is converted into the augments—virtual plants—that populate the ‘biome’, a term borrowed from biology that describes a community of plants and animals living together in a ‘congruous ecology’ (Woodward 2009, p. 2). In this case, the biome is an urban landscape populated by data, plants and bodies.

The concept of combining an art-game form, an affective computing network, together with algorithmic botany produced through AR, marks a physiological turn toward embodied action in mobile AR. Operating as a self-organising system tethered to the physical activity of walking—hence conjoining physical world with digital via enactment—the participant of this art-game becomes a vital part in speculatively generating a ‘natural’ rejuvenation of the city. As the game never unfolds the same way twice, each experience is highly differentiated and multiple meanings layer on top of one another at the same geographical sites. During this active movement across an urban landscape, tangible changes are made by the participant, and each player is involved in a lively botanical re-inscription of the city (Wright 2018a). Members of the public are accorded a meaningful role as ecological change makers in their own community. Beyond the game, Thiel and Pappenheimer’s real time re-assembly of a digital biome across the topology of urban physical space perhaps contributes to a shift in thinking urban design, introducing new design possibilities for a somewhat homogenous urban ecology of the contemporary city.

In opposition to the more mainstream uses of augmentation, Lev Manovich has cited artist Janet Cardiff’s audio walks (dating back to 2005) as an exemplar of the poetic deployment of ubiquitous technology:

Their power lies in the interactions between the two spaces—between vision and hearing (what users are seeing and hearing), and between present and past… (Manovich 2006, p. 226).

By directing the participant toward conflicting perceptual zones, Cardiff’s work shifts conventional preconceptions about that space, transforming relations between participant and site. Janet Cardiff and George Bures Miller’s the City of Forking Paths (Sydney, 2014–2017) places the participant in a situation where they must follow the audio-visual logic of an AR embedded video, along the exact cartography set out by the narrative. Participants play a video on their smartphone and follow along with the artists’ shamanic audio-visual narrative as it meanders through The Rocks district in Sydney. Required to trace multiple narrative flows at the same time, the participant must attune to the work as it unfolds: the video stream playing constantly on the phone’s screen; the binaurally recorded narrative that played through headphones; and, the parallel ‘reality’ of the street experience during the walk (Wright 2018b). These nuances are less narrative than embodied: they must pause and sit, walk, take multiple turns, follow a tunnel under a street, and so forth. Confronted with at times startling imagery on-screen—a phalanx of office workers dimly lit with mobile phones, a gagged man with duct tape over his mouth wearing a straight-jacket in Miller’s Point—the participant must perceptually negotiate a parallel flow of experience between physical and digital. If they deviate from the ‘forking paths,’ they lose their place; for example, by taking a turn down the wrong street, they are cut adrift from the artwork, and the co-emergent experience of physical and digital experience is broken. In this way, the work operates alongside each person’s sensory apprehensions and habits, foregrounding the role of the body in producing a MR experience rather than being framed by technical constraints such as screen, transmission and resolution.

In these mobile AR artworks, the movements of and between participant and smartphone work together to extend the augmented space into geographical space. Highlighting the embedded physical entanglement between mobile device and user, physical and digital mobilities extend the augmented experience toward a more complex and heterogenous material assemblage that does not exclusively reside in screen space. Art experiences that interpolate the performer or participant in extensive (outside the frame) and intensive (sensorial) compositional modes, explore AR from the standpoint of an embodied intra-action engaging participant, augments and ecology.

However, as I have argued, the application of categories, taxonomies or criteria to delineate and create inclusions or exclusions, would restrict MR discourse and practice in media art. Such an approach would offer limited chances for new senses of embodiment, restrict a participant’s sense of agency in digital space, and overly program the shape of the MR experience for participants/performers. The informatic overlay approach was traced through diverse examples and revealed as a specific design approach where augments convey informatic content, shaped by Milgram and Kishino’s taxonomy in co-operation with other programmatic mechanisms. The problems caused by a taxonomic understanding of MR were not solely concerned with the informatic overlay. They also related to the hardware devices that materialized digital augments—such as the HUD and the smartphone—as well as the design practices applied to the deployment of augmented material.

Closer to the interests of this study are MR experiences as bespoke relational arrangements or assemblages, manifest in the work of the artists discussed here, such as Blast Theory, Rafael Lozano-Hemmer, McCormick and Nash, Thiel and Pappenheimer, as well as Cardiff and Miller. Through attention to the various modes of augmented interfacing expressed in these new artistic paradigms, we have explored AR’s capacity to generate interesting arrangements with code and bodies, encouraging experiences that beckon new modalities of embodiment.

As we saw through the analysis of Sandbox (2010), for example, augments must be given due consideration as nonhuman forces that prehend affect and beckon human bodies toward senses of embodiment that are spontaneous and emergent with a computational, algorithmic, and augmented network assemblage. Such a network assemblage has been configured using software as a responsive agent that co-creates in hybrids spaces and blended time scales, with human interlocutors. Furthermore, I have argued that the ‘media assemblages’ (Fuller 2005, p. 13) that have spawned the contemporary approaches to AR/MR discussed on this chapter, are always part of broader technosocial shifts; for example, the re-purposing of the phone as an entire medial space and a space for the emergence of new social behaviours. If we only see AR/MR as an information layer, we miss its capacity to provoke multimodal perceptions, and we miss the affective new senses of embodiment that emerge via the nuanced shifts in user/participant behaviours. Considering AR/MR as an aspect of algorithmic cultures in general, rather than as a purely technical medium for delivering overlaid content via a technical device, affords an approach that examines AR/MR beyond the informatic overlay. In fact, AR/MR is a kind of software assemblage that co-composes a set of evolving cultural behaviours and actions, taking a defining role in the shifting materiality of algorithmic culture.