1 Introduction

Since its emergence as an art medium, Augmented Reality has developed as a number of evidential sites. As an extension of virtual media, it merges real-time pattern recognition with media, finally realizing the fantasies of William Gibson through goggles or handheld devices. This creates a welding of a form of perceptual vision and virtual reality, or optically registered simulation overlaid upon actual spatial environments. And even though AR-based works can be traced back into the late 1990s, much of this work required at least an intermediate understanding of coding and tethered imaging equipment from webcams to goggles. It is not until the advent of marker-based AR possessing lower entries to usage, as well as geolocational AR-based media through handheld devices and tablets that Augmented Reality as an art medium would begin to propagate. While one can make arguments that much AR-based art is a convergence between handheld device art and Virtual Reality, there are gestures that are specific to Augmented Reality that allows for its specificity as a genre. In this examination, we will look at some historical examples of AR, and critical issues of the AR-based gesture, such as compounding of the gaze, problematizing the retinal, and the representational issues of informatics overlays. This also generates four gestural vectors analogous to those defined in The Translation of Virtual Art (see: Lichty 2014a, 445), which we will examine through case studies. Through these case studies, historical and recent to the time of this publication, we may determine the issues of the gestures and aesthetics of AR.

2 The Gaze, the Overlay, and the Retinal

In the creation and “performance” of AR works, there are often two actions in place in relation to the user, and those are of gaze and gesture/positionality. The reason why I separate the two, although related, is that in the five modalities/gestures that I wish to discuss (Fiducial, Planar, Locative, Environmental, and Embodied), each has different relationships between the user, the augment, and the environment. That is, in the experiencing/performance of AR, there is placement of one or many elements between the eye and the recognized target, and the gaze of the agent in experiencing the piece. I will refer to the AR media in question as a “piece” or “installation,” as the bulk of this discussion has to do with art, but some exceptional commercial examples will be included. In The Translation of Virtual Art, I defined the gestural lines of intent, or “vectoral gestures” as being a line of flight between the origin of the work and the site of the intended audience. These consisted of four modalities, being wholly in the physical or virtual, or gesturing from one to the other (or a combination). AR is a different set of configurations.

The difference inherent in AR from VR is that while there is virtual content, that content is overlaid upon a visual representation of the physical. It would be simple to theorize an intermediate plane of representation between the viewer and the target as in the case of the Planar modality, but unfortunately, AR is not that straightforward. Depending on said modality, there could be a space matrix of Locative or interactive media, a space imposed on a marker, as well as one or more spatial planes between the viewer and the target (as in print, which I discuss as the Fiducial and Planar). In addition, there are a number of cases in which modalities overlap strongly (Fiducial/Environmental, Embodied/Planar, etc., as I hope to show later).

AR consists of a space of positional overlays, whether Locative or recognized, and a performative gestural gaze, especially in the case of headset- or handheld/tablet works, as we will observe in Darf Design’s Hermaton. In addition, I would like to put forth a proposition regarding Duchamp’s idea of the “retinal” and an argument for his Fountain being a predecessor to Augmented art in 1917 with his addition of the signature (Craft 2012, 202). The famous entry of Duchamp’s inverted porcelain urinal as work of art inverts the notion of art object, but his signature of “R. Mutt” as a form of augment to the gesture would echo with Manifest.AR’s interventions into art spaces like the MoMA and Guggenheim. This comes into play only after considering notions of the gaze and of what I will call overlay-space. Before aiming a camera of any sort in media art, the argument of the “gaze” emerges in critical discourse.

In order to address the notion of lensed or gestural view (and perhaps I combine these two together a little casually, they are linked in the case of AR), Laura Mulvey’s seminal essay, Visual Pleasure and Narrative Cinema (Mulvey 2004, 837–849) comes to mind. In it, she established the concept of the all-objectifying “male gaze” that gendered the vector of the film lens as one between the subject (female) and objectifier (male). However, with the pervasiveness of personal imaging through mobile devices, Queer Theory and other theoretical frameworks have complicated this discourse. It is for this reason that I feel that as the gaze has been democratized, but manufactured by hegemony, and the “Queering” of augmented space deserves its own essay (and I am surprised that it has not been written of much to this date). As such, I feel it is beyond the scope of this humble musing, but I will touch on the subject momentarily as an invitation for further discussion.

Since the age of writing Visual Pleasure and Narrative Cinema, there are a number of aspects to the human employ of imaging equipment that complicate the gendered subject/object relation. The first, and perhaps an alternative strategy to Mulveyan discourse, is that of personalization of the gaze. With the rise of personal imaging devices, such as iPads and smartphones, the politics of the gaze is bifurcated between the (relatively) “democratized” operator and the hegemonic institution of the manufacturer. While I feel it is more germane to consider the role of the operator in creating the gaze-vector or line of sight of the gaze, the manufacturer is important as well. For it is the manufacturer that designs, and if one still believes Bauhaus ideas of form and function, it also frames the narrative discourse of the device itself. And as a male-dominant culture, technology may reify Mulvey’s assertion of a phallocentric gaze, even to AR, but this may shift in that the design field is more gender equal than Silicon Valley culture. The approval of the design by the manufacturer reinscribes the agenda of the device, and here I believe Mulvey still wields much power. However, my first notion of the locus of the operator is where this discourse diverges from gendered film theory (or at least Mulveyan discourse).

3 Queering Augmentation

The closest to the notion of queering of augmented space comes from identity-altering apps such as Meitu and Faceapp, and social media using overlay technology such as Snapchat and Facebook Messenger. Meitu’s “Cutifying” function, which places the user in a sentimental environment, enlarging the eyes, whitening the skin, and adding lipstick to the face. While the gesture of cuteness can be a symbol of endearment, attraction, or latent aggression as Ngai has suggested (Ngai and Adam 2012), online discussions have also questioned the embedding of racist narratives. However, as I have written in The Mutant Cute (Lichty 2017), the fact that Meitu is a Chinese program brings forth issues of politics and socioeconomics. This relates to the fact that the effects of Meitu rare as much Asian notions of class, thus making the politics of filtering and augmentation increasingly complex. Likewise, the issues related to the gender/age switching of FaceApp are also problematic as the results have certain gender stereotypes of attractiveness and feature stereotypes for youth/age, etc. But these are more filters than augments. Facebook and Snapchat, as such, are some of the first social media sites to incorporate augmentation into their apps. Snapchat, being the first of the two to offer social augmentation, would transform oneself into demons, puppies, and wearing fairy tiaras. In fact, my attention was drawn to it by the bredth of users, from porn stars to famous art curators—personal augmentation creates an “other” space that calls into question gender, race, and species. However, it is notable that the alterity of personal semiotic space is expanding under the visual regimes of augmentation, and I call for more research in this area.

4 The Semiotics of AR

The semiotic space of AR is peculiar in that it is a potentially fluid one, dependent on any number of factors. Depending on modality, Fiducial, Planar, Locative, Environmental, or Embodied, the relationship of the viewer’s position to the subject can be quite relative, interactive, or locative. For example, consider a user in a geolocative installation with, for example, an iPad. Any media is relative to the viewer’s location, point of view, and how the infoset overlays itself on the “picture plane” of reality as represented by the device’s camera and the AR application. Consider if that interactive media is intrinsically dynamic, the chain of signification separates from what Duchamp called the merely “retinal” and becomes haptic as well. The relationship of the viewer, landscape and media infoset compounds the point of view through multiple points of interest (POIs) in the landscape, sliding into a Massumian constant state of becoming (Massumi 2002, 37), as the relation of the viewer and the multiple planes of subject constantly reconfigure into their new positionality. These are, at least in the case of locational and interactive AR, the problem of the fluidity of becoming-signification in relation to the landscape/mise en scene. In the case of the Planar mode of augmentation, the target is often static and the relation is a simple overlay of the augment over the given recognized signifier. Now that I have alluded to the complexities of the relation to media in augmented spaces, their modalities are subject to study.

5 The Structure of the Gesture in Augmented Reality Art: Fiducial, Planar, Locative/GPS, Environmental, and Embodied/Wearable

Augmented art is actually a catchphrase for at a number of different technologies for overlaying virtual content on actual scenery since the term’s coinage by Caudell and Mizell at Boeing in 1992 (Caudell and Mizell 1992, 659–669). In this chapter, I will propose five categories of augmentation, and if any are overlooked, I hope it will be because of new developments since this writing. These techniques consist of the five categories mentioned above, Fiducial, Planar, Locative/GPS, Environmental/Spatial and Embodied/Wearable. While some of these categories overlap or may have indistinct boundaries, such as the intersection of the Fiducial and Planar recognition, it is hoped that they give the critical scholar studying augmentation a discursive toolset. Each of these modalities situates the viewer, content, and overlaid environment in ways that create specific gestures of media delivery.

When speaking about gestures in AR, I reference two of my other essays that take a similar analytical approach to examining situations involving virtual media, The Translation of Virtual Art (Lichty 2014a, 444–462), dealing with art in virtual reality, and Art in the Age of Dataflow (Lichty 2013a, 143–157), which examines the development of electronic literature since Joseph Frank’s theorizing the notion of Spatial Literature in the 1940s (Frank 1991). My contention is that there is an origin, content, and Arakawa and Gins’ concept of a “landing site” (Hughes 2012) for the augmented gesture, which is a destination in a process of communication, but not necessarily a basic sign/signifier relationship. The reason for this is that in AR, although there can be these simpler situations between the viewer and media, like Planar recognition calling forth video overlays, there are others such as dynamic media in GPS-based/Locative installations. These include AR like Richard Humann’s Ascension and Pappenheimer/Brady’s Watch the Sky, which I will discuss in the Environmental section. As in The Translation of Virtual Art, the AR gesture varies in its relationship between origin and receiver, from double signification in the case of Fiducial and Planar, to a dynamic semiotic matrix of constant becoming-meaning in the case of GPS/Locative applications. What I will attempt to do is to progress from a more basic/historical framing of AR mediations and 2D situations, unpacking the gesture into more complex sites of engagement, with the understanding that there will be some examples that overlap and double themselves within my categories. These categories are presented as propositions that are used as “handles” from which a discussion of the different forms of augmentation can be formed.

The “gesture”, as specific to AR, consists of a line of attention/flight between the interactor and the superimposed media overlaid on the given environment, such as attention given to a piece of media situated in 3-space, or by orientation as in the case of Fiducial tracking. As one can imagine, the semiotic relationship between the interactor, the environment, and the augment becomes complex, as simple media overlays become multi-faceted interactive experiences to dynamic augmented spaces that can be updated on the fly.

5.1 Fiducial AR

One of the earlier forms of Augmented Reality is that which uses a specific digital, or Fiducial, marker that gives a unique signature to an objective “seen” by a computer camera. This was the primary form of tracking for the works I first saw in the mid-to-late 1990’s, especially the work using the ARToolKit and the work coming from ATR Kyoto. The Fiducial marker gives information for six degrees of orientation (XYZ orientation, pitch, roll, yaw) and locates the AR content easily in 3-space. My first introduction to AR was Berry and Poupyrev’s Augmented Groove (Berry and Poupyrev 1999), developed at the ATR Kyoto research lab (Fig. 6.1). This work was, in essence, an augmented DJ station in which participants could make audiovisual mixes through the manipulation of vinyl albums with Fiducial markers printed on them. From the documentary video, the user is presented with a character sitting atop the dial on the record, which changes orientation/values through tilt, rotation, etc. As Berry and Poupyrev write in the work’s statement: “The performer modulates and mixes compositions by manipulating real LP records. The motions of the records control filters, effects, and samples dynamically mixed in and out of the groove. A composer can assign any element of composition to any record, and simply removing one record and bringing in another controls the song progression. Effects, filters and sample triggering are all assigned to any of the four record movements and can be controlled interactively using simple physical records rather than numerous dials and sliders” (Kaltenbrunner 20032014).

Fig. 6.1
figure 1

(Berry and Poupyrev 1999)

Augmented Groove

Considering this work was conceived in 1999, it radically predates environments like the Music Technology Group’s Reactable in Fig. 6.2 (Jorda et al. 2005) and the work being done with “Hybrid UIs” being done with Feiner, et al. at Columbia (Sandor et al. 2005) Groove used an overhead camera, as opposed to the latter piece’s use of cameras underneath a translucent table as in the case of the Microsoft Surface tabletop computers. Feiner, et al’s Hybrid UI interface uses a combination of Microsoft Hololens, Leap Motion controller, and Perceptive Pixel desktop computer (formerly Microsoft Surface) to allow images to be used as markers for a vertical interface structure on the table from which the user can manually pick a hologram from the overlaid interface. Augmented Groove showed the use of Fiducial markers as controls, but one of the more popular demos of 3D overlaid media would emerge through videos of demos of ARToolkit proofs of concept using a particular animated character.

Fig. 6.2
figure 2

(Jorda et al. 2005)

Reactable

5.2 Fiducial AR: The Emergence of Miku

This viral example of a pop-cultural Fiducial AR application is the fusion of the free program Miku Miku Dance and AR Toolkit. To understand the confluence of elements to lead to the profusion of videos of “anime” character Hatsune Miku dancing on Fiducial marker cards, a little cultural unpacking is in order.

AR Toolkit is the product of Hirokazu Kato of the Nara Institute of Science and Technology in Japan, created in 1999. However, it took 2 years for it to be released by the University of Washington’s HIT Lab, with over 150,000 downloads from SourceForge.net, according to that site’s statistical tracking (Kato and Billinghurst 1999). It is a series of libraries allowing programmers to orient media to a Fiducial marker relative to its appearance through a webcam or other optical input device. By the mid-2000s eligible media included animated 3D content as seen in Fig. 6.3, which leads to the Japanese virtual pop idol, Hatsune Miku.

Fig. 6.3
figure 3

Miku Hatsune AR, late 2000s

In many ways, Hatsune Miku is the realization of William Gibson’s autonomous virtual pop Idol Rei Toei from his Bridge Trilogy (Williams 2012) in that “she” was released as a character representing a text-to-song program called Vocaloid (Vocaloid.com 2014) by company Crypton, released in 2008. Based on text-to-speech technology developed by Yamaha, Hatsune Miku is the first of a series of Vocaloids to utilize granular synthesis of sampled vocalists (Miku being modeled from the voice of Saki Fujita). What would follow is a series of music videos, especially after the release of Miku Miku Dance, a character animation program starring Vocaloid characters, also released in 2008. High points for the Augmented persona in real space would be in Fig. 6.4 large-scale music concert using imagery developed by UK company Musion, which would also reflect Digital Domain’s Virtual Tupac spectacle at Coachella 2012 (Verrier 2012) and large-scale performances based on the Miku genre, such as “Still Be Here” at Berlin’s Transmediale festival in 2016 (Fig. 6.5).

Fig. 6.4
figure 4

(Image courtesy Digital Domain 2012)

Virtual Tupac

Fig. 6.5
figure 5

(Courtesy Ars Electronica)

Miku Hatsune Still Be Here

The virality of the Miku/Vocaloid technology made her an ideal subject for an AR companion. Since 2009, numerous Hatsune Miku demos based on Fiducial markers on paddles would arise, even to the point of applications using the Oculus Rift headset to let you “live” with or sleep alongside Miku. This is more in the realm of what this essay terms as the Environmental or even Embodied/Wearable gesture of AR. This is a step more advanced than the GPS/Geolocative, placing the augment in space through environmental feature recognition rather than accessing and external GPS database of Points of Interest (POIs) linked to associated media.

New York artist Mark Skwarek created novel uses for the Fiducial marker on the body. The first example is the Occupy Wall Street AR project, (Skwarek et al. 2011, see Fig. 6.6) which was a political intervention by collective Manifest.AR. This intervention took place in front of the Stock Exchange, which is unique in that interventions and protest was only allowed in Zuccotti Park. The intervention was docented, as passersby were invited to don a helmet with a marker, and when the wearer views himself or herself with the front-aiming camera, they would see the engraved portrait of Washington from the US one-dollar bill instead of their head. Skwarek would reprise this gesture in creating markers for “Virtual Halloween Masks” (Poladian 2013), where anyone could download a given marker and app, and suddenly appear with a skull or jack-o-lantern head (or in their hand or wherever the marker would be placed). These are both wonderfully playful applications of the Fiducial gesture. One other artist has used the Fiducial and the Recognition gestures in his performance work and presents segues between these gestural modalities.

Fig. 6.6
figure 6

(Courtesy Skwarek 2011)

Occupy wall street AR mask

Jeremy Bailey (the “Famous New Media Artist”) is a Toronto-based artist who uses markers-based AR in strange and unexpected ways. As Skwarek’s placement of the marker on the body induced a straightforward semiotic swap, Bailey makes peculiar formal translations. His Video Terraform Dance Party (Bailey 2008), performed in Banff, Alberta shows him bobbing his head around, sculpting a virtual island and populating it with virtual birds and citizens as he narrates their creation. In Fig. 6.7, Bailey remaps his entire face as a faceted television with three “channels” that he controls with the tracking of facial markers that he tries to communicate while describing the piece through his stuttering, self-effacing banter in The Future of Television (Quaintance 2013). There are awkward moments, as he calls up a strobing stream and calls it “The Epilepsy Channel”, and then thinking better, he tries to “save” and switches to portraiture of his wife. Bailey then slides into the Planar/Recognition modality with his Important Portraits (Smith 2013), which was a Kickstarter project that became a gallery exhibition at Pari Nadimi Gallery in Toronto. He invited “important” patrons to fund the project for a show in which he would use dramatic portraits of the funders as Planar markers for dynamic geometric augments. Bailey provides a segue and is important in his manic usage of AR modalities somewhere between a Japanese Mecha Epic and baroque portraiture that has moved from usage of Fiducial markers to facial/feature recognition that is hard to categorize.

Fig. 6.7
figure 7

(Image Courtesy Jeremy Bailey)

Future of television

5.3 Planar Recognition AR

Although similar to the idea of the Fiducial marker in that it exists on a surface of some sort, the gesture of the Planar/feature recognition augment exists as a superset of the Fiducial modality. The Fiducial was specified for its historical significance, but the Planar/print/poster form of AR exhibits a broader scope than the digital marker, and in popular media, often performs a more straightforward function. In a TED talk presented in 2012 by the makers of the Aurasma, (now HP Reveal) AR technology (Mills and Roukaerts 2012), Matt Mills and Tamara Roukaerts demonstrate the recognizing gaze through aiming a mobile device at an image of the Scottish poet Robert Burns, as in Fig. 6.8. By scanning the image, a perfectly overlaid video of an actor, approximating the trompe l’oeil of the painting, appears and begins to orate. While more sophisticated than the Fiducial gesture, AR feature recognition of media is often an overlay of content onto print media. Other examples are of an IKEA AR experience, and even Fingerfunk’s Alien chest burster experience which tracks from a t-shirt image (Woermer 2012). All of these augments are, in this writer’s opinion, either simpler than or at best equal to a Fiducial, creating a simple semiotic swap, however lurid or graphic.

Fig. 6.8
figure 8

(Image Courtesy Aurasma 2012)

Matt Mills’ TED talk demonstrating Aurasma technology

Esquire Magazine also uses this technique in a famous example in its Augmented Reality issue in 2009, “graced” on the cover by Iron Man star Robert Downey, Jr., as illustrated in Fig. 6.9. What was unique about this issue is not only the fact that the Fiducial markers summoned a mass of entertaining media through the issue, but reorienting the markers would elicit different responses. Turning the marker sideways would cause Downey Jr. to lounge on his side, playing the raconteur in another way, cause the fashion models to be represented in another season, or call forth another “Joke Told by a Beautiful Woman”. This publication used the potential of the Fiducial and Planar gestures extremely well in not using the orientation of the marker as for mere orientation (tilt, rotation, etc.). The Downey issue was an initial example of what is now a fairly common marketing application of AR.

Fig. 6.9
figure 9

(Image Courtesy Esquire Magazine, 2009)

Activated esquire AR issue cover

As interactive interfaces emerge in all AR technologies unique possibilities. But as we unpack the representational modes of AR outward from interacting with Planar media, the user encounters AR in spaces. This is where the modalities of Environmental, Geolocative, and Embodied/Wearable AR come into play. The difficulty with studying these forms of mediation and interaction is that they both engage space in different, but equally valid ways. Because of Environmental recognition being closer to the Planar/Fiducial than Geolocative and Embodied AR, this will be our next category.

5.4 Locative/GPS-Based

The last gesture/modality in AR, and the most complex, is that of Locative/GPS. This is due to the dynamic relationship between the user, the media linked to points of interest in the landscape, and the objective background upon which the media is overlaid. Many variables are in play as the relationship between user, media and landscape as with the Environmental modality, and dynamic content creates a fluid matrix of representations, creating a sort of semiotic pinball machine. Fortunately for our analysis, and perhaps disappointingly for the work itself, most Locative AR work consists of overlaid imagery or video on static POIs (Points of Interest). This author understands as with all our gestural modalities that there are commercial applications, like the Fiducial application used in the Esquire Magazine issue that has surpassed many of the artworks in our discussion in leverage of the potential of the medium. In addition, Locative AR art constitutes the majority of the medium, so only a brief number of works will be discussed here, and apologies to the mass of work in this gestural realm that is elided. For purposes of interest, I would like to discuss installations that address certain topics—politics and geographical annotation. Each throws content in useful or illegal/unexpected places and creates a double signification of the location through overlay and context.

Political work is one of the smaller genres in AR, although interventions like We AR MoMA (Sterling 2010) have used AR to create salons des refuses inside prestigious museums without actually sneaking into the space and nailing the work to the wall. Figure 6.10 Occupy Wall Street AR (Holmes 2012), organized by Mark Skwarek for the collective Manifest.AR, inserted technically illegal content over the Stock Exchange. The illegality of the gesture is marked by the fact that during the Occupy Wall Street campaign, intervention was only permitted in Zuccotti Park, as it private property. So, collective members, (Mark Skwarek, Alan Sondheim, et al.) “docented” the work to passersby, which included flaming bulls, Space Invaders, the Monopoly game plutocrat, and slot-machine wheels between the columns of the Exchange, playing on Brian Holmes assertion of “Market as Casino” (Holmes 2012). What I feel was unique was that the Occupy AR interventions are an art intervention where the “infopower” is not constrained by material or as I call it, “atomic” power (Lichty 2013a, b, c, 53). As mentioned in a 2013 panel on AR as Activism at the festival South by Southwest, the question was posed as to whether law enforcement could demand the reorientation of a Locative database if it was representing protest in a restricted space. This question was revisited as this author also penetrated controlled airspace with Love Bombers, in which Fig. 6.11 depicts NATO A-10 Warthog Ground Support Bombers, dropping video game 8-bit hearts on the NATO summit in Chicago and the corresponding protesting mobs.

Fig. 6.10
figure 10

(Image Courtesy Skwarek 2012)

Occupy wall street AR

Fig. 6.11
figure 11

(Image Courtesy Lichty and Skwarek 2012)

Love bombers

Two other AR augment works that overlay historical content onto geographical environments are Annette Barbier and Drew Browning’s group collaborative project 2012 Expose, Intervene, Occupy (EIO) (Tripp 2012). EIO used locative and recognition technologies to insert critical narratives into the downtown Chicago landscape. Examples of the eight AR collaborations include Barbier’s 2070 as seen in Fig. 6.12, exploring the progressive invasion of the Asian Carp into the North American Great Lakes through the Chicago River, an alternate historical street sign narrative, and a Mario–Bros. romp by Mat Rappoport that invites the interactor to chase coins through Chicago’s Financial Sector in Fig. 6.13. Two other conversational pieces are PolyCopRiotNode by Adam Trowbridge and Jessica Westbrook that features an ominous cybercop, commenting on the law enforcement culture of Chicago, and WeathervaneAR by John Marshall and Cezanne Charles that has many instances of a “robotically-driven” chicken head, playing on post-Millennial paranoia. Where the Occupy AR series had more of a unitary format, EIO creates an “anthology” of works describing how AR can be used as a tool of psychogeographic inquiry. Of note is the unfortunate fact that due to the change in policy of companies providing the technological infrastructure for the work (similar to the removal of non-profiting movies from blip.tv in December 2013) EIO is now inactive.

Fig. 6.12
figure 12

(Image Courtesy Annette Barbier 2012)

EIO: 2070

Fig. 6.13
figure 13

(Image Courtesy Rappoport 2012)

EIO: Coin Chase

Lastly, I want to mention another participatory and political work Watch the Sky (Fig. 6.14), by Pappenheimer/Brady. It uses GPS and a Web-based input to suggest a larger Harlem Watchtower in Marcus Garvey Park in Harlem, NYC. As Pappenheimer states, “It projects the need for a future much taller structure, “Harlem Watchtower+”, to survey the global affects of anthropologist Arjun Appadurai’s technoscapes as they intersect with other dimensions of the current multi-valent landscape in flux. Thus the original fire watchtower extends its functions to global vigilance and wide-ranging critical views as influences on the local neighborhoods and events” (Pappenheimer 2016).

Fig. 6.14
figure 14

(Image Courtesy Pappenheimer 2016)

Watch the Sky

It also uses a mobile app to invite participants to skywrite in AR over the Garvey park site. The project invites public commentary while using public architecture as political commentary, both historically (the fire watchtower) and contemporary (the notion of vigilance and surveillance in the age of political strife in the US). In some ways, this reflects works like Nathan Shafer’s Seward’s Success, based on unrealized megastructure plans in Anchorage, Alaska to comment on the human and political landscape of a given site.

5.5 Environmental/Spatial Recognition

The next challenge that arises from recognizing an image as a Fiducial marker is that of recognizing a space from a given point of view. This introduces any number of problems, from perspective to time of day, weather, or occluding bodies in the scene, such as vehicles or other bodies. This has largely left the application of environmental AR to indoor applications that have fewer variables. Of course, outdoor applications in regards to machine repair are part of the original Boeing concept and military applications (Caudell and Mizell 1992, 659–669), but these are close-range situations with very specific, regular spatial configurations. Environmental/spatial recognition applications at the Embodied or the architectural scale can present more variables and present challenges in regards to tracking the environment. For the purpose of discussion, I will present examples that will expand in size, and explore a couple examples of intimate environmental experiences that refer to earlier examples in this essay. I will begin with that I feel is still one of the best environmentally based AR game/apps, Hermaton by Darf Design.

Hermaton (Holmes 2013) is an environmental AR game developed by London-Based Darf Design, founded by Sahar Fikouhi and Arta Toulami that uses a half room-sized cut vinyl mural as marker when presented at an environmental size. There is a “tabletop” version that uses its own marker that fits very well into an advanced category of the feature recognition category, but for the sake of our conversation, the room-sized version in Fig. 6.15 is more germane. As their project statement describes Hermaton: “The project uses a buzz wire maze (think: the children’s game “Operation”) which people can navigate through in real-time, attempting to interact with the digital objects of the “Hermaton” machine. The design of this environment provides both an interactive and performance space which allows the user to fully immerse in a new augmented physical landscape” (Fikouhi and Toulami 2013).

Fig. 6.15
figure 15

(Image Courtesy Darf Design 2013)

Hermaton

The user controls a small red ball through the maze-like machine, switching on its lights, and progressively activating the Hermaton. In addition, the user is placed in what I would call a “performative” media space (Lichty 2000, 352) where the body has to physically stretch, crouch, and twist through the virtual machine. Where I draw the line between performance and performativity in media art, including AR, is the implication of audience in experiencing the piece. In the case of environmental AR, there is a becoming-action in navigating the work, but the existence of audience in the space or not is purely incidental, but there is activation of the space.

Another example of environmentally based AR works is Richard Humann’s Ascension project. Ascension, based on the Membit AR platform is a mix of environmental and Planar AR art (Fig. 6.16). In installations in NYC and during the Venice Biennalle, Humann reenvisioned the constellations of the night sky. This was done by placing images that recognize a certain view and perspective as a site of image recognition for Humann’s new mythologies. Instead of merely taking captured images and placing them at the GPS coordinates with the proper orientation, Humann edited them and replaced them with the constellated images. As Membit founder Jay Van Buren says, the technology was originally meant to leave memories, but it can also be used to leave things that never were, which is a provocative element of AR (Membit 2017). One is led to wonder of the veracity of simulations in the landscape in the age od “Fake News”, as one person’s satire has become another’s reality hack.

Fig. 6.16
figure 16

(Image Courtesy Richard Humann 2017)

Ascension

Another larger-scale VR object is the author’s The Kenai Tapestry (Fig. 6.17). Although smaller, the 5-by-21 foot Jacquard-woven textile is a panoramic composite of online and actual photography taken by this author from a 2009 photographic project in Alaska on the Kenai Peninsula and Adak Island. The piece refers to instruments of power such as the Bayeux Tapestry, which depicts the Battle of Hastings, and the culturally transformative nature of the Jacquard Loom at the turn of the nineteenth century much in the way globalization and mechanization do today. The 5-by-21 foot size is appropriate for depiction of the grandeur of the Alaskan landscape. For augment tracking, it uses QR Codes as Web links or Fiducial markers, and features like bird flocks and sunlit highlights as recognizable features. The content (doubly accessible in the case of the QR Code) refers to the artist’s experience of the Alaskan environmental embarrassment of riches while forces such as oil and mineral industries and global warming encroach this remote part of the world. Kenai Tapestry, in its own way, depicts another form of conquest that is the Enlightenment-era notion of the human subjugation of nature, currently termed as the Anthropocene Age (Crutzen and Stoermer 2000, 18). In this way, this work frames itself in a historical context while still forming a critical stance. But other applications root themselves even deeper in history and reveal exciting potentials for the illustrative power of environmentally-based AR.

Fig. 6.17
figure 17

(Patrick Lichty 2014b)

Kenai Tapestry

Nathan Shafer’s Exit Glacier Terminus AR shown in Fig. 6.18 reveals a history of the retreating terminus of the Exit Glacier on the Alaskan Kenai Peninsula. Exit Glacier, created for interpretive rangers with the Kenai Peninsula National Park, is a unique application that specifically recognizes the terrain from its own database, as there is little data connectivity at the site, and had to use its own tenuous Wi-Fi transceiver. Exit Glacier is also unique in that it is one of only two walk-up glaciers, and the AR application will show five distinct reconstructions of the glacier face from 1978 to 2013. The challenge connectivity problematizes the project with most AR frameworks. But conversely, the project’s ironic Alaskan self-sufficiency presents a certain kind of utility that is particularly useful at the edge of the wireless world.

Fig. 6.18
figure 18

(Image Courtesy Nathan Shafer 2013)

Exit glacier terminus AR

5.6 Between the Environmental, and Embodied: The Return of Hatsune Miku

In this section, the AR applications depicted have ranged from interior architecture to the geologic, but a peculiar subset of environmental applications have emerged in Japan, based yet again on our virtual pop idol, Hatsune Miku. I place them between the Environmental and the Embodied/wearable modalities as they entail both a Kinect-like spatial camera linked to the headset, making them Embodied, but specifically about orienting the subject in the environment. The subject in question is Miku herself, and the applications are Miku Stay, a series of experiments to have Hatsune Miku as a happy live-in girlfriend, and another to take the interaction one step further and situate Miku as a sleeping partner.

In Miku Stay (svx 2013), created by a YouTube member named “alsione svx”, Miku exhibits complex interactions like walking up to the viewer in a park as in Fig. 6.19, walking around a kitchen, and sitting in a chair (and impressively dealing with occlusion by walking behind it) and holding hands. Most of these are accomplished through spatial camera and Fiducial markers, but eventually alsione svx mentions that he cannot stand using these any more in the video, so he uses environmental cues such as the chair as a marker. She comes over, stands on the bathroom scale, holds hands and then jumps around laughing merrily. Miku Stay is a feminist’s nightmare, as the app allows the user to live with a hopelessly idealized “waifu” creating expectations unattainable by flesh and blood. If this was not problematic enough, the Sleep Together app (Miku Miku Soine, Fig. 6.20) by Nico Douga (Tackett 2013) takes this one step further, as Miku becomes the user’s bed partner, calling them “Master” and comforting them if there is restlessness in the middle of the night.

Fig. 6.19
figure 19

(Image Courtesy “alsione svx” 2013)

Miku stay in park

Fig. 6.20
figure 20

(Image Courtesy Nico Douga 2013)

Miku Miku Soine

Awkward at this may seem, if we return to the gesture of locating the subject in space using environmental AR, we find that there is a second Miku-as-AR-girlfriend game for the PS Vita, entitled Hatsune Miku Project Diva F (Tolentino 2012). The “song-masher” game (as I call the genre of musical coordination games from Dance Dance Revolution to Guitar Hero) includes a markerless AR app that allows Miku to hang out in your apartment, as seen in Fig. 6.21, and sit on your bed. Is this the isolate hikikomori’s dream, or as Josh Tolentino states in Japanator, “Mindless waifu (“waifu” being a fan term for idolizing an anime character as a possible mate) gimmickry.” Hatsune Miku Project Diva F is definitely in the area of Environmental AR, but in all these examples, the question remains whether AR suggests what Bruce Sterling calls a “design fiction” (Sterling 2013) to alleviate technological isolation? As a note, in March 2017, the Gatebox Virtual Robot project (as in virtual “wife”) announced that it would be releasing a Hatsune Miku version of its product shown in Fig. 6.22 (Crunchyroll 2017). It would allow the user to “live” with the character, trade SMS texts during the day, and have her control the lighting, etc. of the apartment. Although the Gatebox does not represent an AR application as such, it does talk about desires for us to live telepresently and with virtual companions and draws sharp questions about the role AR will play in our interpersonal relationships.

Fig. 6.21
figure 21

(Image Courtesy Crypton, Sega 2013)

AR Shot from Hatsune Miku Project Diva F

Fig. 6.22
figure 22

(Image Courtesy Gatebox, 2017)

Gatebox Hatsune Miku

5.7 Body as Landing Site: Wearable AR

In my 1999 essay, Towards a Culture of Ubiquity (Lichty 2013c), I trace a trajectory of where interaction/delivery of media/mediated reality would be situated. First is the screen, then into the hand(held) device, then onto the body, and then onto space and architecture. Although wearables and Locative technologies have happened far more in parallel than I envisioned, the general trajectory seems on track. There are multiple platforms are overlapping, such as the Epson Moverio/ODG/Microsoft Hololens and Meta platforms, and have supplanted the long-dead Google Glass platform.

In An Alpha Revisionist Manifesto (Lichty 2001, 443–445), I theorize many years prior to this writing, in the future, companies will create pre-prototype narratives and what Sterling would term as “design fictions” to inspire the funders, developers, and consumers into willing their dreams into being. Of course, in the mid-2010s this manifested itself as slick, slightly overpromising promotional videos of the coming platforms. In many ways, they reflected the tropes in current science (or near-future speculative) fiction, as we will see below.

In popular culture, the world of AR has given way from science fiction to design fiction, although there are excellent examples of AR as trope in books like William Gibson’s Spook Country (Gibson 2007, 8), which features a subplot about AR artists depicting the deaths of celebrities at their place of demise. There are plenty of examples in movies as well, such as Minority Report’s dressed-up version of Oblong’s user interface (Underkoffer 2010). However, as it seems, science fiction is giving way to “design fiction” as a way to capture the popular near-future imaginary. The leading design fiction in 2013 involving the Embodied AR gesture, and ironically, the ultimate “chick device” (and I use that phrase with a healthy dose of derision) is Sight (Sakoff 2012), a dystopic AR fantasy by filmmakers Eran May-raz and Daniel Lazo. The opening scene finds our protagonist, Patrick, mime-flying in an austere room. In the next shot, we switch to his eyes, which have been equipped with Sight Systems’ lenses, which show him playing a flying obstacle course. “Sight” technology has apparently revolutionized life as we know it, from augmenting the contents of the refrigerator to making such mundane tasks as frying an egg or turning cutting vegetables into a “Master Chef” game. The story turns darker as in Fig. 6.23, Patrick goes out on a date, using Sight to choose the ideal wardrobe and social approach using his “Wingman” app. After making a few initial gaffes, Patrick wins his date over, and we find out he is, in fact, an interface engineer for Sight Systems itself. They go back to his apartment for a nightcap, and his date notices that Patrick forgot to turn off his scoreboard on the wall, and sees that he has been using the Wingman, and storms off. This is actually not a problem, as he reveals that the secret feature of Sight is to be able to hack consciousness itself.

Fig. 6.23
figure 23

(Image Courtesy Eran May-raz and Daniel Lazo 2013)

Sight

This is also similar to a 2016 episode of the serial Black Mirror called Playtest (Fig. 6.24), in which a thrillseeker, accepting a job with game company SeitoGemu, experiences a neural interface AR system that inadvertently accesses the recesses of his psyche and renders him psychotic (Brooker 2016). This is where my axiom that most authors should not write their last chapter. This is due to the fact that although Sight and SeitoGemu offer marvelous insight into the probable future of Embodied AR, the worn trope of mind control sneaks in. It is also a commentary of technoculture’s growing distrust of Sterling’s notion of the five global vertical monopolies he calls “the Stacks” (Madrigal 2012), as Sight is an obvious commentary on Google Glass taken to its logical extent. The irony of this is that with the advent of Snap, Inc’s Spectacles, for Snapchat, there has been little reaction to this device, perhaps due to its more “friendly” corporate profile.

Fig. 6.24
figure 24

(Image Courtesy Channel 4 Television)

Black Mirror: Playtest

In the world of art, the speculations are conversely much wilder and more constrained.

Keiichi Matsuda’s Hyperreality (Fig. 6.25, Matsuda 2016) shows a near-future scenario of a contingent worker in Medellin, Colombia, doing menial jobs for “loyalty points”. Her visual field is constantly polluted with game-like challenges, here virtual Shiba Inu puppy, offers, and encouragement from her virtual coach. She struggles through the hypermediated landscape until an indentity hacker stabs her in the hand stealing her points. Juliana, the protagonist, in desperation, finds the nearest shrine and becomes a Level One Catholic.

Fig. 6.25
figure 25

(Image Courtesy Keiichi Matsuda)

Hyper-Reality

But the realities of wearable AR art are far more modest at this time, and apparently involve mind control or cybernetic psychosis. Artsy and Pace Gallery’s Studio Drift through curator Elena Soboleva teamed up to create a work called Concrete Storm for the 2017 Armory Show. (Figure 6.26, Burdette 2017) It is a mixed reality installation with concrete constructions that act as registration points for the augmented sculptures. As the used wears the HoloLens, they see the physical components of the installation as well as the augmented concrete pillars, that the users can manipulate, break and build. Although this is a relatively formal piece, this is a good example of early “Holographic AR” art.

Fig. 6.26
figure 26

(Image Courtesy the artists)

Concrete Storm

5.8 Next Steps: Mixing Metaphors/Mixed Realities

Claudia Hart’s Alices body of work (Fig. 6.27, Hart 2014) is one in which I hesitate to place into any of the previous areas because of its intermedia nature. Her use of Fiducial markers on ceramic plates, on bodies in motion, and in VR place her close to the genres of Fiducial and environmental recognition. But the use of AR in gallery, environment, and performance situations make the work unique in that AR is not a focus, but a facet of the work. From plates and napkins in The Looking Glass Collection, which place a reclining odalisque over the viewer’s meal to Alices Walking, an Edward Campion-scored performance in which performers wear Planar markers that are activated with the artist’s smart device app. In Alices Walking, the markers reveal the hidden narratives of the performers, such as “I wonder if I have been changed?”. Also, motifs from the ceramic work reappears, creating a pastiche of relections on “… how queer things are today.”

Fig. 6.27
figure 27

(Image Courtesy Claudia Hart)

The Alices Walking

Hart’s work is unique in terms of its multimodality—AR and its representative function is not the focus of the work but an aspect. AR is not the primary mode of delivery, but an aspect. In this way, this work escapes the genre as technofetishistic site and enters the zone of aspect of gesamtkunstwerk, which erases the focus on the viewing device. The use of AR in performance, as well as in public action, activates the form to something beyond a technological attraction.

6 Conclusions

By looking at Augmented Reality as a delivery method for artistic content, then investigating it as a frame for mediation, a discussion is opened up that ties deeply into art-historical tradition and novel modes of “becoming.” From Duchamp’s notion of the “retinal” to Mulvey’s masculinization of the gaze and pervasive imaging’s fracturing and possible “queering” of the mediated gaze, AR and my proposed gestures/modalities of representation suggest ways in which artists are using AR in service of cultural production. By beginning with historical technologies like Fiducial tracking, we can trace an epistemic arc as AR unfolds into image recognition, spatial location, and Embodied interaction. Additional layers of interaction are embedded into AR in the handheld and wearable units, more layers of signification are stacked into augments. However, it is also important to note that AR as of 2014 is still a medium in its adolescence, as technologies in an “Alpha Revision” state rely on design fictions and crowdsourced bootstrapping to will them into being. This decade-later extrapolation of my idea of Alpha Revisionism has culture in a state where science fiction begins to pale in light of propositional videos and developer kits for Star Trek-like devices. In conclusion, it is this author’s hope that he has left points for further discussion, made a discursive framework for the genre, and set up a number of propositional qualia for the study of Augmented Reality. In my first edition of this essay, I had hoped for the datedness of technological speculation to keep the essence of the principles of this essay, except for advancement of the technology and the creation of a larger historical framework, the primary tenets here remain. Again I hope that the ravages of time remain minimal as the genre of AR moves forward and the conversation continues.