The following paper has two aims. The first and prime goal is to discuss the potential role that different types of visualization can play in archaeological discovery and discourse. The second goal is to suggest the need for a new archaeological specialty concerned with the use of archaeological information within the context of information systems. Both of these goals are predicated on the ever growing use of information systems in archaeology and the potential new venue these provide to exploit, among other topics, the theory-ladenness of archaeological data.

The importance of visualization has been widely recognized in many different fields, including medicine, genetics, physics, mathematics, and economics, as being a key component in knowledge discovery and interpretation. Its importance has been such that it is currently considered to be its own discipline and has several dedicated journals (e.g., IEEE Visualization and Computer Graphics, Information Visualization, Computer and Visualization in Science). Different forms of visual communication are used to make massive amounts of data readable, to illustrate the results of a model and make them more intelligible, and to render part of an argument visible by converting it into a “reasoning artifact” (Thomas and Cook 2005, pp. 36; Gooding 2008, para. 50).

Reference to visualization in archaeology can be found in several earlier publications. Of these, it is worth pointing out the edited volumes by Reilly and Rahtz (1992) and, more recently, Frischer and Dakouri-Hild (2008). Reilly and Rathz’s volume provided a series of early examples where visualization is discussed in relation to geographic information systems (GIS) (Lock and Harris 1992), excavation, and survey data. These examples were generated at a time when access to software (e.g., GIS) and computers was still quite limited, and computer output was rather restricted. Frischer and Dakouri-Hild’s (2008) edited volume provides a good survey of visualizations that are currently found in archaeology. Within this work, Gooding (2008) provides an excellent discussion on the importance of visualization in science. His chapter could be used as a preamble to any course on visualization.

Although separated by a decade and a half, these two volumes are similar in their preoccupation with implementation and technical issues. In both, the concept of visualization is often restricted to purely “empirical” and/or realistic visualizations, that is, visualizations that are intended either to emulate reality (virtual reality, discussed below) or to display data in traditionally abiding ways. Visualization itself is seldom discussed, that is, there is little concern with questions about what are the best strategies to display and/or visually explore archaeological information or with alternative ways in which the data may be represented, the merits and limitations of each, etc. Instead, rather than discussing the nature of visualization, these earlier publications treat visualization as a matter of applying existing software. With few exceptions, the purpose of visualization in archaeology is largely reduced to a question of illustration and/or digital archiving.

Archaeological visualization has not developed much since the introduction of stratigraphic sketches by Boucher de Perthes (1847) and distribution maps by O.G.S Crawford during the early twentieth century. Even with the incorporation of advanced tools such as GIS, the use and production of forms of visual communication remain painfully narrow. Given how much of archaeology relies on recognizing and comparing patterns, spotting outliers, identifying relationships, and building arguments to forward interpretations, the lack of a focused interest in visualization is both surprising and unfortunate. For the most part, the role of visual output is restricted to legitimizing our output rather than as an active element within the archaeological reasoning machinery. Visual communication is reduced to the production of graphical tokens that make up a relatively fixed traditional checklist (e.g., site plan, profiles, etc.).

The importance of visual representations in archaeology becomes clear when we acknowledge the inherently analytical nature of any archaeological investigation: that the choice of representational genre presupposes a set of accepted conventions and purposes and that seeking new forms of data representation will have a direct impact on our ability to generate new interpretations. Indeed, the desire to find better ways to visually convey information as a vehicle to sustain particular interpretations appears in the work of various archaeologists (e.g., Cummings et al. 2002; Lucas 2001; Criado and Villoch 2000; Hamilton et al. 2006). Lucas’ (2001, pp. 157–162) discussion on how excavation units are “flattened” into objects after all information about them (except for their relation to other units) has been stripped away clearly illustrates this point. More specifically, Lucas makes reference to the use of the Harris matrix and the way it records exclusively the topological relationship between archaeological contexts leaving out other information, such as longevity or postdepositional use of the units. One effect of recording information in this way is that when reconstructing a site’s sequence, each unit is recorded as an event, and reference to the processes that occurred after the unit was formed and their associated temporality is totally lost (ibid., pp. 159). So if we were interested, as Lucas is, in understanding the longevity of each unit as a way to reconstruct the biography of the site/feature, this would be out of our reach given our choice of representation. Lucas’ discussion on the Harris matrix is a key reminder of the very limited diagrammatic choices we have when it comes to accepted representations of stratigraphy (essentially one). It also helps to highlight the critical role that the nature of such representation plays in any interpretation when used as the sole record of this information.

Is there a way we could produce a visual representation that provides information on a stratigraphic sequence, while still retaining other information such as the rate of deposition and duration of use at the same time? It is questions like this that make the need to explore the appropriateness of available visualization techniques and/or the possibility of developing new ones relevant, even central, to the archaeological endeavor.

Before going any further, it is important to clarify the specific meaning I attribute to the term “visualization” within the context of this paper. Visualization involves the mapping (transformation) of data or any sort of information into a representation that can be perceived. Theoretically speaking, this coupling does not need to be exclusively visual (could be haptic or auditory), but this is seldom the case and will not be considered here. The ultimate aim of visualization is to render data and information in order to ease communication, insight, and/or understanding. I would like to distinguish this definition from other “more realistic” forms of archaeological representation and illustration. Of course, this distinction is totally arbitrary for even the most “realistic” renderings include at some level a decision about what should be represented and how. This is precisely the main focus of the recent work done by Moser and colleagues (e.g., Moser and Smiles 2005) who have exposed the cultural and social baggage implicit in archaeological imagery. For the most part, here, I will be concerned with visualizations produced as part of the process of analysis and interpretation of archaeological data with a particular emphasis on those generated in connection to an information system. By concentrating on this particular use, I hope to extend and complement some of the topics that are currently addressed by Moser and colleagues.

In addition, I will purposely omit any reference to the application of Virtual Reality (VR) in archaeology (Barceló et al. 2000; Benko et al. 2004) given its limited role as an active tool in the process of archaeological discovery, analysis, or interpretation. Most examples of VR in archaeology capitalize on their ability to render “realistic” scenes and, in turn, to generate holistic experiences of the past. Many are aimed at the heritage industry or at generating some form of display for public consumption (Forte 1997; Niccolucci 2002). At some point, its application to archaeology was hailed as an improvement over tools such as GIS based on its ability to handle more complex spatial representations (Gillings and Goodrick 1996; Goodrick and Gillings 2000; Gillings 2004). However, as powerful as VR is when rendering complex scenes, it was never intended as an analytical tool. This does not mean that VR cannot be useful to archaeologists. The ability to render alternative versions of a model (i.e., a structure or a scene) can be instrumental in providing archaeologists new insights and raising new questions (e.g., Earl and Wheatley 2002). VR can successfully be used to display the results of complex simulation models and analytical tools (e.g., Winterbottom and Long 2006). But VR is ultimately a protocol for rendering 3D graphics. In a sense, the role of VR as an analytical tool is similar to the one that was associated with computer-aided design (CAD) systems when GIS first appeared. CAD may be used in combination with another tool to conduct analyses but, contrary to GIS, it was never intended to be an analytical tool or to aid archaeologists to discover and make intelligible data that otherwise would be impossible to grasp. The same can be said about VR.

Why Visualization?

The general rise in importance of visualization is likely to be a consequence of several interrelated factors. The ubiquitous use of the internet and/or other digital networks as a means for collecting, accessing, sharing, and distributing large amounts of diverse information is a key factor. This increase in quantity and diversity of data has raised serious computational challenges that could not have been met without a similar increase in processing power, particularly from commodity computers and distributed systems. The ever-increasing capacity of computers has, in turn, fostered the development of sophisticated forms of graphic representation. Far from being static or fixed in any medium, these new forms of visualization offer the possibility of being manipulated and queried interactively. Most importantly, the explosion in digital data has been fueled by the rapid improvement, and invention, of data-capturing devices like satellite imagery, digital cameras, medical scanners, genetic sequencers, and so on. Fields that have experienced the biggest surge in visualization are those in which such improvements resulted in a rampant increase of data (e.g., genetics).

The advancement of any specific data-capturing device is less likely to have the same type of impact in archaeology, but, as I discuss below, the amount of information that archaeologists handle is inherently large and complex. Archaeologists are already, and are likely to be more in the future, confronted with large and complex volumes of data. Access to data-rich sources and the use of devices that can deliver vast amounts of data, such as satellite imagery, geophysics, laser scans, LiDAR, GPS, portable XRF, to mention a few, are steadily increasing. The future adoption of XML, or a similar mark-up language, to tag archaeological data (Gray and Walford 1999; Crescioli et al. 2002; Isaksen et al. forthcoming), the development of widely accepted ontologies to describe archaeological information (Richards 2006), and the creation of digital archiving initiatives such as Archaeology Data Service in the UK and Digital Antiquity in the US (Kintigh 2006; Snow et al. 2006; McManamon and Kintigh 2010), aimed at enabling access to large databanks of archaeological information, will undoubtedly contribute to further increase the amount and variety of data accessible to archaeologists.

Review of Visualization

The field of visualization is broad and interdisciplinary, currently encompassing many subfields or specializations (for a comprehensive overview, see Card et al. 1999, also http://www.visual-literacy.org/periodic_table/periodic_table.html). While each of these subfields claims to exploit visualization with a particular focus or purpose, they all share a similar basic goal which is to amplify visual perception and cognition in order to ease the understanding of data. What distinguishes the different types of visualization is often the emphasis that is put on certain aspects of the visualization such as the nature of data, the method by which it was generated or collected, whether the purpose is strictly communication or exploration, and so on.

In the following pages, I attempt to provide some insight into the most significant subfields and to show their relevance to archaeology. As will become apparent, archaeologists have barely scratched the surface when it comes to the possibilities offered in this field.

Data Visualization

Data visualization (sometimes known as infographics or information graphics) is probably the oldest type of visualization and the one most people are familiar with as it is now currently used in many national newspapers. It is often static in nature and lends itself easily to being reproduced in print or as a hardcopy (see Fig. 1).

Fig. 1
figure 1

Based on one of Randall Munroe’s movie narrative charts found at XKCD (http://xkcd.com). The chart plots the interactions between key characters on a timeline, grouping the character lines together when they are interacting in the movies

The primary objective of data visualization is to communicate information more clearly and effectively rather than to facilitate the viewer’s ability to discover new information (as opposed to other types of visualizations). The exact topic of this communication can vary widely; sometimes, it is used for public consumption (as shown in Fig. 1); on other occasions, it is much more restricted and specialized. A common trait of examples in data visualization techniques is to generate succinct representations containing a large amount of information. The nature of visualizations classified under this category is typically wide-ranging; symbolic displays (like traffic signs), collages, combination of texts, images, and colors, drawings and sketches, and so on (see http://www.smashingmagazine.com/2008/01/14/Monday-inspiration-data-visualization-and-infographics/).

Early works by Jacques Bertin, his Semiology of Graphics ([1967] 1983), and Tufte ([1983] 1998, 1990, 1997), in particular his first contribution, Visual Display of Quantitative Information, have been instrumental in providing the inspiration and guidelines for visualizations of this type. In these works, the importance of visual communication (e.g., see Tufte’s now classic example of the explanation of the Challenger disaster) is highlighted.

Examples of this type of visualization have already been used in archaeology. Cummings et al. (2002) integrate the plan view of Neolithic Welsh chambered tombs with a 360° horizon profile showing the main landforms surrounding each monument. They argue that forms of data representation like this one help readers to better follow the heavily descriptive narrative that make up phenomenological approaches and, more importantly, provide a means by which some of the interpretations forwarded in these narratives may be legitimized (see Fig. 2).

Fig. 2
figure 2

A combination of a plan view of two burial sites, showing the axial line representing their entrance, and a profile of the topography surrounding the sites, after Cummings et al. (2002) with permission

A second, more abstract, example of data visualization comes from a recent study focusing on the genesis and dynamics of past visual landscapes (Llobera 2007). The main goal of this study was to describe and explore patterns of co-visibility through time that existed between clusters of Bronze Age round barrows in the Yorkshire Wolds. Co-visibility occurs at those locations from where two or more sets of monuments are visible. These patterns are very dynamic, insofar as they depend on the number of monuments at any one time in the landscape, and change as one moves across the landscape. This complexity could only be tackled through the use of a computer simulation that generated vast amounts of numerical information.

Table 1 provides information about the visibility of one cluster of monuments in relation to the remaining clusters. Rows are grouped into three categories that correspond to the visual appearance of cluster 1. For instance, rows classified as B refer to the case when cluster 1 appears in the background. Let us imagine that we can freely move throughout the background. We would like to know how often cluster 2 is visible or shows up as being in the background (b), middle ground (m), or foreground (f). If we look at the bottom four values on the first column, i.e., the column that corresponds to cluster 2, we read the values of 0.52, 0.09, 0.01, 0.37 (not counting rounding errors, this would add to 1.0). The first value (0.52) tells us that within 52% of the area from where cluster 1 appears in the background, we cannot see cluster 2 as indicated by the x on the second leftmost column. From within 9% of this area, cluster 2 would appear in the foreground. Only within 1% of this area would we be able to see cluster 2 in the middle ground. Finally, within 37% of the area, we would see both clusters 1 and 2 in the background.

Table 1 Co-visibility patterns between a cluster of round barrows (cluster1) and other cluster of barrows in the landscape

Given the complexity of the data generated by the simulation, a graphical format was devised as a way to summarize and compare information among clusters (see Fig. 3). Each ring represents a different visibility range (foreground, middle ground, and background) for the same cluster. In this example, the innermost ring represents the area where cluster 1 appears in the foreground. The next ring represents where it appears in the middle ground and so on. Within each ring, the total area is further subdivided into similar categories representing the visibility ranges of the other clusters. This subdivision (represented by different gray scales) represents the proportion or probability of seeing other clusters at different visibility ranges. So for the case of cluster 1, we can read from the graph that within roughly 85% of the area where cluster 1 appears in the background, no other cluster is visible. Within the remaining 0.15 (15%) chance, there is a 0.05 chance that some cluster will appear in the background, 0.05 that it will occupy the middle ground, and 0.05 that it will show up in the foreground. When we move to the next ring (middle ground), we see that the overall visibility of other clusters has jumped to 45% and so on.

Fig. 3
figure 3

A graphical representation of the co-visibility patterns shown in Table1, after Llobera (2007) with permission

As with Cummings et al. (2002), this type of data visualization aims at aiding the reader in constructing an interpretation from the data. Given perhaps its less iconic nature, at least when compared with Cummings, the form of data representation calls for a process of “discovery”, a process born from the possibility of multiple readings, a common trait among other data-rich representations (Tufte [1983] (1998), pp. 167–68). In this case, data can be inspected in various ways. One way would be by simply comparing the percentage of white versus gray in a ring and/or across rings. Another possible reading would come from comparing the percentage of each visual range within a single ring. Finally, one may focus on how the different visual ranges change across rings and whether a visual range appears or disappears and increases or decreases and by how much. We may add another level of interpretation by exploring the same questions using different number of clusters present at any one time (see Llobera 2007). In this sense, this visualization straddles between two types of visualization, data visualization and scientific visualization.

Scientific Visualization

The use of scientific visualization is to some degree much more familiar to archaeologists. For the most part, this type of visualization refers to various techniques used to represent, explore, and interpret data derived from both observations and models. The purpose is to gain understanding and insight into the data and the underlying processes that generate them (Brodlie et al. 1992, p. 1). Commonly, it makes reference to an array of techniques used to generate, display, and query data as surfaces, flows, and volumes. Currently, scientific visualization includes not only the production of static displays but also a whole suit of strategies used to interactively manipulate and query data.

With the advent of the internet and increased computer power, scientific visualization is used to process vast quantities of observations with the intention of spotting latent patterns, identifying new relationships between variables, and describing trends or detecting outliers. The process of visualization will often entail a preliminary stage within which multidimensional data are reduced into a smaller dimensional space, that is, into fewer variables capturing most of the variability of the original ones. In this sense, scientific visualization builds in part on earlier work done in statistical visualization (Cleveland 1994; Tukey 1977). Archaeologists who have used statistical data reduction techniques (e.g., dimensional analysis, principal components, or correspondence analysis) will be familiar with this idea.

In addition to rendering observations, scientific visualization is used to display the numerical results of mathematical models and simulations. An example is provided by the display of the effect of drag on the wing tips of an airplane (see Fig. 4). The nature of these visualizations is often highly interactive and may include the use of animations as a way to explain processes or to aid in the discovery of latent features (for a similar discussion in geography, see Harrower and Fabrikant 2008).

Fig. 4
figure 4

An example of the use of scientific visualization to display simulation results. Simulation of wing drag © Stuart Rogers http://people.nas.nasa.gov/~rogers/images/

As opposed to other types of visualization (see information visualization below), scientific visualizations tend to be generally viewed as being relatively straightforward (a careful examination of the decisions surrounding their production shows how this is not the case, see van Fraasen 2008; Lynch and Woolgar 1990). Implicitly, there is a sense that whatever data are being represented possess structure or geometry and that, by visualizing it, the process behind it can be uncovered. It also presupposes that this structure will be revealed through regular, and exhaustive, sampling. Indeed, many scientific visualization strategies (e.g., isosurfaces, contours, cut-planes, streamlines, ribbons) used to render and query spatial and volumetric data are predicated on the need for such type of sampling.

Despite the wide-ranging and the well-established techniques that are currently in use in scientific visualization, very few have been adopted by archaeologists (for some pioneering examples, see Dibble and McPherron 1988). There are many possible reasons for this absence:

  • Archaeologists constantly alternate between different conceptualizations of space, while scientific visualizations are mostly based on a field-based concept of space. According to this view, a single property or attribute extends continuously across space. If we consider the case of archaeological excavations, most archaeologists by formation are trained to conceive of excavation units as discrete entities (e.g., Lucas 2001, Ch. 5; Harris 1989, Ch. 6), i.e., to identify depositional units/events and their relationships. Indeed, many archaeologists would consider this practice, the identification and interpretation of separate stratigraphic units, as a requirement of the discipline (Hodder 1999, pp. 82). This object-based concept of space (Couclelis 1992) does not lend itself to the application of scientific visualization methods commonly used to analyze volumes. Yet on occasions archaeologists represent the distribution of certain materials (e.g., pottery shards, lithics) using a field-based concept of space where observations (e.g., pottery counts) are assumed to spread continuously across space. McPherron et al. (2005) make a similar point and provide a very illustrative discussion on the effect that different ways of recording (whether field- or object-based) have on interpretation.

  • The highly regular and exhaustive level of sampling needed for scientific forms of visualization, especially volumetric, is rarely achieved in archaeology (Barceló and Vicente 2004). It is only in a limited number of cases, notably through geophysical surveys (e.g., Gaffney 2008; Losier et al. 2007; Leucci and Negri 2006) and some excavations (e.g., McPherron and Dibble 2000) that this type of sampling is ever achieved. While it is possible to use some scientific visualization techniques with excavations that have been recorded using “continuous” 3D coordinates (i.e., not artificial stratigraphy), they often suffer from a lack of mathematical continuity. That is, observations do not extend across space in a smooth fashion but instead present many abrupt breaks and gaps. This produces artifacts that are not easily resolved graphically.

  • For the most part, archaeologists concern themselves with collecting simple information that can be adequately displayed using basic symbols and color. Even after complex simulations (e.g., Kohler and van der Leeuw 2007; Kohler and Gumerman 2000), visualizations have remained very basic. Current scientific simulations, however, go beyond simple quantities and display information (e.g., rates of change) that require more complex data structures such as vectors and tensors. It is hard to ascertain whether the lack of higher-order data structures, and their representations, in archaeology is a consequence of the simple nature of the data being recorded or simply the lack of innovation when it comes to recording, and displaying data, in new ways. It is probably both. It is however possible that the use of “higher-order” data structures, particularly in association with simulations, could provide the means to sustain novel data interpretations.

An example of the possible use of higher data structures is shown in Fig. 5. This example makes use of tensors, complex mathematical entities that have various interpretations depending on how they are derived. Diffusion tensors, the type shown here, are regularly employed to visualize information obtained through magnetic resonance imaging (MRI) (see Fig. 6). The rate at which a fluid diffuses across a tissue depends on the internal structure of the tissue. The fluid will diffuse more rapidly in the direction aligned with the internal structure and much more slowly when moving perpendicular to it. In order to describe these changes around a location, it is necessary to make reference to more than two dimensions. A second-order tensor, represented here by an ellipsoid, is used to display the rate of diffusion in multiple directions.

Fig. 5
figure 5

Close-up showing the distribution of tensors (ellipsoids) representing visibility around a set a monuments (red semi-spheres)

Fig. 6
figure 6

MRI Diffusion tensors represented by 3D glyphs representing diffusion of fluid in a human brain. Brain dataset courtesy of Gordon Kindlmann at the Scientific Computing and Imaging Institute, University of Utah, and Andrew Alexander, W. M. Keck Laboratory for Functional Brain Imaging and Behavior, University of Wisconsin-Madison

In the archaeological example shown above (Fig. 5), tensors are used to describe how visibility associated with a certain distribution of monuments changes around each location in the landscape. Locations closer to a single monument will show up as an elongated ellipsoid. This is due to the visual predominance of the monument at closer range. A similar result is obtained when several monuments appear to be lined up when viewed from a single position. On the other hand, locations where visible monuments are evenly distributed present well-rounded ellipsoids, indicating the absence of any preferred viewing direction (n. b. for the sake of simplicity, all monuments were assigned the same size). The use of tensors (in this occasion represented as ellipsoids) allows us to describe the visual structuring of the landscape due to the presence of monuments in a way that is closer to that experienced by an individual in the landscape.

As mentioned above, the use of scientific visualization techniques has been absent in archaeology. For example, the majority of 3D renderings of archaeological excavations make little use of well-established volumetric techniques in this realm. Instead, they are primarily constructed as static post facto illustrations rather than as active aids to interpretation. This is likely to do with the nature of archaeological observations and practices. Given these caveats, the application of these tools and strategies is likely to be more successful in conjunction with simulations than as a way to view and explore empirical data.

Information Visualization

Information visualization is a set of techniques aimed at visualizing data that does not have any inherent geometry like a flat database. The relation between variables may not be well understood or even known to exist a priori. It is up to the researcher to decide how to “map” observations to whatever geometry is most appropriate. It is best applied for exploratory tasks involving a large information space. It is exploratory in nature, that is, the person using these techniques may not often have a concrete goal or question in mind. Instead, he or she is interested in examining the data to learn more about it and to make new discoveries (Fekete et al. 2008, pp. 2).

According to Chen (1999:27), the process of visualizing information has two stages: an initial “structural modeling” stage through which the main underlying relationships are detected, extracted, and simplified and a second “graphical representation” stage wherein these structural components are visualized and interacted.

To illustrate the important role that information visualization can play in archaeology, I would like to make brief reference to the notion of “context” as archaeologists often use it. There are ample references to this concept in the archaeological literature, but its precise definition remains elusive, often having various meanings and connotations. Reference to what actually constitutes the meaningful context of any archaeological observation is central to any archaeological approach to the extent that some archaeologists consider its study to be the main focus of archaeological investigation (e.g., Hodder 1987, pp. 120; Barrett 1987, 1994, pp. 4). Here I shall argue that the essence of what constitutes “context,” however it is defined, is linked to the archaeologist’s ability to identify/define multiple sets of relationships. Tools developed in areas like information visualization are ideally suited to explore the nature and structure of such sets.

Often, the term “context” refers to two sets of meanings in archaeology, one more restricted than the other. The more restricted sense refers to the set of relationships that archaeologists identify as being associated with any archaeological observation. During excavation, for instance, a particular shard might be interpreted as belonging to a certain scatter or to be associated with a certain structure and/or sediment (Schiffer 1972; Yarrow 2008, pp. 125). This sense of context is the one captured in context sheets and site plans. A much wider sense makes reference to the historical circumstances (e.g., economic, social) informing and giving meaning to material culture (Hodder 1987, pp. 1–2; Moore 1986, pp. 2–3; Meskell 2004, pp. 249; Jones 2007, pp. 79–80). These two ways of conceiving and referring to the notion of context are related to each other. Indeed, archaeologists frequently concern themselves with the former in order to throw some light on the latter. A thorough discussion detailing the different strategies archaeologists use to bridge both realizations would certainly be helpful, but it is beyond the scope of this paper. Suffice it to say that the way in which we construct both realizations is essentially the same, primarily by selecting, identifying, and defining different types of relationships with other observations. Depending on our scale of analysis and interpretation, the result, at the end, is a more or less complex web of relationships.

For instance, for many archaeologists, the meaning of an object at any point in time cannot be defined exclusively as a function of its immanent properties but must also include reference to the sets of past and present relationships in which it participated (e.g., Jones 2007). If we take a minute to imagine what these sets would look like if represented as a network, it is clear that the network would be massive. Most likely, it would contain many different types of relationships, or if we prefer, be made out of multiple concurrent networks each of a different nature. Our task, as archaeologists, is to identify and recover the relationships between the object and other elements along with its life history (e.g., Kopytoff 1986; Gosden and Marshall 1999) and to determine, of all possible networks, which ones may have been relevant for a particular purpose or setting. The nature of these relationships and the criteria by which they are established will vary widely, ranging from the immediately observable, e.g., spatial, topological, chronological, and meronymic, to those requiring more interpretation, e.g., figurative (metaphorical, metonymic), temporal. Currently, many of these relationships are unaccounted for and remain buried in project databases.

With the increasing use of information systems, traditional forms of recording archaeological contexts have given place to databases and other forms of digital media. While their application is not contested by anyone, their use among archaeologists remains less than optimal. The most sophisticated and widely used method of recording context in archaeology is the relational database. However, there has been remarkably little discussion of how easy or efficient it is to store, access, and query data stored in such types of database. Commonly, databases are designed for data entry, moreover, for a particular form of data entry. Emphasis is on recording data and less on how easy it is to retrieve the data or to generate data output in a certain form. The truth is that, even when archaeologists use complex relational entity databases, their queries remain very simple and limited in scope. Our ability to find new links among attributes of archaeological entities stored inside a database is curtailed by its limited interface (traditionally in the form of a table).

To illustrate, consider the example of a shard for which we have recorded the traditional kind of information: stylistic elements, dimensions, physical properties of the clay, temper and slip, etc. The range of meanings associated with the shard in the past will most likely be a function of its relationship with other entities in our database. The nature of this relationship may be simple and direct, e.g., sharing the same space, or it may be complex and indirect, e.g., related to an activity that took place in a location where some of these other entities were also present (e.g., Moore’s example of the use of ash, 1986, pp. 127). It is clear that our capacity to investigate this second type of relationship hinges on the possibility of retrieving information that is “deeper” in our database. It might require the intersection of several queries. Furthermore, with each degree of separation, the size of what now constitutes “relevant context” will increase exponentially, requiring a way to prioritize relevant information.

The inability to explore this information adequately, i.e., to recover context, is further exacerbated when archaeologists try to gain access to information beyond a single project (i.e., other databases, site reports, publications). This is largely due to the absence of commonly recognized conventions in data recording (see Snow et al. 2006 and Kintigh 2006). It is precisely the ability to elicit information from sources of a diverse nature that makes the development of information visualization techniques in archaeology a key topic for the archaeology of the twenty-first century.

To illustrate how techniques borrowed from information visualization may be useful when exploring a real archaeological example, let us consider a typical archaeological dataset containing information about the lithics from a site. These data come from Close’s (2006) analysis of the assemblage of stone tools made from crystalline volcanic rock at the English Camp site on the San Juan Island, Washington, USA. Lithic typology, stages of production, and measurements were recorded in a flat database (originally an SPSS table; see Table 2). The visualization technique presented here is called Treemap (Johnson and Shneiderman 1991). This technique allows the user to interactively organize data within a table into different tree-like hierarchies. The presentation of hierarchical information using a traditional tree becomes quickly cumbersome and ineffective once a dataset reaches a certain size. Treemap, however, is able to render very complex nested structures in a compact space, allowing the user to take in a large volume of information in a single viewing. It does this by representing the relationship between one branch and each leaf within as a set of nested rectangles. Each rectangle (its size and color) is proportional to the numerical variable/s set by the user.

Table 2 A typical spreadsheet table or flat-database commonly used by most archaeologists. Source of lithic data provided by A. E. Close used in her publication (2006)

There are many implementations of the treemap algorithm. Most of them are able to read flat database files with minimal alteration. Like other information visualization techniques, treemap is used to help the researcher uncover patterns and better understand data. The outcome is a very data-rich visualization that may be read in many different ways.

Figures 7, 8, 9, 10, and 11 show different treemap renderings of the data contained in Table 2 that can be instantly generated by the user. It is this ability to rearrange information quickly and render data in compact form that facilitates the discovery of relationships latent within the data. Many of the patterns shown here can be obtained through traditional means through a much longer period of exploration, as opposed to seconds, and only after countless additional calculations (Close, personal communication).

Fig. 7
figure 7

Application of treemap visualization to display flat-database information. The size of each square is proportional to the length:width ratio. Color brightness is also proportional to this value

Fig. 8
figure 8

The length:width ratio is shown by debitage type. The size of the box representing each debitage type is proportional to the sum of the length:width ratio in that debitage type. This snapshot shows how three types of debitage clearly dominate over the others. It also points out how the less common debitage types tend to be more homogeneous and present larger length:width ratios. Overall it appears to reflect the choice of type list used in this study

Fig. 9
figure 9

The length:width ratio is shown by debitage type. The size of the box representing each debitage type is proportional to the average of the length: width ratio within every debitage type. Except for some categories (bipolar 1st flake and batonnet) the more prevalent debitage types appear to have, on average, a similar length:width ratio

Fig. 10
figure 10

The length:width ratio is shown by debitage type and further subdivided by platform type. The box is proportional to the sum of the length: width ratio within each category or subcategory. In this image we can readily compare what platforms are more present depending on debitage or whether all debitage types share the same types of platform

Fig. 11
figure 11

The length: width ratio is shown for a particular branch of the tree created after we subdivide assemblage first by debitage type and then by platform type (see Fig. 10). In this particular case the branch corresponds to flakes from a single platform core further subdivided by core type. This shows that it is possible to zoom in and out any branch within the hierarchy in order to further explore data

Treemap is one of the most popular visualization techniques used to render raw table information into graphical form. There are potentially many other ways to elicit structure from a database based on different metrics and/or other digital media as well as to render this structure visually. Many of these are now becoming standards within the information visualization world and can be found as part of software research libraries and frameworks (e.g., Indiana University School of Library and Information Science’s InfoVis Cyberinfrastructure or Sandia National Laboratory’s Paraview).

Visual Analytics

To conclude this overview of the world of visualization, I shall briefly refer to the most recent development within visualization: visual analytics. As with other visualization fields, it is hard to tell where this field begins and where it ends; however, the primary focus of visual analytics is the use of visualization as the active element of the analytical process. According to Wong and Thomas (2004, pp. 20–21):

Visual analytics is the formation of abstract visual metaphors in combination with a human information discourse (interaction) that enables detection of the expected and discovery of the unexpected within massive, dynamically changing information spaces. […]

Visual analytics is an outgrowth of the fields of scientific and information visualization but includes technologies from many other fields, including knowledge management, statistical analysis, cognitive science, decision science, and many more.

Visual analytics focuses on the process of analysis facilitated by the use of highly interactive visual displays. Emphasis is on the role that representations play within analytic discourse and reasoning. Both exploration and development are essential traits of this area. It fosters the design of new visual interfaces aimed at maximizing human capacity to perceive, understand, and reason about complex and dynamic data. Not surprisingly, these goals have been extended to incorporate spatial and geographic information giving way to the field of Geovisualization (Dykes et al. 2005; Andrienko et al. 2007). The suitability of some of the techniques developed within this field to archaeology has already been explored in a recent article by Huisman et al. (2009). The authors of this paper used a modern version of Hägerstrand’s space–time cube to visualize basic archaeological information. It is still early to determine whether the use of this specific way of visualizing space–time information may be an appropriate one for archaeological purposes. The geographic representation of space–time is likely to be useful for some archaeological approximations but not all. However, the article points out the possibility, even the need, for devising new visual interfaces that help archaeologists explore similar questions from an archaeological perspective (see Bailey 2007; Lucas 2005; Holdaway and Wandsnider 2008).

It is hoped that the discussion and examples presented above have illustrated the potential of and the need for adopting visualizations in archaeology. Fortunately for us, the ever-growing world of visualization already provides us with many possibilities from which we can learn and initiate our own explorations in this field. Ultimately, the success of visualization in archaeology will depend on our ability to develop our own visualization techniques and esthetics. To some extent, this endeavor is directly related to the role information systems play in the production of archaeological information, a theme that was implicit in the previous discussion and that it is further developed in the following section.

Towards an Archaeological Information Science

We really cannot talk about the possibilities of information visualization without making reference to the importance of data presentation and data manipulation within an information system, particularly as it applies to archaeology. Equally important is reference to the way archaeologists currently make use of information systems like computers.

The task of unraveling how information systems relate to the production of archaeological knowledge may seem, at first, a truly daunting endeavor, particularly in light of the overwhelming diversity of software applications that archaeologists currently employ. However, the recognition that any information system ultimately relies on the use of one or several data representations, or data structures, provides a useful starting point. In computer science, data structures represent different ways of organizing data, particularly with regard to a set of operations or actions. They are designed with a sense of efficiency in light of specific goals (e.g., to facilitate a fast retrieval, to occupy little space, etc.). They are central to the design and development of algorithms. The nature of data structures can be simple or complex. A simple data structure might just provide a format or template to store information; more sophisticated ones will support a set of operations from which data can be retrieved. They might also be associated with an “algebra,” i.e., a set of operations that allows them to be manipulated—the same way one manipulates numbers when adding or subtracting.

Data Structures and Theory-Ladenness

Archaeologists are no strangers to certain forms of data representations and, by extension, to data structures. The application of predetermined context sheets is essentially an example of their use. However, the topic of data representation within archaeology has not received as much attention as it should, especially in light of the pivotal role it has in the production of archaeological knowledge and its potential to precipitate different interpretations. The consequences of this oversight become deeper and more far-reaching the moment information systems are adopted. It is all too easy for the user to forget that he/she is subscribing to a particular form of data representation.

Nowadays, most archaeologists agree that the empirical basis of any archaeological investigation, i.e., recording and quantification, contains a certain level of interpretation, and it is related to specific goals. However, discussions on this matter have been mostly reduced to an issue regarding the choice of data (Chippindale 2000; Hodder 1999, pp. 62, 67) not its form. Archaeologists decide what data to collect based on the questions they are asking and on their theoretical orientation. Attention to the embedded nature of theory in data seems to be suspended after a choice has been made or it is simply dismissed. With some exceptions (e.g., Lucas 2001), the connection between theory and data has not ignited any further developments or translated into different forms of data collection, representation, and processing. This absence is significant and important in light of the flurry of theories that is currently used in archaeology. Theoretical referents, and ideas, are readily incorporated into narratives but seldom trickle down to change the way we gather, organize, and represent our data (for an example within a “phenomenological” approach, see Hamilton et al. 2006).

Concern with the integration of data, its representation, and theory is not new in archaeology. Early studies, many of which have given way to well-recognized standards, in archeozoology (e.g., Grayson 1984) and pottery analyses (e.g., Orton 1982, 1993), focused on how empirical data and subsequent transformations supported certain interpretations. While many of these aimed at resolving questions about data reliability (e.g., what can we say from our density of shards or bone fragments?), the bulk has been directed towards supporting interpretations that were mostly economic and/or about subsistence. This type of focus is noticeably absent with later theoretical orientations. It is within this context that reference to data structures, i.e., the way observation are collected and given shape, becomes relevant to archaeology as a whole.

Hence, the treatment of archaeological information proposed here is that outlined by Van Fraasen in his discussion on the role representations play in science (2008). In his synthesis, Van Fraasen makes a useful distinction between phenomena and appearances. Phenomena are observable entities (e.g., the shards, the eroded land surface). Appearances are “the contents of observation or measurement outcomes” (ibid., pp. 8). They have to do with how we “capture” information (what we record and how). Constructs such as minimum number of individuals (MNI) (Grayson 1984) and estimated vessel equivalent (EVE) (Orton 1993) are appearances that have already been accepted in archaeology. In the context of this paper, they would correspond to data structures. The key point is that they are subject to theoretical orientations and to the intentions and goals of the researcher.

Archaeologists seldom go directly from observations to producing an interpretation; instead, they generate tangible pieces of information or “reasoning artifacts” (Thomas and Cook 2005, pp. 36; Gooding 2008, para. 50) along the way that contribute towards supporting a final interpretation. Data structures, when used to organize archaeological observations, can be employed in the same capacity and provide a new venue for exploring the connection between data and theory in archaeology. Unlike archaeological observations, where reference to their theoretical content is seldom explored, data structures are by definition arbitrary so there is no question about their nature being interpretative. The fact that they organize observations explicitly and that their manipulation is done via a set of operations defined a priori provides transparency and flexibility. Indeed, it is the marriage between data and purpose that make them so powerful and appealing.

A simple example may help illustrate how the use of certain data structures can precipitate new forms of archaeological investigation. The example, provided by Helbing et al. (1997), comes in the form of a simulation used to understand how human trails emerge naturally without the need for central planning. It is not an archaeological example per se but could easily form the basis of a model exploring the transformation of a cultural landscape through time. Helbing and his colleagues borrowed the concept of “social force,” first introduced by social psychologist Kurt Lewin (1938, 1951), as a way to describe how the existence of previous markings on a terrain exerts a sort of attraction that influences people’s choice of direction when moving between two locations. The attraction of the markings generates a potential field that may be represented by a mathematical vector field. Their simulation was successful in explaining the emergence of human trail systems and other more dynamic aspects surrounding the transformation of those trails through time. It is easy to imagine how the idea of potential fields, and their representation, could be extended to model the impact of other features. Such a simulation could be used to explore how various features on a landscape “acted on,” or influenced, the movement of individuals and to assess what differences, if any, the emergence of new features, like fences, paths, monuments, or ditches, had over patterns of movement through time. Features, as forms of materialized social rules (DeMarrais 1996, pp. 11–21), would act as agents affecting space around them. This is in line with how Gell (1998, pp. 16–17) understands the concept of material agency:

An agent is one who ‘causes events to happen’ in their vicinity.[…] Whereas chains of physical/material cause-and-effect consist of ‘happenings’ which can be explained by physical laws which ultimately govern the universe as a whole, agents initiate ‘actions’ which are ‘caused’ by themselves, by their intentions, not by the physical laws of the cosmos[…] Whenever an event is believed to happen because of an ‘intention’ lodged in the person or thing which initiates the causal sequence, that is an instance of ‘agency.’

Agents are responsible for initiating causal sequences that are acknowledged as acts of mind, will, or intention rather than the mere concatenation of physical events. Gell recognizes that no object can be ultimately considered to be the origin of a causal chain, but this is not necessarily how people culturally experience objects. Whether a feature was perceived as something an individual wanted to walk towards or away from, they would be behaving as actors influencing the movement across the landscape.

By representing the effect of previous markings with a vector field, Helbing and his colleagues essentially provide an example of how the agency of markings generated by previous walkers may be represented and integrated into a model. This example is significant for two reasons. First, it demonstrates how the notion of agency may be explicitly incorporated into the investigation, via its vector field representation, at a level that is closer to the data, as an appearance (sensu van Fraasen 2008, pp. 8). Theoretical constructs, such as agency, are often only incorporated at the later stages of a study when the interpretation is being narrated (e.g., Dobres and Robb 2000). Second, if this form of representing agency is accepted, how should we interpret the results of this simulation? Regardless of the exact outcome, the results are likely to be quite robust and would exhibit the kind of vertical independence that Wylie (1999, pp. 176) has identified as a form of validation in archaeology. This example illustrates how archaeological theory may directly inform the choice and creation of data structures used within an information system. It also hints at the possibilities that the incorporation of new forms of representation may bring to archaeology (see Miller and Wentz 2003 for a similar discussion regarding GISc).

Interest in these matters requires a deliberate focus on questions that lie at the interface between theory and data representation that are currently absent in archaeology. It also calls for a concerted interest in the study of information structures and a discussion about their potentials, limitations, and appropriateness. These goals represent a significant departure from, or at least an important overhaul of, the prevailing way archaeologists have conceived and approached the use of information systems in archaeology.

The Current State of Archaeological Computing

In recent years, archaeology has experienced a substantial increase in the use of information systems in archaeology. A quick glance at the proceedings of the main conference on this topic (Computer Applications and Quantitative Methods in Archaeology or CAA) and journals (e.g., Archaeologia e Calcolatori) shows a staggering panoply of interests and topics. One might infer from the volume and diversity of the applications that these make reference to a well-established subfield in archaeology. This is far from the case (contra Hodder 1999, pp. 117, and Frischer 2008). A critical review of many of the studies found in these venues would conclude that most of them:

  • Do not make direct reference to any coherent body of theory and/or to archaeology at large. Concordance between these studies and broader theoretical concerns is mostly lacking or tenuous

  • Reflect the possibilities and limitations dictated by whatever software is currently being applied but not necessarily the goals, needs, and aspirations of archaeologists

  • Focuses on the narrow implementation/technical details surrounding the applications without actively engaging in the bettering of these tools

  • Do not follow clear and well-defined research lines or paradigms

  • Are not part of any well-establish curriculum within archaeology

Despite their long presence in archaeology, the impact of computer applications has been surprisingly limited (for a similar view, see Lock 2003, Ch. 8), i.e., they have not been part of any radical departure in how we conduct archaeology. This does not mean that the advent of information systems has not changed how we perform certain tasks. We are certainly able to collect, process, and store data much faster and in larger quantities. But changes seem to have been more quantitative than qualitative. We are able to record information much more quickly in the field but to what degree is this “new information”? How much has it changed the way we conduct our analysis? We have the capacity to process and visualize information in novel ways but are we actually doing this? More importantly, are we even thinking about new possibilities? How do these new developments relate, if at all, with theoretical orientations currently found in archaeology? Has the introduction of information systems precipitated new ways of doing archaeology?

There is no denying that the use of information systems is allowing us to obtain new insights, but these remain far in between, marginal in scope, and seldom championed by archaeologists themselves. This is because the use of information systems is still reduced to a desirable technical skill that some archaeologists manage to “add-on” to their bag of tricks. There is little recognition that the intersection of information systems (computers primarily) and archaeology provides new paradigms and/or research venues. Furthermore, there is little recognition that such intersection requires its own set of standards and most likely its own specialized curriculum. Evidence of this handicap is clear when we reflect on how rarely the most basic trait computers have to offer, i.e., handling vast amounts of data and cycling endlessly through similar calculations, is harnessed within archaeology.

A 2007–2008 report on the status of archaeology as a profession in 12 European countries commissioned by the UK’s Institute for Archaeologists concluded that while the need for IT had steadily grown in recent years and was accepted as being a necessity, it was considered to be a “non-archaeological” skill (Jefferey and Aitchison 2008). Furthermore, when confronted with many of the known limitations inherent in current IT applications, archaeologists have often failed to deal with them in a productive way. This is the result of lack of proper training and sometimes ignorance about the proper use of applications. In some cases, archaeologists brush off any concerns about these deficiencies by claiming that they are just archaeologists and not computer scientists. They are simply applying what is on offer and that is that. All that is needed then is to mention the limitation to make it right and to proceed as usual. Actively engaging in understanding the underpinnings of applications, let alone developing them, is generally not considered to be archaeology even among many of those trained in IT within the discipline.

This situation is further perpetuated in the academic world where, with the exception of some European postgraduate courses, there is an absence of proper curriculum. Even in the European cases, and with a few exceptions, most curricula have aimed at forming competent technicians but not researchers. That is, the emphasis has been on forming archaeologists as “competent” consumers of IT. “Competency” refers in most cases to becoming familiar with the basic workings of an application. There has been limited effort to form archaeologists on how IT may be integrated within current archaeological discussions (both at a practical and theoretical level), to provide them with the skills needed to actively engage in the development of new IT tools consonant with archaeological interests, to foster a deeper conceptual understanding of how applications work as a necessary step towards the creation of new ones, and finally, to entertain discussions, or initiatives, on how IT may precipitate new ways of doing archaeology beyond what is readily available. Yet thanks to their ubiquity, and moderate cost, computers are an invaluable resource for archaeology, perhaps far more central than archaeologists would like to concede. A resource that can, nevertheless, be easily, and inadvertently, abused without proper training.

A call for new ways to integrate information and computer science with archaeology beyond technical prowess has not happened yet. If we consider the adoption of computing and other information systems as one leading towards further insight and maturity, archaeologists need to move beyond the level of technicians where they are currently at to that of full-blown researchers. This would represent the next logical step from where we are now positioned. This is already implicit in recent articles by Snow et al. (2006) and Kintigh (2006) when discussing the need for an information cyberstructure for archaeology. Both of these articles focus on the challenges behind the creation of an information infrastructure (i.e. protocols, standards and tools) that would bring the integration of archaeological data under a common digital “umbrella”, ultimately allowing greater data sharing amongst archaeologists. While none of them discuss the necessity for specialized training directly, the full benefits of such a cyberstructure would only be possible if such preparation is in place for archaeologists to reap the benefits.

In order for the marriage between computer and information science with archaeology to be successful, it is necessary for some archaeologists to attain what some authors refer to as an ‘amphibious state’ (Bunge 1973; Wylie 2002), that is, a state in which the level of competency attained in all of the disciplines is such that will enable them to move from one discipline to another with ease and, more importantly, to generate novel insights. To form this set of specialists there is a need for a specialized curriculum, one that will initially borrow elements from computer and information science but that will ultimately need to develop into its own paradigms and methods. It is the responsibility of those of us who may have already glimpsed at the possibilities that lay ahead to engage in such an endeavor, and to communicate these insights effectively to the rest of our peers.

To this day the application of information systems by archaeologists has been at the very least narrow, and at the very worse naïve. It needs to be broadened in order to capitalize on the advantages that the manipulation of digital information can bring to archaeology. The intersection between information and computer science and archaeology calls for a new concerned focus in archaeology which we could label Archaeological Information Science (AISc). While a call for an additional specialism (curriculum, etc.) may undoubtedly contribute towards further fragmenting the discipline of archaeology, the price of neglecting this call is simply too high. Broadly defined, AISc would be concerned with the generation, representation and manipulation of archaeological information within the context of information systems. It would call for archaeologists becoming more skillful and having a more pro-active role in the use and design of these systems. While we could think of AISc as not being restricted to the application of computers in archaeology (i.e. we can think of data representations without reference to computers), both become inseparable once the latter are adopted. With this in mind, AISc does not seek the reduction of all archaeological information into a set of formal representations but rather mindfulness about their use in archaeology. It aims to capitalize on the potential advantages that come from generating, representing, visualizing and manipulating archaeological information through the use of computers and other information systems.

Concluding Remarks

In this article I have discussed the possibilities of archaeological visualization. The current field of visualization is a very heterogeneous area that has been recognized as playing a central role in processes of discovery, analysis, interpretation and communication. The richness of data, and the types of queries, that archaeologists handle make the development of new forms of visualization, new interfaces and exploratory strategies a potential research topic in the future of archaeological investigation.

The relevance of archaeological visualization makes sense when viewed as part of a larger disciplinary focus called Archaeological Information Science (AISc), a specialty predicated on the possibilities that the intersection between Computer and Information Science with Archaeology brings. This specialty would follow from the possibilities that information systems offer to capture, represent, manipulate, analyze and model archaeological information, each of which provides fertile ground for connecting archaeological theory and practice.

To pursue the possibility for an AISc, archaeologists interested in the potential offered by the digital revolution need to reconsider their attitude towards the use of information systems and computing. In addition, the archaeological community on the whole needs to shift the way it conceives of the use of computers within the discipline to be able to capitalize on these possibilities. This can only be achieved by accepting the need for a new breed of specialists within archaeology (see Scott 1991, pp.177–8, for a similar call in the analysis of lithics), and recognizing that the treatment of archaeological information goes beyond the mere application of whatever software is currently available; that instead, it represents a new area where archaeologists can focus on discussions about the nature of archaeological data: on its definition, representation and manipulation. There must also be an acknowledgement that the digital representation of archaeological information can precipitate new forms of doing archaeology. This is not limited to novel ways of ‘capturing’ and visualizing data. It includes new ways, and standards, of handling, processing and modeling this information as well. Failure to identify the centrality of these questions, and to develop the relevant skills to tackle them within archaeology, represents a lost opportunity to steer the discipline into new directions. Furthermore, it will have a direct impact on our ability to handle information in the future.

To conclude this paper, it is worth reflecting on the development of Geographic Information Science (GISc), and how it provides some hindsight on the development of an AISc. It took several years after GIS were first introduced before a community of people emerged who felt that questions surrounding their application and design represented a new research area. Central to this development was the recognition that many of the questions addressed were beyond the purview of geography alone requiring contributions from other fields (e.g. Computer Science). We can expect a similar requirement for the development of AISc. Before that ever happens, we first need to accept and embrace what information science and technologies can provide to archaeology not as passive bystanders but as active participants.