Archaeological research on ancient cities and urbanism exhibits a wide range of perspectives on the use of theory. Some archaeologists pursue descriptive research with little explicit use of theory, while others embrace high-level social theory. Still others chart a course between these poles by using lower-level concepts. Work in this latter category is more abstract than empirical description but more grounded and less abstract than high-level social theory. I call this body of work “empirical urban theory,” and in this paper I argue that this is an especially productive explanatory approach for understanding ancient cities and urban life. In defining “theory” I follow anthropologist Roy Ellen, who states that theories,

provide us with a framework through which we can explain and interpret data, and they should do so parsimoniously. So, we might define theory as ‘A supposition or body of suppositions designed to explain phenomena or data’ (Ellen 2010: 390).

The book, Ancient Cities: The Archaeology of Urban Life in the Ancient Near East and Egypt, Greece, and Rome, by Charles Gates (2003), exemplifies the descriptive or empirical end of the theory continuum. Gates covers a large number of ancient cities, but does not concern himself much with theory. Discussion of life in these cities (a topic that usually requires some engagement with theory) is kept to a minimum; his focus is on the forms of cities and their chronology. I think many archaeologists and other urban scholars would agree that this kind of non-theoretical research limits our ability to understand ancient cities in social terms.

In a recent essay, Norman Yoffee (2009) criticized the chapters in a major edited collection (Marcus and Sabloff 2008) as “curiously under-theorized” (p. 281). By suggesting that archaeologists should use Bruno Latour’s actor-network theory and Pierre Bourdieu’s concepts of habitus and doxa to understand ancient cities, Yoffee appears to be advocating research at the opposite end of the continuum, that of high-level social theory. In a similar plea, Arthur Joyce (2009) calls on Mesoamericanists to apply post-structuralist theory to ancient Mesoamerican cities.

High-level social theory, termed “grand theory” by sociologist C. Wright Mills (1959: 25–49), is popular among many archaeologists. Although Latour has not been commonly cited, the ideas of Bourdieu, Anthony Giddens, Michel Foucault, Henri Lefebvre, and other prominent social theoreticians are regularly invoked in studies of ancient cities and urbanism (e.g., Ashmore 2002; Blake 2004; Fisher 2009; Joyce 2000; Joyce 2009; Smith 2003a, c). High-level theoretical schemes describe how the social world works on a very abstract, philosophical level, and as a result their utility in the analysis of particular empirical cases is rather limited (Ellen 2010). In the words of Mills, grand theory is “so general that its practitioners cannot logically get down to observation. They never, as grand theorists, get down from the higher generalities to problems in their historical and structural contexts” (Mills 1959: 33).

In their empirical studies, archaeologists who enjoy high-level theory typically cite such authors in their introductions, and perhaps again in their conclusions, but rarely during the course of their analyses of data. In the words of Kevin Fisher,

While Giddens, Goffman and others provide an overall theoretical orientation for examining the relationship between architecture, interaction and social transformation, their work does not offer the tools needed to analyze the material remains on the ground (Fisher 2009: 440).

The empirical theories discussed in this paper provide those conceptual and methodological tools. These theories are for the most part drawn from the disciplines of architecture, planning, geography, environmental psychology, and other fields. Their concepts (e.g., monumentality, access, visibility, planning, and levels of meaning) directly link the urban-built environment to the actions of people within cities.

Mertonian Middle-Range Theory for Ancient Cities

“Empirical urban theory” is a collection of theoretical approaches that operate on a lower epistemological level than grand social theory; they are located somewhere near the center of the epistemological continuum mentioned above. Outside of archaeology, this kind of research is referred to as “middle-range theory,” a term introduced by sociologist Robert K. Merton in the 1950s (Merton 1968). This is a major topic of research in sociology, political science, and other social sciences, with numerous case studies, conceptual papers, and analyses by philosophers of science (see discussion below). Middle-range theory is even discussed in the field of folklore research, where it has been termed “humble theory” (Noyes 2008).Footnote 1

In archaeology, the phrase “middle-range theory” was hijacked by Lewis Binford (1977) to refer to an idiosyncratic body of theory on formation processes. For Binford, the phrase “middle-range” refers to empirical processes that mediate between the static and dynamic poles of archaeological interpretation. This is quite different from the domain of theory that lies between the descriptive and high-level theory poles of the epistemological continuum. In spite of a few calls to limit the term middle-range theory to its sociological meaning, however (Raab and Goodyear 1984; Shott 1998), most archaeologists continue to associate the concept with Binford and formation processes of the archaeological record (e.g., Forslund 2004; Johnson 2010; Tschauner 1996; Varien and Ortman 2004).Footnote 2

Although later writers on Binfordian middle-range theory in archaeology sometimes mention Merton’s ideas, in most cases they mischaracterize them. In fact, I have found only three archaeologists who accurately describe Merton’s notions in print: Michael Schiffer, Michael Shott, and Robert Bettinger. Shott states,

By whatever name we call it, making general theory susceptible to testing against empirical observation required an intermediate body of theory that was itself directly testable, theory that simultaneously embodied abstraction and groundedness (Shott 1998: 302)

Schiffer (1988: 462) pointed out some time ago “the need for middle- and lower-level principles to mediate between the most abstract theories and empirical reality,” and he provides a good discussion of the Mertonian concept of middle-range theory; see also Bettinger (1987). Archaeologists have been slow to develop theory on this level, particularly for the study of cities and urbanism. Mertonian middle-range theory is not part of “archaeological theory” as normally construed (Bentley et al. 2008; Hegmon 2003; Hodder 2001; Johnson 2010; Schiffer 2000), and it is rarely discussed as such in cultural anthropology (although see Ellen 2010).

In spite of this neglect by archaeologists, I suggest that Mertonian middle-range theory, and by extension the epistemological hierarchy of which it is part, are crucial to the tasks of explaining and understanding the past.Footnote 3 A number of archaeologists and anthropologists have discussed levels of theory and their significance. Schiffer (1988), for example, has discussed this hierarchy in terms of comprehensiveness and empirical content. Descriptive accounts, at the “low” end of the continuum, are less comprehensive with more empirical content, while high-level social theory is far more comprehensive, but with little empirical content. Lyman (2007: 135) identifies two levels of theory. He describes “a structure consisting of ‘subtheories’ as less comprehensive, particularistic parts that in combination comprise more comprehensive, general theories.” He justifies this framework with citations to biological evolutionary theory. Ellen (2010) identifies three levels of anthropological theory: grand theory, middle-range theory, and thick description. He notes that “A theory at the top of the pyramid is not a better theory, only a simpler one that claims to explain a wider range of data, but with much less to say about any particular case” (Ellen 2010: 399).

Archaeologists working within a postprocessual framework, on the other hand, tend to ignore, downplay, or dismiss the importance of Mertonian middle-range theories and the epistemological hierarchy that furnishes their context. Postprocessualists have devoted considerable effort to attacking Binford’s notion of middle-range theory (that is, concepts about formation processes); Forslund (2004) reviews this literature. While these scholars find numerous things to disagree with in Binford’s work, one objection is the idea that there is such a thing as levels of theory that are autonomous from one another. In a number of works, Ian Hodder, for example, argues strongly against “this separation of levels or types of theory” (Hodder 1982: 5–6), claiming that the phrase middle-range theory “seems redundant” (Hodder 1986: 117); see also Hodder (1999: 60). Similarly, Johnson’s recent textbook (Johnson 2010) does not acknowledge the possible existence of autonomous levels of theory; I return to this issue of levels of theory in the conclusions below.Footnote 4

Postprocessualists tend to leap directly between grand theory and the archaeological record without concern for middle-range theory, and this may be one reason why the approach has made so little progress in understanding ancient cities and the built environment (Blanton 1995). I share Stephen Lekson’s puzzlement over the predominant position of high-level theory in archaeology: “Why American archaeologists favor Lefebvre and ignore Amos Rapoport is beyond me” (Lekson 1996: 579). Geographer Jan Nijman made the following comment to me: “How can archaeologists pretend to know so much [about ancient cities] based on so little evidence, while current urbanists have complete working cities to observe yet they cannot agree on anything! The answer is, I think, that archaeologists do not bother with mid-range theory but jump from detailed (sometimes miniscule) empirical evidence to grand theory about city and polity formation” (Jan Nijman, personal communication, 2010).

The empirical urban theory covered in this paper consists of social concepts concerning urbanism that have identifiable expressions in the archaeological record, along with methods for addressing those concepts. Although some archaeologists have made effective use of these bodies of theory, this material remains infrequently cited in studies of ancient cities. Instead of invoking abstract social theory that may or may not apply to past urban settings, as advocated by Yoffee, Joyce, and others, I suggest that research on ancient urbanism will progress more rapidly through an exploration of concepts at the level of empirical urban theory. Toward that end I now provide a more detailed exposition of Merton’s model of middle-range theory.

Robert K. Merton described middle-range theory as follows:

Middle-range theory is principally used in sociology to guide empirical inquiry. It is intermediate to general theories of social systems which are too remote from particular classes of social behavior, organization and change to account for what is observed and to those detailed orderly descriptions of particulars that are not generalized at all. Middle-range theory involves abstractions, of course, but they are close enough to observed data to be incorporated in propositions that permit empirical testing. Middle-range theories deal with delimited aspects of social phenomena (Merton 1968: 39–40).

Merton’s middle-range theory has recently been described as follows: “Not mindless empiricism and not abstract theory or theory about other theorists. Merton developed theory about how the world works” (Sampson 2011: 72); emphasis in the original. A recent work in sociology describes middle-range theory as:

a clear, precise, and simple type of theory which can be used for partially explaining a range of different phenomena, but which makes no pretense of being able to explain all social phenomena, and which is not founded upon any form of extreme reductionism in terms of its explanans [the factors invoked to explain a phenomenon]. It is a vision of sociological theory as a toolbox of semigeneral theories each of which is adequate for explaining a limited range or type of phenomena (Hedström and Udéhn 2009: 31).

Within sociology, middle-range theory is typically contrasted with high-level social theory (Boudon 1991; Sampson 2011; van den Berg 1998). One contemporary trend of middle-range theorizing focuses on the concept of “social mechanisms,” defined as “an intermediary level of analysis in-between pure description and story-telling, on the one hand, and universal social laws, on the other” (Hedström and Swedberg 1996: 281). The study of social mechanisms is one part of a broader approach called “analytical sociology” (Hedström 2005; Hedström and Udéhn 2009). Social mechanisms are lower-order social processes that have a causal component: “The basic idea of a mechanism-based explanation is quite simple: At its core, it implies that proper explanations should detail the cogs and wheels of the causal process through which the outcome to be explained was brought about” (Hedström and Ylikoski 2010).

In the words of Hedström and Swedberg (1998: 24), research on social mechanisms represents “the essence of middle-range sociology and expresses the idea that sociology should not prematurely take on broad-sweeping and vague topics or try to establish universal social laws (which are unlikely to exist in any case). It should instead aim at explanations specifically tailored to a limited range of phenomena.” Mechanism-based explanation has become a significant trend not just in sociology, but also in political science (Gerring 2007) and in the historical social sciences more generally (Bates et al. 1998; Mayntz 2004; Pawson 2000). Grounded in the scientific realism approach in the philosophy of science, the mechanism approach to causal explanation is typically contrasted with the covering-law view of explanation (Bunge 2004; Demetriou 2009; Pawson 2000), which was adopted in the 1960s and 1970s by the new archaeologists (Watson et al. 1971). While it is possible to express the empirical social theories reviewed below in terms of such mechanisms that is not my intention here. I mention the concept of mechanisms strictly as an illustration of the kind of epistemological approach and theoretical level occupied by empirical social theory.

The “limited range of phenomena” (Hedström and Udéhn 2009: 31) targeted by empirical urban theory concerns the material remains of ancient cities. To avoid potential confusion with Binford’s archaeological concepts, I borrow the phrase empirical theory from recent work in other social science disciplines. Political scientist Margaret Levi (1997: 21), for example, discusses “empirical theory” as middle-range theory associated with rational choice theory, and differentiates it from postmodern theory. Anthropologist Murray J. Leaf writes:

By an empirical theory, I mean something we can verify or falsify on the basis of shared experience. It also explains, and it does so in a very specific way. It does not work in the manner of a just-so story or an ideology... Ideologies are logically circular systems of claims and definitions designed to be held true no matter what, usually by including some claim to the effect that they do not describe mere appearances or mere individual experiences but something we cannot observe directly that lies behind them and produces them. Many social theories are of these sorts (Leaf 2009: 7).

A similar perspective is expressed by sociologist Archibald Haller:

The concept ‘empirical theory’ calls attention to systems of concepts describing the ways the elements of a given limited domain of phenomena work together. Such concepts are intended to have measured variables or other verifiable observations as their mirror images... In the natural sciences empirical theory has become so routine that it is simply taken for granted. Measurement technique and theoretic concept are so close to each other that they are interchangeable... Matters differ, however, in the field of sociology. Here empirical theory exists along side of widely regarded philosophical efforts, so-called ‘theories’ whose empirical referents are obscure, and points of view that are little more than ideologies (Haller 2009: 3).

Whether or not one accepts the claim that grand social theories are “little more than ideologies,” I think many archaeologists will agree with Ellen (2010) and Fisher (2009) that high-level social theory is of rather limited utility in carrying out their nuts-and-bolts analyses of ancient wall foundations, potsherds, and street plans. In this paper, I review several categories of empirical urban theory relevant to the archaeological analysis of ancient urban settlements. My goal is not to conduct a full intellectual analysis of this material, but rather to bring these theories to the attention of a broader range of archaeologists and ancient historians. I focus on theory and concepts that relate most closely to three aspects of urbanism of broad interest in archaeology: the layout or form of cities, urban planning, and the social dynamics of urban life.

I organize my discussion into eight bodies of empirical theory, several of which include a number of distinct approaches or emphases. I do not claim that this is the best way to categorize urban theory conceptually, but it is a convenient working typology that is useful for exposition. The best archaeological analyses of ancient cities and urbanism tend to combine concepts and methods from several of these categories in ways that are more complex and nuanced than my scheme may suggest (e.g., Fisher 2009; Moore 1996a, b; Moore 2005; Sanders 1990; Smith 2003a). Some of these topics have received considerable attention from archaeologists (e.g., monumentality and space syntax), others have been applied only within specific regional traditions (e.g., urban morphology in medieval urban archaeology), and some of this material seems to have largely escaped our notice so far (e.g., generative planning theory).

Environment-Behavior Theory

Environment-behavior theory is concerned with the recursive relationship between the actions of people and their built environment. Work in this area sometimes invokes a famous quote from Winston Churchill, “We shape our buildings; thereafter they shape us.” The most comprehensive body of work in environment-behavior theory is that of Amos Rapoport, who defines his approach with three questions: (1) What characteristics of human beings influence particular characteristics of built environments? (2) What effects do built environments have on people, and under what circumstances? (3) What mechanisms link humans and the built environment? (Rapoport 2006: 59). Rapoport has worked with Susan Kent and other archaeologists, and his chapters in archaeological publications (Rapoport 1990b; Rapoport 2006) are good introductions to the most archaeologically relevant aspects of his extensive corpus of scholarship.

One of Rapoport’s important concepts is the notion of levels of meaning in the built environment (Rapoport 1988a, b, 1990a, b). Although “environment-behavior studies” as articulated by Rapoport covers all three of these levels, I emphasize his concept of low-level meaning in my discussion of environment-behavior theory. Low-level meanings focus on mnemonic cues for identifying the uses for which settings are intended, enabling users of a building, city, or space to behave and act appropriately and predictably. These are related to social situations, expected behavior, privacy, accessibility, penetration gradients, seating arrangements, and movement. Rapoport’s concept of middle-level meanings is discussed below under architectural communication theory, and high-level meanings are discussed with the topic of normative theory.

The work of Jerry Moore provides a good illustration of the productive archaeological use of environment-behavior theory. His books on Andean built environments (Moore 1996a, b, 2005) contain numerous examples of empirical urban theory, with individual analyses often combining two or more of the categories outlined in this paper. To illustrate environment-behavior theory, I single out chapter 4, “The Architecture of Ritual,” of Architecture and Power in the Ancient Andes (Moore 1996b). Moore draws on a range of social and spatial theories—including Edward Hall’s (1966) proxemics, the ritual field model of Victor Turner (1974), and the phenomenological landscape approach of Tadahihko Higuchi (1983)—in order to derive five material-spatial attributes of ritual architecture. These attributes (permanence, centrality, ubiquity, scale, and visibility) can be objectively measured from archaeological remains, and their analysis permits inferences about the nature of ritual activities, the uses of spaces, and broader patterns of prehistoric social dynamics. Moore’s five-attribute scheme is strongly linked to the empirical archaeological record, but it is theoretically derived and permits inferences about the activities of people in the past; in other words, it is an example of empirical urban theory (this line of analysis is extended in Moore 2005: chapter 3). Axel Nielsen (1995) William Cavanagh (2002) present schemes for the analysis of ritual activities in relation to the built environment that are broadly similar to Moore’s, and this approach holds great promise for further elaboration.

Jerry Moore’s discussion of the uses and significance of Andean urban plazas in the work discussed above is expanded in a separate article (Moore 1996a). Takeshi Inomata (2006) analyses Classic Maya urban plazas as political theaters, another example of productive empirical theory. He draws on a very different range of high-level theory than Moore (e.g., performance theory, practice theory, and works on community, hidden transcripts, and nonverbal communication), illustrating the fact that empirical theories are often independent of specific bodies of high-level theory (see further discussion of this issue below).

Environment-behavior theory can also be applied to residential contexts. The use of space syntax methods to analyze residential compounds can be considered as a kind of environment-behavior theory, although here I separate space syntax as a distinctive body of method and theory. Donald Sanders’s (1990) analysis of residential compounds at Myrtos on Crete is a synthesis of Rapoport’s approach and semiotics, permitting a detailed analysis of personal space, territoriality, privacy, and boundaries in a residential setting. Also on Crete, Letesson and Vansteenhuyse (2006) draw on the work of Hall and Higuchi to expand understanding of Minoan palaces.

Architectural Communication Theory

Architectural communication theory is concerned with the ways in which planners and architects design cities and buildings in order to communicate specific messages, typically of a social and political nature. This body of theory relates to Rapoport’s model of middle-level meaning, in which deliberate statements about identity, status, wealth, power, and other traits are communicated through buildings and cities (Rapoport 1988a, 1990a). The concept of “materialization of ideology” (DeMarrais et al. 1996) is closely related to architectural communication theory. Most studies in this area focus on civic architecture rather than residential contexts.

Perhaps the most common theme in archaeological applications of architectural communication theory is monumentality. The scale of civic architecture is typically used by materialist archaeologists to measure the extent of economic or political power commanded by rulers (e.g., Abrams 1994; Smith 2008). An influential paper by Bruce Trigger (1990) provides a conceptual basis for this approach. Joyce Marcus (2003) cautions that the argument that “scale equals power” may not hold across cultures or traditions, although it does appear to hold within many individual urban systems. Architectural communication is not only about large buildings, however. In an example form historical archaeology, Leone and Hurry (1998) show how the designers of St. Mary’s City in colonial Maryland employed European Baroque principles of urban planning, such as “lines of sight to direct eyes to points of reference in space that represented hierarchy, and monarchy in particular” (Leone and Hurry 1998: 36). Other archaeological studies of monumental architecture and political communication include work by Blanton (1989), Kolb (2005), and Moore (1996b).

Richard Blanton’s (1994) model of canonical and indexical communication is an important contribution to architectural communication theory. Alternative types of identity are communicated through the vernacular architecture of state-level societies. When people build their own houses, “canonical communication” describes their use of architectural features to signal a household’s participation in a broader cultural tradition, whereas “indexical communication” involves claims of advancement in wealth or status. These concepts can be generalized to cover other types of architectural communication. For example, the use of monumental architecture to signal power is a type of indexical communication, and the use of archaic styles to mark urban memory can be seen as a form of canonical communication.

My analysis of civic architecture and planning in Aztec city-state capitals (Smith 2008) focuses on several examples of architectural communication as expressed in the urban built environment. The sizes and monumentality of civic buildings such as temples and palaces not only proclaimed the power of city-state kings, but the practice of constructing such buildings also helped generate commoner identification with, and allegiance to, the king in ways discussed by Pauketat (2000), Cowgill (2003) and others. The high degree of standardization of the forms of civic buildings among cities throughout central Mexico communicated another kind of message: the common participation of local kings and nobles in a regionally extensive noble class with an established canon of public architecture.

A third type of architectural communication was the use of an archaic city plan, copied from the ancient holy city of Tula, at a number of Aztec cities. This city plan materialized an ideology that legitimized power by reference to descent from the kings of Tula. The use of archaizing architecture and city plans for ideological purposes characterized a number of premodern societies, such as Roman Greece, where architecture was used to signal Roman continuity with the classical past of Athens (Alcock 2002). In the Aztec case, architectural communication theory not only helps explain the forms, layouts, and uses of city-state capitals, but these principles also illuminate the transformation of Tenochtitlan from a city-state capital like many others into an imperial capital (Smith 2008).

Space Syntax

Space syntax is a body of concepts and methods associated with Bill Hillier and associates at the University College, London (Hillier and Hanson 1984; Hillier and Vaughan 2007). The techniques of space syntax, including access graphs, depth measures, and other elements, have been employed by many archaeologists; indeed the studies are too numerous to review here (a comprehensive review of this literature is badly needed). There is a body of social-spatial theory associated with space syntax that posits a rather strict and deterministic relationship among buildings, movement, and social relations (Bafna 2003; Hillier 2008). Most archaeologists follow ethnologist Edmund Leach (1978) in asking the question, “Does space syntax really ‘Constitute the Social’?”, and in practice they either ignore or reject its theoretical foundations as too limited and unrealistic (e.g., Fisher 2009; Moore 1996b: 184–210; Smith 2003a: 242–254). As a result space syntax largely functions as a method, not a body of theory, within archaeological research.

Nevertheless, space syntax does have an implicit theoretical component for archaeologists, although less ambitious than claimed by Hillier and others. That theory focuses on the importance of movement within built environments and the significance of access (restricted vs. open) for social interaction. As pointed out by others, these concepts overlap with the fields of environment-behavior theory (Fisher 2009) and urban morphology (Pinho and Oliveira 2009). While it may be artificial to single out space syntax as an independent body of empirical urban theory, I do so because of the distinctiveness of its methods and its widespread use by archaeologists.

Jerry Moore’s first book contains a good example of archaeological space syntax analysis (Moore 1996b: 184–210). Unlike most archaeological applications, which use these techniques to describe complex architecture and then try to draw some conclusions, Moore uses access analysis to test a specific hypothesis about the layout of royal compounds in the Chimu city of Chan Chan. Previous scholars had suggested that distinctive U-shaped rooms were posts where bureaucrats controlled access to the numerous storage rooms within the compounds. Space syntax access graphs show, however, that these rooms “do not control access to storerooms” (Moore 1996b: 208).

Hillier and others have begun to broaden the space syntax approach to model urban form on a larger scale using street patterns (Hillier 2008; Hillier and Vaughan 2007). This approach may have archaeological applications for cities with clearly defined street patterns, although it is hard to see how it would apply to low-density cities in regions like Mesoamerica where streets were not important urban features.

Urban Morphology

“Urban morphology” is a distinctive body of research that began with highly descriptive studies of historical town plans and then expanded into a broader analytical approach to urban form. As a tool for historical study of urban form, the work of M.R.G. Conzen (1968) gained particular impetus through the development of “town plan analysis”, a technique for studying historic urban plans and their changes through time (Lilley 2000); this morphogenetic approach is sometimes called “the Conzenian tradition.” A central concept in this field is “townscape,” a description of the integrated physical, visual, and functional aspects of the urban-built environment (Conzen 1988). A townscape has three components: (1) the layout or ground plan, (2) the building fabric (construction materials and architectural style), and (3) the uses of buildings and open spaces (Whitehand 2001).

From its beginnings in the study of medieval town plans, urban morphology has grown into a more comprehensive approach to urban form. Whitehand (2001) identifies three areas of current research in British urban morphology: urban micromorphology (detailed studies of urban houselots and townscapes), studies of changes in town plans over time, and research on the relationship between decision-making and urban form. This approach continues to be an important area of research in Britain, particularly in medieval history and archaeology (e.g., Lilley et al. 2007). The field is integrated through the International Seminar on Urban Form (ISUF), a group that publishes the journal Urban Morphology and holds international conferences. In the past decades, the urban morphology approach has literally spread around the world, with planners, geographers, and urban historians pursuing these methods in many diverse areas (Conzen 2001; Pinho and Oliveira 2009; Whitehand and Gu 2006).

In classic urban morphology research on medieval towns, the analyst begins with detailed nineteenth-century plans of towns that have medieval origins. The next step is the identification of “plan units,”contiguous areas whose streets, plots, and houses share a common morphology (size, shape, orientation). The assumption is that each plan unit was laid out and built in a single epoch. Then relevant historical and archaeological data are considered in an attempt to date and contextualize the plan units, and the final step is construction of an integrated historical model of the dynamics of expansion of the town. Lilley (2000) provides the clearest exposition of these methods. In a different kind of study, Lilley et al. (2007) apply these principles of plan analysis to the “new towns” built by Edward the First in Wales between 1,272 and 1,307. Through careful mapping and morphological analysis these scholars are able to determine which towns were likely laid out by a known individual designer (Master William of Louth), and to explore the ways in which various agents—surveyors, masons, members of the royal household, townspeople—negotiated the process of design and construction of the new royal towns.

Urban morphology has great potential—both as a method and an associated body of empirical urban theory—for research on premodern urbanism. Careful attention to town plans and their changes through time is a notable feature of this approach. It would seem a natural partner for some of the other bodies of empirical theory discussed here, and the joint application of concepts and methods could be of great benefit for understanding ancient cities.

Reception Theory

I use the term “reception theory” to describe a range of approaches to ancient cities that share a concern with the ways that residents and visitors experienced the urban built environment through movement, including daily quotidian activities and special occasions such as public ceremonies. The notion of “reception” is loosely borrowed from literary studies, where it refers to the ways in which readers encounter texts (Eagleton 1996). So far only a few authors have used the term with respect to ancient built environments (Favro 1999; Holtorf 2001). This category is perhaps the most heterogeneous of my eight groups of empirical theory, ranging from highly empirical analyses of specific routes through well-documented cities (Favro 1996) to speculative phenomenological accounts of landscapes and monuments (Isbell and Vranich 2004; Tilley 1994). I will not deal further with the latter approach. Apart from its various empirical and conceptual deficiencies (Barrett and Ko 2009; Bintliff 2009; Fleming 2006), the landscape phenomenological approach cannot be considered empirical theory. Its practitioners apply high-level theory directly (and very subjectively) to the archaeological record, and their avoidance of middle-range methods and concepts may account for some of the difficulties pointed out by its critics.Footnote 5

Planner Kevin Lynch’s The Image of the City (Lynch 1960), a highly influential study of how people select routes to move through modern cities, is the point of origin for a strand of reception theory within planning and urban studies. Based on maps and interviews, Lynch concluded that people cognize urban environments using five form elements: paths, nodes, edges, landmarks, and districts. People develop mental models based on these features, and use them to select routes and to understand the layout of the city. Planners use this approach to study the “legibility” and “imagability” of modern cities.

Architectural historian Diane Favro draws on Lynch’s concepts and findings to develop a rich analysis of how people experienced the built environment of Rome under Augustus. The archaeological and historical records permit a detailed analysis of what people saw as they walked along the major streets of Rome. The visual elements of the built environment were well understood by Augustus and the designers and architects of imperial Rome. Favro is able to show how specific buildings and features (such as arches and fountains) were placed within the city so as to create visual impressions on pedestrians, which in turn produced outcomes in terms of ideology and practice.

Reception theory (as empirical theory) can also be pursued without the benefit of texts. In chapter 3 (“The Architecture of Monuments”) of Architecture and Power in the Ancient Andes, Jerry Moore (1996b) uses a reception approach to explore the notion that “there is a direct relationship between a monument’s design and its communicative potential, and thus its ability to serve as a marker of social cohesion” (p. 98). Moore draws on the work of Higuchi (1983), who presents “a clear methodology for transforming ideas about landscape into measurable properties of physical forms” (Moore 1996b: 98). Differing visual characteristics of monuments in the sites analyzed by Moore suggest variation in the size of the audience that was targeted for specific ceremonies and events. The study of isovists and viewsheds, conducted informally by Moore, has been expanded greatly in recent years with the development of GIS technology (Fisher 2009; Forte 2003; Lake and Woodman 2003), although this line of research has yet to make significant contributions to reception theory about ancient cities.

Generative Planning Theory

Most considerations of ancient urban planning have ignored housing to concentrate on civic architecture (e.g., Ashmore and Sabloff 2002; Kostof 1991; Smith 2007). Generative planning theory can redress this imbalance through its focus on houses and residences. Research and writing on urban design and planning were long dominated by a focus on top-down processes. Planners produced plans, which were carried out (or not) by officials of the city or state. Recently many planners (and other scholars) have come to appreciate and study bottom-up processes of urban growth and organization.

Several theoretical traditions converge in arguing that when people design and construct their own housing and neighborhoods, they can achieve outcomes that are more socially beneficial than can be achieved by the heavy hand of central planners. Since housing in many (perhaps most) ancient cities was most likely built by residents, not by central authorities (Smith 2010), this line of theorizing should be of great interest to archaeologists. I call this amorphous body of theory “generative planning theory”; generative is the label often given to the bottom-up processes involved, and the word planning emphasizes that these processes are not chaotic or random. They are planned, but at a household or neighborhood level, not at a central civic or state level.

Christopher Alexander is one of the most influential normative theorists in contemporary architecture (see discussion of normative theory below). His book, A Pattern Language (Alexander et al. 1977), is one the most frequently cited books in the field of architecture (Mehaffy 2007). Alexander’s ideas provide some of the conceptual foundations for generative planning theory (Alexander 1987; Alexander et al. 1977). A Pattern Language discusses several hundred “patterns” which cover a wide range of phenomena, from house form to city form to behavioral practices:

We begin with that part of the language which defines a town or a community. These patterns can never be ‘designed’ or ‘built’ in one fell swoop—but patient piecemeal growth, designed in such a way that every individual act is always helping to create or generate these larger global patterns, will, slowly and surely, over the years, make a community that has these global patterns in it... We do not believe that these large patterns, which give so much structure to a town or of a neighborhood, can be created by centralized authority, or by laws, or by master plans. (Alexander et al. 1977: 3)

Other influential thinkers in the field of generative planning theory are Amos Rapoport (1988a, b, 1990a, b) and Paul Oliver (1997).

Some of the clearest discussions of this approach are found in the work of Besim Hakim (e.g., 1986, 2007), who contrasts central planning with what he calls “generative programs” in terms of their operation and their outcomes on the neighborhood level. Whereas central planning is carried out by static blueprints, generative programs are carried out by residents. They are, “comprised of ethical/legal norms derived from the history and value system of the society” (Hakim 2007: 88); these programs are locally based and have legitimacy for residents. Hakim has analyzed in detail the historical development of generative programs in Islamic cities (Hakim 1986).

A number of scholars have produced similar analyses focused on the informal settlements or shantytowns that have grown up around many cities in the developing world. While not losing sight of the problems of poverty and material depravation in many of these settlements, these authors point out the pride residents take in building their own homes, often over long periods as resources become available (Hardoy 1982; Turner 1991). Kellett and Napier (1995) broaden this discussion under the label of “vernacular theory” (see also Oliver 1997; Rapoport 1988b). Similar sentiments on the social advantages of generative processes over central planning have been expressed by non-traditional planners using concepts such as “non-plan” (Banham et al. 1969), the “libertarian suburb” (Barnett 1978), and “informal planning” (Briassoulis 1997).

I am not aware of archaeological applications of generative planning theory to ancient cities, although some starts have been made. Planner and architectural historian Jorge Hardoy (1982) argued that residential zones in most Precolumbian cities in the New World were informal settlements comparable to modern informal settlements. I extended his argument and made some suggestions for archaeological analysis (Smith 2010), but without a detailed case study. Alison Kohn (2010) conducted an ambitious ethnoarchaeological study of informal vernacular housing in Bolivia in order to generate insights for the archaeological analysis of ancient urban housing. Given the growing body of empirical theory on informal settlements, vernacular architecture, and generative processes, it is time for archaeologists to apply this approach to the study of residential zones in ancient cities.

Normative Urban Theory

Architects and planners use the term “normative theory” to describe theories that have an evaluative component, as in the phrase “good urban design.” Planners believe some cities are “better” than others in terms of livability, safety, sustainability, and other positive social values. Normative theory focuses on the achievement such positive benefits through the design and construction of cities. This concept should be distinguished from customary usage in Americanist archaeology, where “normative” was a pejorative term used by Lewis Binford and other new archaeologists to criticize mental, or idea-based cultural models (Lyman and O'Brien 2004). In order to promote cross-disciplinary understanding, I use the term here in its architectural sense.

Planners are concerned with improving cities, and much of planning theory is normative in nature (in the field of planning “normative theory” is often contrasted with “descriptive theory”). The vast majority of this literature (e.g., Taylor 1998) is so heavily focused on modern, western urbanism that it seems difficult to apply to ancient cities. Several thinkers in planning and architecture, however, have described approaches to normative theory sufficiently broad in historical and comparative terms as to be applicable to ancient cities. The most relevant of these writers for archaeology are Kevin Lynch and Amos Rapoport.

In A Theory of Good City Form Kevin Lynch (1981) discusses three normative theories. One of these—“the city as an organism”—is a notion that applies primarily to modern cities. A second model—“the city as a practical machine”—is also most relevant to modern cities, but it does have ancient applications in planned utilitarian settlements such as military camps or planned colonial cities. Lynch’s third normative model, “the theory of magical correspondences,” describes notions of good city form in a number of ancient urban traditions. Whereas modern planners are concerned with livability, safety, efficiency, and sustainability, Lynch suggests that ancient planners were more concerned with designing cities whose form and operation were in tune with the cosmos, and he proposes a series of form elements (e.g., axial procession ways, walled enclosures with gates) that ancient planners used to achieve this end (Lynch 1981: 73–81). Amos Rapoport (1993) published a similar treatment of form elements that promotes the notion of city form as a reflection of the cosmos, based partly on the work of Mircea Eliade.

Eliade’s (1959) model of universal cosmological symbolism for ancient cities has been employed by a number of archaeologists and historians of religion (Carrasco 1999; Wheatley 1971). Cosmic symbolism is an example of what Rapoport calls “high-level meaning,” a kind of symbolic representation that only exists within the context of a specific cultural and religious system (Rapoport 1990a). Whereas it may be straightforward for archaeologists to identify the form elements mentioned by Lynch and Rapoport (e.g., orthogonal layouts, formal procession routes), providing a convincing reconstruction of accompanying cosmic symbolism can be difficult or impossible in the absence of written texts (Flannery and Marcus 1993). I have published a case study suggesting that Mayanists have been too quick to posit cosmological symbolism for buildings and cities in the absence of written evidence (Smith 2003b, 2005). Indeed, Bruce Trigger has stated:

The desire to create cosmograms does not appear to have been as obvious or widespread in early civilizations as Eliade and his followers have maintained... His general ideas seem to have been applied too dogmatically and in some cases without sufficient local warrant to the physical layout of structures (Trigger 2003: 470).

Nevertheless, cosmic architectural symbolism was clearly important in a number of ancient urban traditions. The best documented case is ancient China, where Paul Wheatley (1971: 411–427) has shown the clear parallels between textual accounts of the cosmic symbolism of urban layout on the one hand, and archaeological and historical data on the other. The Emperor, known as the “Son of Heaven,” was the representation or embodiment of sacred power on earth. His authority was legitimized by locating his capital at a powerful and propitious place, employing a rectangular plan with key gates and procession avenues, and orienting the city to the cardinal directions. Oracles were consulted and divination rites were performed, following feng shui principles, to select the sites of capitals. The welfare the kingdom was thought depend on finding a favorable location and orientation for the capital.

In the Chinese case, the ancient normative principles of city layout are well documented in texts and images, and Wheatley’s act of matching these up with actual city layouts (see also Steinhardt 1990) is a good example of normative urban theory. Similar analyses have been done for ancient cities in India (Bafna 2000) and at Angkor (Dumarçay and Royère 2001). Some medieval European cities were designed in accordance with cosmic models (Lilley 2009), and a particularly well-documented modern example of this phenomenon is the Hindu city of Bhaktapur in Nepal (Gutschow 1993; Levy 1990). For archaeologists working in traditions without texts, however, it can be difficult if not impossible to infer the cosmic symbolism from urban form alone. Specific urban spatial configurations—such as orthogonal grid layouts—have been built over the ages in accordance with widely varying normative and social models (Grant 2001).

City Size Theory

This category includes a variety of perhaps disparate theoretical approaches that focus on explaining the sizes of cities, most commonly in relation to economic or social processes. Most of these approaches have been applied in some form by archaeologists. Economists (Krugman 1996) and economic historians (Bairoch 1989) have explained city size and urbanization rate (the percent of population living in cities and towns) in terms of economic growth, and this approach has been applied to early Mesopotamian cities by Algaze (2008). Central place theory explains city size and location on the basis of retail market exchange (Haggett et al. 1977), an approach that has attracted the attention of archaeologists on and off over the years (e.g., Inomata and Aoyama 1996). Larger scale economic and demographic processes are invoked to understand city size distributions on the level of nations or “city systems,” often studied with rank-size analyses (Berry 1964). This approach has seen numerous archaeological applications, and archaeologists have contributed methodological refinements (e.g., Drennan and Peterson 2004; Savage 1997).

The most comprehensive body of theory on city size that relates directly and clearly to ancient cities is Roland Fletcher’s model of settlement growth (Fletcher 1986; Fletcher 1995). Fletcher posits limits to settlement size and density based upon social stress arising from frequent social interactions. His limits are empirically derived from a sample of thousands of human settlements. As settlements grow, they can only cross size thresholds if there are innovations in communications technology and/or in architectural configurations that channel social interaction. Given the comprehensive nature of Fletcher’s data, the clear archaeological implications of his growth model (in terms of both settlement sizes and architectural features), and the importance of the topic to a number of disciplines from archaeology to modern urban studies, it is puzzling that more archaeologists have not employed this model (Manning 1999). Fletcher’s model is one of the premier examples of empirical urban theory, generated by an archaeologist and applicable to archaeological, historical, and modern settlement size data. Other research on size thresholds in human settlements and networks (e.g., Bodley 2003; Hill and Dunbar 2003; Kosse 1990) is also very relevant archaeologically and can be related to one or more empirical urban models.

Several compendia of city size data reaching back to the earliest archaeologically documented cities have been published (Chandler 1987; Modelski 2003). These data have been analyzed by a variety of quantitatively oriented scholars, including anthropological modelers (White et al. 2008) and world-systems analysts (Chase-Dunn 2007; Chase-Dunn and Manning 2002). Archaeologists, however, may want to pay closer attention to this use of our data by other scholars: do Chandler and Modelski have the archaeological data right, or are there problems with their tables? And perhaps those of us who work in the New World should think about assembling city size data comparative to the Old World information contained in these books.

Discussion

The empirical urban theories reviewed here—examples of middle-range theory in Merton’s terms—present a more productive avenue for the analysis of archaeological data from ancient cities than does grand social theory. This suggestion is similar to the view of sociologists Tim May and Beth Perry, who comment on theoretical trends in urban sociology:

What occurs here in the unfolding of urban sociology is a movement away from the difficult but also productive relations between theory and data. The danger is that social theory becomes so far removed from localities that it does not appear to have implications for informing context-sensitive research that connects everyday experiences to public and social issues (May and Perry 2005: 345).

The key feature of most of the empirical urban theories reviewed above is that they link the actions of people in cities to the materiality of the urban built environment. These theories deal with both the impact of humans on the built environment (processes of design, construction, modification, and destruction) and the impact of the urban built environment on the actions, social organization, and mental states of people.

The eight bodies of theory reviewed above do not exhaust the potentially useful empirical urban theory that exists in disciplines outside archaeology, but they do seem to represent some of the more relevant and directly applicable bodies of such theory. Several other relevant domains can be suggested for which the connections to archaeological data and ancient urban dynamics do not appear to be as strong; these approaches will probably require more exploration and modification to be useful archaeologically. One such area is ecological theory as developed by the Chicago School of urban sociology in the mid-twentieth century. Although the explanatory models of this approach have long been superseded (Gottdiener and Hutchison 2006), the focus on spatial mapping of social variables and the attention to neighborhood characteristics are attractive features for archaeology (Sampson 2008). A related domain is segregation theory, which focuses on explaining patterns of racial segregation in modern cities (Briggs 2005; Marcuse and van Kempen 2002; Sampson 2009). An empirical and conceptual expansion of this topic to focus on variation in the spatial clustering of ethnicity and class in cities across time and space has potential for archaeological application (York et al. 2010).

Anthropological political economy is an active area of archaeological research (e.g., Earle 2002; Feinman and Nicholas 2004; Sanders and Santley 1983), and the linking of current empirical craft production theory (Costin 1991; Costin 2001), for example, to urban spatial dynamics would seem to hold promise. Institutional economics (North 1984; North 1991) is another area in which the spatial expressions of economic and political processes in urban settings have yet to be explored in detail. Collective action theory (Levi 1988; Ostrom 2007) is a potentially productive body of empirical theory that is just beginning to receive attention from archaeologists (Blanton and Fargher 2008; Smith 2008), and its application to urban dynamics has potential.

I have not said much in this paper about high-level social theory. Although many archaeologists are enamored of Giddens, Bourdieu, and other social theorists and feel the need to invoke their work frequently, it is not necessary to engage this level of theory in order to develop and use empirical theory. Empirical theory is not necessarily derived from specific high-level theories, nor is it dependent upon a particular theoretical orientation. Merton observed that middle-range theories are in many respects independent of high-level theory:

They are frequently consistent with a variety of so-called systems of sociological theory... comprehensive sociological theories are sufficiently loose-knit, internally diversified, and mutually overlapping that a given theory of the middle range, which has a measure of empirical confirmation, can often be subsumed under comprehensive theories which are themselves discrepant in certain respects (Merton 1968: 43).

Many of the empirical urban theories reviewed above are consistent with a number of bodies of high-level theory. The practice theory of Bourdieu and Giddens is commonly invoked by archaeologists working on urban built environments (Ashmore 2002; Fisher 2009; Yates 1989). Henri Lefebvre’s concept of the “production of space,” another high-level social theory, has been applied to ancient urban contexts by Adam T. Smith (2003a). Smith recognizes that highly philosophical concepts are difficult to apply directly to archaeological data, and he employs space syntax methods and concepts from environment-behavior theory and architectural communication theory in his analysis of political landscapes.

Although few archaeological applications of empirical urban theory make explicit use of rational choice theory, this approach is consistent with many of the empirical theories presented above. Its focus on groups of people with contrasting goals and resources (e.g., kings, nobles, and commoners), acting within political and social constraints (Blanton and Fargher 2008; Boudon 2009; Kiser and Hechter 1998; Levi 1997; MacDonald 2003; Ostrom 2007), can help make sense out of ancient cities and urban dynamics (e.g., Smith 2008: chapter 8). Another potentially relevant branch of theory is the “social-economic systems” approach, a synthesis of ecological resilience theory and traditional social science theory (Anderies 2006; Janssen et al. 2003; Ostrom 2009; Young et al. 2006). Scholars have only started to explore the application of this approach to modern cities and the built environment (Alessa et al. 2009; Moffatt and Kohler 2008), and this work may hold promise for archaeological research.

Work on the level of empirical theory does not commit one to a particular brand of grand theory, although explicit ties between middle-range theories and high-level theory can enrich our understanding of ancient social dynamics. Urban theory outside of archaeology is notable for the broader spectrum of both high-level and middle-level theories brought to bear on cities and urbanism (e.g., Cuthbert 2006; Low 1999; Parker 2004; Rapoport 1990a, 1990b; Rapoport 2000; Short 2006). This material can be a source of concepts, theories, and methods as archaeologists develop our toolkit for understanding the forms and social dynamics of ancient cities. In this paper, I have discussed eight bodies of empirical urban theory that are particularly salient in this respect. I hope this exposition convinces archaeologists of the usefulness of theory and methods from related disciplines, and of the promise for continuing archaeological elaboration of concepts and methods at the level of empirical urban theory and other (Mertonian) middle-range theories.