Keywords

1 Introduction

CIDOC CRM and CIDOC CRM based models such as CRMarchaeo have been recently used to model archaeological work. Archaeologists excavate, observe patterns, collect finds, keep notes, and produce records (such as handwritten excavation notebooks, filled-in context sheets, photographs, sketches drawings.) CIDOC CRM and CRMarchaeo aim to aid their digital documentation. Can CIDOC CRM and CIDOC CRM based models sufficiently represent archaeological records? To what extent are they able to provide a framework to assist archaeological work, documentation and interpretation? We address these issues by working towards an automated CRM-based system to assist archaeologists in modeling excavation works and research. Real time digital documentation of data from excavations, integrated with other semantically described data, will help archaeologists to more effectively evaluate and interpret their work results.

To this end, we have represented archaeological context sheets (first page of the two-page context sheet, see Fig. 4) from recent archaeological excavation works at Fuwairit in Qatar (2016–2018), part of the Origins of Doha and Qatar Project (ODQ), by successfully employing classes and properties of CIDOC CRM, CRMarchaeo and CRMsci models.

2 Related Work

In the last decade, CIDOC CRM-related research and work has been done to integrate archaeological data, given the need of documenting archaeological science [20]. The ARIADNE project [17] and its continuation ARIADNEplus projectFootnote 1 have systematically attempted to integrate different European archaeological datasets by using CIDOC CRM and by developing the CRMarchaeo and CRMsci extensions. Other attempts involved the extensions CRMsci and CRMdig to document scientific archaeological experiments and results [20] or just the CIDOC CRM (without any of its extensions) in an effort to describe archaeological objects but without an evaluation of this approach [6].

The English Heritage has also developed a CIDOC CRM extension, the so-called CRM-HEFootnote 2, to model archaeological concepts and their properties. To the same end, the STAR project (Semantic Technologies for Archaeology Resources) [2] investigated the suggested extension on archaeological data integration. Additionally, they proposed a semi-automatic tool for archaeological dataset mapping to CRM-HE [3] as well as an approach for archaeological data creation from grey literature semantic search [23].

In terms of describing archaeological excavation records, there is an approach similar to the one presented in this paper [12]. This approach focused on CRMarchaeo classes and properties to model data derived from the daily archaeological excavation notebooks. Data in the archaeological notebooks related to describing the timespan of the works in an archaeological trench, defining and establishing elevation points, measuring the depths of archaeological strata, addressing the trench’s stratigraphy, recording the archaeological findings from the works in the trench, and publishing the results of excavation and the archaeological work.

This work lies within the overall theme of integrating various types of cultural metadata and encoding them in different metadata schemas using CIDOC CRM. Approaches relate to mapping the semantics of archival description expressed through the Encoded Archival Description (EAD) metadata schema to CIDOC CRM [4], semantic mappings of cultural heritage metadata expressed through the VRA Core 4.0 schema to CIDOC CRM [9, 10], and mapping of the semantics of Dublin Core (DC) metadata to CIDOC CRM [14]. These mappings consider the CIDOC CRM as the most appropriate conceptual model for interrelations and mappings between different heterogeneous sources [11] in the information science fields.

3 Preliminaries

3.1 Archaeology, Excavations, Strata, Contexts

Archaeology is the study of past material remains, aiming to comprehend past human cultures. From fossils dating millions of years ago to last decade’s fizzy drink cans, archaeologists try to discover evidence of past phenomena, cultures and societies. Archaeology lies within humanities and social sciences, but it can also involve other scientific disciplines, depending on the nature of discoveries [21]. Archaeological work is a process of continuous discovery and recording. Archaeological finds are preserved and stored for interpretation, study, and exhibitions. In terms of methodology, archaeologists work in:

  1. 1.

    recording visible remains of past human activity (i.e. buildings and ruins),

  2. 2.

    surveying the surface of an area to spot, report and collect artifacts (i.e. human-made objects, e.g. fragments of pottery, glass and metal objects) and ecofacts (i.e. natural remains deposited as a result of human activity, e.g. animal bones, seeds etc.), and

  3. 3.

    systematically excavatingthe ground to discover artifacts and ecofacts. In archaeological excavations, archaeologists remove layers of soil (strata) within well-defined and oriented trenches. As soil is removed, distinct concentrations of soil and artifacts are revealed. These are called contexts and are reported in the diaries of the archaeologists or via filling in ‘context sheets’. Archaeological diaries and/or context sheets form the basis of documenting the excavation process and comprise the starting point for archaeological analysis and interpretation.

3.2 CIDOC CRM and CRMarchaeo

CIDOC Conceptual Reference Model (CIDOC CRM)Footnote 3, is a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. CIDOC CRM intends to provide a model of the intellectual structure of cultural documentation in logical terms.

Several extensions of CIDOC CRM suitable for documenting various kinds of cultural information and activities have been proposed so far. CRMarchaeoFootnote 4 is an extension of CIDOC CRM created to support the archaeological excavation process and all the various entities and activities related to it, while the CRMsci (Scientific Observation Model)Footnote 5 is an extension of CIDOC CRM intended to be used as a global schema for integrating metadata about scientific observations, measurements and processed data in descriptive and empirical sciences such as biodiversity, geology, geography, archaeology, cultural heritage conservation and others in research IT environments and research data libraries.

This work applies CIDOC CRM, CRMarchaeo and CRMsci to document archaeological data and reports, which will offer valuable experience concerning the documentation needs of these data. We test our approach by using archaeological data in Qatar. This research will, in turn, influence the process of further developing and refining these models. This work is based on CIDOC CRM version 6.2.7 (October 2019), CRMarchaeo version 1.5.0 (February 2020), and CRMsci version 1.2.8 (February 2020).

4 The Origins of Doha and Qatar Project, and the Archaeological Works at Fuwairit

The Origins of Doha and Qatar Project (ODQ) started in 2012Footnote 6. It aims to investigate the history and archaeology of Doha, the capital of Qatar, and the other historic towns of Qatar, as well as the lives and experiences of their inhabitants. ODQ was run by University College London in Qatar (UCL Qatar) in collaboration with Qatar Museums (QM), funded by the Qatar Foundation through Qatar National Research Fund (QNRF), under grants NPRP5-421-6-010 and NPRP8-1655-6-064. Given the rapid development of Doha in the last few decades, which transformed the city from a pearl fishing town at the beginning of the 20th century [5] to a vivid modern capital city thanks to oil revenues since the 1950s [1, 7, 8], ODQ employed a multidisciplinary methodology. This included recording of historical buildings, excavations, recording oral histories of local people, GIS analysis for pre-oil and early oil Doha  [16, 18, 19], archival research and study in historical documents on Doha’s founding and growth. Preliminary Results have been publicly presented in Qatar and the world by the project leaders. The project has also produced educational material for schools in Qatar. From 2016 until 2018, ODQ expanded its works in Fuwairit, about 90 km north of Doha in Qatar, with recordings of historical buildings, excavations and surface surveys, as the area consists of a historic village with buildings of historical architecture, as well as rock art and inscriptions, and the archaeological site itself (the remains of a pearl-fishing town of the 18th-early 20th c. AD). Works included mapping/surveying, excavations, recording of historical buildings, archaeological surface survey in both Fuwairit and the neighboring Zarqa, and pottery analysis [15]. For the purposes of this paper, we used context sheets from the archaeological excavation works in Fuwarit during the first season (2016) and specifically from Trench 1. In Fig. 1 we see the representation in CIDOC CRM of the overall structure of the Origins of Doha Project.

Fig. 1.
figure 1

Describing the Origins of Doha Project in CIDOC-CRM.

5 Describing the Structure of a Stratigraphic Matrix

5.1 Stratigraphy in Archaeology and the Stratigraphic Matrix

The process of layers (strata) of soil and debris laid on top of one another over time is called stratigraphy. Archaeologists and geologists are particularly interested in the stratigraphy of an area, as strata determine sequences of human-related or geological events. As a rule, when a stratum lies above another, the lower one was deposited first. Let’s think of earth strata as layers in a chocolate cake. To make a cake, first we put the sponge base, then a chocolate cream layer, then another layer of sponge cake, then one more layer of chocolate cream, then the chocolate frosting, and last (but not least) a cherry on top. This is a sequence of cake-making events with the base being the earliest and the cherry being the latest event in the process. Archaeologists prefer to eat their cakes from top to bottom, from the cherry to the base! First, they define the contour of a specific space to excavate, which is usually a square or rectangular space of x metres by x metres. The excavation space is called an archaeological or excavation trench. Then they start to carefully and meticulously remove the top layer (stratum) of the trench, and they keep on excavating within this area stratum by stratum. The content of each stratum in the trench may include evidence of human activity, such as fragments of clay, glass and metal objects, roof tiles, bricks, fossils, remains of a fire, animal bones etc. These objects help towards dating the strata and interpreting past events that have formed the strata. When the excavation of a trench is finished, the sequence of excavated deposits and features can be arranged in a stratigraphic matrix according to their chronological relationship to each other, i.e. whether the events that created them occurred before or after each other. This matrix is also described as a Harris Matrix from the book on archaeological stratigraphy by E. C. Harris [13]. Usually, earth strata are not as straightforward as chocolate cake strata. Archaeological strata may contain formations such as post holes, pits, walls, and burrows which disturb natural layers but often indicate human activity and are the results of human behaviour. Archaeologists number each stratum and each feature (e.g. a built structure or pit cut) on the stratigraphic matrix, and they try to interpret past events by co-relating strata and objects found within stata. Each stratum and each feature are called contexts. It is important to note that contexts do not always have direct stratigraphic relationships with others even if they are close to each other or likely to be contemporary. For example, two deposits which have built up on either side of a wall have no direct stratigraphic link, though they might both have a relationship with the same context below (e.g. the wall, which is stratigraphically below both deposits). In such cases the matrix branches. For every context, archaeologists fill in a context sheet described below.

5.2 Describing a Stratigraphic Matrix in CRMarchaeo

In CRMarchaeo, each context on the stratigraphic matrix is member of the class A8 Stratigraphic Unit. Stratigraphic units are related via the property AP11 has physical relation, further refined by the property of property AP11.1 has type. The type can be ‘above’, ‘below’, ‘within’, ‘next to’ or other, depending on the relation of a context with another context in the stratigraphic matrix. In Fig. 2 we can see a fragment of the stratigraphic matrix of Trench 1, while in Fig. 3 we see the representation of a part of this stratigraphic matrix in CRMarchaeo.

Fig. 2.
figure 2

A fragment of the stratigraphic matrix of trench 1.

Fig. 3.
figure 3

Representation of the fragment of the stratigraphic matrix appearing in Fig. 2.

6 Describing the Content of a Context Sheet

6.1 The Context Sheet

Archaeologists working for the Origins of Doha Project, and therefore at Fuwairit, have used context sheets to record their excavation work in the archaeological trenches. The context sheet is the report describing each context unheartened and, therefore, it is critical for archaeological research and interpretation. Every context sheet offers:

  • Reference information (site codes, trench and context numbers, relation to other contexts, date, names of archaeologists recording, related photo and drawing numbers)

  • Information on the context’s soil deposit and its characteristics.

  • Information about finds in the context.

  • Space for archaeological interpretation.

  • Space for recording levels and an accompanying sketch (back sheetFootnote 7).

In Fig. 4 we see the front page of a context sheet from the excavation of Trench 1.

Fig. 4.
figure 4

A context sheet from the excavation of trench 1.

6.2 Modelling a Context Sheet in CRMarchaeo

A context sheet is an instance of the class E31 Document. A context sheet documents (P70 documents) an instance of the class A1 Excavation Process Unit (in our example this instance is Excavation of Stratigraphic Unit 2). The relation between the context sheet and the excavation of the stratigraphic unit that it documents is expressed through the path (see Fig. 5, in which the CRM representation of most of the fields of the context sheet appearing in Fig. 4 is depicted):

E31 Document \(\rightarrow \) P70 documents \(\rightarrow \) A1 Excavation Process Unit

Reference Information: The field Site Code contains a code (an instance of the class E42 Identifier which identifies the project (ODQ in our case). This identifier is related to the Origins of Doha Project (instance of the class E7 Activity (see Fig. 1) through a path of the form:

E7 Activity \(\rightarrow \) P48 has preferred identifier \(\rightarrow \) E42 Identifier

Concerning the values of the fields Trench and Context Number, we observe that the trench appears as an instance of the class A9 Excavation (see Fig. 1) while Context Number appears as an instance (in our example Excavation of Stratigraphic Unit 2) of A1 Excavation Process Unit (see Fig. 5). These two instances should be related with the property P9 consists of through a path:

A9 Excavation \(\rightarrow \) P9 consists of \(\rightarrow \) A1 Excavation Process Unit

Fig. 5.
figure 5

Representation of the contents of part of the context sheet appearing in Fig. 4.

Information on the Context’s Soil Deposit and its Characteristics: There are several fields of the context sheet which are represented in CRM by directly connecting the instance of A1 Excavation Process Unit with the instance of other CRM classes through appropriate properties.

Concerning the items 1) Colour, 2) Compaction and 3) Composition, we observed that Colour and Compaction can be seen as properties of the material in Composition. These items are represented as follows: the value in Compaction can be regarded as an instance (silty sand in our example) of the CRMsci class S11 Amount of Matter which consists of (P45 consists of) an instance (sand in our example) of the class E57 Material. The values of the properties of these material are instances of the CIDOC CRM class E26 Physical Feature. Each feature is related to the material with the property P56 bears feature. An instance of the class E55 Type is also connected through the property P2 has type to each instance of E26 Physical Feature to denote the type of the feature (compaction or colour in our case).

The field Method of Excavation (item 6) is represented through the CRM paths of the form:

A1 Excavation Process Unit \(\rightarrow \) P16 used specific object \(\rightarrow \) E22 Human-Made Object

which relate the instance Excavation of Stratigraphic Unit 2 with the tools used in this excavation (i.e. the trowel and the mattock). These tools are instances of the class E22 Human-Made Object.

Concerning the Deposit Type field of the context sheet, it gets one of the values listed in the right side of the context sheet under the title Deposit Type. In our model this is represented by creating an instance of the class E55 Type with the selected value (collapse in our example) and connecting this instance to the corresponding instance of A2 Stratigraphic Volume Unit through the path:

A2 Stratigraphic Volume Unit \(\rightarrow \) P2 has type \(\rightarrow \) E55 Type

Reference Information (Initials and Data): This specific instance of the A1 Excavation Process Unit (in our example Excavation of Stratigraphic Unit 2) was performed at a specific time, represented as an instance of E52 Time-span, and carried out by an instance of E21 Person. This information is modeled in CRM by the following paths:

A1 Excavation Process Unit \(\rightarrow \) P4 has time-span \(\rightarrow \) E52 Time-span

A1 Excavation Process Unit \(\rightarrow \) P14 carried out by \(\rightarrow \) E29 Person

This information is recorded in Initials and Date field of the Context Sheet.

Information on the Sequence of Context with Relation to Other Contexts: The information on the sequence of context is depicted in the CIDOC CRM representation of the stratigraphic matrix (see Fig. 3). In this Figure we see that the context S.U.2 (i.e. the context described by context sheet of our example) is below the context S.U.1 and above the contexts S.U.3 and S.U.6. Notice that the instance S.U.2 of the class A8 Stratigraphic Unit coincide in both figures (Fig. 3 and Fig. 5).

Information on Finds in the Context: The finds in the context can be represented as instances of the CRMsci class S10 Material Substantial. Each instance of this class is then related to the deposit of the stratigraphic unit in which it is contained through a path of the form:

A2 Stratigraphic Volume Unit \(\rightarrow \) AP15 is or contains remains of \(\rightarrow \) S10 Material Substantial

Reference Information (photographs, drawings, context volume): Each photograph taken or a drawing designed during the excavation process is an instance of the class E36 Visual Item. The photograph/drawing is related to the corresponding instance of the A8 Stratigraphic Unit CRMarchaeo class through the property P138 represents. To distinguish between photographs and drawings we relate to the corresponding instance of E36 Visual Item an appropriate instance of E55 Type (i.e. an instance whose value is either photo or drawing).

Space for Archaeological Interpretation: To represent the content of the context sheet field Description, Comments, Preliminary Interpretation as well as the field Post Excavation Interpretation we use a set of paths of the form:

A1 Excavation Process Unit \(\rightarrow \) P140i was attributed by \(\rightarrow \) S5 Inference Making \(\rightarrow \) P2 has type \(\rightarrow \) E55 Type

where a specific interpretation is encoded as instance of S5 Inference Making while the corresponding instance of the class E55 Type (which may be one of the values ‘Description’, ‘Comment’, ‘Preliminary Interpretation’, ‘Post-excavation Interpretation:Local Stratigraphic Phase’, ‘Post-excavation Interpretation:Pot Phase’) describes the type of this interpretation.

Concerning the field Context Same As it relates the current context (instance of A8 Stratigraphic Unit with another context (i.e. another instance of A8 Stratigraphic Unit) which has the same features as the current context. This relation is expressed with the following path:

A8 Stratigraphic Unit \(\rightarrow \) P130 shows features of \(\rightarrow \) A8 Stratigraphic Unit

Such paths can be added in Fig. 3.

7 Conclusions and Future Work

This work has used CIDOC CRM and its extensions CRMarchaeo and CRMsci to represent archaeological work and assist archaeologists in documenting and managing archaeological and cultural heritage information. It also adds to the theoretical discussion on common grounds among humanities, computing, and information studies. We put emphasis on representing the contents on the first page of the two-page context sheets used by archaeologists in their systematic excavation works on archaeological trenches. As future work, we aim to extend the proposed model with the CRMba [22] classes and properties, to allow adding representations of architectural remains and their relations. Also, we will use CRMgeo to describe trench recordings of levels (the second page of the two-page context sheet) to complete the context sheet description. Another next step is to design an automated system for documenting excavation works. This will provide archaeologists with the capacity to document their work in the field (archaeological contexts, findings, interpretation) in real time and make the most of the system’s data entry and information searching facilities as well as explore the reasoning capabilities of the relevant ontologies.