1 Introduction

1.1 Some history

Legal computing began in the 1960s, at a time when in part it shared much with humanities computing—in that textual corpora have to be searched with informational retrieval tools (Choueka et al. 1971, 1972, 1980; Choueka 1989)—but also in part with bibliometrics (because of the need to trace relevant literature, hence citations). That was a time when jurimetrics became visible (it is still an active field, with a focus on quantitative technical problems: see, e.g., Nance and Morris 2002; Combrink-Kuiters et al. 2000; Finkelstein and Levin 2003). Not by chance, it was also a time when Scandinavian scholarship in law aimed at developing metrics for law, e.g. the Bolding–Ekelöf degrees of evidential strength, evolving through Bolding (1960), Ekelöf (1964), then (with the involvement of AI) Åqvist (1992), and Shimony and Nissan (2001; Nissan 2012: Sec. 2.6).

It must be said that the actual contribution of computing to legal practice initially was in word processing, something that came later to be seen as rather low-tech. In the 1970s, research into an area which came to be called ‘AI & Law’ began to thrive, but its original focus was on modal logics—an area of non-classical logic: see, e.g., Hamkins and Löwe (2008), on a modal logic of forcing, and Martino (1997) on a modal logic of political action—and in particular on deontic logic: the logic of what is permissible or forbidden (e.g. Åqvist 1984, 1986; Horty 1993; Nute 1996; Goble 1999). Subsequently, research into formalisms of argumentation boomed, with application in the legal domain thriving within the walls of academia. It took much longer (as we are going to see) for the area of legal evidence to emerge within ‘AI & Law’.

1.2 Professional categories, tasks, their needs and some tools

Legal professionals and the police (whose own work often leads downstream to criminal trials) have considerably different professional cultures. Within those categories, judges, lawyers and prosecutors—as well as arbitrators or mediators in alternative dispute resolution—have their respective tasks and requirements, which stand to benefit from tailored computer applications. As for the police, investigators and crime prevention have distinct needs. There exist operational tools, or then just prototypes. Some research is still blue-sky, but with clear potential, such as tools resorting to mathematical formalisms in order to reason on hypotheses in a space of crime scenarios.

Let us exemplify how various the range of applications is. Admittedly, the following are non-AI smart tools. CACTUS is a piece of software—a simulation system based on a software multiagent architecture, i.e. a system delegating (sub)tasks to components (agents) specialized by domain of expertise—for training police officers in managing public order events, while communicating as they would in a real situation (Hartley and Varley 2001). A multiagent system is such that intelligent behaviour is coordinated among a number of separate intelligent agents, these being autonomous software modules (sometimes embodied in robots); they are called autonomous agents (Nissans 2012: Sec. 6.1.6).

ExpertCop is a software tool (a geosimulator, combining simulation and a geographical information system) for training police officers in allocating police presence in given urban environments, for the purpose of preventing crime (Furtado and Vasconcelos 2007). Geographical tools may also assist members of the public: a smartphone app, Protobadi, aiming at reducing street harassment of women in Bangladesh, sounds an alarm, sends text messages to the woman’s emergency contacts, and ‘incident data from all users are collated to create a heat map showing the areas where harassment is at its worst’ (Marks 2014).

In court, lawyers seek to persuade the adjudicator. By contrast, their attitude towards the other party is conflictual, eristic, and they do not expect to persuade the other party while in court. If, however, a settlement out of court is sought, then a solution for the conflict of interests is sought by negotiation. Tools assisting in legal negotiation, possibly Web-based (Zeleznikow 2002), may be used by lawyers or by members of the public—e.g. spouses seeking divorce, using the expert system Split Up in Australia (Zeleznikow and Stranieri 1995, 1998)—to avoid having to use lawyers, let alone going to court. Importantly, this is a tool for Australian law. Arguably, the same architecture could be adapted to the some task in family law in other jurisdictions, but clearly the Australian tool, unmodified, could not be usefully used for the problem at hand to be solved in conformity with a different jurisdiction.

Organizational aspects of police intelligence and the handling of suspects involve different kinds of equipment (Nissan 2012: Sec. 4.5). Concerning lineups of suspects (identity parades), a computerized version is ID parade discs, on which video clips from a database appear, along with a video clip showing the suspect (Nissan 2012: Sec. 4.5.2.3). Bloodstain pattern analysts on the scene of a crime have their own needs and tools (Akin 2012), and so do forensic scientists. For those many categories, computer applications have been developed. Age-progression software is a kind of computer-graphic software, useful to the police for the purposes of locating missing people, in that it predicts how a given person (based on an old photograph) would have aged meanwhile; it suggests how some child who has disappeared would look like a few years later (Nissan 2012: Sec. 8.2.3). Another tool aims at helping bring down the occurrences of suicide in US prisons: a device ‘can detect a prisoner’s vital signs from a wall or ceiling metres away’ (Rutkin 2014).

Nissan (2012) is a two-volume comprehensive overview of computer techniques for legal evidence, case preparation, procedural support and police investigation. We could argue that AI was involved in the development of many (not all) of these tools. Now some are simply used; others have been prototypes that did not enter professional practice. Yet other techniques have remained within the walls of academia, unimplemented, but interesting and with potential for application.

Bex’s book (2011) relates criminal evidence to narratives and argumentation. Legal databases and knowledge discovery using them by means of AI techniques are the subject of Stranieri and Zeleznikow’s book (2005). Also see Valente (1995), on an approach to legal knowledge engineering. In the present study, we are providing an overview; we are going to dwell more at length on tools or techniques to which, and to whose use, the present author is committed.

2 Jurisdictional differences

2.1 Bench trials versus trial by jury

It is necessary to consider the difference between jurisdictions and between criminal procedures in different countries. For example, jury research has been thriving in North America, but not in Europe, not even in Britain (where bench trials have gradually become common); there exist mathematical models of how a jury, or fact-finders in general, gradually become oriented to decide this or that way, and this is represented by quantitative parameters. Sometimes, AI techniques have been used in such descriptive models (Nissan 2012: Sec. 2.1). Modelling the work of a jury does not imply resulting changes in procedures.

2.2 From examining magistrate to prosecutor

In Italy, a sostituto procuratore would lead a police inquiry as an examining magistrate and would then turn into a prosecutor: in the 1990s, Judge Carmelo Àsaro developed single-handedly, while he was a sostituto procuratore, a software tool for procedural support, DAEDALUS, which proved popular nationwide among his colleagues, and which validates each and every step taken (Asaro et al. 2012). One reason for this is accountability, avoiding the sequence of steps taken being challenged by the defence later on. Another is the French and Italian legal maxim Quod non est in actis, non est in mundo: what is not in the dossier is considered not to exist, as though, in the outer world. Quite clearly, English-speaking countries have a very different tradition: DAEDALUS is in use in Italy, and could be adapted to France or the Netherlands, but consider that in unlike in Italy, in Britain police investigation is not led by an examining magistrate. Also the statute of limitations (in Italy: termini di prescrizione) is very different, in that in Italy even the legal proceedings of the appeal have to be concluded before expiry.

Regardless of the kind of jurisdiction: ‘Procedural-support systems are AI & Law programs that lack domain knowledge and thus cannot solve problems, but that instead help the participants in a dispute to structure their reasoning and discussion, thereby promoting orderly and effective disputes’ (Prakken and Renooij 2001). ‘When procedural-support systems are to be useful in practice, they should provide support for causal reasoning about evidence’ (ibid.). Responsibility remains with the user: the judiciary, or the prosecutor, or lawyers, according to who is using such procedural-support tools.

In the USA, legal professionals can resort to local tools for procedural support. CaseMap is a commercial software tool for organizing the evidence; it is produced by CaseSoft, an American firm (www.casesoft.com).

2.3 Setting in software the nature of offences for the court of cassation

Judge Asaro also developed a diagrammatic representation for supporting criminal law tasks (Asaro 2012). He also designed a tool for the Court of Cassation in Rome, ITACA, then developed under contract by Siemens. I translate from Italian an excerpt from an email of his (14 June 2012):

Itaca concerns not the Cassations sentences, but the offence per se, which was subjected to an in-depth analysis. The offence is considered as an object, whose properties were described and, for each of them, its parameters. Next, an inventory and a hierarchy were made of the events affecting the offence. The events were modelled as objects, and configured as accessible upon being invoked, which in turn triggers ‘methods’ [in AI this is called ‘demons’] whose output modifies the parameters of the properties of the offence. Of the offence, as a object kind, quite everything has been modelled (penalty, fitness to proceed, competence, and so forth). Next, the individual offences were inserted into the database, along with the respective specifications. Upon this knowledge base, complex routines have been built which represent—by decomposing them into basic units of knowledge, organized like a taxonomy—two fundamental procedures: the calculation of the expiry as per the statute of limitations (il calcolo del termine di prescrizione), and of the expiry of permissible detention before charges (e di quello di scadenza della custodia cautelare). Of both procedures, the pertinent events and ‘methods’ have been represented.

If such support tools are found to be useful, they can be deployed, with the user (i.e. the judiciary, or lawyers, according to whom the tool is tailored for) remaining responsible.

2.4 Points of procedure

There exist tools helping with specific points of procedure: Scottish academics developed an expert system, ADVOKATE, for assessing eyewitness suitability and reliability (Bromby and Hall 2002; Nissan 2012: Sec. 4.4). ‘In 1976, following several high profile cases of mistaken identity and poorly conducted line-ups in England and Wales, the Devlin Report provided a thorough analysis of identification evidence in criminal proceedings’ (Bromby et al. 2007: 308). ‘Following the Devlin Report, the first case to give serious consideration to eyewitness identification was R v Turnbull’ (ibid.). As for ADVOKATE: ‘The model includes information such as the distance between the witness and the perpetrator; the duration of the observation; and the visibility or lighting conditions. These elements, along with several other event factors are commonly referred to as the “Turnbull Rules” derived from the case R v Turnbull. The application of these factors has now become a requirement in England and Wales when considering the admissibility of eyewitness testimony in court. The Turnbull rules have been applied by other common law jurisdictions, notably Ireland, Canada and Australia’ (Bromby et al. 2003: 100).

Turning to another subject: procedural differences may pertain, for example, to how expert witnesses are allowed to give evidence. ‘In Australia, […], the Federal Court has encouraged (through its court rules) both “hot-tubbing” and joint conferences of experts. In the joint conference court rules, judges attempt to control expert witness partisanship by directing expert witnesses to confer, or to produce a document identifying the matters on which the experts agree and those on which they disagree. Under the “hot-tubbing” rules, experts testify together in court, responding to questions from attorneys and each other, as well as the judge. Judges may also appoint their own witnesses, although they rarely do so in criminal trials’ (Beecher-Monas 2008).

There is a difference between hot-tubbing and joint conferences of experts, in respect of lawyers’ interventions: ‘“Hot-tubbing” is also known as taking concurrent evidence. In this procedure, the experts for both sides simultaneously take the stand in court and question each other about their opinions on the record. They are also subject to questioning by the court and the lawyers. Hot-tubbing, in contrast to joint conferences of experts, permits the participation of legal counsel in exchanges among the experts’ (ibid.).

Modelling particular decisions is not new. Arguably, using software tools to model particular processes is relatively unproblematic. Nevertheless, as we are going to see, to the extent that probabilistic tools are used in order to prescribe rather describe adjudication in criminal cases (as opposed to the decision whether to prosecute), this is a nest of hornets in (especially USA) legal scholarship.

Now consider that a Finnish philosopher, Holmström-Hintikka (1995, 2001), has developed a formalism modelling the questioning of an expert witness. It stands to reason that procedural differences would need to be accounted for in more refined models, such as ones (I suggest) to be incorporated in computer tools for helping lawyers or prosecutors to plan for direct or counter-examining the experts. There has been a long literature on such issues of advice.

An expert system dealing with the hearsay rule is the Hearsay Rule Advisor (HRA). It was developed as an LL.M. project in Vancouver, Canada, by Blackman (1988), under the supervision of MacCrimmon (1989). That expert system ‘provides advice on whether a statement comes within the definition of hearsay and if so, whether the statement comes within an exception to the general rule excluding hearsay statements’ (MacCrimmon 1989: 468). The initial questions the user is asked by this expert system—MacCrimmon explains (ibid., pp. 467–468)—

classify exceptions based on the context of the trial (whether the declarant is available to testify and the type of trial, civil or criminal). This part of the program eliminates some exceptions as more facts become known. At this time the exceptions included in the program are: dying declarations, declarations against interest, declarations in the course of duty, and business documents (British Columbia only). Hearsay exceptions in the HRA are classified on the basis of four dimensions: EVENT, PERCEIVE, BELIEVE [and] INTEND. First the system searches for an approximate match between the user’s facts and the events in the system. Once a match is found, the user is asked questions designed to assess whether the three dimensions of PERCEIVE, BELIEVE and INTEND for a particular exception are satisfied by the user’s facts. These questions are tailored to fit the EVENT identified so that the system does not waste time with irrelevant or inapplicable questions. […]. Legal liability often turns on whether a person knows, thinks, believes certain things and not simply on whether they are in a particular emotional state.

We begin with the declarant as the principal actor. The action is the making of the statement. The EVENT is defined as the events which initiate the required belief states which initiate the goal of telling the truth. Thus for dying declaration the initiating events are the declarant is wounded, and the declarant is dying. It is assumed that these events initiate the belief that the declarant is dying that initiates the goal of telling the truth. For the exception, declarations in the course of duty, the initiating events are the declarant is performing a duty and others are relying on his or her actions which initiate the belief state that the declarant expects to be discovered if he or she makes an error which in turn initiates the goal of avoiding censure by his or her employer.

The dimensions of PERCEIVE [and] BELIEVE may be related to states of the world which enable the declarant to make a true statement. Circumstances which facilitate accurate perceptions are often required. […] INTEND focuses on the facts of the specific case being considered in order to establish the requisite belief state.

A referee remarked: “I am reminded of work in the 1970s by Schank, Colby and Abelson on scripts, and by McCarthy on legal reasoning”. My reply is: absolutely. Especially what in the 1970s used to be the Yale school of automated narrative processing within computational linguistics, namely the Computational Dependency school mainly associated with Roger Schank, Christopher Riesbeck and Michael Dyer (see e.g. Schank 1972; Schank and Riesbeck 1981; Dyer 1983; and see an overview in Nissan 2012, Sec. 5.2), has been influential in the 2000s: this is fairly evident in two mutually independent developments: Victor Raskin and Sergei Nirenburg’s ontological semantics (Nirenburg and Raskin 2004), and my own episodic formulae, about which I published extensively since 2001 (e.g. Nissan 2001a; Nissan et al. 2004; cf. an overview in Nissan 2012: Sec. 5.3).

3 US law professors’ perception of machine intelligence

Machine learning is a branch of artificial intelligence (AI, the latter is also called machine intelligence). Machine learning enables AI systems to improve their performance, by augmenting their knowledge. Machine learning is prominent in data mining, the pool of techniques for sifting through a huge mass of data to come up with information and patterns. Types of data mining include, for example, predictive data mining (whose aim is to learn from sample data in order to make a prediction and whose techniques include neural networks, rule induction, linear, multiple regression); segmentation (whose aim is to automatically group data into groups/clusters and to discover meaningful groups in sample data and whose techniques include k-means clustering, self-organizing maps); summarization of the data (to automatically present data in a way that makes interpretation easier, helping the user visualize patterns or find associations within the sample data); time series, for forecasting; and text mining (i.e. data mining whose data are textual corpora).

In ‘Machine learning and law’ Surden (2014: 115), somewhat reductively, concluded that

there are a number of tasks within the law for which the statistical assessments within the ambit of current machine learning techniques are likely to be impactful despite the inability to technologically replicate the higher-order cognition traditionally called upon by attorneys. The general insight is that statistical and other heuristic-based automated assessments of data can sometimes produce automated results in complex tasks that, while potentially less accurate than results produced by human cognitive processes, can actually be sufficiently accurate for certain purposes that do not demand extremely high levels of precision and accuracy.

However, responsibility must remain with human professionals, who need a grasp of what counts as an acceptable answer.

My qualms with Surden’s statement are that there exist tasks at which artificial intelligence actually outperforms human abilities, especially when detection of patterns is required in a huge network. This is done by data mining, an area within machine learning, itself a subarea of artificial intelligence, making it useful for crime detection and police investigation. Clearly, the latter are quite distinct from what lawyers or judges do, and Surden was mainly thinking of the use of machine learning for retrieving text, including the text of legal precedents.

With a strong focus on the US market for legal services and the prospects for the legal profession, McGinnis and Pearce (2014: 3046) listed, and then went on to cleverly discuss, ‘five areas that machine intelligence will dramatically change in the near future: (1) discovery; (2) legal search; (3) document generation [e.g. computer-generated forms]; (4) brief and memoranda generation; and (5) prediction of case outcomes [legal analytics]. Developments in predictive analytics, which we will discuss at greatest length in connection with case outcomes, will affect all five of these areas’. I must say that the list is not complete, as it left out advances in software tools supporting the development of argumentation. Moreover, the prospects as listed raise difficult questions regarding legal judgements, which should be explicable.

The forecast of McGinnis and Pearce (2014: 3043) is bleak for that lower tier of legal professionals who carry out rather rote tasks, as ‘over time, these transformations will change legal practice, helping superstars at the expense of journeymen lawyers’. In fact, McGinnis and Pearce reason that whereas delivery in court will usually still require lawyers (but I have come across news of arbitration services in the USA that accept submissions sent online, in the interest of reduced cost and time), ‘lawyers who are in highly specialized areas subject to rapid legal change, like Dodd-Frank regulation, will be relatively unaffected, because machines will work best in more routinized and settled areas’ (ibid.: 3042).

My own (and a referee’s) response to this is that there needs to be wider discussion on threats to the employment of lower-tier professionals. This is not just a matter of AI. In fact, I submit, just think of the situation in England and Wales in the 2000s and 2010s, with the gradual cutting of funding to legal aid resulting in dire difficulties for some categories of legal firms, let alone forcing wide sectors of the population to renounce obtaining justice in the courts. That is to say, both market forces and policy are crucial factors in what will happen to lower-tier legal professionals.

Moreover, McGinnis and Pearce remarked, ‘counselors who must persuade unwilling clients to do what is in their self-interest will also continue to have a role, since machines will be unable to create the necessary emotional bonds with clients. But journeymen lawyers—such as those who write routine wills, vet house closings, write standard contracts, and review documents—face a much bleaker future, because machines will do many such routine legal tasks. Thus, while the arrival of the machines will be generally good for consumers, the picture is much more mixed for lawyers’ (McGinnis and Pearce 2014: 3042). I suspect that the potential is a mixed picture also for the public, and not only because large segments would still be priced out of access to justice, while being potentially faced with opponents with enhanced capabilities. One should not take it for granted that potential is identical whatever the jurisdiction. There are questions for professionals.

Fairness to the parties is a concern which has on record caused, in Italy, that a given technology be disallowed in a given case, because one party resorted to it but it was not available to the other party. During an international meeting with Italian and South American legal scholars in Pisa in the late 1990s, Antonio A. Martino pointed out a precedent from Italy, when in a criminal case, counsel for the defendant claimed that the defendant was disadvantaged because the prosecution had presented its case by resorting to advanced technology (it was merely a trivial hypermedia tool instead, with links like what you see on the Web); the court accepted that objection, and the prosecution’s use of legal computing tools was disallowed. Those, however, were rather early days. Martino himself had much earlier introduced the Italian term giurimatica (‘jurimatics’) for rather trivial tooling helping the legal profession, as simple as document processors, but eventually was particular not to apply the term to artificial intelligence tools. That is to say, his neologism had been depreciated because of its early history of application. A referee remarked: ‘We could never fully know what tools are used by lawyers’.

Liability issues arising from the use of expert systems in the field of law were discussed by Alheit (1989), as a particular case of liability in relation to the use of expert systems, for which, see Zeide and Liebowitz (1987). Alheit pointed out, in general concerning knowledge-processing software, that ‘[t]here exists a tremendous litig[ation] potential over their use, misuse, and even non-use’ (Alheit, ibid.: 43, referring to Zeide and Liebowitz 1987). A referee remarked: “My assumption is that the professionals remain responsible. Philip Leith and others explored these questions”.

4 Information retrieval and text mining for discovery

There exists a variety of approaches to the lawyer’s task of discovery, in AI & Law. For one approach, see Daniels and Rissland’s (1997) ‘Finding legally relevant passages in case opinions’. In discovery, information retrieval is the rather obvious area of computing subserving the lawyer’s task of document review. McGinnis and Pearce state (2014: 3047):

In predictive coding, lawyers look at a sample of the larger set of documents. Computer technicians help construct algorithms that predict whether a document is relevant. Of course, predictive coding is imperfect, because it can miss some documents. But, imperfection is the norm even when lawyers perform document review, where fatigue, boredom, and other frailties—which do not affect machines—can substantially reduce the accuracy of document review. As a result, some [U.S.] courts have approved predictive coding as a tool of discovery that essentially will make the final decisions of relevance, because they believe the price and performance of e-discovery is at least equal to that of the traditional kind.

Judgements regarding relevance remain the responsibility of the professional.

There exist tools for text mining (Nissan 2012: Sec. 6.1.9), a form of data mining as applied to huge textual corpora (one of the techniques employed is support vector machines), and there exist applications to legal documents (ibid.: Sec. 6.1.11). Forms of inconsistency are problematic, such as due to judicial error or to new legislation or precedents (ibid.: Sec. 6.1.12). Sometimes text clustering is done by resorting to neural computation (ibid.: Sec. 6.1.14.15), or to fuzzy logic (ibid.: Sec. 6.1.15), or to genetic algorithms (ibid.: Sec. 6.1.16).

There are applications of text mining for criminal investigation: this is the case of email mining (ibid.: Sec. 6.2.1.1). Researchers have used the Enron email database after it was put in the public domain, following the Enron scandal (ibid.: Sec. 6.2.1.2). The SIGHTS text mining system was applied to the discovery of social coalitions (ibid.: Sec. 6.2.1.3). A referee remarked: “The rise of Google and ‘big data’ mean that the genie is out of the bottle”. Indeed.

Besides, there have been projects in automated text-summarization, and some of these are for use by legal professionals (ibid.: Sec. 6.1.8). For news story categorization systems, already in the 1980s nice results could be shown, combining information retrieval and natural-language processing. Mani (2001) provides a systematic introduction to automated summarization techniques (cf. Hahn and Mani 2000). Another book on the subject is Moens (2000), whereas Moens (2001) is on legal text retrieval. In the SALOMON project, Belgian criminal cases were summarized automatically and presented by a case profile. Case category, case structure and irrelevant text units are identified based on a knowledge base represented as a text grammar. Thematically important text units and key terms are selected by shallow techniques. These paragraphs are represented as weighted vectors and clustered. The system points the user effectively towards relevant texts. SOLOMON was described in Uyttendaele et al. (1998), Moens et al. (1997, 1999). Another tool is FLEXICON (Fast Legal Expert Information CONsultant): it extracts relevant text units based on location heuristics, occurrence frequencies of index terms and the use of indicator phrases, and automatically generates case summaries (Gelbart and Smith 1993). Experiments with the SUM project of Grover et al. (2003) were conducted on sentences from Britain’s House of Lords (the latter’s judiciary role has meanwhile been transferred to the new Supreme Court). Yet another automated summarization tool for law is LetSum (Farzindar and Lapalme 2004). At any rate, reporting and summarizing tools are not new.

5 Analysing the risk of going to court

Lawyers need different tools from the police, and the partial overlap is only in what a prosecutor could potentially use, for example a tool helping to evaluate costs or benefits of obtaining a piece of evidence, or of whether to litigate or prosecute. In the latter case, a plea bargain is a possibility in some jurisdictions; it used to be abhorred in Italy, but reform made it an option: the 1990s saw procedural reform in various European states indeed, with the adoption of some features from the Anglo-American adversarial system.

The prediction of case outcomes is known as legal analytics: the judiciary application of predictive analytics. An example made by McGinnis and Pearce (2014: 3052) is that

one form of legal analytics would use fact patterns and precedent to predict a case’s outcome, thereby better equipping lawyers to assess the likely result of litigation. Of course, lawyers make implicit judgments about litigation prospects when advising clients whether to bring a lawsuit, settle, or go to trial. But their advice is based on their intuitions and limited to their direct or indirect experience of law. The advantage of predictive analytics is that it provides a mechanism both to access a vast amount of information and systematically mine that information to understand the likely outcome of the case at hand.

At the end of a legal process, there will be judgements. Many tools and methods may have been used.

Litigation Risk Analysis is a proprietary method of Marc B. Victor, for quantifying legal and factual uncertainties by assuming probabilities, for constructing a decision tree and for using it in order to evaluate the risks of litigation (Nissan 2012: Sec. 4.3.2.3). TreeAge Pro is a decision-tree software, for performing a Litigation Risk Analysis (ibid.: Sec. 4.3.1).

A firm, Lex Machina (a spin-out of Stanford University), predicts outcomes in patent litigation in the USA based on a mass of historical precedents: ‘customers can look at stats on courts, judges, attorneys and patent rulings to put together a better legal offense or defense in the high-stakes game of patent litigation—instead of relying on their experience and instincts’ (Harbert 2012); ‘technology companies […] have been waging a legal battle through patents. The ownership of innovations in mobile communications is a major front in the conflict’ (ibid.). Lex Machina ‘grew out of Stanford’s IP Litigation Clearinghouse (IPLC). The original plan for IPLC was to create a database similar to what Stanford had done with securities law’ (ibid.). ‘Patent cases are more numerous than securities cases’ (ibid.). ‘Patent cases also are more complex, coming in many shades of gray’ (ibid.). ‘As a Stanford spin-out, Lex Machina makes the database available for free to certain public-interest entities, including the courts, academics, government policymakers and media outlets’ (ibid.). There will be a market for such advisory systems.

Automated quantitative legal prediction is one of the subjects in Katz (2013). The model described in Ruger et al. (2004) is based on precedent and is applied to predicting the decision-making of the US Supreme Court. The model outperformed a set of Supreme Court human experts at predicting future outcomes. Ruger et al. (2004: 1150) explained:

For every argued case during the 2002 Term, we obtained predictions of the outcome prior to oral argument using two methods—one a statistical model that relies on general case characteristics, and the other a set of independent predictions by legal specialists. The basic result is that the statistical model did better than the legal experts in forecasting the outcomes of the Term’s cases: The model predicted 75 % of the Court’s affirm/reverse results correctly, while the experts collectively got 59.1 % right. These results are notable, given that the statistical model disregards information about the specific law or facts of the cases. The model’s relative success was due in large part to its ability to predict more accurately the important votes of the moderate Justices (Kennedy and O’Connor) at the center of the current Court. The legal experts, by contrast, did best at predicting the votes of the more ideologically extreme Justices, but had difficulty predicting the centrist Justices.

A referee remarked: ‘Systems can make predictions about structures’. Ruger et al. also pointed out (2004: 1157–8):

In quantitative studies run retrospectively, the attitudinal model has been very successful in accounting for—technically “postdicting”—the outcomes of Supreme Court cases. For all of its postdictive success, however, there are a few problems both technical and conceptual—with using the standard attitudinal model to predict cases. The technical problems are twofold. The first is that the attitudinal model is quite good at predicting the Justices’ array along a particular linear dimension. But in its basic form it is not particularly good at situating specific cases ex ante along that linear array so as to predict where the key decision point will be—that is, how many Justices will vote one way and how many the other. As long as the Justices’ votes align according to the predicted spatial array, the outcome is regarded as consistent with the attitudinal model, irrespective of the decisional dividing line. So for instance, on the current Court a unanimous decision either way is consistent with the attitudinal “prediction”, but so too is a 5–4 decision where justice O’Connor joins Rehnquist/Thomas/Scalia/Kennedy, and so too is a 5–4 decision where she joins the Stevens/Ginsburg/Breyer/Souter quartet. The only type of decision that flunks the spatial model is one where, say, Justices Scalia and Thomas vote with Stevens, Ginsburg and Souter to vacate a defendant’s sentence and Justice Breyer is with Rehnquist, O’Connor and Kennedy in dissent. Clearly, a model that would claim predictive accuracy in a case like Grutter v. Bollinger, irrespective of whether Justice O’Connor voted to uphold or strike down the affirmative action plan at issue, leaves much to be desired.

6 Argumentation

6.1 Argumentation and its representations

The remit of argumentation is how to put forth propositions in support or against something. An established field in rhetorics, within AI & Law it became a major field during the 1990. Prakken and Sartor (2002: Sec. 1.2) usefully

propose that models of legal argument can be described in terms of four layers. The first, logical layer defines what arguments are, i.e. how pieces of information can be combined to provide basic support for a claim. The second, dialectical layer focuses on conflicting arguments: it introduces such notions as ‘counterargument’, ‘attack’, ‘rebuttal’ and ‘defeat’, and it defines, given a set of arguments and evaluation criteria, which arguments prevail. The third, procedural layer regulates how an actual dispute can be conducted, i.e., how parties can introduce or challenge new information and state new arguments. In other words, this level defines the possible speech acts, and the discourse rules governing them. Thus the procedural layer differs from the first two in one crucial respect. While those layers assume a fixed set of premises, at the procedural layer the set of premises is constructed dynamically, during a debate. This also holds for the final layer, the strategic or heuristic one, which provides rational ways of conducting a dispute within the procedural bounds of the third layer.

Here work with a background in AI can help with problem description.

Wigmore Charts are a graphic method of structuring legal arguments, currently conspicuous in some more formal approaches within legal evidence scholarship; they were first introduced by the American jurist John Henry Wigmore in the Illinois Law Review, 8 (1913), 77. See Nissan (2012: Sec. 3.2), Anderson et al. (2005). Computer scientists are rather used to another data structure, the Toulmin structure of argument, defined by Toulmin (1958), and for which see Hitchcock and Verheij (2005). Ball (1994) applied it to the analysis of public policy arguments. At a conference in AI & Law, Bench-Capon (1998) described a computer tool being an implementation of a Toulmin dialogue game. Bench-Capon is a leading scholar in research into argumentation within AI & Law; that area of research is thriving and has resulted in a large body of publications. See, e.g., Prakken’s (2002) ‘Incomplete arguments in legal discourse’ and Prakken’s (2006) ‘Formal systems for persuasion dialogue’. Cf. his ‘A formal model of adjudication dialogues’ (Prakken 2008).

6.2 Computer tools helping with argumentation

Araucaria is a relatively widespread tool for visualizing arguments (Reed and Rowe 2001, 2004). It was developed at the University of Dundee in Scotland. The software is freely available. It was also discussed in Chapters 11 and 12 in Walton et al. (2008). Cf. Nissan (2012: Sec. 3.7). Carneades is a computer tool, implemented using a functional programming language and Semantic Web technology, based on a particular formal model of argumentation (Gordon and Walton 2006). DART is a tool for supporting argumentation (Freeman and Farley 1996), which was applied to legal situations also by Gulotta and Zappalà (2001).

QuestMap (Carr 2003) is a computer tool for supporting argumentation. It is based on IBIS, mediates discussions, supports collaborative argumentation and creates information maps, in the context of legal education. Collaborative problem identification and solving is the purpose of IBIS, an Issue-Based Information System; problems are decomposed into issues. Reason!Able [sic] is a computer tool for supporting argumentation (van Gelder 2002). Some tools envisage collaboration among users, yet Reason!Able only has one user per session. It guides the user step-by-step through the process of constructing an argument tree, containing claims, reasons and objections, the latter two kinds being complex objects which can be unfolded to see the premises.

ArguMed is another computer tool for visualizing arguments (Verheij 2003). One of its peculiarities is the concept of entanglement. In the words of Walton et al. (2008: 398): ‘In ArguMed, undercutting moves, like asking a critical question, are modelled by a concept called entanglement. The question, or other rebuttal, attacks the inferential link between the premises and conclusion of the original argument, and thereby requires the retraction of the original conclusion. On a diagram, entanglement is represented as a line that meets another line at a junction marked by an X’.

Convince Me is a computer tool for supporting argumentation (Schank and Ranney 1995). It is one of the tools reviewed in van den Braak et al. (2006). It is based on Paul Thagard’s Theory of Explanatory Coherence (e.g. Thagard 2000). The arguments consist of causal networks of nodes (which can display either evidence or hypotheses), and the conclusion which users draw from them. Convince Me predicts the user’s evaluations of the hypotheses based on the arguments produced, and gives a feedback about the plausibility of the inferences which the users draw.

6.3 Argumentation-based computer tools for criminal investigation or tribunals

Stevie is an argumentation-based computer tool intended for supporting criminal investigation. Stevie enables analysts to view evidence and inferences. The program is described as distilling out of that information coherent stories which are ‘hypothetical reconstructions of what might have happened’, and which are defined as ‘a conflict-free and self-defending collection of claims’ which moreover is temporally consistent (van den Braak and Vreeswijk 2006). Cf. Nissan (2012: Sec. 3.10.2). In Stevie, a story is ‘a conflict-free and self-defending collection of claims’ which moreover is temporally consistent (van den Braak and Vreeswijk 2006). In contrast, legal narratives as intended in legal scholarship, as well as computer methods for processing narratives, are the subject of Nissan (2012: Ch. 5).

EMBRACE is a decision-support system for Australia’s Refugee Review Tribunal (Stranieri et al. 2012a, b: Sec. 3.11.6.2). It is an application of the Generic Actual Argument Model (ibid.: Sec. 3.11.5), which is a generic non-dialectical model of argumentation. That argumentation models may be non-dialectical may be counterintuitive, but EMBRACE is an example of how useful this can be.

7 Examples of kinds of reasoning

7.1 Case-based reasoning

Legal search combing for precedents is not identical with, yet can be subserved by, a particular AI technique: case-based reasoning (CBR). ‘Case-based reasoning is the process of using previous experience to analyse or solve a new problem, explain why previous experiences are or are not similar to the present problem and adapting past solutions to meet the requirements of the present problem’ (Stranieri and Zeleznikow 2005). Within case-based reasoning, ‘[t]he model based approach assumes that there is a strong causal model of the domain task. It generally involves selecting among partially matched cases, in which symbolic reasoning is used to determine the difference between the given problem and the retrieved cases’ (ibid.). Redmond and Blackburn (2003) described an application of case-based reasoning and other methods for predicting repeat criminal victimization. Of course, case-based reasoning is not just a matter of AI.

The contrast between rule-based and case-based intelligent systems from artificial intelligence should not be mistaken for the contrast between such legal jurisdictions that mainly judge based on precedent (which is the case of Anglo-Saxon countries) and such jurisdictions (such as France) where adjudication is mainly based on rules as stated in law as made by legislators. Moreover, the two opposite pairs do not overlap even when either rule-based or case-based reasoning is adopted in intelligent software systems applied to the legal domain. Bain’s JUDGE system (Bain 1986, 1989a, b) is, among the other things, a tool whose AI mechanism is case-based reasoning. It adopts a hybrid approach involving both rule-based and case-based systems. JUDGE is a cognitive model of judges’ decision-making when sentencing (and indeed it was based on interviews with judges). Both judges and lawyers may find sentencing information systems useful (Tata et al. 1996):

Formally, a Sentencing Information System is descriptive rather than prescriptive. That is, it contains no guidance as to how a sentencer might use this information to help in making the sentencing decision in a particular case. A Sentencing Information System (SIS) can display the range of sentences for the particular combination of offence and offender characteristics selected. The sentencer will have no guidance as to what extent and in what direction the appropriate sentence for the case at hand should vary from the average. This decision is a matter for the discretionary judgement of the sentencer. However, the frequency distribution indicates the highest and lowest sentences previously passed for the type of case at hand. In a well trodden area it might be assumed that a sentencer would have to have good reasons for straying outside the upper and lower limits, although there are no formal reasons why a sentencer should not choose to do so nor does the SIS restrict the sentencer’s choice in any way.

JUDGE ‘begins with an empty case-library and a handful of heuristics for deciding sentences when no cases can be applied to a new situation. After only a few cases, however, it begins to retrieve ‘remindings’ of its own cases from memory and to modify the strategies associated with those cases to form new sentences’ (Tata et al. 1996). ‘The idea of a case-based reasoning system unsupported by heuristics has also been used as a basis for modelling the sentencing process. Murbach and Nonn (1991) report progress on a project to develop a sentencing support system for fraud cases in Canada. Their system provides information about penalties but also includes information on case factors not included in the categories of offence used in the penal code but agreed by judges to be relevant to sentencing. There is thus an attempt to include information which reflects judicial perceptions of seriousness in order to make the system more sensitive and thus more useful to sentencers’ (ibid.).

‘[S]o that criminal justice personnel could easily compute, record, archive and examine the implications of the US Sentencing Commission Guidelines’ (ibid.), ASSYST (Applied Sentencing systems) was developed by the US Federal Court system (Simon et al. 1991). Another tool aiming at uniform sentencing (more consistent sentencing) is The Judge’s Apprentice (HaCohen-Kerner and Schild 1999). It is an expert system concerned with given categories of offences, and its aim is to assist in enhancing uniform sentencing at Israeli courts; these are akin to courts in the Anglo-American system, but there are no juries, only bench trials (the decision-making is by professional judges). It seeks to preserve uniform sentencing by reference to a ‘base-sentence’, not just to retrieved precedents (which is how JUDGE works). In Bain’s JUDGE, precedents are retrieved and processed based on indexes capturing situations; in addition, HaCohen-Kerner and Schild’s tool is prescriptive, and also resorts to empirical conceptual indexes which resulted from interviews with judges, and (unlike JUDGE) provides comparisons to precedents and justifications for the sentence it suggests. JUDGE, from Liverpool, was descriptive: not for suggesting a sentence in a real judicial context, but for modelling the behaviour of judges who sentence criminals.

Ashley’s (1991) HYPO argumentation system (which modelled adversarial reasoning with legal precedents) was continued in the CABARET project (Rissland and Skalak 1991) and the CATO project (Aleven and Ashley 1997). HYPO ‘is a case based [i.e. precedent-driven] reasoner developed by Ashley and Rissland at University of Massachusetts at Amherst. It analyses problem situations dealing with trade secrets disputes, retrieves relevant legal cases from its database, and fashions them into reasonable legal arguments. It has turned out to be the benchmark on which other legal case based reasoners have been constructed’ (Stranieri and Zeleznikow 2005: Glossary).

7.2 Abductive reasoning

Abductive inference (as opposed to deductive inference) was first theorized by Charles Peirce (Nissan 2012: Sec. 2.2.1.6). ‘Abduction, or inference to the best explanation, is a form of inference that goes from data describing something to a hypothesis that best explains or accounts for the data. Thus abduction is a kind of theory-forming or interpretive inference’ (Josephson and Josephson 1994: 5), with this pattern (ibid.):

D is a collection of data (facts, observations, givens).

H explains D (would, if true, explain D).

No other hypothesis can explain D as well as H does.

Therefore, H is probably true.

Important applications to criminal evidence are ECHO and PEIRCE-IGTT. ALIAS is a particular multiagent architecture, with abductive logic-based agents. It was applied to the modelling of reasoning on the evidence in a criminal case, in Ciampolini and Torroni (2004), using LAILA, a language for abductive logic agents (Nissan 2012: Sec. 2.2.1.5).

8 Legal evidence

8.1 History of evidence research in AI & Law

Notwithstanding a few seminal precursors from the late 1980s—ECHO (Thagard 1989), my own ALIBI (Kuflik et al. 1989)—it is only with the new century that the modelling of reasoning on legal evidence has emerged as a significant area within the well-established field of AI & Law (active since the 1970s). What previously existed was models by statisticians of criminal evidence. Nissan (2012) is a two-volume book about AI for legal evidence.

It took a systematic, organic effort in order to promote the new subdomain of modelling the reasoning on evidence within AI & Law. This was mainly done through several editorial initiatives, as well as workshops, of the present writer and of others (Martino and Nissan 2001; Nissan and Martino 2001, 2003, 2004; and MacCrimmon and Tillers 2002), and this in turn involved spurring scholars from disparate disciplinary quarters to develop some piece of research to specification, and then to have referees from different specialties evaluate the resulting papers again and again. Another journal special issue on AI & Law, but one in which only part of the papers are on evidence, is Peterson et al. (2001). From 2001, also Henry Prakken began to publish about models of evidence (Prakken 2001, 2004; Prakken and Renooij 2001; Prakken et al. 2003). Kaptein et al. (2009) have published the paper-collection Legal Evidence and Proof: Statistics, Stories, Logic. Martino and Nissan (Martino and Nissan 1998) is a journal special issue about modelling legal time.

8.2 ECHO

ECHO is a computer tool, based on artificial neural networks, for abductive reasoning, developed by Paul Thagard and first applied to the modelling of reasoning on the evidence in a criminal case in Thagard (1989), who then applied it to other trials as well. See Nissan (2012: Sec. 2.2.1). PEIRCE-IGTT is a piece of software: an abductive inference engine from artificial intelligence, developed by a team led by John Josephson (Josephson and Josephson 1994). One of its applications was to the modelling of reasoning on the evidence in a criminal case, re-engineering ECHO’s original application (Nissan 2012: Sec. 2.2.1.5). Whereas ALIBI’s and ECHO’s respective immediate purposes were theoretical, nevertheless a tool such as ECHO could be conceivably useful for lawyers or the prosecution using it while preparing a trial, in order to carry out a simulation of jury behaviour, based on the provisional state of the evidence and of the arguments intended or expected to be used.

The input for ECHO simulations of a trial are sets of simple propositions, and these propositions include items of evidence, prosecution hypotheses and defence hypotheses. Some other possible statements are identified as contradictions. Some other statements are instances of an explains function, followed by its parameter instances. Such statements include prosecution explanations, defence explanations and motives. Moreover, data are declared, within the input code, these data being testimonies uttered by the witnesses, thus being observed by everyone inside the courtroom. ECHO was first applied in order to model the reasoning of a jury on a murder case against California Highway police officer Craig Peyer, who was tried in San Diego for the murder of Cara Knott on 27 December 1986. The trial ended on 27 February 1988, in a hung jury. Another trial modelled with ECHO was that of Claus von Bülow, for the episode in December 1980, when his wealthy wife, Martha von Bülow (nicknamed Sunny), lapsed into a coma (Thagard 2004; Nissan 2012: Sects. 2.2.1.7, 2.2.1.8).

Thagard’s theory of explanatory coherence consists of these principles: symmetry (explanatory coherence is a symmetric relation, unlike, say, conditional probability); explanation (a hypothesis coheres with what it explains, i.e. evidence or another hypothesis; hypotheses that together explain some other proposition cohere with each other; and the more hypotheses it takes to explain something, the lower the degree of coherence); analogy (similar hypotheses that explain similar pieces of evidence cohere); data priority (propositions that describe the results of observations have a degree of acceptability on their own); contradiction (contradictory propositions are incoherent with each other); competition (if P and Q both explain a proposition, and if P and Q are not explanatorily connected, then P and Q are incoherent with each other; P and Q are explanatorily connected if one explains the other or if together they explain something); and acceptance (the acceptability of a proposition in a system of propositions depends on its coherence with them). There are various algorithmic solutions available (alternative to each other) that, along with those seven principles, can compute acceptance and rejection of propositions, on the basis of coherence relations.

8.3 ALIBI

ALIBI is computer system developed by myself (Nissan) and my students in various prototypes as early as Kuflik et al. (1989) and Fakher-Eldeen et al. (1993). In an AI blue-sky research perspective, it is a ‘planner’ which impersonates somebody accused with observed actions, who denies mens rea and produces alternative explanations, and seeks exoneration or a lesser liability (ranking those pretexts accordingly). Nissan (2012: Sec. 2.2.2) explains ALIBI in detail.

ALIBI decomposes the actions involved hierarchically and separates the actions themselves from deontic (i.e. moral or legal) connotations: this way, e.g. stealing is interpreted as taking in given circumstances (somebody else’s property, while the agent is unseen): stealing is deontically bad, whereas taking is not necessarily reproachable. Reasoning on effects is also carried out. An alternative plan is composed hierarchically, which claims exoneration or, in a version called ALIBI 3, pleads to a lesser computed liability.

Processing in ALIBI has the program recursively decompose the actions in the input hierarchically, i.e. into a tree of actions, down to elementary, atomic actions. (A tree is such a graph, that any two nodes are connected by exactly one path.) Moreover, actions are stripped of their deontic (i.e. moral or legal) connotation. For example, ‘stealing’ is interpreted as ‘taking’ in given circumstances (somebody else’s property, while the agent is unseen): stealing is deontically bad, whereas taking is not necessarily reproachable. ALIBI strips ‘stealing’ down to ‘taking’, and it is up to ALIBI to concoct such a plan where that act of taking fits in a way that is legitimate for the accused. Generating the justification corresponds to a reconstitution of actions into a different tree. Then, the terminal actions in the decomposition tree are differently reconstituted into alternative explanations (i.e. technically in AI, different ‘plans’) that are more innocent, in that they are such that liability is eliminated or minimized.

In some instances, the explanation or excuse the suspect caught red-handed would make is hilariously meagre in evidentiary terms. The latter, however, happened because the given situation was desperate, and ALIBI, while impersonating the suspect, was trying so hard, too hard to explain out the ascertained narrative elements by denying mens rea, that the emplotment, however sensibly contrived, is unconvincing. For example, in one session the input accusation states that the accused broke the glass of a jeweller’s display window, got inside, shot and wounded the jeweller, and then ran away carrying valuables with him. ALIBI states that ‘he’ broke the glass accidentally, and that ‘he’ got inside in order to leave a note with his coordinates. (If the accusation states that the accused opened, e.g. a drawer where money was held, or even the cash-register, then a possible pretext could be that he was looking for a pencil and paper, so he could leave his name and address.) He was carrying a weapon in legitimate circumstances. He shot the jeweller accidentally. Or then, he may play the hero, and claim that he heard voices from which he reckoned that the jeweller was in danger, and then he got inside in order to help. Once the jeweller was wounded, the accused ran away in order to seek medical help, as he could not provide it himself. He took away the valuables in order to return them, because the wounded jeweller could not guard his own property (in fact, incapacitation is an effect of being wounded); the accused, while going out to seek medical help, would guard the goods in the meanwhile, on the owner’s behalf.

8.4 Formal inferential techniques in models of reasoning on criminal evidence

Independent Choice Logic (ICL) is a formalism which Poole (2002) applied to legal argumentation about evidence. The formalism can be viewed as a ‘first-grade representation of Bayesian belief networks with conditional probability tables represented as first-order rules, or as a [sic] abductive/argument-based logic with probabilities over assumables’ (ibid.: 385). See Nissan (2012: Sec. 2.5.1).

Dynamic uncertain inference was used by Snow and Belis (2002), who analysed ‘a celebrated French murder investigation’ (ibid.: 397), namely the case of which Omar Raddad was convicted in Nice, in 1994, and then pardoned, the conviction being very controversial (the victim’s body was found with, near it, a sentence accusing Raddad written on the floor, scrawled in the victim’s blood). Snow and Belis (2002) ‘apply ideas about credibility judgments structured by graphs to the problem of dynamic uncertain inference. By dynamic, we mean that assessments of credibility change over time without foreknowledge as to the types of evidence that might be seen or the arguments that the [crime] analyst might entertain over time’ (ibid.: 397), in contrast with such ‘kind of belief change that occurs’ when the possible outcomes of experiments ‘are typically known before one learns the actual outcomes’ (ibid.: 397–398).

In psychology, attribution is how people (and computational cognitive models) reason about their own beliefs and the ones they ascribe to others. In AI, this area is called agents’ beliefs. ATT-Meta is a system for agents’ simulative reasoning by cognitive agents on each other, which deals with agents’ beliefs in respect of a formal approach to uncertain reasoning about them. Barnden (2001) applied ATT-Meta to reasoning about legal evidence.

AVERs is the visualization component of the architecture of a sense-making software tool for crime investigation, as envisaged by Bex et al. (2007). AVERs was ‘implemented as a web front-end to an SQL database. A case can be represented visually through multiple views; in this paper we will focus on the two graphical views, that is, the evidence view and the story view’ (ibid.: Sec. 6). Ideally, they wanted to design a more sophisticated tool than such investigative analysis software. Their approach to the story of the prosecution and the defence is qualitative, and does not resort to probabilistic quantification. It fits within logical and computer science research into argumentation, but this is combined with reasoning about stories and evidence (Nissan 2012: Sec. 5.4). In his book, Bex (2011) combines in one model arguments, stories and evidence. What I especially appreciate in the paper ‘Did he jump or was he pushed? Abductive practical reasoning’ (Bex et al. 2009) is the neat mathematical model for ranking the plausibility of competing scenarios without resorting to probabilities.

8.5 The Bayesian controversy among legal scholars

The Bayesian debate is a controversy among (especially USA) legal scholars, concerning legal evidence and the use of statistics, and in particular of Bayes’ theorem. See a discussion in a journal special issue (Allen and Redmayne 1997, discussed in Nissan 2001b), and in Nissan 2012: Sects. 2.3, 5.1. Statistics in DNA evidence are also subject to controversy (ibid.: Sec. 8.7.2.2). The Bayesian debate in legal scholarship is, I have argued, central to how acceptable AI evidential reasoning can be expected to be in judicial matters.

Relative plausibility of narratives has been argued for by a leading Bayesio-skeptic, Ron Allen. See Allen (1991, 2008a, b); Allen and Pardo (2008). I quote from Allen (1994: 606):

The distinction between the structure of proof and a theory of evidence is simple. The structure of proof determines what must be proven. In the conventional [probabilistic] theory [which Allen attacks] this is elements to a predetermined probability, and in the relative plausibility theory [which Ron Allen approves of] that one story or set of stories is more plausible than its competitors (and in criminal cases that there is no plausible competitor). A theory of evidence indicates how this is done, what counts as evidence and perhaps how it is processed.

Allen and Pardo (2007: 109) found that scholarship which applies probability theory to juridical proof

suffers from a deep conceptual problem that makes ambiguous the lessons that can be drawn from it—the problem of reference classes. The implications of this problem are considerable. To illustrate the problem, consider the famous blue bus hypothetical. Suppose a witness saw a bus strike a car but cannot recall the color of the bus; assume further that the Blue Company owns 75 percent of the buses in the town and the Red Company owns the remaining 25 percent. The most prevalent view in the legal literature of the probative value of the witness’s report is that it would be determined by the ratio of the Blue Company buses to Red Company buses, whether this is thought of as or plays the role of a likelihood ratio or determines information gain (including an assessment of a prior probability) […] But suppose the Red Company owns 75 percent (and Blue the other 25 percent) of the buses in the county. Now the ratio reverses. And it would do so again if Blue owned 75 percent in the state. Or in the opposite direction: it would reverse if Red owned 75 percent running in the street where the accident occurred (or on that side of the street) and so on. Or maybe the proper reference class has to do with safety standards and protocols for reporting accidents. Each of the reference classes leads to a different inference about which company is more likely liable, and nothing determines the correct class, save one: the very event under discussion, which has a likelihood of one and which we are trying to discover.

‘The blue bus hypothetical […] exemplifies the general implications of reference classes, and those implications would hold for practically any attempt to quantify a priori the probative value of evidence’ (ibid.: 113).

8.6 Probabilistic tools and using them

Using a sophisticated tool would not by itself make your arguments unassailable. HUGIN is a piece of Belief Net software, using which Fenton and Neil (2000) carried out calculations using Bayesian networks in order to present probabilistic legal arguments, in support of the so-called Jury Observation Fallacy and against the use of knowledge of prior convictions of a criminal suspect. Adrian Bowyer summarized this stance in a letter published in the latest issue of June 2001 of the London Review of Books (LRB), a letter immediately signalled as follows in an e-list posting by Mike Redmayne:

According to [the Jury Observation Fallacy], if a jury finds someone not guilty on the evidence presented in court—in other words, without taking previous convictions into account—the fact that this defendant has previous convictions for similar crimes usually makes it more, not less, probable that he or she is indeed innocent of this particular crime. This is because, when a crime is committed, the police quite reasonably go out and feel the collars of those with previous convictions for similar crimes. They therefore tend to fish in a highly non-representative pool, rather than picking suspects from the general population. This tips the probabilities in the defendant’s favour to an extent that is not outweighed by the likelihood of a certain fraction of past offenders becoming recidivists. If the defendant is considered innocent on the facts of the case, then his past convictions should be seen as evidence not so much of guilt as of the failures of police procedure.

Mike Redmayne (a legal scholar of the London School of Economics, quite at home with probabilistic modelling) was unconvinced by some of the assumptions made. In a posting discussing Fenton and Neil (2000) at an e-list (then at bayesian-evidence@vuw.ac.nz), he claimed:

Your conclusion is sensitive to the probability that a defendant will be charged given a previous conviction and no hard evidence. If the probability is less than 1 in 200, the fallacy disappears. One point about this is that there are further screening stages between charge and trial, and even between trial and acquittal (the judge can be asked to certify that there is ‘a case to answer’). It would be very difficult for a case to get to the jury when (more or less) the only evidence against a defendant is that he has previous convictions for crimes similar to the one with which he’s now charged. If there is other evidence against the defendant, surely that affects your conclusion, because it increases the probability of guilt? […]

PATER is a software system for probabilistic computations for testing paternity claims (Egeland et al. 1997).

8.7 Augmented reality and a crime scene

Chris Baber (2010) discussed a conceptualization of crime scene examination, in terms of distributed cognition. Prakken et al. (2003) discussed appropriate argument structures for reasoning about evidence in relation to hypothesizing crime scenarios. Aron (2012) reported about an application to crime scene investigation of augmented reality (a variant of virtual reality), developed at Delft University of Technology in the Netherlands by Oytun Akman and colleagues. Virtual tagging of objects in the crime scene is what generates the augmented reality: objects in a crime scene are tagged virtually by the local police, and a clean record of the scene is built in three dimensions (3D) before evidence is removed for processing. The crime scene in augmented reality, that is to say, showing also the tags, can be viewed by crime scene investigators by wearing a pair of augmented reality glasses. It was expected that such evidence could be admissible in court. Testing was in a mock crime scene. The crime scene is filmed by a person carrying a laptop inside a backpack and wearing a head-mounted display that receives 3D video from a pair of attached cameras controlled by the laptop. Augmented reality is generated because the wearers of the head-mounted display see their surroundings at the crime scene as normal, but they can use hand gestures in order to overlay virtual objects. Gestures of the left hand enable the wearer to access a menu, and selects from a variety of tools. In contrast, the wearer uses his or her right hand in order to tag objects in the crime scene. For example, the wearer may wish to tag bullet holes, or blood spatter. Moreover, using a headset, the wearer at the crime scene may contact somebody at the laboratory, and that other person in turn can view the scene as well as tag it. Testing in a real-world crime scene was planned in collaboration with the Dutch police, and an application was considered to the training of new crime scene investigators.

8.8 Crime scenario modelling: the dead bodies project

The Dead Bodies Project (Keppens and Zeleznikow 2002, 2003; Keppens and Schafer 2003a, b, 2004, cf. 2005, 2006) was intended to help at inquests aiming at ascertaining the causes of death, when prima facie a crime cannot be ruled out (Nissan 2012: Sec. 8.1). Application is in post-mortem inquests, with the goal of determining whether death occurred through natural causes, homicide or suicide. A so-called truth maintenance system, or ATMS (a well-known AI approach to consistency) is resorted to, in order to maintain a space of ‘possible worlds’ which correspond to hypothetical scenarios.

An ATMS is a mechanism that enables a problem solver to make inferences under different hypothetical conditions, by maintaining the assumptions on which each piece of information and each inference depends (de Kleer 1986, 1988). The goal of computation with an ATMS is to find minimal sets of premises sufficient for the support of each node. One has to find all minimally inconsistent subsets (NOGOODSs) and to find all maximally consistent subsets (GOODSs).

The Dead Bodies project resorts to neither conventional expert systems nor case-based reasoning. Any case is potentially unique. Crime investigation is very difficult to proceduralize. The design solution adopted for this project was to develop a model-based reasoning system, i.e. such a system that given a problem instance, a model of the problem is constructed, and a problem-independent technique is applied. In the same project, dynamic preference orderings are assigned to uncertain events. Default orderings may be overruled by inferred orderings.

Keppens and Schafer (2006) described the prototype of a decision-support system for crime scenario construction. It is component events, rather than entire scenarios, that are stored. (By scenario, a description of a combination of events and situations is meant.) The component events are composed into useful scenarios by an algorithm. The input is a description of the available evidence. A network of plausible scenarios is then generated. Those scenarios in turn can be analysed, with the goal of devising effective evidence collection strategies. The algorithm was allegedly highly adaptable to unanticipated cases, by allowing a major crime being investigated to be matched by component events in several different ways. One advantage hoped for was the avoidance of such pitfalls of human reasoning as premature case theories, or rather premature convergence, such that police investigators tend to focus on the more likely suspects they had identified early on.

That project belongs to a category of software tools known as compositional modellers and introduced by Falkenhainer and Forbus (1991). In compositional modellers, small, generic and reusable rules called model fragments capture a domain’s first principles. Another category to which the system described by Keppens and Schafer (2006) belongs is abductive diagnosers. In abductive diagnosis (Console and Torasso 1991), what the conditions are of a physical system under investigation is determined by comparing observations as predicted by models, to such observations that are extracted from the real world. The generation of models, in an abductive diagnoser, is done by resorting to a knowledge base of first principles about the given domain. First principles are general rules, independent from the decision procedure, and in this they differ from the heuristic rules (i.e. rules of thumb) found in rule-based expert systems.

Keppens and Schafer (2006) stated that their

abductive, first-principles based approach recognises that while the individual scenarios encountered in a major crime investigation may be virtually unique and vary widely, the underlying domain knowledge on evidence and the types of events that create it are not. It also encourages a principled hypothetico-deductive investigative methodology because it hypothes[izes] all (known) possible causes of the available evidence, composes these causes into plausible scenarios and deduces additional evidence from the plausible scenario. This promotes consideration of many scenarios, instead of individual ones, in deciding on future investigative actions. Finally, the approach also allows making expert domain knowledge available to less experienced investigators.

Keppens and Schafer (2006: Sec. 4.2) classify information by distinguishing facts (‘pieces of inexplicable, certain information’) from evidence (‘information that is certain and explicable’), by distinguishing three kinds of ‘uncertain and explicable’ information (uncertain states, uncertain events and hypotheses) and by distinguishing three types of assumptions, i.e. of ‘uncertain and inexplicable information’: default assumptions (describing information that is normally presumed to be true); conjectures (the unknown causes of certain feasible scenarios); and uncommitted investigative actions (possible but not yet performed activities aimed at collecting additional evidence). Keppens et al. (2005) described an extension of the scenario space generation, resorting to Bayesian networks, entropy and returned evidence collection strategies (Nissan 2012: Sec. 8.1.3).

The approach to crime scenario generation associated with Jeroen Keppens and Qiang Shen (as described in Keppens’ publications, or in Shen et al. 2006) is known as compositional modelling (CM) or intelligent CM-based approach. A team from Wales pointed out limitations of some existing knowledge-based approaches to the generation of plausible crime scenarios and indicated challenges arising for problems such that the degree of precision of available intelligence data can vary greatly (Fu et al. 2010). They tried to tackle the problem of there being unknown referents (whether a person or an object). They proposed a fuzzy compositional modeller that represents, reasons about and propagates inexact information for supporting the automated generation of crime scenarios. ‘Further, the article offers a link-based approach to identifying potential duplicated referents within the generated scenarios’ (ibid., p. 253). Their paper provided an illustration by means of an example of discovering unforeseen crime scenarios.

8.9 Virtual autopsies

There exist computer tools resorting to 3D graphic models for a ‘virtual autopsy’, and how to present the pathologists’ findings in court. This may be more useful than exhibiting body parts in court. Virtopsy is a computational technique developed by a team in Bern, Switzerland, for carrying out ‘a virtual autopsy’: information acquired through post-mortem imaging prior to autopsy is often used to plan the autopsy, confirm autopsy findings and allow for a second look if further questions arise during the forensic investigation (Ebert et al. 2012).

9 Supporting tools for fighting crime

9.1 Data mining

Network link analysis arose in human factors research, originally in order to determine the layout of machine shops in American industry during the First World War (Gilbreth and Gilbreth 1917). Link analysis is currently supported by computer tools. One of its applications is to crime investigation, and it is conducted by intelligence analysts. Its aim is to discover crime networks, to identify the associates of a suspect, to track financial transactions (possibly by data mining), to detect geographical patterns (possibly by kind of crime) and so forth. ‘Link analysis is the graphic portrayal of investigative data, done in a manner to facilitate the understanding of large amounts of data and particularly to allow investigators to develop possible relationships between individuals that otherwise would be hidden by the mass of data obtained’ (Coady 1985).

Relationships between items in very large databases are represented by association rules, discovered by means of data mining (e.g. finding that 80 % of customers who bought the books X and Y also bought a book on subject Z). Something similar may be interesting when tracking, say, unlawful transactions. ‘The problem of mining association rules can be stated simply as follows: Given predefined values for minimum support and minimum confidence, find all association rules which hold with more than minimum support and minimum confidence’ (Chan et al. 2001: 278, citing Agrawal and Srikant 1994 for the definition of confidence and support).

9.2 Link analysis and geography

Brandes et al. (2001: Sec. 1) stressed the importance of visualization when researching social networks (Aggarwal 2011), advocated the visualization not only of connections within a social network, but also of quantitative data, and pointed out that only a minority of social networks can be mapped onto a geographical representation.

At the Air Force Institute of Technology in Ohio, at the Air Force Institute of Technology in Ohio, Hamill’s (2006) Analysis of Layered Social Networks is concerned with prevention of near-term terrorist attacks.

Whereas link analysis is not to be confused with social network analysis (the two followed, in the respective research literature, different historical trajectories), they have come together: link analysis can benefit from social network analysis, borrowing from the latter, and applying, this or that formal device. Users watching on the screen the results returned by link analysis tools will see those results, not the mathematics of the underlying concepts from social network analysis.

Link analysis is an interactive technique, visualizing (in charts or maps or diagrams) networks of entity-to-event associations (e.g. tying a victim to a crime), as well as entity-to-entity (e.g. blood relative, or spouse, or place of birth, or owner of a firm), and event-to-event (e.g. tying emails to each other). ‘Link analysis is the process of building up networks of interconnected objects or items over time and the use of special techniques and software tools for forming, examining, modifying, analysing, searching, and displaying these patterns of behaviour, especially for the investigative data miner’ (Mena 2003: 80). ‘Knowing and working at the right level of granularity is a very important part of preparing the data for link analysis’ (ibid.: p. 78). Link analysis is specialized per application. For example, call detail records (who is calling whom, and when) are analysed (this is called dialed-digit analysis) in order to fight wireless fraud, an application for which link analysis is also used in different manners (ibid.: pp. 78–79). Another application is for counter-drug analysts (ibid.: pp. 80–82).

Link analysis is not suited for handling a mass of data based on aggregate statistical characteristics, something for which neural networks and machine learning are suited instead. Besides, the latter two approaches construct predictive models: networks, for them, are a model representation, whereas for link analysis instead, networks are a data representation (ibid.: p. 88). It is possible, however, for a tool to combine link analysis with neural networks. The latter is resorted to, in order to detect trends hidden in data, by the Trend Hunter utility of the ATAC criminal analysis tool (http://www.bairsoftware.com/atac.htm). Mena (2003: Sec. 3.11, pp. 88–104) enumerated and described various link analysis tools and discussed at length the popular Analyst’s Notebook.

Crime Workbench is an intelligence management software product for criminal and fraud investigation (http://www.memex.com/cwbover.html); there is a scaled down version, Crime Workbench Web, accessible from everywhere, and ‘aimed at the intelligence analyst and law enforcement investigator on the move’ (Mena 2003: p. 100). Daisy is a link analysis tool supporting a circular layout of nodes: these are connected by lines inside the circle and are possibly surmounted by histograms outside the circle (http://www.daisy.co.uk/daisy.html). By contrast, the main layout of displays generated by NETMAP (a link analysis tool used by several government agencies in the USA) is a wagonwheel format, while also supporting other layouts (http://www.altaanalytics/com/). Crime Link generates a triangular table, an association matrix enabling to see who knows whom (http://www.crimelink.com/). The ORIONInvestigations criminal data organizer can be integrated with the ORIONLink link analysis tool (http://www.oriosci.com/productinfo/Magic.html). ‘A special feature of ORIONLink is its what-if mode, which allows objects and their connections to be hidden or restored on the fly, allowing for the viewing of their impact on the total organization, such as a terrorist cell or criminal gang’ (Mena 2003: 103).

There exist geographical information systems for mapping crimes (Nissan 2012: Sects. 6.1.4, 6.1.6.2). Boba (2005) is a book on crime mapping for crime analysis. But when talking about tools used by the police, we are typically talking about link analysis tools (ibid.: Sec. 6.1.4 and Ch. 7). FLINTS is a software tool for criminal intelligence analysis. It performs network link analysis; its developer, Richard Leary (2012), originally applied it in the West Midlands Police. COPLINK is a tool for criminal intelligence analysis, developed for the Tucson police at the University of Arizona, and performing network link analysis (Nissan 2012: Sec. 6.2.5). For visualization, COPLINK resorts to a hyperbolic tree view and a hierachical list view (Xiang et al. 2005; Schroeder et al. 2007; Xu and Chen 2004). Chen et al. (2004) used a concept-space approach, in order to extract criminal relations from the incident summaries and create a likely network of suspects.

9.3 Tools for fighting fraud

Effective computer tools for fighting fraud resort to data mining and link analysis. For example, there are applications to fraud at auctions. FADE (Fraud and Abuse Detection Engine) is a data mining system developed by the online auction site eBay in order to detect fraud perpetrators at its site (Mena 2003: 254). IFS (Internet Fraud Screen) is a data mining tool giving credit-card fraud alerts, developed by CyberSource for Visa USA for matching fraud transactions. IFS ‘uses a combination of rule-based modeling and neural-network modeling techniques’ (ibid.: 271). IFS’s ‘profile scores look at more than a dozen different information items, including the customer’s local time and the risk associated with the customer’s e-mail host. CyberSource also provides e-retailers with an IFS report that includes risk profile codes, address verification systems (AVS) codes, and other relevant information to help e-merchants calibrate their risk thresholds and score settings. This helps the e-business subscribers to control the level of risk they want to operate under’ (ibid.).

An AI project for US Treasury agency’s FinCEN (Financial Crimes Enforcement Network), whose goal is to identify money laundering networks, by carrying out network link analysis, was reported about by Goldberg and Wong (1998). Links are created in databases of financial transactions.

Fuel fraud is detected using link analysis in the Poznan ontology model. A team in Poznan, Poland, developed an adequate analytic tool, FuelFlowVis, to help with investigations and prosecutions, concerning scams involving chains of transactions made by a multitude of straw companies, and whose goal is fuel fraud (Jędrzejek et al. 2009). What is involved is a kind of crime known as a fuel laundering scam. ‘This crime mechanism is to buy rebated oil (in Poland heating oil) from a licensed distributor and then mix it (i.e. add components) and sell to the retail market as duty paid diesel’ (ibid.: p. 83). The team mainly studied three large fuel laundering cases from the 2001–2003 period that went to court in Poland in 2008.

NetProbe is a tool for detecting fraud at online auction sites (Pandit et al. 2007; Chau et al. 2006). Users and transactions were modelled as a Markov random field (MRF), tuned for the detection of suspicious patterns generated by fraudsters. A belief propagation mechanism was resorted to, in order to infer the maximum likelihood state probabilities of nodes in the MRF, given a propagation matrix and possibly a prior state assignment for some of the nodes. Each node can be in one out of three states, namely fraud, accomplice or honest, or then its state may be undetermined, in the sense that NetProbe does not assign a state to that node. NetProbe uses the propagation matrix in order to detect bipartite cores in the graph. Incremental NetProbe, a version which allows approximation, enables users to get answers to their queries in real time. This avoids wasteful recomputation from scratch of node beliefs. Incremental NetProbe incrementally updates node beliefs as small changes occur in the graph.

A team in Pisa (Basta et al. 2009) developed SNIPER, an auditing methodology, applied to an area in fiscal fraud detection, namely the detection of Value Added Tax (VAT) fraud. Such fraud can take various forms, such as underdeclaring sales, or overdeclaring purchases. Moreover, fraudulent claims are possible for credits and refunds, because ‘tax charged by a seller is available to the buyer as a credit against his liability on his own sales and, if in excess of the output tax due, refunded to him’ (ibid.: 27). The Pisa-based team aims at having a rule-based computer tool that by means of data mining, would ‘identify the taxpayers with the highest probability of being VAT defrauders, in order to support the activity of planning and performing effective fiscal audits’ (ibid.). A major constraint is the limited auditing capability of the competent revenue agency: ‘In Italy for example, audits are performed on only 0.4 % of the overall population of taxpayers who file a VAT refund request’ (ibid.).

9.4 More tools for the police

Oatley and Ewart (2003, 2011) discussed crime analysis software. Oatley et al. (2006) discussed decision-support systems for the police. In CATCH, Kohonen neural networks (i.e. self-organizing maps) ‘learn to cluster similar cases from approximately 5000 murders and 3000 sexual assaults residing in the databases’ (Kangas et al. 2003: 365), using data from the HITS (Homicide Investigation Tracking System) database system, containing data about violent crimes primarily from the US Pacific Northwest. CATCH comes in two versions, one for murders and one for sexual assaults, and is a collection of tools that also include query tools and geographical maps.

AREST (sic) is an expert system for the profiling of suspects of armed robbery (Badiru et al. 1988). Valcour (1997) described InvestigAide B&E, an expert system for the Canadian Police, whose purpose was to support the processing and investigation of breaking and entering cases, by assisting in gathering and recording case data, and providing such information as suspect characteristics. Big Floyd is a link analysis tool of the FBI (Bayse and Morris 1987), with inferential capabilities, and applying the notion of template matching for detecting the likelihood that particular types of crimes were committed.

9.5 The forensic sciences and their tools

Forensic science domains use software tools, such as the European Pollen Database, EPD (http://www.ncdc.noaa.gov/paleo/epd/epd_main.html). Forensic pollen is investigated by forensic palynologists to ascertain provenience (Nissan 2012: Sec. 8.5.5) by resorting to that database, which generally caters to botanists.

Or then consider computing in environmental forensics (ibid.: Sec. 8.5.6; Murphy and Morrison 2002), which may be general tools for photogrammetry, photointerpretation and digital imaging (aerial photographs are involved), or mapping, or chemistry, or for access to documents of the land registry in order to investigate site history (e.g. of a landfill linked to the release of contaminants). Chapter 9 In Murphy and Morrison (2002), ‘Forensic Air Dispersion Modeling and Analysis’, features techniques such as the Gaussian plume model analysis. Numerical computing tools are of course involved, but are not specifically for forensic science. Modelling small particles suspended in the air goes by the name particulate pattern recognition (ibid.: Ch. 11), and techniques include applying neural networks (ibid.: Sec. 11.2.6, pp. 433–34).

There are tools for bloodstain pattern analysts on the scene of a crime (Akin 2012). Electronic sensing (or e-sensing) technologies aim at endowing devices with the capability of reproducing human senses using sensor arrays and pattern recognition systems. In e-sensing, machine olfaction and an electronic nose are such a capability or device that is intended to detect odours or flavours. Electronic-nose technology is the subject of a fifty-page survey by Wilson and Baietto (2009). See Yinon (2003) and Pamula (2003), concerning its use for detecting explosives. For finding a corpse, police dogs are effective.

There is a vast literature in computer security and its branch, computer forensics. Polonium (Chau et al. 2010; Nissan 2012: Sec. 6.2.4) applies graph mining (a form of data mining) to malware detection.

10 Tools for identification

‘Biometrics, which refers to identifying an individual based on his or her physiological or behavioural characteristics, has the capability to reliably distinguish between an authorized person and an imposter. A biometric system can be operated in two modes: (1) verification mode and (2) identification mode (Jain et al. 2000). The former is called person verification, or person authentication. A biometric system operating in the verification mode either accepts or rejects a user’s claimed identity, while a biometric system operating in the identification mode establishes the identity of the user without any claimed identity information’ (Khuwaja 2006: 23–24). Jain et al. (1999) is a book on the subject. Bromby (2010) discussed how biometrics can aid certification of digital signatures.

Nissan (2012: Sec. 8.7.3) surveyed computer techniques for fingerprint recognition, and in Sec. 8.7.2 discussed emerging problems with fingerprint or DNA evidence. Apart from fingerprint-based identification, there are techniques for person verification based on ‘face, hand geometry, iris, retina, signature, voice print, facial thermogram, hand vein, gait, ear, odor, keystroke dynamics, etc.’ (Khuwaja 2006: 24). Biometric fusion (or information fusion in biometrics: Ross and Jain 2003) is ‘[t]he general method of improving performance via collection of multiple samples’ (Rattani et al. 2008: 485). Multi-biometrics is ‘[t]he ability to utilize multiple biometrics modalities (multimodal), instances within a modality (multi-instance), and/or algorithms (multi-algorithmic) prior to making a specific verification/identification or enrollment decision’ (ibid.), where enrollment is ‘[t]he initial process of collecting biometric data from a user and then storing it in a template for later use’ (ibid.: 484).

There are projects on suspect identification by means of composites (composite images of human faces) as well as tools for reconstructing the facial likeness of victims from skeletal remains. Facial portraits, or mugs (of persons once detained), typically are not composites, but rather a photograph (mugshots), or then portraits of suspects are drawn by a sketch artist manually, based on a verbal description of a victim or eyewitness (Identi-kit procedures). An alternative to mugs and to artist’s sketches is a composite, by which initially a photographic photofit was intended. The term photofit is still in use in the USA, whereas in the UK the more general term composite is preferred. Old computerized systems for composites include E-FIT, PROfit (CD-FIT), and Mac-A-Mug Pro. EvoFIT is an advanced tool. See Nissan (2012: Sec. 8.2.2). The CRIME-VUs project under the lead of Charlie Frowd in the UK produced EvoFIT and validated it with techniques from experimental psychology. The approach combines facial composites, sketches and morphing between facial composites. MIMIC (short for Multiple Image-Maker and Identification Compositor) is a system for generating composites. MIMIC uses film strip projections. Compusketch is a system for assisting witnesses in approximating their description of the facial features of a criminal suspect. Both Compusketch and Mac-a-Mug Pro are computerized versions of the Photofit process.

FacePrints is a project and tool of Johnston and Caldwell (1997) at New Mexico State University, for assisting a witness to build a facial composite of a criminal suspect. The tool tracks the suspect through a ‘face space’, a population of faces, by using a genetic algorithm: genetic computation is an optimization technique inspired by natural selection from biology. The use of genetic algorithms in data mining has been described, in Stranieri et al. (2012a, b), inside my book about AI for legal evidence (Nissan 2012).

Facial reconstruction is the forensic reproduction of an individual human’s face from skeletal remains. Computer-graphic tools exist which support this task (Nissan 2012: Sec. 8.2.6). Age-progression software is a kind of computer-graphic software, which based on an old photograph, suggests how some child who has disappeared would look like a few years later (ibid.: Sec. 8.2.3).

Image forensics is a branch of forensic science whose goal is the detection of image tampering. The tampering is typically done by computer (digital forgeries), and the computational methods for detection (digital image forensics) belong to image processing within signal processing. Forgery in photographs in which, for example, a person is inserted is exposed by Johnson and Farid (2007) by analysing the lighting.

11 Concluding remarks

In this overview, we considered in turn the state of the art of artificial intelligence application to law and police investigations. We did not seek to be exhaustive. Rather, our aim has been to provide a correct view of the potential and implications for various professional profiles involved: lawyers, judges, prosecutors, police, forensic scientists, as well as members of the public or corporate bodies who seek an analysis of the risks of going to court, or seek alternative resolution.

How far has AI really affected the practice of law? Clearly, not as much as the hype, but there does exist a potential for adopting tools. How has the legal profession changed? It is clear that there is a market for tools of various kinds, for which claims are made. In this review of recent tools, no new AI breakthroughs were reported. Tools were largely for decision support. Legal systems remain unchanged by the technology available (changes some Continental jurisdictions underwent were partly under the impact of the Anglo-American adversarial system, but such changes were not driven by changes in technology). Responsibility remains with the lawyers. There exist tools for negotiation intended to relieve users of having to incur litigation costs (this is the case of the Split Up expert system for Australian family law), and even some of the costs involved with having to resort to a lawyer (think of cartaceous or software self-help kits for writing a valid will). Up to a point, then, the risk is the customer’s (but hasn’t this been the case anyway?): tools for use by members of the public have them incur risks similar to those of self-representation in court when no lawyer to assist is available. As for legal professionals, as well as for the police, tools may help them to understand the context in which they work. Decisions are still made on a basis of unequal power and unequal access to information. It was shown that there is a market for IT tools. Money can buy legal argumentation power; the poor are at a disadvantage. Because of the general trend in policies on a global as well as national level, that disadvantage is unfortunately very likely to keep increasing. The fault is not with the technology per se, but rather with societal and political context in which technology comes to be developed for likely buyers, and with economic disincentives for access to members of the public.

Concerning tools for law in particular: this is an area for Weak AI, where tools are developed to help to solve problems confronting lawyers. It is then a matter of legal professionals deciding what to buy. The ordinary citizen remains dependent on human lawyers. The author is associated with many of the systems which were discussed.