Abstract
Legislation and case law are widely published on the Web as documents for humans to read. In contrast, this paper argues for publishing legal documents as Linked Open Data (LOD) on top of which intelligent legal services for end users can be created in addition to just providing the documents for close reading. To test and demonstrate this idea, we present work on creating the Linked Open Data service Semantic Finlex for Finnish legislation and case law and the semantic portal prototype LawSampo for serving end users with legal data. Semantic Finlex is a harmonized knowledge graph that is created automatically from legal textual documents and published in a SPARQL endpoint on top of which the various applications of LawSampo are implemented. First applications include faceted semantic search and browsing for 1) statutes and 2) court decisions, as well as 3) a service for finding court decisions similar to a given one or free text. A novelty of LawSampo is the provision of ready-to-use tooling for exploring and analyzing legal documents, based on the “Sampo” model.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Semantic Finlex Linked Open Data Service
Finnish legislation and case law have been published as web documents since 1997 in the Finlex Data BankFootnote 1. Although the Finlex service is widely used by the public, it does not provide machine-readable legal information as open data, on top of which services and analyses could be built by the ministry or third parties. The first version of Semantic Finlex based on Linked Data was published in 2014 [4]. The data included 2413 consolidated statutes, 11904 judgments of the Supreme Court, and 1490 judgments of the Supreme Administrative Court. In addition, some 30000 terms used in 26 different thesauri were harvested for a draft of a consolidated vocabulary. During the work, shortcomings of the initial RDF data model became evident as well as the need for using the then emerging new standards for EU level interoperability: ELI European Legislation Identifier [3] and ECLI European Case Law Identifier [2]. The dataset also consisted of only one version (2012) of the statutory law and was not updated, as new legislation and case law was published in Finlex. The issues were resolved in the new version of Semantic Finlex [10] that currently hosts a dataset comprising approximately 28 million triples. The data was enriched by automatic annotation to named entities (judges mentioned in the court decisions) and references to legal texts (such as EU law transposed by the statutes and statutory citations appearing in court cases), vocabularies, and data sources, such as DBpedia, by utilizing different named entity linking tools [10, 13].
The Semantic Finlex service adopts the 5-star Linked Data modelFootnote 2, extended with two more stars, as suggested in the Linked Data Finland model and platform [7]. The 6th star is obtained by providing the dataset schemas and documenting them. Semantic Finlex schemas can be downloaded from the service and the data models are documented under the data.finlex.fi domain. The 7th star is achieved by validating the data against the documented schemas to prevent errors in the published data. Semantic Finlex attempts to obtain the 7th star by applying different means of combing out errors in the data within the data conversion process. The service is powered by the Linked Data FinlandFootnote 3 publishing platform that along with a variety of different datasets provides tools and services to facilitate publishing and re-using Linked Data. All URIs are dereferenceable and support content negotiation by using HTTP 303 redirects. In accordance with the ELI specification, RDF is embedded in the HTML presentations of the legislative documents as RDFaFootnote 4 markup. In addition to the converted RDF data, the original XML files are also provided. To support easier use by programmers without knowledge of SPARQL or RDF, a simplified REST API is provided, too. As the underlying triplestore, Apache Jena FusekiFootnote 5 is used as a Docker container, which allows efficient provisioning of resources (CPU, memory) and scaling.
2 LawSampo Semantic Portal
To demonstrate the use of Semantic Finlex in applications, the semantic portal LawSampo is being developed. LawSampo is a new member in the SampoFootnote 6 series of semantic portals, based on the “Sampo model” [6], where the data is enriched through a shared ontology and Linked Data infrastructure, multiple application perspectives are provided on a single SPARQL endpoint, and faceted search and browsing is integrated with data-analytic tooling. The faceted search and tooling are implemented using the Sampo-UI frameworkFootnote 7 [8]. The Sampo portalsFootnote 8 have had millions of end users on the Web suggesting that it is a promising model to create useful semantic portals.
The landing page of the LawSampo portal offers different application perspectives: 1. Statutes. By clicking on Statutes, a faceted search interface [14] for searching and browsing statutes is opened. The facets on the left include document type (with seven subtypes), statute type, year, and related EU regulation. After filtering out a set of documents (or a particular document) of interest, the user is provided with two options. First, the user can select a document from the result list and a “homepage” of the document opens, showing not only the document but also linked contextual information related to it such as the referred EU regulations linked to EU CELLARFootnote 9 or other documents from Semantic Finlex referring to it. For example, court decisions in which the statute has been applied can be shown. Second, it is possible to do data analysis based on the filtered documents. For example, a histogram can be created showing the dates of the filtered documents. 2. Case Law. In the Case Law perspective, a similar faceted search interface opens for searching and browsing court decisions. In this case, the facets include court, judge, and keywords characterizing the subject matter of the judgement. 3. Case Law Search. The third perspective is an application, where a law case judgement, or more generally any document or text, can be used for finding similar other case judgements. For example, if one gets a judgement from a court, this application can be used to find out what kind of similar judgements have been made before. Several methods for finding similar cases were tested when implementing this application including TF-IDF, Latent Dirichlet Allocation (LDA), Word2Vec, and Doc2vec [11, 12]. 4. Life Events. In addition, a fourth perspective is being implemented by which legal materials can be searched for based on the end user’s life situation problem at hand (e.g., divorce).
3 Related Work and Contributions
Our work on legal Linked Data services was influenced by the MetaLex Document ServerFootnote 10 [5] that publishes Dutch legislation using the CEN Metalex XML and ontology standards. Other Metalex ontology based implementations include legislation.gov.ukFootnote 11 and NomothesiaFootnote 12 that also implements ELI-compliant identifiers. Various ELI implementations and prototypes have also been implemented in existing legal information portals nationally, e.g., in LuxembourgFootnote 13, FranceFootnote 14, and NorwayFootnote 15. Many countries already produce ECLI-compliant case law documents to be indexed by the ECLI search engineFootnote 16. A prominent example of publishing EU Law and publications as linked data is the CELLAR system. Previous related works in the U.S. include, e.g., the Legal Linked Data project aiming at enhanced access to product regulatory information [1].
LawSampo aims to widen the focus of these related works by providing both legislation and case law to end users through intelligent user interfaces, such as semantic faceted search and document similarity-based search. The documents are automatically enriched with contextual linked data, and the end user is also provided with ready-to-use data-analytic tooling for analyzing the documents and their relations. In the future, we plan to expand the related enriching datasets to include, e.g., related parliamentary documents and discussionsFootnote 17, in the spirit of [15]. In order to be able to publish more legal documents in cost-efficient way, we also work on semi-automatic pseudonymization of court judgements [9] and automatic annotation of legal documents [13].
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
In Finnish mythology and the epic Kalevala, “Sampo” is a mythical artefact of indeterminate type that gives its owner richness and good fortune, an ancient metaphor of technology.
- 7.
Cf. homepage for more info: https://seco.cs.aalto.fi/tools/sampo-ui/.
- 8.
Including, e.g., CultureSampo (2008) for cultural heritage, TravelSampo (2011) for tourism, BookSampo (2011) for fiction literature, WarSampo (2015) for military history, BiographySampo (2018) for prosopography, and NameSampo (2019) for toponomastic research. Cf. homepage: https://seco.cs.aalto.fi/applications/sampo/.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
The ParliamentSampo system: https://seco.cs.aalto.fi/projects/semparl/en/.
References
Casellas, N., et al.: Linked legal data: improving access to regulations. In: Proceedings of the 13th Annual International Conference on Digital Government Research (dg.o 2012), pp. 280–281 (2012). Association for Computing Machinery
Council of the European Union: Council conclusions inviting the introduction of the European Case Law Identifier (ECLI) and a minimum set of uniform metadata for case law. In: Official Journal of the European Union, C 127, 29.4.2011, pp. 1–7. Publications Office of the European Union (2011)
Council of the European Union: Council conclusions inviting the introduction of the European Legislation Identifier (ELI). In: Official Journal of the European Union, C 325, 26.10.2012, pp. 3–11. Publications Office of the European Union (2012)
Frosterus, M., Tuominen, J., Hyvönen, E.: Facilitating re-use of legal data in applications-Finnish law as a linked open data service. In: Proceedings of JURIX 2014, Kraków, Poland, pp. 115–124. IOS Press (2014)
Hoekstra, R.: The MetaLex document server. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7032, pp. 128–143. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25093-4_9
Hyvönen, E.: Using the semantic web in digital humanities: shift from data publishing to data-analysis and serendipitous knowledge discovery. Semant. Web 11(1), 187–193 (2020)
Hyvönen, E., Tuominen, J., Alonen, M., Mäkelä, E.: Linked data Finland: a 7-star model and platform for publishing and re-using linked datasets. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 226–230. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11955-7_24
Ikkala, E., Hyvönen, E., Rantala, H., Koho, M.: Sampo-UI: A Full Stack JavaScript Framework for Developing Semantic Portal User Interfaces (2020). https://seco.cs.aalto.fi/publications/2020/ikkala-et-al-sampo-ui-2020.pdf. Submitted
Oksanen, A., Tamper, M., Tuominen, J., Hietanen, A., Hyvönen, E.: ANOPPI: a pseudonymization service for Finnish court documents. In: Proceedings of JURIX 2019, Madrid, Spain, pp. 251–254. IOS Press (2019)
Oksanen, A., Tuominen, J., Mäkelä, E., Tamper, M., Hietanen, A., Hyvönen, E.: Semantic finlex: transforming, publishing, and using Finnish legislation and case law as linked open data on the web. In: Peruginelli, G., Faro, S. (eds.) Knowledge of the Law in the Big Data Age, Frontiers in Artificial Intelligence and Applications, vol. 317, pp. 212–228. IOS Press (2019)
Sarsa, S.: Information retrieval with Finnish case law embeddings. Master’s thesis, University of Helsinki, Department of Computer Science (2019)
Sarsa, S., Hyvönen, E.: Searching case law judgements by using other judgements as a query. In: Proceedings of the 9th Conference Artificial Intelligence and Natural Language. AINL 2020, Helsinki, Finland, 7–9 October 2020. Springer-Verlag (2020)
Tamper, M., Oksanen, A., Tuominen, J., Hietanen, A., Hyvönen, E.: Automatic annotation service APPI: named entity linking in legal domain. In: Proceedings of ESWC 2020, Posters and Demos. LNCS, vol. 12124, pp. 208–213. Springer, Heidelberg (2020)
Tunkelang, D.: Faceted Search. Morgan & Claypool Publishers, San Rafae (2009)
Van Aggelen, A., Hollink, L., Kemman, M., Kleppe, M., Beunders, H.: The debates of the European parliament as linked open data. Semant. Web 8(2), 271–281 (2017)
Acknowledgments
Our work was funded by the Ministry of Justice; CSC – IT Center for Science, Finland, provided computational resources for the work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hyvönen, E. et al. (2020). Publishing and Using Legislation and Case Law as Linked Open Data on the Semantic Web. In: Harth, A., et al. The Semantic Web: ESWC 2020 Satellite Events. ESWC 2020. Lecture Notes in Computer Science(), vol 12124. Springer, Cham. https://doi.org/10.1007/978-3-030-62327-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-62327-2_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62326-5
Online ISBN: 978-3-030-62327-2
eBook Packages: Computer ScienceComputer Science (R0)