Aims and Scope

The rapid growth of online available scientific, technical, and legal data such as patents, technical reports, articles, etc. has made the large-scale analysis and processing of such documents a crucial task. Today, scientists, patent experts, inventors, and other information professionals contribute to this data every day by publishing articles, or writing patent applications.

In order to benefit from the scientific-technical knowledge present in such documents, it has become critical that the communities related to Semantic Technologies, NLP, and Deep Learning join their forces to provide more effective and efficient solutions. This workshop, aims to provide a venue for researchers and practitioners to foster inter-disciplinary research in the areas of Semantic Technologies, NLP and Deep Learning.

Papers deadline

March 9th, 2023

March 16th, 2023


April 13th, 2023


April 20th, 2023

Workshop Topics

We are interested in novel contributions about the following topics:

  • Data Set Collection
    • New tools and systems for capturing scientific, technical, and legal data including patents.
    • Proposals of procedures and tools to store, share and preserve.
    • Collecting and sharing data sets such as benchmarks, etc.
  • Novel Semantic Technologies for scientific, technical and legal data
    • Ontologies and annotation schema to model such data.
    • Annotation, linking and disambiguation
    • Knowledge graph construction.
  • Applications for patents, scientific, technical and legal data by exploiting semantic technologies.
    • Exploiting knowledge graphs to drive document similarity, question answering, search etc.
    • Recommender systems.
    • Semantic content-based retrieval.
    • Natural language processing techniques for classification, summarization, etc.


The submissions must be in English and adhere to the CEUR-WS one-column template (see Session 2: The New CEURART Style). The papers should be submitted as PDF files to EasyChair. The review process will be single-blind. Please be aware that at least one author per paper must be registered and attend the workshop to present the work.

We will consider three different submission types:

  • Full research papers (10-12 pages) should be clearly placed with respect to the state of the art and state the contribution of the proposal in the domain of application, even if presenting preliminary results. In particular, research papers should describe the methodology in detail, experiments should be repeatable, and a comparison with the existing approaches in the literature is encouraged.
  • Short Papers (6-8 pages) should describe significant novel work in progress. Compared to full papers, their contribution may be narrower in scope, be applied to a narrower set of application domains, or have weaker empirical support than that expected for a full paper. Submissions likely to generate discussions in new and emerging areas of legal data are encouraged.
  • Position or Industry Papers (2-4 pages) should introduce new point of views in the workshop topics or summarize the experience of a group in the field.

Submissions should not exceed the indicated number of pages, including any diagrams and references.

Each submission will be reviewed by three independent reviewers on the basis of relevance for the workshop, novelty/originality, significance, technical quality and correctness, quality and clarity of presentation, quality of references and reproducibility.

The accepted papers will be available on the Workshop website. The proceedings will be published in a CEUR-WS volume and consequently indexed on Google Scholar, DBLP, and Scopus.


SemTech4STLD workshop will take place on May 28th, 2023.

Timing Content
14:00 14:10
14:10 15:00 Keynote and Q&A on Making legal knowledge accessible to machines: challenges and opportunities

Speaker: Dr. Sabrina Kirrane, Vienna University of Economics and Business Institute for Information Systems and New Media. SLIDES

Abstract: This talk explores automated techniques for making legal knowledge pertaining to both legislation and court cases accessible to machines. First, we introduce a legal knowledge graph creation methodology that can be used to transform structured and unstructured legal data into legal knowledge graphs that can be easily linked across different EU member states. Our knowledge graph, which is strongly routed in the ELI and ECLI standardisation initiatives, is populated with data and metadata from the Austrian legal information system and concepts automatically extracted from Austrian legal documents. Second, we examine the particularities of temporal annotation in the legal domain, including the different court case decision structures adopted by the European Court of Human Rights (ECHR), the European Court of Justice (ECJ) and the United States Supreme Court (USC). We are especially interested in misleading or mistaken temporal expressions in references; and deficiencies in the TimeML temporal annotation standard when it comes to specific legal temporal expressions. Third, we perform a comparison of different approaches to automatically extract events and their components (i.e., who; did what; when). The comparative analysis is performed over a set of 30 decisions from the ECHR that were manually annotated by two legal experts. We subsequently used the gold standard to compare the performance of various event extraction tools: rule-based, probability-based, and deep learning based.

Short Bio: Dr. Sabrina Kirrane is an assistant professor at the Vienna University of Economics and Business Institute for Information Systems and New Media. In addition, she is the Vice President of the Semantic Technology Institute International and the Founding Director of the Sustainable Computing Lab. She is also a member of the Vienna University of Economics and Business Research Institute for Cryptoeconomics. Dr. Kirrane was the Scientific/Technical Co-ordinator of the SPECIAL H2020 project from 2017 to 2019. Dr. Kirrane co-founded the Society, Privacy and the Semantic Web - Policy and Technology (PrivOn) workshop, which was co-located with the International Semantic Web Conference between 2013 and 2017. She is on the editorial board of the Journal of Web Semantics and the Semantic Web Journal. Additionally, she has also been on the organising committee of the International Semantic Web Conference (ISWC), the Extended Semantic Web Conference (ESWC), and the International Conference on Semantic Systems (Semantics). Sabrina has also participated in various Dagstuhl seminars focusing on Federated Semantic Data Management, Normative Multi-Agent Systems; Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web; and Autonomous Agents on the Web. Her research primarily focuses on data and algorithmic governance (e.g., access constraints, usage policies, regulatory obligations, societal norms, trust and transparency mechanisms), legal informatics, decentralised applications, and intelligent software web agents.

15:00 15:30
Towards Semantic Exploration of Tables in Scientific Documents. Varish Mulwad, Vijay S Kumar, Jenny Weisenberg Williams, Tim Finin, Sharad Dixit and Anupam Joshi
15:30 16:00
Coffee Break
16:00 16:30
Invited Talk and Q&A on A Coordinated Ecosystem of Systems and Models for Maintenance and Publication of Data, Metadata and Legal Documents in the European Commission

Speaker: Dr. Armando Stellato
16:30 17:00
Environmental impact assessment reports in Wikidata and a Wikibase. Finn Nielsen, Ivar Lyhne, Dario Garigliotti, Annika Butzbach, Emilia Ravn Boess, Katja Hose and Lone Kørnøv
17:00 17:20
Efficient Use of DALICC in Data Processing Pipelines with Fuzzy License Information. Kurt Junghanns, Michael Martin, Norman Radtge and Sabine Gründer-Fahrer
17:20 17:40
Diving into Knowledge Graphs for Patents: Open Challenges and Benefits. Danilo Dessi and Rima Dessi
Workshop Chairs

Program Committee


For general enquiries on the workshop, please send an email to: rima.tuerker@fiz-karlsruhe.de