21Apr2020

Data as DNA

CORE Admin

LAB1100 is exploring new approaches to the discovery of patterns in historical texts.

LAB1100 is setting up a new project that aims to apply pattern recognition techniques developed in the field of bioinformatics to transcribed handwritten documents. This project will develop a tool that is able to produce a data-driven index of terms based on any type of textual data. As the tool will rely on reoccurring patterns, and not on semantics, the application will be language-agnostic and will be able to deal with spelling variations and transcription errors.

LAB1100 aims to rely on algorithms that have been developed in the field of bioinformatics for the discovery of DNA sequences.

The purpose of this project is not to provide scholars with a tool that shows 'the most relevant' terms, but to create a heuristic tool that helps to identify terms and to locate the texts and pages in which these terms have been used. The tool can also be used to find texts or pages with co-occurring patterns. The tool's API and user interactions will be facilitated by a nodegoat research environment that will ingest the indices, weights, and references to the pages and texts.

data analysis

Latest Blog Posts

03Feb2026

Data and Dialogue: Retrieval-Augmented Generation in nodegoat

CORE Admin

We have extended nodegoat in order to be able to communicate with large language models (LLMs). Conceptually this allows users of nodegoat to prompt their structured data. Technically this means nodegoat users are able to create vector embeddings for their objects and use these embeddings to perform retrieval-augmented generation (RAG) processes in nodegoat.

This development connects three of nodegoat’s main functionalities into a dynamic workflow: Linked Data Resources, the new vector store (nodegoat documentation: Object Descriptions, see ‘vector’), and Filtering. The steps to take are as follows:

Vector Embedding

The first step is to use one or multiple Reversed Collection templates to determine the textual content for each Object. This step transforms any dataset stored as structured data into a textual representation that can be used as input value for the generation of a vector embedding. This allows the user to select only those elements that are relevant for the process.

Next, the textual representation of each Object is sent to an LLM in order to create an embedding for each Object. The communication between nodegoat and an LLM is achieved by making use of Linked Data Resources and Ingestion Processes.[....]

AI data analysis nodegoat

26Jan2026

Upcoming nodegoat workshops

CORE Admin

In the next couple of months we will be running these events at various locations throughout Europe. Find the latest information about this here: https://nodegoat.net/workshop

05-02-2026: nodegoat Workshop at the University of Basel organised by the Research and Infrastructure Support team and the Swiss National Data and Service Center for the Humanities.
19-02-2026: nodegoat Workshop at the University of Jena.
25-03-2026: Workshop: Einführung in nodegoat at the University of Bonn.
16-04-2026: nodegoat Workshop at the Research Centre of the Slovenian Academy of Sciences and Arts in Ljubljana.
24-04-2026: nodegoat Workshop at KU Leuven, organised by CLARIAH-VL.
10-07-2026: nodegoat Curious: Building a Custom Relational Database for Your Research at the Digital Medieval Studies Institute, IMC Leeds.

training

13Nov2025

nodegoat APIs available in the OpenAPI Description

CORE Admin

Every nodegoat API is now described by an OpenAPI Description (OAD). A .yaml file is automatically generated based on the current configuration of any project specific API. This machine-readable document describes all available endpoints for the selected project.

See the blog post on nodegoat.net for a full description of this new feature.

data publication nodegoat