Our lobid-gnd service provides access to the Integrated Authority File GND. The service contains integration into OpenRefine, a powerful tool for working with messy data. This tutorial provides an overview of GND reconciliation for OpenRefine. The features used here require OpenRefine 2.8 or later.
Reconciliation is the process of matching name strings to identifiers of entities in a database like an authority file, Wikidata etc. This is useful whenever you want to merge differing name strings for the same person in your data or when you want to fetch additional data from the target database you are reconciling against.
The first step in the reconciliation process is to create a project. OpenRefine can import data from various sources. For this tutorial, we’ll simply import data from the clipboard:
Copy these lines and paste them in OpenRefine:
name;beruf;ort J. Weizenbaum;Informatiker;Berlin Twain, Mark;Schriftsteller; Kumar, Lalit;; Jemand;;
In the following preview screen you can take over the settings which were automatically detected and create the project:
We now want to reconcile the text strings in the
name column with GND entries:
We’ll have to add the GND reconciliation service:
https://lobid.org/gnd/reconcile as the service URL:
Collapse the drawer on the left hand side by clicking the newly added service. As our list for reconciliation consists solely of personal names, we now select
DifferentiatedPerson to reconcile only against GND entries of that type:
Optionally, we could reconcile against a non-default type by typing into the “Reconcile against type” field and selecting one of the suggested types, e.g.
It can make sense to pass additional data from other columns to improve the reconciliation results. Type in the text fields for each column, and select one of the suggested properties. E.g. use the data from the
beruf column to search in the
professionOrOccupationAsLiteral field in the GND:
After reconciliation, we can inspect candidates that have not been automatically matched by clicking or hovering over (depending on your OpenRefine version) their name:
This brings up a preview, with the option to match them:
Alternatively, we can search for a match by clicking “Search for match”. This brings up a dialog with a text field prefilled with the cell value. Select one of the suggestions to match the cell:
After matching, we can enrich our data with the reconciled data. We want to add columns based on the reconciled values:
We can now select the properties we want to add (using the search field and picking one of the suggestions for what we typed, or from the the prefilled list below the search field) and preview them. Here, we choose
Beruf oder Beschäftigung,
The first three properties are GND entries themselves, so they are recognized as reconciled items (they are links in the preview).
For non-reconciled items that have a label and an ID in lobid-gnd (such as
Ländercode), we can configure the content we want (label or ID) using the
configure link for that property:
Note also the
limit setting, which works for all properties and limits the number of values added for each entry (0 is the default, meaning no limit).
After confirming the preview (removing the old columns
ort, cutting off the non-reconciled item using the facet on the left hand side), we have the enriched table with new data:
We can now use the new reconciled items (like
Berlin in the
Sterbeort column here) to add more columns based on their properties (i.e. properties of
As an example, we add a link to a depiction of the
Finally, we can export our data in various supported formats:
Comments? Feedback? Just add an annotation with hypothes.is.