Basque Cultural Heritage Data

Challenge

The Basque Cultural Heritage dataset landscape is fragmented across the Basque homelands in Spain and France, as well as a widespread diaspora. Collections use incompatible formats, lack standard identifiers and, for data from before the 1980s, lack a common orthography.

Approach

Wikidata entries for Basque persons, organisations, places, events and Cultural Heritage Objects (CHO) will provide the initial set of clear identifiers used in the project. These identifiers, along with their associated text and metadata, will be prepared for integration into the ECHOLOT system. The initial dataset will then be expanded and enriched using further datasets, provided by third parties and others. ECHOLOT will analyse text strings from source catalogues, propose matching entities, and rank best candidates. Experts from the case study participants, as well as the case coordinator UPV/EHU, will then validate the top suggestions through the ECHOLOT interface, enabling the matching algorithm to be refined across two improvement cycles.

Expected results

  • Entity alignments will be available in a dedicated ECHOLOT instance with full provenance metadata, including matching methods and human validation.
  • The validated alignments will also be published on Wikidata.
  • The creation of the first comprehensive, cross-border, cross-genre authority collection for Basque CH entities.
  • Both the source datasets and Wikidata will be enriched with new metadata generated through the alignment process.
  • The ECHOLOT pipeline will remain trained and ready to process additional datasets in the future.

Case study participants:

UEU Inguma

Udako Euskal Unibertsitatea (UEU, Basque Summer University) is an educational institution created in 1973 for the promotion of the Basque language at university level. It offers web services such as Inguma (database with >48K scientific documents and >14K authors), Tesiker (>1000 PhD theses in Basque), or its free digital library with >500 academic books.

Medialab Tabakalera

Tabakalera is an international centre for contemporary culture and a medialab – an open space for citizen creation and experimentation fostering knowledge production and collaborative projects. Medialab manages a library catalogue, with >46K records, and a digital archive of Tabakalera’s artistic and cultural activity, with >15K archival records.

Sancho el Sabio Fundazioa

Fundación Sancho el Sabio Vital Fundazioa (Sancho el Sabio Foundation) is a non-profit, concerned with the preservation, cataloguing and exposition of Cultural Heritage objects. Its Documentation Centre is one of the world’s most important heritage libraries with regard to Basque culture. Since 1964, the document collections comprise monographs, periodical publications, photographs, manuscripts, maps and family documentation.

EWKE (Basque Wikimedians User Group)

The Basque Wikimedians User Group (EWKE) is a non-profit dedicated to the improvement of Basque language within the Wikimedia ecosystem and closing the knowledge gap on topics related to the Basque Country. The group is a Wikimedia Foundation affiliate and a cultural institution with international experience in developing projects centered in education, especially Open Education Resources and GLAM.

Stay updated with our newsletter

ECHOLOT is a project funded by the European Union under Grant Agreement n.101233096. The views and opinions expressed in this website are the sole responsibility of the author and do not necessarily reflect the views of the European Union.

ECHOLOT is part of the ECCCH Cultural Heritage Cloud initiative, managed by the ECHOES project.

Image credits

© ECHOLOT. All rights reserved.