Developer (ETL, AWS, RDF, SPARQL, Elasticsearch)
Building alpha for managing catalogue data using RDF
After a successful discovery our client need an expert RDF developer to help implement an alpha of a new pan archival catalogue, developing a solution meet the users' needs, using AWS Neptune and Elasticsearch.
Work so far
Our client has run a discovery phase leading to plans for an alpha using AWS Neptune and Elasticsearch.
The discovery produced a proposal for a new Catalogue Data Model using RDF, a new identifier scheme, and transformation routines for the existing data to the new model. We have held workshops identifying the keyways that staff managing the catalogue work with the data and what they would like in future. The archivist needs to search, analyse, add to, correct, edit, enrich, and enhance record descriptions so that the catalogue is properly maintained. The archivist needs to work with catalogue entries individually or as large sets, making (or reversing) bulk changes, so they can work efficiently. The archivists need to understand the version history of the catalogue so they can be confident about where the information has originated.
We have investigated all the current databases that hold catalogue data and how they inter-relate. We have investigated a wide range of existing data standards and ontologies. We have documented all the findings in a detailed published report.
Our client are developing a pan-archival catalogue, bringing together record descriptions from multiple catalogues into a single new system. We are looking for a developer to work on the alpha development.
The specialist will develop a new catalogue management system. This will involve developing API functions to search, select, add, export, edit, import and delete catalogue data; developing search for use by expert users (using SPARQL in combination with Elasticsearch); developing an Extract, Transform, Load process to migrate The National Archives catalogue data from multiple relational database (SQL Server) and RDF databases to a cloud based native RDF database (AWS Neptune).
Key Skills / Experience
- Have experience with using standards-based ontologies/vocabularies, such as W3C PROV data model, Dublin Core and W3C ODRL
- Have experience of validating RDF data, for example using RDF SHACL
- Have experience of working with RDF databases and SPARQL, for example AWS Neptune
- Have experience, knowledge and understanding of Extract, Transform, Load (ETL) processes
- Have experience, knowledge and understanding of working with mixed content in the context of large, semi-structured datasets.
- Have experience, knowledge and understanding of create resilient and secure systems using IAM in a cloud context.
- Have experience developing a user interface/front end to support non-expert, editorial engagement with RDF
- Have experience, knowledge and understanding of EAD3 and EAC-CPF.