OAI-PMH interface
A starter OAI-PMH interface for ArchivesSpace allowing other systems to harvest your records is included in version 2.1.0. Additional features and functionality will be added in later releases.
By default, the OAI-PMH interface runs on port 8082. A sample request page is available at http://localhost:8082/sample. (To access it, make sure that you have set the AppConfig[:oai_proxy_url] appropriately.)
The system provides responses to a number of standard OAI-PMH requests, including GetRecord, Identify, ListIdentifiers, ListMetadataFormats, ListRecords, and ListSets. Unpublished and suppressed records and elements are not included in any of the OAI-PMH responses.
Some responses require the URL parameter metadataPrefix. There are five different metadata responses available:
- EAD — oai_ead (resources in EAD)
- Dublin Core — oai_dc (archival objects and resources in Dublin Core)
- extended DCMI Terms — oai_dcterms (archival objects and resources in DCMI Metadata Terms format)
- MARC — oai_marc (archival objects and resources in MARC)
- MODS — oai_mods (archival objects and resources in MODS)
The EAD response for resources and MARC response for resources and archival objects use the mappings from the built-in exporter for resources. The DC, DCMI terms, and MODS responses for resources and archival objects use mappings suggested by the community.
Here are some example URLs and other information for these requests:
GetRecord – needs a record identifier and metadataPrefix Up to ArchivesSpace v3.5.1 OAI identifiers are in this format:
http://localhost:8082/oai?verb=GetRecord&identifier=oai:archivesspace//repositories/2/resources/138&metadataPrefix=oai_ead
Starting with ArchivesSpace v4.0.0 OAI identifiers are in the new format (notice the colon after the oai:archivesspace
namespace part of the identifier):
http://localhost:8082/oai?verb=GetRecord&identifier=oai:archivesspace:/repositories/2/resources/138&metadataPrefix=oai_ead
see also: https://github.com/code4lib/ruby-oai/releases/tag/v1.0.0
Identify
http://localhost:8082/oai?verb=Identify
ListIdentifiers – needs a metadataPrefix
http://localhost:8082/oai?verb=ListIdentifiers&metadataPrefix=oai_dc
ListMetadataFormats
http://localhost:8082/oai?verb=ListMetadataFormats
ListRecords – needs a metadataPrefix
http://localhost:8082/oai?verb=ListRecords&metadataPrefix=oai_dcterms
ListSets
http://localhost:8082/oai?verb=ListSets
Harvesting the ArchivesSpace OAI-PMH server without specifying a set will yield all published records across all repositories. Predefined sets can be accessed using the set parameter. In order to retrieve records from sets include a set parameter in the URL and the DC metadataPrefix, such as “&set=collection&metadataPrefix=oai_dc”. These sets can be from configured sets as shown above or from the following levels of description:
- Class — class
- Collection — collection
- File — file
- Fonds — fonds
- Item — item
- Other_Level — otherlevel
- Record_Group — recordgrp
- Series — series
- Sub-Fonds — subfonds
- Sub-Group — subgrp
- Sub-Series — subseries
In addition to the sets based on level of description, you can define sets based on repository codes and/or sponsors in the config/config.rb file:
The interface implements resumption tokens for pagination of results. As an example, the following URL format should be used to page through the results from a ListRecords request:
http://localhost:8082/oai?verb=ListRecords&metadataPrefix=oai_ead
using the resumption token:
http://localhost:8082/oai?verb=ListRecords&resumptionToken=eyJtZXRhZGF0YV9wcmVmaXgiOiJvYWlfZWFkIiwiZnJvbSI6IjE5NzAtMDEtMDEgMDA6MDA6MDAgVVRDIiwidW50aWwiOiIyMDE3LTA3LTA2IDE3OjEwOjQxIFVUQyIsInN0YXRlIjoicHJvZHVjaW5nX3JlY29yZHMiLCJsYXN0X2RlbGV0ZV9pZCI6MCwicmVtYWluaW5nX3R5cGVzIjp7IlJlc291cmNlIjoxfSwiaXNzdWVfdGltZSI6MTQ5OTM2MTA0Mjc0OX0=
Note: you do not use the metadataPrefix when you use the resumptionToken
The ArchivesSpace OAI-PMH server supports persistent deletes, so harvesters will be notified of any records that were deleted since they last harvested.
Mixed content is removed from Dublin Core, dcterms, MARC, and MODS field outputs
in the OAI-PMH response (e.g., a scope note mapped to a DC description field
would not include <p>
, <abbr>
, <address>
, <archref>
, <bibref>
, <blockquote>
,
<chronlist>
, <corpname>
, <date>
, <emph>
, <expan>
, <extptr>
, <extref>
,
<famname>
, <function>
, <genreform>
, <geogname>
, <lb>
, <linkgrp>
, <list>
,
<name>
, <note>
, <num>
, <occupation>
, <origination>
, <persname>
, <ptr>
, <ref>
, <repository>
, <subject>
, <table>
, <title>
, <unitdate>
, <unittitle>
).
The component level records include inherited data from superior hierarchical levels of the finding aid. Element inheritance is determined by institutional system configuration (editable in the config/config.rb file) as implemented for the Public User Interface.
ARKs have not yet been implemented, pending more discussion of how they should be formulated.