1. Set of documents
This document is one of the specifications produced by the Mellon and ErfgoedPod project:
-
Data Pod (this document)
2. Introduction
In a Solid decentralized network, data is stored in a distributed network of data pods. Data stored on these data pods is made available over the Web with unique identifiers, enabling other actors and applications on the network to interact with the available resources without the need for a centralized service. If we want to keep track of the lifecycle and interactions for the published resources on a data pod, requirements have to be specified for any data pod implementation to support this functionality. In this document, we define the required functionality for a data pod implementation that can incorporate Event information for all published resources on the data pod.3. Definitions
This document uses the following defined terms from [spec-overview]:
4. High-level overview
A Solid Data pod is a personal data space on the Web. An actor can use this data space to store and share resources over the Web, and receive notifications from other actors in an advertised inbox resource directory. To track the the lifecycle and its interactions of resources published on a data pod in decentralized networks, a data pod implementation must provide functionality for the storage and discovery of these events for their respective resources. In § 6 Resource storage, we define the base requirements for the storage of resources on the network. In § 7 Resource Versioning, we define how resource versioning can be handled on the data pod. In § 8 Resource Event Information, the requirements are defined for the storage and discovery of an Event Log for a given resource on the data pod. In § 9 Notifiations, the requirement is defined for the data pod to be a Linked Data Notification Receiver.5. Creating a data pod
A Solid data pod MUST be deployable as a local background process or as a remote web service. In the case that a data pod is deployed as a local process, the data pod instance should be connected to the Web at all times if the data pod should be discoverable permanently in the network. In the case of the latter, an actor in the network MUST be able to create their own data pod using this web service.6. Resource storage
6.1. indexing
Any resource stored on the data pod should be included in an index on the data pod to enable discovery of the resources by applications and actors in the network. A first indexing method is to make use of a Type Index. By indexing the resources in a public type index, actors and applications in the network can discover resources of a specific type. A second method is to index the resources using a Shape Tree. The shape tree specification defines where resources matching a specific shape are stored on the data pod. By parsing the available shape tree, any actor or application on the network can discover resources for a specific shape.6.2. metadata
The storage of resource metadata for non-RDF resources is desirable on a Solid Pod implementation. Resource metadata can be added to non-RDF resources by creating a resource with the .meta extension, according to the Solid specification.7. Resource Versioning
The data pod may provide functionality to support versioning of stored resources. Support for such versioning is not built in to the Solid specification. Resource versioning in a Solid pod environment can be implemented according to the definition by the Fedora API specification which is based on the memento protocol. In cases where it is important to be able to reference specific versions of a resource, versioning may be handled by generating a new URI for different resource versions. Version linking can be handled using the DCAT vocabulary, by adding the versioning information directly to the resource in the case of an RDF resourcem, or to the metdata resource in case of non-RDF resources.8. Resource Event Information
The data pod may provide functionality to store Event data related to resources stored on the data pod. To provide this functionality, the data pod must implement the Event Log specification. This specification dictates how event related data must be stored on the data pod, and how it can be discovered by external actors in the network.9. Notifiations
In the Solid ecosystem, notifications serve as the main communication mechanism in a network of Solid Pods. These notifications follow the Linked Data Notifications Specification (LDN). Any Solid pod in the network serves as an LDN Receiver, and consequently has the ability to receive notifications from any actor in the network. By defining an inbox on a resource, notifications for different resources can be directed towards different inboxes defined on the data pod. The actor managing the Solid pod may choose to manually process the incoming notifications or can automate this process through an external Orchestrator service. A Linked Data Notification has no guarantee of any response being given or any action being undertaken. Any action required on receiving specific notifications has to be defined and enforced separately.10. Spec roadmap
10.1. September
-
work out requirements for resource versioning in Solid.
-
Is time based versioning useful for our usecases?
-
What is the best approach to version taggin for specific resources and discovery?
-
How can we make sure events can be tagged for specific resource versions even on remote actors?
-
-
Work out data discovery more.
-
work out requirement for shape trees according to the interop spec
-
11. Acknowledgement
We thank Herbert Van de Sompel, DANS + Ghent University, hvdsomp@gmail.com for the valuable input during this project.