Use Cases and Requirements for the Mellon Project

Living Standard,

This version:
https://mellonscholarlycommunication.github.io/mellon-specification/
Issue Tracking:
GitHub
Editor:
Ruben Dedecker (UGent - Imec)

Abstract

Use Cases and Requirements for the Mellon Project.

1. Introduction

The goal of the mellon project is to design a framework that enables scholarly communication over a decentralized network, where researchers can retain ownership and control over their published research. In order to design this framework, different approaches to distributed networks have to be considered, and technical limitations have to be overcome.

In § 2 Scholarly communication, the requirements and vision for scholarly communication are explained. In § 3 Architecture of a Decentralized Network for Scholarly Communication, an architecture is proposed for scholarly communication over decentralized networks based on pod based data storage. The artefacts required for scholarly communication are listed in [[$artefacts]]. For the proposed architecture, concrete use-cases are worked out in § 8 Use Cases. From these use-cases, requirements are derived in § 9 Requirements.

2. Scholarly communication

2.1. Core functions of scholarly communication

For scholarly communication to be successful, four core functions have to be fulfilled.
Function Explanation
Registration Allowing claims of precedence for a scholarly finding
Certification Establishing the validity of a registered scholarly claim
Awareness Enabling actors in the scholarly system to remain aware of new claims and findings
Archiving Preserving the scholarly record over time

These four functions are required in the network to be able to do scholarly communication. In the context of a decentralized network, registration can also be viewed as a way to link researchers to their institutions, allowing for trust in the validity of the researcher and their research.

2.2. Researcher-centric scholarly communication model

The goal is that researchers’ pods are hosted by their respective institutions, as part of the infrastructure provided in support of research and education. When a researcher moves on to another institution, the pod moves along and becomes hosted by the new institution. Researchers without institutional affiliation or with multiple affiliations can opt for commercial pod hosting platforms, national/regional academic hosting provisions that can be assumed to emerge, or host the pod themselves. Over time, a researcher’s pod accumulates an overview of scholarly contributions made throughout a career path.

2.3. The role of institutions in the decentralized network

The role of institutions in the decentralized network is to enforce policies in the network that make sure that all functions of scholarly communication can be executed for created artefacts in the network. Affiliated researchers are required to adopt the institution policies.

2.4. Trusting actors in the decentralized network

Actors in the network require an approach that allows them to trust other actors present in the network. In the case of scholarly communication, we expect the researchers to trust the research institution they are affiliated with. As the institution provides policies used by the orchestrator that define the called services for the created scholarly artefacts, the researcher can trust the choices of the institution. When submitting to external institutions such as conferences, it is up to the researcher to trust the additional policies set by these conferences. The evaluated policies are stored both in the event log of the orchestrator and in the called services, and can always be checked by all actors in the network.

3. Architecture of a Decentralized Network for Scholarly Communication

A decentralized, decoupled scholarly communication ecosystem with researcher pod, scholarly dashboard, orchestrator, Service Hubs, scholarly browser.

3.1. Researcher pod

The researcher pod serves as the main hub for a researcher, serving as a storage module for scholarly artefacts and interactions, and for exchanging information with external service providers and institutions.

A researcher pod stores the following information:

Information type Explanation
Personal contributions Scholarly contributions (artifacts and interaction artifacts) made by the researcher and descriptive metadata for them. This information is recorded using the researcher’s preferred authoring applications
Functions of scholarly communication Lifecycle event metadata pertaining to the fulfillment of the functions of scholarly communication for the researcher’s personal contributions. Through the intermediation of the Orchestrator, this information is obtained via Service Hubs of platforms that fulfil the respective functions
Peer contributions Pertinent metadata about selected interaction artifacts created by peers and pertaining to the researcher’s personal contributions. This information is obtained via an Awareness Service Hub (the one used for the interaction artifact) through the intermediation of the peers’ Orchestrator
Social interactions A record of the informal interactions the researcher has with peers via available social network features that are enabled by solid pods

3.2. Orchestrator

The orchestrator is a component in the network that is responsible for orchestrating the control flow of the network. The orchestrator itself is a relatively small component. It is responsible for the retrieval and the execution of policies in the network, and may have some built-in functionality such as creating an event log.

These policies are rules that are imposed by actors in the network, such as the research institution and even the researcher. This is discussed in § 5.3 Policy sources. The policies that are used by an orchestrator define the control flow in the network for artefacts submitted to the orchestrator.

The orchestrator component exposes an inbox to the network. This enables actors in the network to interacts with the orchestrator using Linked Data Notifications. On receiving a Linked Data Notification in the inbox, the orchestrator executes the relevant policies for the received input.

3.2.1. Orchestrator design

3.3. Service hubs

Service hubs provide services to the network. These services are necessary to enable the execution of the functions of scholarly communication by the network. Service hubs are a way to decentralize the functionality that is currently primarily contained to centralized services. Orchestrators have a choice between all the available service hubs in the network to, and can require one (or multiple) of the available service hubs to be used by the affiliated researcher pods to fulfill a function of scholarly communication.

3.4. Collector

A Collector is a service that collects and indexes information on the artefacts in the network. It collects this information from the available service hubs in the network. (It might make sense to also allow collectors to retrieve information directly from researcher pods. Else the interaction artefacts have to pass through a service hub that can be retrieved by a collector in the network. Also it can bypass service hubs if they are not willing to provide all required information themselves / only partly). It stores information about the available artefacts, and the linked lifecycle events and interaction artefacts. The collector provides an outward API that enables querying of the artefact information in the network. It can also provide indexing information to enable clients to filter over the exposed data. (This might be an interesting use case for event streams!). In the case of multiple distributed networks, collectors can collect data from / refer to other collectors in other networks to enable evaluating queries over all linked networks.

3.5. Scholarly Dashboard

The scholarly dashboard is the interface that can be used by researchers to execute the lifecycle events of an artefact (publish an artefact, review an artefact, interact with an artefact, ...) (This dashboard may also include functionality for direct interactions between researchers, however this is out of scope for a basis scholarly dashboard). A scholarly dashboard can be managed by an institution, requiring users to provide login credentials of said institution. (or be from external parties providing other possibilities / interactions?)

3.6. Scholarly Browser

The scholarly browser is the interface that can be used by anyone to query the distributed network for information on scholarly artefacts. It provides information about the artefact, as well as lifecycle events and the available interactions with the artefact.
An example control flow graph for creating and registring an artefact.
An example control flow graph for the orchestrator triggering the peer review certification process after registration of the artefact has completed.

4. Artefacts of scholarly communication

The functions of scholarly communication produce artefacts. These artefacts need to be stored on the relevant locations, and should be retrievable by the appropriate actors to fulfil the functions of scholarly communication.

4.1. Artefacts

The artefacts of scholarly communication that are to be stored in the decentralized network. In this section, the artefacts stored and generated by the network are listed and explained.

4.1.1. Research Artefacts

These artefacts are the result of research done by a researcher.

(This list is non-exhaustive)

Research artefacts
Paper (preprint) A research paper (preprint) and the corresponding metadata.
Supporting data Images, datasets, software, ..., supporting the research and their respective metadata
Research objects Research objects (Research Object Crates) is an approach to packaging research data and metadata. This may be useful to create a unified semantic layer over published components of research.

4.1.2. Interaction Artefacts

These artefacts are the result of interactions on artefacts by actors in the network.

(This list is non-exhaustive)

Interaction artefacts
Comment A comment made on an artefact in the network and its metadata (creator, timestamp, ...)
Proposed edit A proposed edit on an artefact in the network and its metadata
Review A review by an actor in the network of a research artefact in the network.

4.1.3. Artefact Lifecycle Events

Artefact lifecycle events provide metadata over the events that happen in the lifecycle of an artefact. This metadata enables actors in the network to follow the events in the artefact lifecycle step by step, and to retrieve all information relevant to the artefact.
Artefact lifecycle events
Creation The creation of a research project (with initial data?) by an actor in the network.
Publication The publication of research (in the form of a paper publication / research object / ...) by an actor (individual researcher / institution / ...) in the network
Review An artefact in the network has been reviewed by an actor in the network
Update An artefact in the network has been updated by an actor in the network
Reference An artefact in the network has been referenced by another artefact in the network
Subscription An actor in the network has subscribed to an artefact in the network
Interactions An actor in the network has interacted with an artefact in the network.
artifacts, interaction artifacts, descriptive metadata, event metadata

4.2. Subscribing to artefacts in the network

Actors inside (and outside) the network may be interested in following specific research topics, authors, ... . This is a core function of scholarly communication: Awareness. For scholarly communication, it should be possible to subscribe on all lifecycle events of artefacts in the network (for which an actor has the correct permissions). This can be handled by the available Service Hubs for awareness present in the network. Collectors present in the network can also pose as an awareness service hub, as it indexes lifecycle information of all artefacts present in the network These Service Hubs for awareness should be equipped to enable actors in the network to subscribe to specific lifecycle information for research artefacts that match given filter criteria. E.g. an actor should be able to subscribe to creation lifecycle events of artefacts tagged with "Scholarly Communication". The service hub should be able to advertise the available options for filtering research, and the lifecycle information which can be subscribed to.

5. Policies

In a decentralized network setting, policies can be used to dictate the control flow of the network in a dynamic way. A policy dictates a function that should be executed for an input matching a certain condition. In the case of Scholarly Communication, a policy defines a service call that should be done for an input scholarly artefact matching a given form. By adapting the policies, the control flow in the network can be changed dynamically. Enforcing a set of policies MUST result in an unambiguous control flow.

5.1. Policy Format

As multiple actors in the network must create and process policies, there is a requirement for a unified format for these policies. The choice of format MUST enable orchestrating entities in the network to enforce the policy, and provide proof of enforcement. The format must also help resolve problems if conflicting policies are

5.1.1. Triggers

A policy can be seen as a trigger, that executes a function when the trigger condition is fulfilled. A trigger contains:

On receiving an input, the orchestrating entity will try to match the input with the data shapes of the retrieved policies (triggers). In case of multiple matching triggers, all triggers are executed according to the available policy priority information. The trigger execution consists of a function call to the indicated service.

5.2. Policy enforcement

Policies in the network MUST be enforced by the orchestrating entities in the network. In this network architecture, the Orchestrator is responsible for enforcing policies in the network.

5.2.1. Proof of policy enforcement

In some cases, proof of enforcement may be required. E.g. A research institution requires a certain set of policies to be enforced for affiliated researchers, and should be able to verify if the required policies have been enforced on artefact submission. For this, the orchestrating instance is required to store an event-log of the processed requests and the enforced policies. Through this event log, it MUST be possible to prove that all required policies were enforced for every step. This event log MUST be stored redundantly, to prevent and discover tampering.

5.3. Policy sources

In the decentralized network, multiple actors may require policies to be enforced. In the case of scholarly communication, the institution, the faculty, the research group and the researcher can all impose policies to be enforced for scholarly artefacts.

5.3.1. Policy inheritance

Policy inheritance enables the extension of a set of policies with additional policies. This is required in the e.g. the case of a research institution with multiple faculties. The institution can provide a core set of policies for the institution, that is extended for each faculty with specific policies relevant to the artefacts created in that faculty. This is achieved by referencing the inherited policies with a predefined policy-inheritance predicate. This way, the orchestrating instance can retrieve the chain of policies.

5.4. Policy Priority

When multiple sources provide sets of policies to the orchestrating instance, multiple policies may be retrieved that match for a received artefact. In the cases where the order of execution of policies is important, priority information can be required for policies in the network. In case of multiple matching policy rules for a received artefact, the priority information can be used to decide the order of policy evaluations.

6. Events

To communicate events in the decentralized networks, systems must be in place for the creation, distribution and processing of these events.

6.1. lifecycle events

In the context of scholarly communication, events are used to communicate lifecycle events of artefacts in the network. These events define the lifecycle of an artefacts in the network. External actors in the network can use these events to recreate and the lifecycle of the artefact at the appropriate services in the network.

Using the ActivityStreams 2.0 specification, we can represent a lifecycle event as follows:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "summary": "Creation of artefact registration lifecycle event",
  "type": "Announce",
  "published": "2015-02-10T15:04:55Z",
  "actor": {
   "type": "Service",
   "id": "http://www.servicehub.example/registrationservice",
   "name": "Example Service Hub Registration Service",
   "url": "http://www.servicehub.example/",
   "image": {
     "type": "Link",
     "href": "http://www.servicehub.example/logo.jpg",
     "mediaType": "image/jpeg"
   }
  },
  "object" : {
   "id": "http://www.servicehub.example/registrationservice/registrations/tokens/artefact1",
   "type": "https://example.com/vocabulary#Registration",
   "name": "Registration of artefact 1"
  },
  "target" : {
   "id": "http://example.org/artefacts/artefact1",
   "type": "Document",
  }
}

A lifecycle event consists of

Lifecycle event
Term Description
actor The actor responsible for the lifecycle event
object The lifecycle event
target The artefact that is the of the lifecycle event.

Quick note: The term subject makes more sense than target to describe the artefact of the lifecycle event. However the subject term is defined properly in the AS vocabulary (only in terms of relationships).

6.2. Lifecycle event distribution

To distribute lifecycle events in a decentralized networks, the Linked Data Notifications specification is used.

6.3. Lifecycle event processing

6.4. Service Interface

As Linked Data Notifications are used to as the medium to interact with services in the network, it might be useful to have services advertise the required notification shape for interactions with a service. Currently no ideas are offered to include this functionality, but it might be useful for future reference.

7. Scholarly Communication context

7.1. Actors in the network

Actors in the network
Term Description
U User A user querying information over the scholarly communication network
R Researcher A user adding / interacting with information to the scholarly communication network
I Institution Organisation that enforces policies on managed actors.
S Service Provider Provider of services in the network.

7.2. Objects in the network

Actors in the network
Term Description
O Object Generic representation of an entity
A Artefact Object that is the result of scholarly communication and can be recorded.
M Metadata Metadata about an artefact.
IA Interaction Artefact Object that is the result of an interaction with an Artefact.
LE Lifecycle Event Object that is an event in the lifecycle of an Artefact.
R Record Record of the scholarly communication in the network.
P Policy Policy providing a set of regulations that must be enforced.

7.3. Roles in the network

Actors in the network
Term Description
C Consumer Entity that queries the network for scholarly communication information.
P Publisher Entity that publishes new artefacts to the network.
O Orchestrator Entity that provides a Service hub to the network.
SHP Service Hub Provider Entity that provides a Service hub to the network.
B Collector (Browser) Entity that collects the scholarly communication data present in the network.
REC Scholarly Record Record of scholarly communication happening in the network. Can be queried for information on the scholarly record of the network.

8. Use Cases

The following scenarios aim at capturing what an architecture based on Decentralized Web should enable for users. In this context, the users are the different stakeholders that are active in the network:

We use the following template:

Actors/Roles Actors can be people (roles), other systems, or time directly interacting with the process.
Goal The final successful outcome that completes the process.
Stakeholders Anybody with an interest or investment in how the system performs.
Preconditions The elements that must be true before a use case can occur.
Triggers The events that cause the use case to begin
Postconditions What the system should have completed by the end of the steps
Procedure The process and steps taken to reach the end goal, including the necessary functional requirements and their anticipated behaviors.
Requirements Requirements for the use-case to be achievable.

8.1. Joining a scholarly communication network

In order to create or interact with artefacts in a scholarly communication network, a user must have a valid data pod environment to store these (interaction) artefacts. In order for these artefacts to be taken into the scholarly record, they have to be passed to the respective service hubs that fulfil these functions. For this to be possible, the data pod must have access to an orchestrator in the network, that will ensure that the created artefacts can follow the correct steps to be included in the scholarly record.

8.1.1. Creating a data pod

Bob wants to interact with published artefacts in the network, but does not have a data pod available to store these interactions. Any scholarly dashboard used to interact with artefacts in the network SHOULD provide an option to login to an existing pod, register an existing data pod or to create a new data pod. Bob chooses the option to create a new pod, and is redirected to an environment where he can create a new data Pod (The available environments can be selected by the scholarly dashboard, as the use of specific platforms can be advertised / enforced).

8.1.2. Participating in the network with an existing data pod

Bob wants to interact with published artefacts in the network, and has a data pod available. Bob chooses the option of the scholarly dashboard to login with an existing data pod. The institution managing the scholarly dashboard MAY require Bob to authenticate himself to the institution before allowing use of the dashboard. On authentication, Bob is given access to the functionality available in the scholarly dashboard, and interactions in with artefacts may be handled by an orchestrator already linked in the Bob’s pod, or by an orchestrator advertised / enforced by the scholarly dashboard.
Actors/Roles Actors:
  • Researcher

  • Institution

Components:

  • Dashboard I

  • Orchestrator I

  • Researcher pod R

Goal
  • The researcher pod has access to an orchestration service in the network.

  • The managing institution of the orchestrator is aware of the researcher pod, and can link an affiliated person to the pod.

  • The researcher can now interact with artefacts on the network.

  • The researcher can now create new artefacts in the network an use the scholarly communication functions for the created artefact.

Stakeholders
  • Institution

  • Researcher (pod-owner)

Preconditions
  • The researcher has a pod available.

  • The researcher has access to a scholarly dashboard (optional).

  • The researcher can authenticate himself to the dashboard if required (is affiliated to the managing institution, ....).

  • Or the user can authenticate himself on another way and gain access to the orchestrator that way (without using dashboard).

Triggers
  • A Researcher wants to join the network.

Postconditions
  • The researcher can make use of the orchestrator, and has access to all required functions for scholarly communication.

  • The orchestrator in question can authenticate the user as an affiliated person. (It can assure to the network to trust this user?)

Procedure
  1. The researcher logs in to the scholarly dashboards OR the researcher connects to an interface that enables them to retrieve information about available orchestrators in the network.

  2. The user retrieves the orchestrator information (via the dashboard).

Requirements
  • Pod environment

  • Scholarly dashboard

  • Authentication mechanism for pod owner wishing to join the network (= make use of orchestrator)

8.2. The researcher profile

On initial participation in a scholarly communication network, actors in the network MAY enforce that connecting data pods require a researcher profile to be complete, in order to validate affiliations to certain institutions. Any scholarly dashboard used to interact with the scholarly communication network SHOULD provide functionality / redirect to services for researchers to create their profile. Institutions SHOULD provide functionality for researchers to receive a token of sorts proving their affiliation to the institution, that can be used by actors in the network to verify the researcher’s affiliations.

8.2.1. Creating a researcher profile

Bob logs on to a scholarly dashboard with a newly created data pod by authenticating himself to the dashboard. The dashboard analyses Bob’s data pod, and cannot discover a researcher profile. The dashboard SHOULD provide a way for Bob to create a researcher profile / the dashboard can create a researcher profile automatically if it has the available information using Bob’s authentication to the dashboard. Bob’s researcher profile is now added to Bob’s data pod, and is passed to one or more orchestrators in the network (connected to Bob’s data pod or to the scholarly browser). The orchestrators make sure that the information is passed to the relevant services, so that Bob’s updated information is now available in the network.
Actors/Roles Actors:
  • Researcher

Components:

  • Researcher pod R

  • Dashboard I

Goal
  • The researcher pod has a researcher profile set.

Stakeholders
  • Researcher (pod-owner)

  • OPTIONAL: Institution

Preconditions
  • The researcher has a pod available.

  • The researcher can authenticate himself and his pod to the scholarly dashboard.

Triggers
  • The researcher navigates to the "Edit Profile" tab of the scholarly dashboard.

  • The researcher tries an action that requires the researcher profile to be completed, and is redirected to the "Edit Profile" tab in the dashboard.

Postconditions
  • The researcher profile is set in the researcher pod.

  • OPTIONAL: The researcher profile matches the institution profile of the researcher.)

Procedure
  1. The researcher navigates / is redirected to the "Edit Profile" tab of the scholarly dashboard.

  2. OPTIONAL: If the dashboard can retrieve information on the researcher through the authentication process, this information can be automatically filled in.

  3. The researcher completes all required fields of the form, and submits.

  4. The completed profile is posted to the researcher pod.

  5. The browser shows a success / failure message for the update, and redirects to the "View Profile" tab in the dashboard, or the tab the researcher was redirected from.

Requirements
  • Pod environment

  • scholarly dashboard

8.2.2. Updating a researcher profile

Bob has changed jobs and now works at a new research institution. The Scholarly Dashboard of the new institution SHOULD provide functionality for Bob to update his researcher profile. On submission of the new information, it is passed to one or more orchestrators in the network (connected to Bob’s data pod or to the scholarly browser). The orchestrators make sure that the information is passed to the relevant services, so that Bob’s updated information is now available in the network.

8.3. Uploading artefacts

Here we present some use-cases for adding artefacts to the network.

8.3.1. Adding artefacts to the network scholarly record

Bob has created a new research paper. Bob wishes to add the artefact to the network. Bob opens the scholarly dashboard, and navigates to the upload section. Bob selects the artefact he wishes to add, and completes all required information that cannot be automatically extracted from the uploaded artefact. On submission, the scholarly dashboard stores the artefact in the data pod of Bob on the designated location (designated by Bob, or designated by the shape tree present in the pod). The scholarly dashboard receives the information for all relevant service hubs from the orchestrator. The artefact data is sent to all relevant service hubs (At least the registration service hub for claims of precedence). The created lifecycle events are stored on Bob’s data pod, and are distributed to all subscribed actors in the network via the awareness service hub.
Actors/Roles Actors:
  • Researcher

Components:

  • Researcher pod R

  • Dashboard I

  • Orchestrator I

  • Service Hubs S

Goal
  • The researcher artefact goes through all steps of the scholarly communication cycle appropariate for the artefact in question, and is added to the scholarly record.

    • The artefact is registered and timestamped by one or more Registration Services if applicable.

    • The artefact review process is started (if applicable) by one or more Certification services if applicable.

    • The network is made aware of the new artefact by one or multiple awareness services if applicable.

    • The artefact is archived by one or more archiving services if applicable.

  • The researcher pod receives the appropriate artefact metadata for the called services.

  • The artefact is available for actors in the network to query / interact with, given the correct permissions.

Stakeholders
  • Researcher (pod-owner)

  • Institution

  • Service providers

  • Publisher (other reserachers in the network, ...)

  • Consumers (users browsing the scholarly record, ...)

Preconditions
  • The researcher has a pod available.

  • The researcher is authenticated to the scholarly dashboard.

Triggers
  • The researcher indicates that a created artefact should be added to the scholarly record.

  • The scholarly dashboard can add specific artefacts or interactions to the scholarly record.

Postconditions
  • The artefact is added to the scholarly record of the network.

  • The necessary functions of scholarly communication have been called for the artefacts.

  • The artefact is available for other actors in the network.

Procedure
  1. The artefact publisher obtains a reference to a network orchestrator

  2. The publisher retrieves all policies in the orchestrator

  3. These policies are evaluated over the artefact, in the case of relevant policies (By client / orchestrator / institution service).

  4. The artefact receives a token of approval from the orchestrator / institution that all policies have been evaluated.

  5. The artefact sends the artefact to the required services according to the set policies / according to the advertised services?

  6. The called services notify the publisher of the result of the service call.

  7. The called services post the resulting metadata to the researcher pod / The researcher retrieves the resulting metadata from the service.

Requirements
  • Researcher pod

    • artefact storage

    • notification processing

    • metadata storage

    • metadata posting

  • Scholarly dashboard

  • Orchestrator

    • discovery

    • interface

    • pass policies (personal / institutional / ...)

    • advertise available / trusted services in the network (through policies?)

  • Policies

    • syntax

    • interoperability

    • enforcement

  • Service Hubs

    • discovery

    • interface

Bob wishes to add relevant datasets for a previous published paper (by him) in the network. In the case the published paper is not his, an approach is needed that allows the original creator of the linked artefact to accept the new link. Bob opens the scholarly dashboard, and navigates to the upload view. Bob uploads the new artefact. Bob selects the artefact that the new artefact is linked to in a "Link" section of the view. On submission, the scholarly dashboard stores the artefact in the data pod of Bob on the designated location (designated by Bob, or designated by the shape tree present in the pod). The scholarly dashboard receives the information for all relevant service hubs from the orchestrator. The artefact data is sent to all relevant service hubs.
Actors/Roles Actors:
  • Researcher

  • Institution

  • Service Hub Providers

Components:

  • Researcher pod R

  • Dashboard I

  • Orchestrator I

  • Service Hubs S

Goal
  • Create, store and distribute the interaction artefact.

  • Link the artefact to a target artefact.

  • Notify the relevant actors of the interaction.

Stakeholders
  • Researcher

  • Publishers (other researchers in the network)

  • Consumers (people browsing the network for scholarly information)

  • Service Hub Providers (provide services requiring information over the data linking)

Preconditions
  • The target artefact exists on the network and was created by the same researcher (If not the case, a mechanism is required to request target to accept link).

Triggers
  • The researcher navigates to the view of an artefact, and selects the "Link" option.

Postconditions
  • The artefact has been linked to the target artefact.

  • The network is made aware of the created link.

Procedure
  1. TODO

Requirements
  • TODO

8.4. Interacting with artefacts in the network

Here we present some use-cases for interacting with artefacts in the network.
Actors/Roles Actors:
  • Researcher

  • Institution

  • Service Hub Providers

Components:

  • Researcher pod R

  • Dashboard I

  • Orchestrator I

  • Service Hubs S

Goal
  • Create, store and distribute the interaction artefact.

  • Link the interaction artefact to the interacted with artefact..

  • Make the interaction artefact discoverable from the interacted with artefact.

  • Notify the relevant actors of the interaction.

Stakeholders
  • Researchers

  • Institution

  • Service Hub Providers

Preconditions
  • The interacted with artefact is present in the network.

  • The interacting user is a researcher (is connected to the network)

Triggers
  • The interacting user creates an interaction on an artefact in the network using the scholarly dashboard.

Postconditions
  • The interaction artefact is stored in the interacting user’s data pod.

  • The interaction artefact is added to the scholarly record.

  • The interacted with artefact (can / must) link back to the interaction artefact.

Procedure
  1. An interaction is created by a user using the scholarly dashboard.

  2. The interaction is stored on the user’s data pod.

  3. The interaction is submitted to the orchestrator.

  4. The orchestrator calls the necessary services to register the interaction into the scholarly record (maybe with a timeout if reactions are to be deleted?)

  5. The necessary services are executed. 5.1 The interaction artefact is archived. 5.2 The network is made aware of the new interaction.

Requirements
  • TODO

8.4.1. Commenting on an existing artefact in the network

Bob has some questions over an artefact created in the network. Bob opens the scholarly dashboard, and navigates to the artefact view. Bob selects the option to add a new comment. In the new comment view, Bob writes his comment and submits. The dashboard stores the comment and the relevant data on Bob’s data pod. The scholarly dashboard retrieves the awareness and archiving service hub information (registering interactions is not necessary?). The archiving service stores the comment and indexes it to be retrievable with the data for the original (multiple?) referenced artefact(s). The awareness hub extracts the artefact(s) that are referenced by (in) the comment, and notifies all actors subscribed to these artefacts of a new interaction lifecycle event. Here we present some use-cases for interacting with artefacts in the network.

8.4.2. Subscribing to entities in the network

Actors in the network SHOULD be able to subscribe to artefact lifecycle events for artefacts in the network. Additionally, actors MAY be able to subscribe to the events of new actors being added to the network. Actors SHOULD be able to subscribe to only specific event types linked to specific artefacts, that can be filtered by the actor on the available dimensions of artefacts in the network. These filters on the dimensions of the artefacts can be achieved using shape matching. E.g. an actor in the network should be able to subscribe for notifications of specific lifecycle events of artefacts where the artefact has a tag of "Scholarly Communication". (This should be indexed in a way that not all filters have to be evaluated with shape matching for each lifecycle event, as this is not scalable.)
Actors/Roles Actors:
  • Researcher

Components:

  • Scholarly dashboard (Optional?)

  • orchestrator

  • awareness service hub

Goal
  • The researcher is now subscribed to target type of artefacts.

  • The researcher receives notifications of the (chosen) life cycle events of the target artefacts.

Stakeholders
  • Researchers

  • Service hubs

Preconditions
  • An awareness service hub is available in the network.

Triggers
  • The researcher navigates to the "Subscriptions" view of the scholarly dashboard.

Postconditions
  • The researcher is subscribed to certain lifecycle events of artefacts in the network matching selected filters.

  • The researcher receives notifications for these lifecycle events if they occur in the network (and are passed to the subscribed to awareness services).

Procedure
  1. A filter is created in the "Subcriptions" view of the scholarly browser.

  2. The filter is submitted to the orchestrator as a "Subscription" event.

  3. The orchestrator executes the available triggers for a "Subscription" event.

  4. The awareness service hub is called with the subscription event.

  5. The awareness service stores the subscription, and adds the filter to its list of subscription filters.

  6. On a new artefact lifecycle event in the network, the filter is evaluated, and the researcher is notified for matching lifecycle events.

Requirements
  • Awareness service hub

  • Notification processing

  • (Indexed) Shape matching of artefact (lifecycle events).

8.4.3. Subscribing to artefacts of a specific author in the network

Bob wishes to subscribe to all artefacts created by Alice in the network, to receive updates of all new lifecycle information of her artefacts. Bob uses the scholarly dashboard and navigates to the subscription page. The scholarly dashboard retrieves the awareness subscription hub information from the used orchestrator. The subscription service hub advertises the available dimensions on which artefacts in the network can be filtered, and the available lifecycle information to which can be subscribed. The scholarly dashboard represents these options to Bob. Bob decides to filter the artefacts based on their author, and enters the webId of Alice into the filter. Bob decides to subscribe to the "create" and "publish" lifecycle events. On submission, the availability service hub saves the subscription, and on subsequent relevant lifecycle events of artefacts with Alice as the author, Bob receives a notification of these events. The subscription is also stored with the necessary metadata on Bob’s pod, allowing Bob to easily list and undo created subscriptions.

8.4.4. Undo a specific subscription

Bob wishes to undo a specific subscription because of too much events. Bob navigates to the subscription page of the scholarly dashboard. All previous subscriptions are listed on the page. Bob selects a subscription, and chooses the remove option. The awareness service hub is notified that the specific subscription should be removed. On success from the service hub, the scholarly dashboard removes the subscription from Bob’s pod.

9. Requirements

In this section the requirements are listed based on the proposed architecture § 3 Architecture of a Decentralized Network for Scholarly Communication and use cases § 8 Use Cases.

9.1. Pod based data storage

To provide researchers with the possibility to store their own research artefacts and interactions, while keeping control of the data in the process, pod environments such as Solid can be used. These can be hosted by the research institutions themselves, or by recommended third party services.

9.1.1. Research Artefact Storage

Research artefacts (Papers, images, datasets, ...) can be stored as (non-)rdf documents on the data pod of the researcher, or can be linked to the data pod from an external sources using mechanics described in § 9.1.4 Data Discovery.

9.1.2. Artefact Linking

Research artefacts (Papers, images, datasets, ...) can be interacted with in the network. These interactions MUST link to the artefacts they interact with. Artefacts may also contain links to artefacts they cite, to their used datasets, ... In case two way coupeling is required for these interactions, a mechanism is required to notify the pod storing the other artefact to return a link / provide the option to link back. This can be handled automatically, or require manual interaction to veryify.

9.1.3. Metadata Storage

Artefacts in the network have metadata, and will require this metadata to be stored and linked to the original artefacts. There are multiple solutions to store metadata for (non-)rdf data on a Linked Data Platform. However, it may benefit applications to support multiple approaches to metadata storage, and fall back on less ideal approaches in case the required metadata cannot be discovered.
9.1.3.1. .meta file
A straightforward way to store file metadata on a Linked Data Platform is making use of a .meta file. As the .meta is a naming convention, for every file the .meta file can be automatically retrieved. However, lately this seems to have been getting less and less traction in the community?
9.1.3.2. describedby / seeAlso metadata
Metadata can be provided in a more semantic way using predicates as rdf:seeAlso. This metadata can be stored in the location where the relevant resource is stored. A downside to this however is that if a resource is directly retrieved, this information will not be retrieved, and the metadata reference will be lost to the application retrieving the resource. A third solution is to provide Link Relations This solution returns the link to the metadata file in the response header of the HTTP request. Using this solution in combination with the previous solution of providing the link to the metadata file in the RDF data where the resource is stored, provides the most complete approach to storing metadata for a resource.

9.1.4. Data Discovery

Data discovery is required for external services to retrieve artefacts from a researcher pod, on being notified of new data. There should be a semantic way for external applications and services to discover the locations of certain types of artefacts and their metadata, and retrieve this data, given the correct permissions. Currently, shape trees are considered as the go-to approach to enable data discovery on linked data platforms.

9.1.5. Location aware data storage

External services that need to post data to the researcher pod (e.g. registration service hub returning a certificate of registration) need to be aware of the location where they can post this data. The shape tree approach used for data discovery can also be used to make the actor posting data aware of the location where data matching a given shape should be posted.

9.1.6. Artefact versioning

Versioning is a concept that is very relevant for scholarly communication, as certain artefacts (datasets, papers, ...) can be iterated upon (incorporating reviews, ...). In order to support this behavior, a versioning system must be in place that allows for actors in the network to retrieve different versions of the same research publication. This versioning system also should take into account the available lifecycle event information for the artefacts in question (Using the avialable lifecycle information, an actor in the network SHOULD be able construct the whole timeline of an artefact: creation, updates, publication, interactions, ...). This concept could also be applied for interaction artefacts (e.g. see comments on proposed edits, see updates of made comments, ...)

9.1.7. Permissions

Permissions are an important tool for data pods to protect private information. Only actors with the correct permissions set have access to resources in a pod environment. In pod environments, permissions are handled using ACL files. Permissions are required in case certain parts of the research are sensitive (e.g. datasets with sensitive information), or should not yet be public (e.g. in-progress work to share with colleagues).
9.1.7.1. Setting permissions
Functionality must be in place for all allowed to set permissions in a data pod. This requires that researchers are able to allow e.g. a scholarly dashboard to edit the permissions of their data pod (for specific locations).
9.1.7.2. Group permissions
Setting permissions for individual actors can be insufficient at times. Applications built on the Mellon framework SHOULD enable actors to to create, edit, and delete permissions for groups of actors. Creating or updating a permission group MUST have the consequence that the permissions of this group are applied on the new permission group, and are removed from actors that are removed from the permission group. This can be useful for e.g. creating a permissions group for your research group, so that permissions for created research can be automatically granted for the whole research group, instead of needing to set all permissions individually.

9.2. Events

9.2.1. Notifications

Notifications are required for the scholarly dashboard to notify the user of new information (received from e.g. the awareness service).
9.2.1.1. Notification filtering
A scholarly dashboard application SHOULD be able to filter notifications based on preset shapes. This enables the scholarly dashboard to filter the notifiactions for relevant lifecycle information of searched artefacts.
9.2.1.2. Notification based actions
In the case that notifications are used to notify of important artefact lifecycle information (in the case this is not directly posted to the researcher pod because of design reasons?), fucntionality is required that can evaluated the received notifications and execute automatic actions. This can be avoided by making notifications non-essentialy by posting the important life cycle events directly to the data pod, and making the notifications secondary to this and only point out the locations of the newly added data.

9.3. Orchestrator

The orchestrator is an actor in the network that is responsible for executing policies on received artefacts in the network. For all received policies, the orchestrator should filter relevant policies for the current action, and provide a proof of enforcement for each policy. Policy filtering can be achieved using shape matching solutions.

9.3.1. Orchestration interface

The interface for an orchestration service is an inbox advertised by the orchestrator. This inbox serves as an event-queue, from which the orchestrator pulls notifications if available. As the inbox is open, it can be posted to by all actors in the network.

9.3.2. Orchestrator policy retrieval

The orchestrator matches and enforces policies on received lifecycle events of artefacts in the network. But these policies have to first be retrieved / passed to the orchestrator. For this, both the client and the institution managing the orchestrator can provide an interface for the orchestrator to retrieve relevant policies.

9.3.3. Orchestrator policy enforcement

The enforcing of policies is handled by first matching the relevant policies to a received artefact lifecycle event, and subsequently executing the attached triggers. Proof of this execution should be possible, to verify the artefact path through the network, and to verify for the institution that the rules have been enforced. For this, a proof of execution could also be required by the service hubs, to know that a received submission is approved by (the automated rules of) a certain institution in the network.

9.4. Policies

Functionality is required for actors to enforce policies on data being generated / submitted / ... in the network.

9.4.1. Shape matching

Shape matching functionality can be a vital part of enforcing certain policies for actors to match the shape of artefacts and their metadata in case of RDF data. Shape matching makes use of a shape ontology, for which both shacl and shex are currently the most relevant shape ontologies. In case of non-rdf data, other solutions can be explored?.

9.5. Service hubs

In this sections the requirements for the different service hubs are listed.

9.5.1. Service hub discovery

Service hubs in the network MUST be discoverable via advertisement of a service hub by orchestrators in the network.

9.5.2. Service hub interactions

Service hubs MUST advertise the specific services they provide. Multiple service hubs may be used simultaneously to fulfil the same service (e.g. a researcher may register his created artefacts at multiple service hubs).

9.5.3. Service hub functions

Here the different available services in the network are listed. The proposed registration tokens are not bindingm and could be replaced by other verification mechanisms (see § 9.6 Verification mechanics).
9.5.3.1. Registration service
A registration service MUST provide a token of registration to the pod where the submitted artefact is stored. This token MUST be timestamped to allow actors in the network to verify claims of precedence. A copy of this token (and the submitted artefact) MUST be stored locally, to allow for verification by actors in the network, and should be provided to an archiving service.
9.5.3.2. Certification service
A certification service MUST compare a submitted artefact with the current state-of-the-art in the respective field. If the submitted artefact is novel, the service MUST provide a certification token to the pod where the submitted artefact is stored. A copy of this token (and the submitted artefact) MUST be stored locally, to allow for verification by actors in the network, and should be provided to an archiving service.
9.5.3.2.1. Peer review service
In the case of peer reviews, the a service may be setup that forwards a submitted artefact to a set of peers the network for review. The reviews of these MUST be returned to the service (before a set deadline), where the service will return a token with the attached reviews.
9.5.3.3. Awareness service
The service MUST advertise the available dimensions of the artefacts on the network for which an actor can subscribe (e.g. title, tags, creator, ...). Next to artefacts, an awareness service MAY also advertise new actors (researchers / service hubs / ...) connected to the network. The service MUST provide an interface for actors in the network to subscribe to specific lifecycle events of artefacts that can be filtered on all advertised dimensions. The awareness service MUST update all relevant subscribers of new information in the network. The service MUST provide an interface to undo specific subscriptions.
9.5.3.4. Archiving service
An archiving service MUST store all submitted artefact data it receives. The service MUST provide an interface to retrieve specific data from the service.

9.6. Verification mechanics

A scholarly communication network requires a way for actors in the network to verify data and entities in the network.

9.6.1. Researcher profile verification

A method MUST be in place for actors in the network to check the profile information of researchers in the network. This authentication requires the institution the researcher is affiliated with to provide an interface or a digital token to the researcher using which the verification can be done.

9.6.2. Artefact data verification

An actor in the network MUST be able to verify an artefact in the network at the service hubs specified in the artefact metadata. E.g. An actor can verify the registration of the artefact by for example verifying a registration token available for the artefact at the registration service hub specified in the artefact metadata or lifecycle events.

9.6.3. Service hub verification

An actor in the network MUST be able to verify a called service in the network. This can prevent malicious actors from re-routing actors to malicious services. This can be resolved using simple public / private key-based solutions

9.7. Collector

The Collector services in a network collect the generated knowledge, and make it available for clients querying the network. In the case of scholarly communication, there are multiple approaches.

9.7.1. Archiving service providing querying capabilities

As an archiving service collects and archives the scholarly record of the network, this information could be queried by clients. This however brings the risk of the archiving service having a monopoly on both the archiving and making public of the scholarly record in the network, where these services can provide their own browser for their own archived data.

9.7.2. Separate collector instance

Separate collector services may be added to the network. These services can subscribe to the available awareness service hubs on the network to be notified of new information in the network. The initial state of the collector service will have to be synchronized by requesting the scholarly record from the available archiving services that should be indexed.

9.8. Examples

Researcher Pod

/
  inbox/
  profile/
    card
  publications/
    publication1.pdf
    publication1.meta 
    publication2.pdf
    publication2.meta
  graphs/
    graph1.svg
    graph2.svg
  datasets/
    dataset1.ttl 
    dataset2.json

publication1.meta


   

9.9. Ontologies

Thought should be given to what ontologies are to be used to semantically describe the artefacts and events. The use of different ontologies could mean that more or less compatibility layers are required for the framework to interact with different scholarly communication frameworks outside of its network. Relevant resource

10. Definitions

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119