Provenance concerns information about how an entity came to be and about its contributions towards the existence of others. Dydra exposes meta-data about repositories sufficient to answer questions about - data lineage - retrospective repository state - responsibility for changes

Lineage is first-order only. That is, the information about a transaction enumerates the data set constituents at the level of graphs. In does not describe dependencies at the level of individual inserted or deleted statement, [cui2001] and no attempt is made to identify graph subcomponents [ding2005] .

Each repository can specify a respective provenance repository. This is reflected in the base repository’s service description as a prov:hasProvenanceService association. along with additional prov:hasProvenance associations to aid discovery for revisions. In addition refrences appear in SPARQL protocol responses in the HTTP response headers, as proposed by PROV-AQ[prov-aq](3.1) - the query response header rel=’prov:has_query_service’ specifies as the anchor the sparql query service for the respective provenance repository. - the query response header rel=’prov:has_provenance’ specifies as the anchor the graph identifier for repository revision effective for the query

Provenance Schema

The schema derives from the w3c proposed provenance ontology .

@prefix : <urn:dydra:> .
@prefix prov: <> .
:Transaction rdfs:subClassOf [prov:Activity]( .
:Revision rdfs:subClassOf [prov:Entity]( .
:Graph rdfs:subClassOf [prov:Entity]( .
:Account rdfs:subClassOf [prov:Agent]( .
:Operation rdfs:subClassOf [prov:Entity]( .
:Query rdfs:subClassOf :operation .
:Repository rdfs:subClassOf [prov:Collection]( .

Provenance Information

The provenance information is compiled subsequent to each update request when a provenance repository has been specified. It states the identities of the transaction entities: account, repository, query, transaction, and generated revision. For each of those, it records the following subject and associations:



  • prov:wasAssociatedWith : the account
  • prov:hadMember : the revision
  • as graph : the context for the query, transaction, and revision statements



  • prov:used : the parent revision (unless initial)
  • prov:generated : the revision
  • prov:hadPlan : the query
  • prov:startedAtTime, prov:endedAtTime
  • prov:wasRevisionOf : the parent revision (unless initial)


  • prov:wasDerivedFrom : read graphs
  • prov:wasGeneratedBy : the transaction
  • prov:wasInvalidateBy : the succeeding transaction (if applicable)
  • prov:wasRevisionOf : the parent revision
  • prov:wasUsedBy : the succeeding transaction (if applicable)
  • prov:startedAtTime

Parent Revision

  • prov:endedAtTime


  • prov:wasGeneratedBy, prov:wasInvalidatedBy, prov:wasInfluencedBy : the transaction, depending on creation, deletion, or modification.

A Simple Example

The information collected in a provenance repository after a simple sequence of three of three updates was performed on its base repository would appear as follows:


Query Responses

Update query responses include links to provenance information in the headers and or the encoded body, depending on the particular encoding. (NYI)


For html responses - eg. the query editor page, provenance and provenance-service links should be present along with an anchor link to the abstract repositiory, in the head. (NYI)


For RDF responses, the prov:hasProvenance, prov:hasAnchor, and prov:hasProvenanceService properties should be incorporated into the encoded result. [prov-aq](3.2.1) (NYI)


In order to enable provenance processing:

  • Configure the base repository to record provenance data

    <> <urn:dydra:provenanceRepositoryId> <http://localhost/account/provenance-repo-id>

  • Specify the repository as a request pragma

PREFIX provenanceRepositoryId: <http://localhost/account/provenance-repo-id>
INSERT DATA { 'object~:0001' . rdf:type .

[prov-aq](1, 2)