Federation¶
Federation concerns ways for a query to draw upon diverse data sources and processing capacity in addition to the immediate dataset and query processor. The SPARQL 1.1 recommendation include provisions for expressing federated queries in SPARQL.
For Dydra, its various forms involve an initial, base query, which is processed by a first host against an initial repository and a sub-query, which is processed by a second host against some other repository, with the solutions incorporated by the first host into the base query evaluation algebra. The distinct forms arise due to variations in which hosts play which of the two roles, which query forms initiate federated processing, and which data sources are involved.
Federation Request Modes¶
Dydra supports internal and external federation. Internal federation pertains when the same query processor hosts both the repository to which a sub-query is applied and the repository for the base query. In this case, the query if executed as a subtask within the initial host query processor subject to intra-host access and authorization. External federation pertains when a sub-query is performed upon request by a remote host processor. In this case, for each such location, the first processor determines a sparql endpoint, a query request is made of that endpoint host, and the processor receives the results are integrates into its algebra data-flow. The federation mode none suppresses federation processing.
The default mode for a query is specified in the initial system configuration. Each individual query may specify its own mode, subject to constraints
mode |
effect |
permitted variations |
---|---|---|
|
federation is disabled:
|
|
|
federation is enabled for references local to the host |
|
|
federation is enabled for both local and remote references. |
|
Federation Query Forms¶
The SPARQL 1.1 Federated Query
specification includes the SERVICE
form, with which a query
indicates that a component is to be executed by an alternative processor. This
is explicit federation. SPARQL 1.0, on the other hand, introduced the
possibility to indicate diverse datasets through GRAPH
forms, but the
specification indicates that, if an IRI is specified in a dataset
description, “attempts are made to obtain an RDF graph associated with the
IRI.” In other words, it allows just that the first processor obtain the
designated graph and incorporate it into the local dataset to which it then
applies the query.
The Dydra SPARQL processor performs explicit federation only
and any graphs must already be present in the repository when it initiates the request.
Authorization¶
When a query specifies either an internal reference to a local repository or a remote repository location, the access must be authorized. Authorization has two aspects: from the client perspective and from the service perspective. Each repository allows to freely specify authorized locations in the form ACL entries which specify either originating repository or a request agent. A reference from a query is permitted when the query’s initial repository matches an entry from the service location’s authorized clients or the request agent satisfies an analogous constraint. The respective ``system``repository contains these permissionsas ACL entries..
Examples¶
Internal Federation¶
A SERVICE
form which either specifies a constant IRI term or
which specifies a variable location in tur bound to an IRI term, which is local to
the query processor host, is executed as a SubSelect
within the same
query processor.
References to repositories within the same account are always
permitted, but references to any other repository require authorization.
PREFIX federation_mode: <urn:dydra:internal> # or external
select *
where { ?s ?p ?o .
service <http://localhost/jhacker/foaf> {
?s <http://xmlns.com/foaf/0.1/mbox> ?mbox .
}
}
A local host is indicate by the following authorities
- local
- 127.0.0.1
- dydra.com
- the exact hostname returned by the hostname
function.
External Federation¶
A SERVICE
form which specifies a location IRI, which is external
to the query processor host, is executed as a SPARQL request to the
remote processor.
All external requests require authorization to access the respective service.
PREFIX federation_mode: <urn:dydra:external>
select *
where { ?s ?p ?o .
service <http://w3.org/tbl/foaf.nt> {
?s <http://xmlns.com/foaf/0.1/mbox> ?mbox .
}
}
Internal Federation with Virtual Sources¶
A SERVICE
local federation location may designate data sources in addition to concrete repositories.
Stored views may be identified with their external resource identifier.
Alternative backends are identifed according to their declare repository alias.
View Federation¶
Where the location IRI includes a view suffix, that view is executed and the results are incorporated
according to its dimensionality. A SELECT
query expression yields a result field with the
select form projsction dimensions.
A CONSTRUCT
or DESCRIBE
expression always yields a result field with the dimensions ?s
, ?p
, ?o
.
A view location is any IRI which follows the pattern for internal federation, above.
Relational Federation¶
PSQL views are declared in the store configuration (/srv/dydra/config/server.conf
) to
define an arbitrary mapping from Postgres view to RDF field,
where the columns of each view become dimensions of the solution field.
For example, given the declaration
pgsql {
tr_crop_plots_seed_start {
storage pgsql
pgsql-table public.tr_crop_plots_seed_start
}
tr_crop_plots_seed_end {
storage pgsql
pgsql-table public.tr_crop_plots_seed_end
}
tr_crop_plots_seed {
storage pgsql
pgsql-table public.tr_crop_plots_seed
}
}
and the Postgres views defined as
View "marti.tr_crop_plots_seed"
Column | Type | Modifiers | Storage | Description
--------+------+-----------+----------+-------------
s | text | | extended |
p | text | | extended |
o | text | | extended |
View definition:
SELECT concat('http://example.org/plot/', tr_crop_plots.crop_plot) AS s,
'http://example.org/seed'::text AS p,
tr_crop_plots.seed AS o
FROM tr_crop_plots;
View "marti.tr_crop_plots_seed_end"
Column | Type | Modifiers | Storage | Description
--------+---------+-----------+----------+-------------
s | text | | extended |
p | unknown | | plain |
o | date | | plain |
View definition:
SELECT concat('http://example.org/plot/', tr_crop_plots.crop_plot) AS s,
'http://example.org/end_date' AS p,
tr_crop_plots.end_date AS o
FROM tr_crop_plots;
View "marti.tr_crop_plots_seed_start"
Column | Type | Modifiers | Storage | Description
--------+---------+-----------+----------+-------------
s | text | | extended |
p | unknown | | plain |
o | date | | plain |
View definition:
SELECT concat('http://example.org/plot/', tr_crop_plots.crop_plot) AS s,
'http://example.org/start_date' AS p,
tr_crop_plots.start_date AS o
FROM tr_crop_plots;
a federation operation could take the form
select ?field ?start ?end ?amount
where {
?field <http://example.org/harvest> ?amount .
{ service <http://localhost/pgsql/tr_crop_plots_seed> { ?field <http://example.org/seed> ?seed } }
{ service <http://localhost/pgsql/tr_crop_plots_seed_start> { ?field <http://example.org/start_date> ?start } }
{ service <http://localhost/pgsql/tr_crop_plots_seed_end> { ?field <http://example.org/end_date> ?end } }
}" :repository-id "my/field-harvest")