added support for named graphs in the processor classes #48

acoburn · 2015-01-28T22:49:26Z

https://jira.duraspace.org/browse/FCREPO-1310

This makes it easier for implementors to use named graphs with external triplestores.

added unit and integration tests

awoods · 2015-01-30T17:16:36Z

If the FCREPO_NAMED_GRAPH header is set, I gather that means that all Fedora-related updates will be inserted/deleted relative to that named graph, which would amount to a single Fedora named graph in the external triplestore, correct?
If so, the utility of that is then being able to potentially clean out all Fedora-related triples in isolation from other triples in the store?

ajs6f · 2015-01-30T17:17:32Z

Doesn't that depend on whether the header's value varies from request to request, or is a constant for the repository?

awoods · 2015-01-30T17:32:28Z

I see where the header is being read, but not where it is being set... and therefore assume it is a configuration element.

ajs6f · 2015-01-30T17:34:38Z

Isn't the setting of the header exactly what we'd want to leave up to each site? Each site might want to partition its triples very differently, for all kinds of unpredictable-to-us reasons.

awoods · 2015-01-30T17:36:30Z

Where would/does that logic exist for setting the graph-name header?

ajs6f · 2015-01-30T17:38:42Z

It would be, as I understand it, in the integration at a given site. Perhaps in elaborated Camel, or in some proxying element between the repo and Camel. I'd do it in Camel, myself, because you've already got it in play. We might want to include some defaulting behavior and document some recipes for partitioning into one graph per repo or one graph per resource.

awoods · 2015-01-30T17:45:04Z

I agree that some documented defaulting behavior and recipes for assigning the graph-name header at different scopes would be helpful.
Otherwise, this particular PR is probably ready to go... unless there is anything that you can think of to put in here, @acoburn, that would make the picture more clear.

acoburn · 2015-01-30T18:16:53Z

The header is not set anywhere in the camel component -- that is up to implementors. For instance, one may want to partition the fedora nodes into separate (possibly overlapping) named graphs. The default behavior is to use no named graph (i.e. everything goes into the default graph).

For instance, to partition into named graphs, based on a dynamically assigned property placeholder value:

from("activemq:topic:fedora")
  .filter(some-type-of-filter)
    .to("fcrepo:localhost:8080/rest")
    .setHeader(FcrepoHeaders.FCREPO_NAMED_GRAPH).simple("{{named.graph}}")
    .process(new SparqlUpdateProcessor())
    .to("http4:localhost:3030/ds/update");

Or, to partition based on some existing RDF property:

from("activemq:topic:fedora")
  .filter(some-type-of-filter)
    .to("fcrepo:localhost:8080/rest")
    .setHeader(FcrepoHeaders.FCREPO_NAMED_GRAPH)
      .xpath("/rdf:RDF/rdf:Description/ex:namedGraph/text()", String.class, ns)
    .process(new SparqlUpdateProcessor())
    .to("http4:localhost:3030/ds/update");

Or, you may want to have a "public" and a "private" graph in the triplestore:

from("activemq:topic:fedora")
  .to("fcrepo:localhost:8080/rest")
  .multicast("direct:public", "direct:private");

from("direct:public")
    .filter(some-predicate)
      .setHeader(FcrepoHeaders.FCREPO_NAMED_GRAPH)
        .constant("public")
      .process(new SparqlUpdateProcessor())
      .to("http4:localhost:3030/ds/update");

from("direct:private")
  .filter(some-other-predicate)
    .setHeader(FcrepoHeaders.FCREPO_NAMED_GRAPH)
      .constant("private")
    .process(new SparqlUpdateProcessor())
    .to("http4:localhost:3030/ds/update");

But the setting of the header is really up to the specific implementation -- it may be hard-coded; it may come from an RDF property; it may come from some dynamically assigned property.

awoods · 2015-01-30T18:34:16Z

Thanks, @acoburn, those examples are helpful.

added support for named graphs in the processor classes

acoburn · 2015-01-30T18:37:00Z

I'll add these to the documentation

Conal-Tuohy · 2017-02-21T03:44:58Z

@acoburn thanks for the examples above. Does this, and more explanatory material, appear in the documentation? I haven't been able to find better documentation that appears on this github issue. I would like to be able to use each node's identifier as the name of the RDF graph.

acoburn · 2017-02-21T14:03:22Z

@Conal-Tuohy I don't believe there are very good examples of this, but I am planning to implement something soon that will index each resource's triples into separate named graphs in a triplestore. Keep an eye on https://gitlab.amherst.edu/acdc/repository-extension-services for a new triplestore indexer in the next week or two.

Conal-Tuohy · 2017-02-21T23:54:21Z

Thanks @acoburn - I will keep an eye on that repo.

It would be most helpful though if any of this could be documented in the official documentation.

acoburn · 2017-02-24T02:07:51Z

@Conal-Tuohy the named graph support is documented in the official documentation, though that documentation doesn't explain the patterns one might employ for using named graphs.

The key thing is to have this in your camel code:

.setHeader(FCREPO_NAMED_GRAPH).constant("info:myuri")

Or, to index each resource into its own named graph:

.setHeader(FCREPO_NAMED_GRAPH).header(FCREPO_URI)

A full example of indexing each resource into its own named graph is available here: https://gitlab.amherst.edu/acdc/repository-extension-services/blob/master/acrepo-connector-triplestore/src/main/java/edu/amherst/acdc/connector/triplestore/TriplestoreRouter.java

The destination triplestore can then be configured to make the default graph a union of all the datasets, which makes it possible to query across graphs. Alternately, one can query a particular named graph using the SELECT * FROM <graph-uri> WHERE { ... } syntax.

Conal-Tuohy · 2017-02-24T06:30:36Z

Thanks again @acoburn - your second example is I think the pattern I'm after (where the URI of the Fedora item is used as the name of the graph in the graph store). I will give that a try.

Regarding the documentation on the Wiki, I originally found some related documentation at https://wiki.duraspace.org/display/FEDORA471/Setup+Camel+Message+Integrations (and subordinate pages), but it had no mention of named graphs. There was a reference to fcrepo-camel-toolbox which is how I found my way to this github issue and discovered the named graph feature.

But searching again just now, using Google, I found the corresponding page for Fedora 4.2 does include material related to using named graphs in the external graph store. I don't know why the version of that page for Fedora 4.71 doesn't also include that documentation, but it doesn't. Is the feature not supported in the latest Fedora? Or is it just missing from the docs?

acoburn added 2 commits January 28, 2015 17:47

added support for named graphs in the processor classes

64ff2c7

added unit and integration tests

updated documentation to include FCREPO_NAMED_GRAPH header

acd21fb

awoods pushed a commit that referenced this pull request Jan 30, 2015

Merge pull request #48 from acoburn/named-graph-support

40e2b42

added support for named graphs in the processor classes

awoods merged commit 40e2b42 into fcrepo-exts:master Jan 30, 2015

acoburn deleted the named-graph-support branch January 30, 2015 22:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added support for named graphs in the processor classes #48

added support for named graphs in the processor classes #48

acoburn commented Jan 28, 2015

awoods commented Jan 30, 2015

ajs6f commented Jan 30, 2015

awoods commented Jan 30, 2015

ajs6f commented Jan 30, 2015

awoods commented Jan 30, 2015

ajs6f commented Jan 30, 2015

awoods commented Jan 30, 2015

acoburn commented Jan 30, 2015

awoods commented Jan 30, 2015

acoburn commented Jan 30, 2015

Conal-Tuohy commented Feb 21, 2017

acoburn commented Feb 21, 2017

Conal-Tuohy commented Feb 21, 2017

acoburn commented Feb 24, 2017

Conal-Tuohy commented Feb 24, 2017

added support for named graphs in the processor classes #48

added support for named graphs in the processor classes #48

Conversation

acoburn commented Jan 28, 2015

awoods commented Jan 30, 2015

ajs6f commented Jan 30, 2015

awoods commented Jan 30, 2015

ajs6f commented Jan 30, 2015

awoods commented Jan 30, 2015

ajs6f commented Jan 30, 2015

awoods commented Jan 30, 2015

acoburn commented Jan 30, 2015

awoods commented Jan 30, 2015

acoburn commented Jan 30, 2015

Conal-Tuohy commented Feb 21, 2017

acoburn commented Feb 21, 2017

Conal-Tuohy commented Feb 21, 2017

acoburn commented Feb 24, 2017

Conal-Tuohy commented Feb 24, 2017