New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rdf iteration #117
Rdf iteration #117
Conversation
…e properties, except for root node.
…nto its own class
public Model asModel() { | ||
final Model model = createDefaultModel(); | ||
model.setNsPrefixes(namespaces()); | ||
for (final Triple t : this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever this call is made, the iterator benefits are lost, no? i.e. The entire contents of the iterator is loaded.
Should we allow a "limit" argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of that method is to convert to a serializable form with ease. Instead of inserting params here, the idea is to filter the iterator to select the triples wanted and then call asModel(). When I engage with the second half of the task (client code) I will add some helpers to easily select out triples by the various characteristics that might be interesting, including by "windowing".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
* @author ajs6f | ||
* @date Oct 18, 2013 | ||
*/ | ||
public class RootRdfContext extends NodeRdfContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add Workspace triples here? Or is that another RdfContext?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Too many of these contexts simply came directly out of the old impl. My way of asking the question is: in what context (larger sense of the word) would we want to supply those triples? My first thought is-- you might very well want them when you are not in the context of examining a given node, so they're not from a NodeRdfContext at all and therefore certainly don't belong here. (There are many other triples being generated here that don't belong here, by the same logic, but I was trying to get something out the door.)
There are other contexts (e.g. NamespacesRdfContext) that aren't associated to a give Node. That's the right pattern, for my money. So we could have a WorkspacesRdfContext and that seems to me to be best. If we can find a larger abstraction that really makes sense, we can factor NamespacesRdfContext and a hypothetical WorkspacesRdfContext through it. Maybe RepositoryRdfContext?
Looks good. I will wait for tomorrow's small update (lazy iteration on versions), then I will push this into master. |
Resolved with: b225254 |
https://www.pivotaltracker.com/story/show/57304426
This PR is for the issue named above. It provides all the machinery for an interator-based approach to RDF generation. It stops short of actually changing anything recognizable as a public contract from the kernel. This is based on discussion between @awoods and @ajs6f in which we concluded that a two-step procedure would be safer and more manageable. Currently, this PR is meant mostly for code review and criticism. It can be merged directly without breaking anything, but the performance/memory usage improvements of an iterator-based approach will not become visible until they are exposed outside the kernel (e.g. in fcrepo-http-api).