-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot persist IR #3252
Comments
I guess we need to think about this. A closure/proc cannot necessarily be persisted by itself because it likely has references to things in enclosing scopes (like IRMethod). As such, I never thought about this scenario directly. Even if things were changed so you persisted first non-closure parent scope and all nest closures then I am not sure if this works since persistence was designed around persisting IRScript or IREvalScript.... Not quite sure on this one. |
Not sure what the internals looked like, but it seems that @headius had this working once upon a time. |
@kylekyle yeah I remember this experiment. I think this example was pretty scary in that it if anything containing it lexically was different the code would explode. I guess if we examined all constants and containing depth > 0 variables we could perhaps safe-guard reloading the persisted closure and give an error message if it was being reloaded into an invalid containing scope. I am interested in the use-case you are looking to solve. The mechanism used in that experiment is one way and another way would be to encapsulate and persist any contained state at the same time and restoring that. The problem with encapsulating that extra state is it becomes a slippery slope since we may be persisting state of an object which would not exist in a new runtime so you would then need to persist type info (or restrict to common subset of well-defined types .. eg String/Array). I think this is why that experiment assumed you were reloading into a similar env. I think we can do something to make closures persist but it would be cool to get more clarification on how you envision it working. |
I do a lot of work in Spark. In my opinion, the only thing that would make Spark better is Ruby support. Unfortunately, Spark gets its flexibility from the ability to serialize a closure (especially those defined interactively in a shell session) and send it to executors and get back immediate results. The Ruby Spark project has made some headway toward Ruby support in Spark, but there are some pretty sharp edges since it uses Sourcify to marshal procs to Strings where it can. A properly serializable proc would make Ruby integration with Spark's Java API nearly seamless. The closures we want to serialize are typically trivial: # a word count example
file = context.text_file 'hdfs://.../'
file.flat_map(&:split).map{|word| [word,1]}.reduce_by_key(&:+) None of those closures require any extra state.
That would be perfect for what I'm trying to do. |
The code that supported headius' gist wouldn't happen to be lying around somewhere, would it? If I had a starting place, I could start working on a fork of the |
Our runtime in 9k is completely different so hacking the entry point you found would be the place to start |
@kylekyle Any progress on this? - I think this will make Spark so much better and easier to use |
@tomz: I got a first pass finished, but it's super buggy and I'm starting to think this approach won't be viable. I think the easiest thing might just be to a Rubinius driver to serialize procs since it natively supports that. Everything else would remain JRuby. Another alternative would be to use Truffle, which I believe (correct me if I'm wrong), might also have native support for serializing a proc. The only problem is that the standard library hasn't been completely implemented in Truffle, so a lot of things just don't work yet (like pry). Kind of a setback, but I haven't given up yet. I'll update this ticket if there's any news. |
I believe JRuby 9.2.11.0 included a large chunk of work to make IR serialization work better. It's not officially a supported feature, but this issue should no longer be valid. Marking as fixed in 9.2.11.0. |
It appears that JRuby supports serializing a scope through the
IRWriter
class, but thepersist
method looks broken at the moment. I bumped into the issue while writing an extension for serializing aProc
:You can test the service with a trivial proc:
kylekyle$ jruby -X-C -r proc_to_bytes_service.jar -r proc_to_bytes -e "proc{puts 'hello'}.to_bytes"
But you get the following stack trace:
Is the persist method expecting a different type of scope? Is there a different StreamWriter I should be using? Has anyone had success persisting a scope using
IRWriter
? Is there an easier way to serialize a JRuby proc?The text was updated successfully, but these errors were encountered: