Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reificator overriding methods breaks RubyKernel dispatch #4643

Open
ivoanjo opened this issue Jun 2, 2017 · 20 comments
Open

Reificator overriding methods breaks RubyKernel dispatch #4643

ivoanjo opened this issue Jun 2, 2017 · 20 comments

Comments

@ivoanjo
Copy link
Contributor

ivoanjo commented Jun 2, 2017

Environment

  • JRuby: jruby 9.1.10.0 (2.3.3) 2017-05-25 b09c48a Java HotSpot(TM) 64-Bit Server VM 25.131-b11 on 1.8.0_131-b11 [linux-x86_64]
  • Kernel: Linux maruchan 4.11.0-041100-generic #201705041534 SMP Thu May 4 19:36:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Distro: Ubuntu 17.04

Expected Behavior

When investigating failures while using the -Xreify.classes option and after fixing the issue in #4642, it turns out that then I got bit by #3793.

I spent a few minutes looking at why

class BrokenReify
  def to_s
    super
  end
end

puts BrokenReify.new.to_s

causes an infinite recursion:

java.lang.StackOverflowError
[...]
        at rubyobj.BrokenReify.to_s(Unknown Source)
        at org.jruby.RubyKernel.to_s(RubyKernel.java:1976)
        at org.jruby.RubyKernel$INVOKER$s$0$0$to_s.call(RubyKernel$INVOKER$s$0$0$to_s.gen)
        at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:737)
        at org.jruby.ir.runtime.IRRuntimeHelpers.instanceSuper(IRRuntimeHelpers.java:983)
        at org.jruby.ir.instructions.InstanceSuperInstr.interpret(InstanceSuperInstr.java:69)
        at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:355)
        at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:73)
        at org.jruby.ir.interpreter.InterpreterEngine.interpret(InterpreterEngine.java:77)
        at org.jruby.internal.runtime.methods.InterpretedIRMethod.INTERPRET_METHOD(InterpretedIRMethod.java:150)
        at org.jruby.internal.runtime.methods.InterpretedIRMethod.call(InterpretedIRMethod.java:137)
        at org.jruby.RubyClass.finvoke(RubyClass.java:557)
        at org.jruby.runtime.Helpers.invoke(Helpers.java:397)
        at org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:380)
        at rubyobj.BrokenReify.to_s(Unknown Source)

In a nutshell, this seems to happen because when calling super, we eventually reach the RubyKernel, which then calls back into the object itself, assuming that the only to_s defined on a ruby object instance is the method in RubyBasicObject.

This assumption is true... except when we are using -Xreify.classes. This is because the Reificator actually adds ruby methods to the class it generates, with a "trampoline" that allows Java code to directly call methods on ruby classes.

Of course, when the class itself overrides methods from BasicObject and calls super, then RubyKernel will instead call the overridden version of the method, rather than the one it intended to call. (E.g. conceptually, we'd want RubyKernel calling those methods with INVOKESPECIAL).

The problem above can thus be trivially fixed by just deleting https://github.com/jruby/jruby/blob/9.1.10.0/core/src/main/java/org/jruby/RubyClass.java#L1547-L1726 .

I'm opening this issue to mostly sum up what I've discovered and to pose a question: Is this behavior of having the generated "trampoline methods" still a thing JRuby wants to have?

I ask because I see -Xreify.classes being useful for two purposes:

  1. To have ruby instances that correspond to their class, rather than RubyObject and friends. This is invaluable when debugging memory issues as otherwise it's very hard to see what's going on in VisualVM and friends

  2. To have a Java class that contains the methods of the ruby class, allowing them to be called by Java code

I believe that supporting just 1) without 2) would be an extremely valuable tool for debugging memory problems with JRuby applications, as otherwise it's very complex to navigate heap dumps.

Fixing -Xreify.classes for this use case can just be deleting the offending code, or alternatively adding a -Xreify.classStorage that enables reification but does not generate the "trampoline methods" would be fantastic. Would any of these be solutions you'd consider accepting?

Otherwise, the alternative as I see it would be to probably duplicate or convert the java methods on RubyBasicObject into static methods, so they could be called directly, rather with dynamic dispatch, but that would probably be a more complex endeavor.

@kares
Copy link
Member

kares commented Jun 2, 2017

great summary, as noted on the other thread I am also interested in reification being improved/reliable.
probably in a way you mentioned (don't care for having it on by default - backward compat first), there's one catch and that is the reusal of reification for become_java! so it needs to be done carefully while at the same time sharing as much as possible (there's a third separate mechanism for generating Java proxies - ideally it could also become somehow part of the show and very optimistically work on edge cases such as when a Java sub-class in Ruby.become_java!).

based on these requirements I see anything else as an optional bonus (-Xreify.classStorage redundant?)

@headius
Copy link
Member

headius commented Jun 6, 2017

Ok so I think this is one of two active class reification issues outstanding, and I'll muse on some solutions here.

I suspect the problem in your case is actually a bug in how we handle super within these trampoline methods. As you discovered, the super call is starting its search at the bottom of the class hierarchy again (in your custom class) rather than at the Object level or higher. As a result it hits the super again and blows the stack.

We can fix that issue by ensuring the trampolines are passing in the correct class, I think.

The more complicated issue here is that there are many such methods that when reified may overwrite core class behavior and act oddly. I'm thinking perhaps we should get class reification working again, but only for methods you explicitly say to export.

This may or may not become a default feature, but I think it should (re)built based on the variable reification logic that we use all the time now. Ultimately all these classes should be generated, and if you opt into it there should be a bottom class for each Ruby type so you can see it in profiles, etc.

In the short term...I'm considering changing the reify.classes flag to at least start generating a real subclass (below the specific-width reify.variables classes) so people can get the benefits of real classes until we fix the method mapping.

headius added a commit that referenced this issue Jun 6, 2017
The goal here is to start splitting off the method reification,
which many (most) people don't need, so we can still get the
benefit of generating a real class for every custom Ruby class
without the hassle of binding a bunch of methods.

See #4643.
@headius
Copy link
Member

headius commented Jun 6, 2017

Oh and to answer the question about trampolining: yes, this is a large part of the reason reification exists: to provide a way for Java interfaces (and plain instance methods) to be implemented such that they'll dynamically dispatch into Ruby and pick up any lazily-defined methods (or method_missing).

headius added a commit that referenced this issue Jun 7, 2017
This is a first step toward unifying all forms of reification.

In this commit, all the hand-implemented RubyObjectVar* and
ObjectAllocator* are instead generated as needed and then reused
statically. Since this will be a bounded number of classes (based
on the max size of inspected instance vars in user classes) it
should not use too much memory.

This now allows us to support reified instance variable fields up
to an arbitrary size.

Note that although the allocators are generated on a per-class
basis (necessary without using MethodHandle for allocation) the
variable accessors are still hand-written and only call the
appropriate getVariable# or setVariable# up to 9 as before. It is
an open TODO to generate these accessors on a per-class basis so
they inline all the way through. The alternative would be moving
toward more MethodHandle-based factories, but that limits
cross-JVM support and may result in startup impact.

For #4643 among others.
@headius
Copy link
Member

headius commented Jun 7, 2017

Ok, moving right along. I just committed logic that generates all the RubyObject0 etc classes along with their ObjectAllocator0. This cleans up a lot of the reification code around instance variables.

The next step would be generating a specific subclass of those width-specific specialized classes that has the name of the Ruby class in question.

For simple cases where the object descends from Object, this will be no problem. We make sure the width-specific class is available, and then generate a new subclass (and allocator, likely) specific to the Ruby class name.

For the more complicated cases where there are other reified classes between ours and Object, things get a little tricky.

For example, the following case:

class Foo
  def initialize
    @foo = 1
  end
end

class Bar
  def initialize
    super
    @bar = 0
  end
end

Bar.new

The Foo class is easy: its Java hierarchy will be `Foo < RubyObject1 < ReifiedRubyObject < RubyObject.

What should Bar's Java hierarchy be?

  1. Ideally it would be Bar < Foo < ... but then we can't reify the @bar variable and it would have to be in an array table, since we've already done our specialization in this hierarchy.
  2. A second alternative would be that Bar and Foo have disjoint class hierarchies, as in Bar < RubyObject2 < ReifiedRubyObject < RubyObject. Obviously this would break some usages from Java, since Ruby would think that a Bar is a Foo, but Java would not. This would also make it harder to e.g. profile for all instances of Foo and its subclasses.
  3. We could generate the field logic specific to Bar by adding the additional getVariable1 style methods for @bar. The down side here is that for all second-level or higher user class (i.e. at least one intervening class before reaching Object) we'd basically duplicate those specific-index methods every time.

The current reification logic for instance variables did not have this problem, because it did not enforce any Java-level hierarchy to match the Ruby hierarchy. Both Foo and Bar would simply use the base RubyObject1 or RubyObject2.

I'll ponder this a bit tonight and see if I can think of anything.

@headius headius added this to the JRuby 9.2.0.0 milestone Jun 7, 2017
@headius
Copy link
Member

headius commented Jun 7, 2017

There's a fourth option: don't reify instance variables if we're reifying classes. That's basically what we had before, except that reifying variables was always on and interfered with reifying classes.

On the pro side this would simplify "structural" reification of the class name and methods. On the con side this would mean losing the memory and performance characteristics of field-based instance variables in exchange for structural reification.

At this point I'm leaning toward just duplicating the get/set logic in each generated class when there's a hierarchy to be maintained, so each class knows how to get its own variables and all variables get reified properly.

In order to have my example above for Bar < Foo produce only those two reified classes, Foo must reify first with var0, and Bar must copy its instance variable accessors plus add another for var1. From then on the two classes would evolve additional ivars independently, using an array table to hold them. But the inspectable vars would be reified immediately.

Note also this work may make it possible to re-specialize a given class at a later time if it turns out there's many dynamically-created instance variables. Ideally all classes that aren't changing a lot at runtime would eventually produce tightly-packed Java objects.

@ivoanjo
Copy link
Contributor Author

ivoanjo commented Jun 7, 2017

Thanks for tackling this and for the great write-up @headius ! 🎉 🚀

@headius
Copy link
Member

headius commented Sep 5, 2017

I would like to merge #4648. Can you confirm it works properly for you?

I have not come up with a good way for us to reify both fields and class identity, given the challenges of supporting parent class fields in children, so the patch in #4648 (which disables one type of reification when you enable the other) may be the best option going forward.

The unfortunate down side is that field reification would be useful to have with class identity reification, since it makes it simpler to drill down into object trees in a heap dump. However it's only marginally better than the var table, since none of the original instance variable names are used (instead it's value0, value1, etc). The best option would be to use the actual names from Ruby code, reify the fields, and still maintain the parent/child class structure in the reified classes. I'm not sure how to do that right now.

@ivoanjo
Copy link
Contributor Author

ivoanjo commented Sep 9, 2017

I'm still alive :)
I'm moving so the last few days have been a bit hectic but I'll test and report back ASAP.

@ivoanjo
Copy link
Contributor Author

ivoanjo commented Sep 9, 2017

Just replied on #4648 (comment)

@headius
Copy link
Member

headius commented May 17, 2018

I've merged in the "new" reify variables logic, which will now generate proper-shaped RubyObject subclasses for any number of statically-inspectable instance variables.

$ jruby -e 'eval "class Foo; def initialize; #{ 100.times.map {|i| "@foo#{i} = 1"}.join("\n")}; end; end"; p JRuby.ref(Foo.new).getClass'
class org.jruby.gen.RubyObject100

This does nothing to fix the combination of reify.classes and reify.variables, nor does it fix reification in the presence of become_java!. The complexities are many:

  • Sure, we can easily generate a properly-named Java class for any Ruby type, but...
  • How do we handle parent/child relationships when both have reified variables?
  • Can we ignore the Ruby hierarchy for this purpose, since as far as JRuby is concerned they're both just RubyObject?
  • Should it be SpecificName extends RubyObject100 or should we regenerate RubyObject100 with a specific name? The latter would be easier for maintaining parent/child relationships, but still would waste memory on parent's fields that don't get used.
  • More work could be done to reify by intelligently sharing parent and child fields, but it gets tricky when there might be dynamic ivars too.

So there's a lot of questions to answer about reification of variables, class names, hierarchies. But it's moving in the right direction.

Bumping to post 9.2.

@headius headius modified the milestones: JRuby 9.2.0.0, JRuby 9.2.1.0 May 17, 2018
@ivoanjo
Copy link
Contributor Author

ivoanjo commented May 17, 2018

Thanks for picking this up! I'm still very interested in the use-Java-performance-tooling-to-peek-at-JRuby angle, so I'm anxiously following new developments :)

@headius
Copy link
Member

headius commented May 17, 2018

Well that is an interesting point. If this feature is focused on presenting named types to the jvm, a lot of the tricky bits above could be ignored. I had hoped to unify all of the different forms of reification, including become_java, but that may be a bit too ambitious given the oddities of the class structure. So if all we really wanted or needed was proper names for the classes, it would not be at all difficult to just generate one more subclass of the specialized type. Basically just support the simplest class reification for heap profiling and other jvm tools.

@ivoanjo
Copy link
Contributor Author

ivoanjo commented May 18, 2018

Yeap, I think being able to use JVM tools easily (or even out of the box), especially in production would be a major value-add.

@enebo enebo modified the milestones: JRuby 9.2.5.0, JRuby 9.2.6.0 Dec 6, 2018
@enebo enebo modified the milestones: JRuby 9.2.6.0, JRuby 9.2.7.0 Dec 19, 2018
@enebo enebo modified the milestones: JRuby 9.2.7.0, JRuby 9.2.8.0 Jan 9, 2019
@headius
Copy link
Member

headius commented Aug 2, 2019

I have a local patch I'm testing that adds reify.variables.name to generate the specialized classes with their "real" names:

[] ~/projects/jruby $ jruby -e "module Foo; class Bar; end; end; puts JRuby.ref(Foo::Bar.new).getClass"
org.jruby.gen.RubyObject0

[] ~/projects/jruby $ jruby -Xreify.variables.name -e "module Foo; class Bar; end; end; puts JRuby.ref(Foo::Bar.new).getClass"
org.jruby.gen.Foo.Bar

headius added a commit to headius/jruby that referenced this issue Aug 2, 2019
This partially addresses jruby#4643.

One of the main features of the old "reify.classes" feature was
that all Object subclasses would reify as their actual names. This
is needed for JVM tooling to separate different types of Ruby
object since normally they'll all appear to be RubyObject or one
of the specialized subclasses like RubyObject1.

However the other features of reify.classes are difficult to
reconcile with reified variables from reify.variables, due to the
complexity of mapping a Ruby class hierarchy into a Java class
hierarchy at the same time we are reifying variables into fields.

This patch goes partway to restore the actual names by forcing
reify.variables to use the actual Ruby class name without
attempting to match the class hierarchy or generate the other
Java methods from reify.classes.

Pass -Xreify.variables.name to get this new behavior (off by
default).
@headius
Copy link
Member

headius commented Aug 2, 2019

Woot, it's working!

$ jruby -J-Xrunhprof:depth=0 -Xreify.variables.name -S gem list > /dev/null
Dumping Java heap ... allocation sites ... done.

$ grep ' org.jruby.gen' java.hprof.txt
   56  0.24% 83.26%    325440 5803    328520  5858 300000 org.jruby.gen.Gem.Dependency
   59  0.22% 83.93%    299248 9336    333840 10417 300000 org.jruby.gen.Gem.Requirement
   60  0.22% 84.15%    294472 1268    294472  1268 300000 org.jruby.gen.Gem.Specification
   99  0.08% 89.15%    111320 1260    111408  1261 300000 org.jruby.gen.Gem.StubSpecification
  142  0.05% 92.03%     71032 1260     71088  1261 300000 org.jruby.gen.Gem.StubSpecification.StubLine
  197  0.03% 94.14%     33960  598     33960   598 300000 org.jruby.gen.Gem.Version
  226  0.02% 94.77%     25688  630     25688   630 300000 org.jruby.gen.Gem.NameTuple

@headius
Copy link
Member

headius commented Aug 2, 2019

I've merged in that new property. It doesn't address the problems with become_java! in the presence of reify.variables so this is still an open issue, probably to address in 9.3.

@headius headius modified the milestones: JRuby 9.2.8.0, JRuby 9.3.0.0 Aug 2, 2019
Adithya-copart pushed a commit to Adithya-copart/jruby that referenced this issue Aug 25, 2019
This partially addresses jruby#4643.

One of the main features of the old "reify.classes" feature was
that all Object subclasses would reify as their actual names. This
is needed for JVM tooling to separate different types of Ruby
object since normally they'll all appear to be RubyObject or one
of the specialized subclasses like RubyObject1.

However the other features of reify.classes are difficult to
reconcile with reified variables from reify.variables, due to the
complexity of mapping a Ruby class hierarchy into a Java class
hierarchy at the same time we are reifying variables into fields.

This patch goes partway to restore the actual names by forcing
reify.variables to use the actual Ruby class name without
attempting to match the class hierarchy or generate the other
Java methods from reify.classes.

Pass -Xreify.variables.name to get this new behavior (off by
default).
@headius
Copy link
Member

headius commented May 26, 2021

The intermediate mode using reify.variables.name solves the main problem here, in that it provides a way to use JVM tooling to see real classes without breaking on core class overrides.

Running the following script produces the subsequent output in a VisualVM heap dump:

script:

class Foo; end
ary = 10000.times.map { Foo.new }
gets

Screen Shot 2021-05-26 at 10 44 15

The remaining problems overriding existing methods from fully reified classes (with methods) remains and will not be fixed for 9.3.

@byteit101 This is an interesting problem you might have some thoughts on, since you worked closely with the reification code for #5270.

@headius headius modified the milestones: JRuby 9.3.0.0, JRuby 9.4.0.0 May 26, 2021
@ivoanjo
Copy link
Contributor Author

ivoanjo commented May 26, 2021

Really awesome, thanks for the update! :)

@headius
Copy link
Member

headius commented Nov 16, 2022

Lots of work has happened in reification since this issue, so I think it's worth punting remaining issues to 9.5.

@byteit101 This might be something you can help with, since you have had your fingers in the other reification logic for real Java subclasses.

@headius headius modified the milestones: JRuby 9.4.0.0, JRuby 9.5.0.0 Nov 16, 2022
@jsvd
Copy link
Contributor

jsvd commented Mar 5, 2024

Here's another interesting trigger, not sure if the issue is exactly the same:

class A;
  attr_reader :id
end
puts A.new.object_id.inspect
❯ JRUBY_OPTS="-Xreify.classes=true" ./vendor/jruby/bin/jruby -v           
jruby 9.4.6.0 (3.1.4) 2024-02-20 576fab2c51 OpenJDK 64-Bit Server VM 17.0.4.1+1 on 17.0.4.1+1 +jit [arm64-darwin]

❯ JRUBY_OPTS="-Xreify.classes=false" ./vendor/jruby/bin/jruby /tmp/test2.rb
4000

❯ JRUBY_OPTS="-Xreify.classes=true" ./vendor/jruby/bin/jruby /tmp/test2.rb 
nil

❯ JRUBY_OPTS="-Xreify.variables.name" ./vendor/jruby/bin/jruby /tmp/test2.rb
4000

We've stumbled upon this in the wild after a user enabled reify.classes by mistake in production.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants