-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-raising the cause of a Java exception can result in leaked memory and IllegalStateExceptions #4699
Comments
interesting, so the leak was only relevant on 9.1.7.0 (am confused since the reports starts with 9.1.12) ? |
We @swiftype have 9.1.7.0 in production and that's where we've observed the leak. We've rolled back, updated the code and deployed the changed version to fix the leak. 9.1.12.0 is mentioned because it was one of the versions tested locally while trying to reproduce the issue locally in development. |
@kovyrin so one theory would be we are storing the result of cause in a temporary variable in that rescue and then raising out the value of that temp. Since we are not niling out temporary variable it would leak. HOWEVER, this is unwinding the stack because we are raising so that temporary variable should no longer be in scope and freed. If I was to keep with this theory this thread basically dies here and does not unwind which would then leak...but then I think you would have lots of threads laying around. So probably not that? hmm |
Ok, so there seems to be two issues here.
The first problem is caused by two conditions in Throwable.initCause:
I will fix the double-init issue. @marshalium would you be able to test a master or 9.1.13 HEAD build? |
I've pushed a fix to master and jruby-9.1.13.0 branches. |
@headius Thanks for looking into this! Unfortunately, I ran the script from from my original comment on the I dug into this a bit, and I think I discovered more of what's actually going on, but I don't know how to fix it. I added code to RubyKernel.maybeRaiseJavaException(), right before the ex.initCause call, to log thread and object ids:
And what it logged showed that multiple threads are trying to set the cause on the exact same exception object:
The check that got added didn't fix anything since this is race condition where multiple threads can see the cause as The reason I was seeing this with Guava CacheLoader is that when multiple threads try to load the same key in parallel, threads after the first one block on a future. Once the loading thread sets an error on the future the waiting threads see it and raise an UncheckedExecutionException whose cause is set to the error instance from the future. When the JRuby code re-raises the cause of the UncheckedExecutionException multiple JRuby threads end up raising the exact same exception instance and so they get IllegalStateExceptions. Here's a gist with a simpler method of reproducing the issue that does not depend on Guava: https://gist.github.com/marshalium/64e9ebb4bb4ac107eee1f5a629feee18 |
Ok, so this is a weird behavior of initCause that interacts badly with threads. Thanks for the extra investigation. I think the options we have are:
I am leaning toward (1) since Java typically wants you to init the cause yourself, rather than doing it automatically. I will implement (2) or (3) for now. |
Another option would be not setting cause, but adding the Ruby exception to "suppressed". |
New repro gist: confirmed broken on master. Still lots of debate in my mind about whether to sync, set suppressed, or not set anything. |
sync patch pushed to master in 1f2e040 |
@kares I think you added the logic to set cause. What would you think about just dropping it? Java does not automatically set a cause for you unless you pass one into the constructor, so us setting it later based on a Ruby raise -- which might happen many times for the same throwable -- seems like the wrong behavior. |
I vote for removal. |
right, this is logic added in 9.1 to go along with with Ruby's (Ruby) exception.cause handling b8933ee not sure how much its used/spec-ed (probably very little if any). seemed like a good fit to match JI the way Ruby's cause works - a passed or found in error-info gets propagated. JRuby already has some places where the RaiseException's cause is init-ed (pbly all local - single threaded), since its often hard to tell a failure not seeing the native Java cause - that was the motivation to further help reduce the pain around JI although it makes much less sense now that I think about it. obviously did not think too far :) ... feel free to remove for 9.2 we can always re-add later if someone complains he/she is not getting its Java cause set on a throwable |
I will remove it and we can discuss further whether we want to restore this or something like it for 9.2. My remaining thoughts on it: Java only allows setting cause at construction time, or exactly once after that. Ruby sets it to In Java, cause is never automatically set for you. If someone is re-raising a Java exception in Ruby, they probably don't want to attach a cause (which might be the source of leaks, if those exceptions carry too much data and stick around). If someone is raising a new Java exception, they might want to provide a cause. We can't easily do it automatically since we would have to modify the constructor call, and we probably wouldn't want to add something that magic. Unfortunately we also can't pass the exception, because the object in $! is a RubyException, not a subclass of Throwable. I think we should open a new feature/bug to make it possible for a user to provide a cause to Java, but more and more I think doing it automatically is a mismatch. |
I've confirmed that my tests pass on 9.1.13.0. I can't test for the memory leak until we get 9.1.13.0 into production, but I'll file another ticket (ideally with some kind of dev reproducible case) if we encounter that again. Thanks for the fix! |
Environment
I only reproduced this using Guava, but I believe it is not a Guava specific bug. If you'd like me to reproduce without an external dependency I'm willing to try, but it was easier to just use Gauva.
Expected Behavior
Re-raising the cause of a Java exception should work without error or leaked memory.
This script reproduces part of the issue. It runs successfully on 1.7.27 but fails on 9.1.12.0.
To run, download the Guava jar https://repo1.maven.org/maven2/com/google/guava/guava/22.0/guava-22.0.jar, and place it this script:
Actual Behavior
IllegalStateExceptions
Running the above script on 9.1.12.0 results in all threads except for one dying from IllegalStateExceptions. It does not fail if
NUM_THREADS = 1
.Memory leaks
Regarding the memory leak, I have not yet been able to reproduce outside of production but running code similar to what is above resulted in several threads that each had a single 1.1GB
RubyArrayOne
objectthat contained many
ConcreteJavaProxy
s that wrapped the exception being thrown:I changed our prod code from
to
and the memory leak went away. So I believe that re-raising a cause from inside the rescue in cases like this can also result in circular dependencies that leak memory.
The memory leak occurred on Linux with JRuby 9.1.7.0:
Let me know if there is any more useful detail that I should add!
The text was updated successfully, but these errors were encountered: