-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBM JRE crashes with jruby 9.1.2.0 when JRuby JIT is enabled #3964
Comments
Gotta be a version mismatch somewhere in your env, since 9.1.2.0 can boot OpenSSL just fine on its own (or we wouldn't even be able to install gems). How did you do that upgrade? Is it possible two different versions of JRuby are getting spliced together? |
I'm able to run rails fine locally, this error only happens when deploying the war file to websphere. |
@atambo If you can turn on JRuby debug mode (-d or equivalent) on that server, we'd be able to see the full Java trace of that error. It would be helpful. I must reiterate, though...assuming OpenSSL is loading properly for you locally, this error should not be possible unless there's another stale JRuby floating around somewhere (or a stale jruby-openssl). |
Is it possible to turn on JRuby debug mode through a java property or environment variable? I package the rails application using https://github.com/jruby/warbler so I'm not sure how to pass the -d flag to jruby. |
@atambo I see in your gist that you are loading the jopenssl.jar twice. is the app configured to run two jruby runtimes ? if yes, does it work when using only one runtime ? |
Here is a more full backtrace: |
@mkristian I have the |
@mkristian as for my app configuration I have it configured to use one runtime using these settings:
|
@atambo so the loadService log clearly shows a lot of files are loaded exactly twice. jars as well regular ruby files. it almost looks like all files are loaded twice. and it looks almost that a new ruby gets spawned, the |
I do see it loading everything twice. I'm not sure what would be causing that as I am not setting multiple runtimes. Is it possible my rails application is doing some weird threading or something that could be causing this issue? I tried turning https://gist.github.com/atambo/06e597b7765649e711b189b4e3893a4c I'm not sure how to read that javacore to see the reason. |
@atambo segfault :(
just to see if those MethodNotFoundErrors go away. looking at your original gist of the loadService:
I expect java.rb to found and loaded only once. just saying something is odd here. |
So when I manually put
So those jars being in the WEB-INF/lib are related to the segfault somehow. |
you also need to remove those jars from from gem location and replace them with empty.jars. we just need to make sure that those jars are only ONCE on classloader hierarchy |
So if I delete the jars from the jruby-openssl gem and the stdlib and just have them be in the WEB-INF/lib I get the following error:
I saw there is a environment variable Which gets me past the above error and gets me a new error:
It doesn't seem to be looking in WEB-INF/lib when loading jar files. |
So it looks like the |
@atambo hmm. when removing them you need to replace them with empty jars. not sure about the last error. not moving the jars would be the better option but it fails with the segfault. I remember seeing this before. |
@atambo I think reverse the class loader lookup with |
this might turn out to be yet another specific case caused by the IBM JVM+WebSphere combo ... if Kristian's shoots turn out to be a dead end it would be interesting to know whether 1.7.25 + JRuby-OpenSSL work fine (there's the forced jar-dependency which might not be there in 1.7.25's JOSSL) |
@kares FYI: I did see the segfault when the OSGi container provided the BC jars 1.52 for all bundles. |
@mkristian So when I try However, when I add
So this is definitely related to whenever this code block runs the jvm ends up segfaulting: https://github.com/jruby/jruby-openssl/blob/master/lib/jopenssl/load.rb#L7 |
@mkristian do you know of some way to isolate my application from the container's bouncy castle jars? |
@atambo the only thing I know to separate from the underlying container is this classloader.delegate. are you sure there is another BC on websphere ? which version ? did you try jruby-openssl-0.9.14 which comes with an older BC version ? |
@mkristian I'm not sure if there is another BC on websphere, I just thought that your comment above to @kares was stating that maybe BC is bundled inside of websphere. I tried using So I guess where I'm at now is that whenever bouncy castle gets loaded the jvm segfaults. |
@atambo two last shots in the dark:
|
The little bit of Java call stack that can be gleaned from IBM Java's javacore*.txt is this:
The crash is bad enough that this stack might be wrong, although we've seen the exact same stack trace twice. (I'm in the same office with @atambo.) |
@mkristian so when I try putting the BC jars into WEB-INF/lib and When I also add So it is something to do with the JIT + the BC jars. @headius any ideas on how I can try to track down the ibm jdk/jruby JIT issues? |
@atambo maybe the default warbler config does work as well with |
@mkristian so using the default warbler config (not doing So it looks like the real issue is that the JRuby JIT + jruby-openssl crashes the IBM JVM. I'm going to collect all the javacore dump files and make them available and also try upgrading the IBM JDK and see if that fixes anything. |
So I tried with the newest IBM JDK and the crash still happens. I also tried with Oracle JDK 8 and Open JDK 8 and they both run fine without crashing. I've submitted the below debugging information to the IBM JDK folks but here it is as well if anybody knows how to read this stuff: websphere logs: /var/log/messages: java dump stuff: |
This turns out to be easy to reproduce. With JRuby 9.1.2.0 on your PATH for
Note: the
💥 WLP will crash with a final message something like:
|
The thread that crashes appears to in the process of re-raising an exception using Maybe @mstoodle or @charliegracie can route us to someone that can read these J9 crash reports and tell us if we're doing something J9 can't deal with? |
For the IBM people the PMR number is 72179,001,866. |
Thanks for adding the PMR #. I will talk with folks in the morning to check the status. I will also try to reproduce locally. |
I was able to reproduce it easily. It is nice to have a reproducible test case. The IBM JVM is crashing while walking the threads stack to build the exception. It is crashing while getting the details for a JIT'd frame I believe. I will dig a bit deeper to see if I can figure out what is wrong with this frame. I am also attempting to write a much simpler test case. |
@charliegracie Thanks very much! Let us know if there's something we need to change. We do use some peculiar class/method names to encode Ruby identifiers, but according to my understanding of the JVM spec they should all be kosher. |
I was not able to create a simpler test case. I have taken the investigation as far as I can on my own. Discussing it further with the folks who own the PMR and are responsible for the code which is crashing. |
@charliegracie @mstoodle @DanHeidinga Thanks again for your help. Any luck? If there's something simple I can change to make J9 happy, that's an option too. We are hoping to put out JRuby 9.1.3.0 in the next week or so. |
Punting to 9.1.4.0 in hopes that our IBM friends can help us then :-) |
We had a bit of breakthrough on this today and believe that we know the root cause of the issue: interpreter -> jit state transition data is being overwritten when dropping to the stack frame for the method that handles the exception being thrown. For anyone running into this issue, we do have a workaround that applies to this test: run with A real fix will require changes to J9. I'll update this again when we have the fix in hand. |
@DanHeidinga So I'm right in thinking there's nothing we need to change, right? This is just a lucky shape of code that triggers your issue? |
I'm going to remove release milestone from this since it's sounding like there's nothing on our end to change. We are happy to have a resolution in sight! |
@headius Agreed, this is purely a JVM bug. No JRuby changes required. |
Looks like this is now fixed: http://www-01.ibm.com/support/docview.wss?uid=swg1IV88798 Fix will be released in IBM JRE 8.0.3.20. |
@atambo Thank you for following up! Woohoo! |
@atambo Sorry to dig up an ancient thread, have you tried upgrading to JRuby 9.2.5.0 or 9.2.6.0? I previously ran into the same issue as you (which was resolved with JRE update), now running into a similar issue that can also be worked around by disabling JIT. Previously 9.2.0.0 still worked OK. |
Sorry @karlhe, I no longer work at IBM on the product that was using websphere and jruby so I haven't done any upgrading. If you work for IBM there is an internal website where you can report the segfault. |
@atambo I unfortunately don't work for IBM anymore either, but thanks anyway :) |
So I upgraded from jruby 1.7.25 to jruby 9.1.2.0 and now the IBM JRE crashes whenever I try to run a rails application deployed to websphere using warbler.
I can prevent the crash by running with the JRuby JIT disabled like so:
-Djruby.compile.mode=OFF
. Here are the logs/java crash dumps:websphere logs:
http://alextambellini.com/debug/console.log
http://alextambellini.com/debug/messages.log
/var/log/messages:
http://alextambellini.com/debug/system_messages.txt
java dump stuff:
http://alextambellini.com/debug/javacore.20160615.160251.9932.0002.txt
http://alextambellini.com/debug/core.20160615.160251.9932.0001.dmp
http://alextambellini.com/debug/core.20160615.160251.9932.0001.dmp.zip
http://alextambellini.com/debug/jitdump.20160615.160251.9932.0004.dmp
http://alextambellini.com/debug/Snap.20160615.160251.9932.0003.trc
https://gist.github.com/atambo/66e7f0093b7741737518e5ce8b306fdd
Environment
@mkristian, I know you've looked into similar things before (#1549). I'd be happy to let you look at whatever or get you stuff you need to help debug the issue.
The text was updated successfully, but these errors were encountered: