Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory issue with jruby / java / linux. #4367

Closed
TheWudu opened this issue Dec 7, 2016 · 14 comments
Closed

Possible memory issue with jruby / java / linux. #4367

TheWudu opened this issue Dec 7, 2016 · 14 comments

Comments

@TheWudu
Copy link

TheWudu commented Dec 7, 2016

Hi!

We experience some problems with the memory management of jruby / jvm / or linux?. As we are not sure if it belongs to jruby or not, I just want to know your opinion on that. We do not know how we can proceed with that issue any more.

Our environment:

  • JRuby 9.1.5.0
  • Linux 3.13.0-95-generic Fixed http://jira.codehaus.org/browse/JRUBY-3986 #142~precise1-Ubuntu SMP Fri Aug 12 18:20:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • Options:
    -Djruby.shell=/bin/sh
    -Djffi.boot.library.path=
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.authenticate=false
    -Djava.rmi.server.hostname=localhost
    -Djava.security.egd=file:/dev/./urandom
    -Djruby.home=//.rvm/rubies/jruby-9.1.5.0
    -Djruby.lib=//.rvm/rubies/jruby-9.1.5.0/lib
    -Djruby.script=jruby
    -Djruby.daemon.module.name=Trinidad
    -Xmx3193m
    -Xms3193m
    -XX:PermSize=512m
    -XX:+UseG1GC
    -XX:MaxMetaspaceSize=768m
    -XX:MetaspaceSize=512m
    -Xss2048k
    -Xbootclasspath/a://.rvm/rubies/jruby-9.1.5.0/lib/jruby.jar
    -Dcom.sun.management.jmxremote
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.authenticate=false
    -Dcom.sun.management.jmxremote.port=1098
    -Djava.rmi.server.hostname=
    -Dfile.encoding=UTF-8
    -Dcommons.daemon.process.id=11164
    -Dcommons.daemon.process.parent=11163
    -Dcommons.daemon.version=1.0.8 abort
  • Framework: Triniad + Sinatra

When we look at the htop output, or at our graphs the server needs more and more memory over time. E.g. currently it shows
VIRT: 11,2G
RES: 6738M
running since last friday. The server itself has 7984 MB RAM.

When i connect the process using visual vm, I see that it uses ~ 1.5 gb of memory on heap, which is below the given amount (3193 mb), it uses about 175 mb metaspace, which is far below the given amount of 768 mb). It has between 58 and 62 threads (daemon and live), quite constant with a few spikes up to 72, which are opened and closed immediately.

The problem is, the system believes the process needs that much memory, but the java monitor shows a completely different status. When i restart the service (restart the process) the memory is freed again.

Can you give any hint on how to debug that further? Or maybe you know something else which could cause such an issue?

Any help would be appreciated,
Martin

@kares
Copy link
Member

kares commented Dec 7, 2016

there's been couple of sim.reports such as these lately but somehow always turned out to be app specific.
my first advise is to get a heap dump and start poking around or setup a JVM monitoring tool and monitor when memory usage increases. you should not need that much metaspace unless doing hot redeploys.

we can certainly look at heap dump but its time-consuming ;( and sometimes needs source code access.
... are you seeing the "leak-like" behaviour just lately, did you change something, does it get more load?
(all relevant to consider/think about if you're running into this just lately)

@TheWudu
Copy link
Author

TheWudu commented Dec 7, 2016

We did not have these issues with jruby-1.7.x before and the load did not increase really. We have about 30-40k rpm in average the last 7 days with peaks up to ~ 150k rpms. The heap dumps does not show anything specific. If i do a dump file now it has about 1 gb, even if htop tells about 11.3gb VIRT memory, and MAT tells me that its about 300 mb of data on heap. Which matches with the visual vm output.

@TheWudu
Copy link
Author

TheWudu commented Dec 7, 2016

Additional the Leak Suspect from MAT tells me:
20,790 instances of "org.jruby.ir.IRMethod", loaded by "" occupy 71,470.79 KB (21.69%) bytes.

8,804 instances of "org.jruby.MetaClass", loaded by "" occupy 64,254.32 KB (19.50%) bytes.

Biggest instances:

org.jruby.MetaClass @ 0x725658710 - 9,759.79 KB (2.96%) bytes.
org.jruby.MetaClass @ 0x72d22c4e8 - 6,511.81 KB (1.98%) bytes.

4,938 instances of "org.jruby.RubyClass", loaded by "" occupy 38,600.81 KB (11.72%) bytes.

Biggest instances:

org.jruby.RubyClass @ 0x700a7ebb0 - 4,300.45 KB (1.31%) bytes.

But as these "leak suspects" are part of the 300 mb on the heap, i does not expect them to be an issue.

@kares
Copy link
Member

kares commented Dec 7, 2016

thus if you're sure there's nothing crazy going on during requests can you let it (does it) fail with an OoME?
... that would confirm if there's a heap problem or one elsewhere.

The problem is, the system believes the process needs that much memory, but the java monitor shows a completely different status. When i restart the service (restart the process) the memory is freed again.

so it must be native memory than? just guessing - there's only so much one could do without looking into it.
naively, did you try going to 9.1.6.0 ?

@headius
Copy link
Member

headius commented Dec 11, 2016

It does sound like a native memory leak based on the numbers so far, especially if Java tools report a significantly smaller heap.

@TheWudu
Copy link
Author

TheWudu commented Dec 13, 2016

Yes I expect the same. Any clue how to find the leak? Any cool tools maybe?

@kares we did not try to use jruby 9.1.6.0 yet.

@headius
Copy link
Member

headius commented Dec 14, 2016

First thing would be to confirm 9.1.6.0 has the problem. There were many fixes in that release.

After that, there's a link on the wiki to flags for profiling memory, tools for getting and analyzing heap dumps, etc.

@headius
Copy link
Member

headius commented Dec 14, 2016

Oh, realized now some of those tools may not be super helpful for native memory leaks, and you already mentioned you've used MAT to analyze a heap dump. Ok...so let's get you on 9.1.6.0 and see if there's still a problem.

@andreaseger
Copy link

Hi
first sorry for the long silence.
Over the last weeks we worked around the leak by regular restarts of the app while also monitoring the leak during several jruby upgrades. Both 9.1.6.0 and 9.1.7.0 showed the leak.

Yesterday we performed the upgrade to 9.1.8.0 and as of now it looks like we're no longer leaking memory.

This shows the used system memory before and after the jruby upgrade.

screenshot from 2017-03-22 09-28-44

@kares kares added this to the Invalid or Duplicate milestone Mar 22, 2017
@headius
Copy link
Member

headius commented Mar 22, 2017

@andreaseger Great news! Thanks for updating us.

@the-michael-toy
Copy link

Any idea which fix in 9.1.8.0 would have caused this? We are seeing a similar problem on some, but not all instances, where since moving to j9k we have unbounded native memory growth.

@enebo enebo closed this as completed May 16, 2017
@TheWudu
Copy link
Author

TheWudu commented May 17, 2017

@the-michael-toy No we have no idea which fix caused this.

@adpande
Copy link

adpande commented Mar 3, 2018

I think the issue has resurfaced in 9.1.12.0

74,318 instances of "org.jruby.RubyClass", loaded by "sun.misc.Launcher$AppClassLoader @ 0x70b600ab8" occupy 402,588,016 (23.82%) bytes.

Keywords
sun.misc.Launcher$AppClassLoader @ 0x70b600ab8
org.jruby.RubyClass

@adpande
Copy link

adpande commented Mar 3, 2018

ruby-leak1
ruby-leak2
ruby-leak3

Here are the issues from MAT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants