Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaking Classes since JRuby-9.1.x.x #4142

Closed
agerauer opened this issue Sep 7, 2016 · 19 comments
Closed

Leaking Classes since JRuby-9.1.x.x #4142

agerauer opened this issue Sep 7, 2016 · 19 comments

Comments

@agerauer
Copy link

agerauer commented Sep 7, 2016

Since we updated from JRuby-9.0.5.0 to JRuby-9.1.4.0 our applications seem to leak classes (yes classes, not objects). The class count is steadily increasing over time. The increase seems to be load dependent. We have two different applications displaying the same behaviour.

Example:
In JRuby-9.0.5.0: After some time we had a stable class count of about 25.5K
In JRuby-9.1.4.0: Shortly after start the class count is at 30k, after a few hours we are at 32k and growing

This graph shows the issue:

https://www.dropbox.com/s/2f7yio5v9ppwzsm/Screenshot%202016-09-07%2010.36.04.png?dl=0

To further investigate I created heap dumps with a few hours time in between. It seems there are more and more classes created for the same Ruby files:

https://www.dropbox.com/s/c6xl14yfsqtuyts/Screenshot%202016-09-07%2011.15.05.png?dl=0

Environment

@kares
Copy link
Member

kares commented Sep 8, 2016

interesting have you tried triggering a full GC - maybe those classes would end up being collected ?
anyways, this seems most likely like a JIT regression ... // cc @headius

@kares kares added the jit label Sep 8, 2016
@kares kares added this to the JRuby 9.1.6.0 milestone Sep 8, 2016
@agerauer
Copy link
Author

agerauer commented Sep 8, 2016

Just tried to run GC on two different servers, it didn't remove any classes.

@kares
Copy link
Member

kares commented Sep 8, 2016

the tmp prefix (and the fact that based on the screenshot the classes are empty) seems to indicate that these aren't JRuby's JIT-ed classes nor any of the (generated) internal ones - they mostly have "rubyjit" or "rubyobj" package prefix.

@agerauer
Copy link
Author

agerauer commented Sep 8, 2016

@kares
Copy link
Member

kares commented Sep 8, 2016

ah thanks - that confirms it's coming from the JITCompiler ... seems it's not using the "rubyjit" prefix now.
they might be "valid" (used) and it makes sense there's more of them over time than (probably more stuff JIT compiling in 9.1 than in 9.0). each has it's own little class-loader but contains different data.

it would make sense to have a better name for them so they do not seem that much similar ...

they might be nothing to worry much, how long did you let it roll ? did you run into any issues ?

@agerauer
Copy link
Author

agerauer commented Sep 8, 2016

With all other settings unchanged our JRuby processes grow to consume a few hundred MB more memory now, which lead to crashing servers. That's when I started to investigate and look for things that changed since the upgrade.

On our main project I have seen up to 3x as many classes compared to 9.0 before the server ran out of RAM. I don't know how far it would go. The used Metaspace is increasing linear with the class count. Is there a limit to how many classes the JITCompiler will create? What would happen if I limit the Metaspace with the -XX:MaxMetaspaceSize option?

@kares
Copy link
Member

kares commented Sep 8, 2016

thanks - knowing there's a crash eventually is important ... we've seen a report about increased usage #4127 but turned out app specific (no clear JRuby leakage found looking at the heap-dump). so maybe there's something after all. if possible you could try 9.1.2.0 to confirm whether it's also acting the same.

thread dumps would be useful at some point - but if there's not a clear leak candidate(s) it might be tricky.

The used Metaspace is increasing linear with the class count. Is there a limit to how many classes the JITCompiler will create?

there's -Xjit.max but it might not be taken into consideration on 9K - if you do see linear increases than there probably is an issue. we'll eventually need to look at heap dumps (it would be best to compare with 9.1.2 if you're going to try and it works)

What would happen if I limit the Metaspace with the -XX:MaxMetaspaceSize option?

you should get an OutOfMemoryError eventually

@headius
Copy link
Member

headius commented Sep 8, 2016

This looks really weird...there shouldn't be more than a couple classes for any given jitted method, and usually just one. I'll have a look at your heap dump and see if I can figure out why we're duplicating so much code.

@headius
Copy link
Member

headius commented Sep 8, 2016

@Helle Would it be possible to get a copy of one of the dumps? If you're concerned about sensitive information, perhaps we could at least be online at the same time and walk through the heap together? I need to gather more information about these classes to understand why there's so many of them.

Is this running with a single JRuby instance or multiple instances?

It is certainly possible we're jitting more in 9.1 than in 9.0...many JIT bugs were fixed. But I still wouldn't expect to see so many duplicates.

@headius
Copy link
Member

headius commented Sep 8, 2016

Ahh another flag/property you could set is jruby.jit.logging=true (-Xjit.logging=true). Ideally over time that should show us that we actually are jitting methods multiple times.

@lephyrius
Copy link

@Helle How do u get class count in New Relic?

@kares
Copy link
Member

kares commented Sep 9, 2016

@headius from some of the images its apparent that those aren't the same thing (methods).
currently all methods/bodies from the same file seems to use (mangled) __FILE__ as the class name.

@agerauer
Copy link
Author

agerauer commented Sep 9, 2016

@headius There are two different projects showing this behaviour. Both running on two Servers with a single JRuby instance each. I will send you an email with links to the heap dumps. I generated the heap dumps from the smaller of the two projects with a night of very light traffic in between.

I am going to activate jit.logging=true on one of the servers and i'm gonna change my Xmx settings so the Metaspace has more room to grow. Let's see what happens over the weekend. As I'm on vacation next week, my colleague @cyrez will post the results.

@kares 9.1.2.0 has the same behaviour. It may be normal behaviour in 9.1 to JIT a lot more, I noticed the difference because it made my servers run out of RAM...

@lephyrius I'm using the New Relic Java Agent for that

@kares
Copy link
Member

kares commented Sep 9, 2016

... also for comparison it might be valuable to have heap dumps from the same app using 9.0.5 as well

@headius
Copy link
Member

headius commented Sep 13, 2016

Looking forward to the results! And yes, heap dump would be useful. You can pass -XX:+HeapDumpOnOutOfMemoryError to the JVM and it will dump right before it croaks. Otherwise, just pull off a heap dump using the jmap command at some point when you're pretty sure the leaking is obvious.

@headius
Copy link
Member

headius commented Sep 13, 2016

Oops, got my bugs crossed...there's already a heap dump for this one. Yes, having a comparison dump for 9.0.5.0 would be great help.

@mgroebner
Copy link

Had some problems collecting useful data. I hope i can send you the dumps tomorrow, just deployed the application with 9.0.5.0 again. I will include a log file with Xjit.logging enabled from the weekend.

@mgroebner
Copy link

In my tests from yesterday it looks like our applications leaking in 9.0.5.0 also. So maybe it's a problem in our application. I will in investigate further and come back next week with more information.

https://www.dropbox.com/s/izznts4dqyk7grn/Screen%20Shot%202016-09-15%20at%2016.26.53.png?dl=0

@headius
Copy link
Member

headius commented Sep 15, 2016

Ok, keep us posted. We'll close this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants