New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFlowProblem and subclasses retain too much memory after running #2270
Comments
Yes, we never got to this part quite yet, i.e. what state we want to retain for reuse later, etc. For now (for the 9k initial release), it is okay to clear out all df state once you hit the JIT. Later on, as we integrate profiling, inlining etc, we can figure out what we need to keep around. |
All df state is retained in IRScope, and clearing this should be as simple as setting dfProbs = null. CFG can be cleared out as well. |
And, I am @subbuss .. I imagine some other subbu is getting your jruby pings ;-) |
Oops, sorry @subbu and @subbuss :-) I tried what you suggest, and while it does reduce memory of jitted methods, it seems not enough methods jit to make much difference. I suspect good tests are unlikely to hit most methods enough to trigger JIT at our current threshold, and we've seen this when people report slower-than-MRI test run times. If I set jit.threshold=0 memory improves a bit, but then the next candidates pop up:
|
Oh right, linearizedBBList is what I use in the JIT. Killing it. |
"If I had to speculate we are retaining AST and IR data in 9k. So AST probably is getting captured somewhere and ending up pinned." From another thread. This might not be the bulk of new memory but we should not need it and it is not a small amount of memory. |
I did not see a lot of AST nodes being retained...at least not in the top heap items. I've pushed jruby/jruby@memory_opto branch with some of the improvements we have discussed here. Memory is starting to shrink a bit, but we need to do more to flush out unused structures after prepareForInterpretation. Here's a heap histogram of test:mri after my changes...still a TON of HashSet/HashMap stuff. |
I have been working on this so I will give an update. I separated out the DirectedGraph support into its own artifact 'org.jruby:dirgra' and changed master to use this. In this new package I changed the DG edge list and all vertex in/out sets into primitive arrays + a length field. Informally, an empty Rails project in console heap after this change:
I could probably cut this down more but at this point. I think the next bigger fish to fry is probably variable allocation maps (and a few other maps) in IRScope. I will be landing this later in the week after I am done with my current eval cleanup work. |
I was trusting jvisualvm for those values and I realized afterwards those values are not trustworthy. Looking at sampled heap in jvisualvm always seems to be accurate (I have done many many runs since then). So at this point in time we are (rails console on empty app):
So still a ways to go but up to this point I have not tried eliminating ANY analysis data. I know also that cloning all instrs for interpretation is ~10% of the memory. Removing this clone and pushing it to JIT will get rid of that. Last comment is that JIT for rails console is only using 2-3M total. So JIT overhead at least early on is smaller than I expected to see. |
This is an ongoing process which is not finished and is not blocking pre1 so I am moving it forward. |
I am investigating #2266 and have found that there are large numbers of HashMaps and other data structures kept alive by IRScope.dfProbs. On my system, running travis-core tests, JRuby 9k retains around 250MB more memory than JRuby 1.7.16 throughout the suite, causing the OOM (because the suite was already dangerously close to max).
Do we need to retain all data within these problems after they have run? I think now might be a good time to do a pass over all of IR...clear out state that's no longer needed at execution/JIT time and reduce some of these larger data structures down to basics.
FWIW, there may be other areas retaining lots of memory, but HashMap and related objects were the top three items in a heap dump, and everywhere I looked I saw IR holding references to data I don't think it needs for execution.
cc @subbu, @enebo
The text was updated successfully, but these errors were encountered: