Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFlowProblem and subclasses retain too much memory after running #2270

Open
headius opened this issue Dec 3, 2014 · 10 comments
Open

DataFlowProblem and subclasses retain too much memory after running #2270

headius opened this issue Dec 3, 2014 · 10 comments
Assignees

Comments

@headius
Copy link
Member

headius commented Dec 3, 2014

I am investigating #2266 and have found that there are large numbers of HashMaps and other data structures kept alive by IRScope.dfProbs. On my system, running travis-core tests, JRuby 9k retains around 250MB more memory than JRuby 1.7.16 throughout the suite, causing the OOM (because the suite was already dangerously close to max).

Do we need to retain all data within these problems after they have run? I think now might be a good time to do a pass over all of IR...clear out state that's no longer needed at execution/JIT time and reduce some of these larger data structures down to basics.

FWIW, there may be other areas retaining lots of memory, but HashMap and related objects were the top three items in a heap dump, and everywhere I looked I saw IR holding references to data I don't think it needs for execution.

cc @subbu, @enebo

@subbuss
Copy link
Contributor

subbuss commented Dec 3, 2014

Yes, we never got to this part quite yet, i.e. what state we want to retain for reuse later, etc. For now (for the 9k initial release), it is okay to clear out all df state once you hit the JIT. Later on, as we integrate profiling, inlining etc, we can figure out what we need to keep around.

@subbuss
Copy link
Contributor

subbuss commented Dec 3, 2014

All df state is retained in IRScope, and clearing this should be as simple as setting dfProbs = null. CFG can be cleared out as well.

@subbuss
Copy link
Contributor

subbuss commented Dec 3, 2014

And, I am @subbuss .. I imagine some other subbu is getting your jruby pings ;-)

@headius
Copy link
Member Author

headius commented Dec 3, 2014

Oops, sorry @subbu and @subbuss :-)

I tried what you suggest, and while it does reduce memory of jitted methods, it seems not enough methods jit to make much difference. I suspect good tests are unlikely to hit most methods enough to trigger JIT at our current threshold, and we've seen this when people report slower-than-MRI test run times.

If I set jit.threshold=0 memory improves a bit, but then the next candidates pop up:

  • BasicBlock holds a reference to its CFG and bbs are held by IRMethod.linearizedBBList. I guess we clear this too?
  • The data structures that make up the directect graph now start to show up...all those Sets. But I suppose this will go when the CFG roots are cleared.

@headius
Copy link
Member Author

headius commented Dec 3, 2014

Oh right, linearizedBBList is what I use in the JIT. Killing it.

@enebo
Copy link
Member

enebo commented Dec 3, 2014

"If I had to speculate we are retaining AST and IR data in 9k. So AST probably is getting captured somewhere and ending up pinned."

From another thread. This might not be the bulk of new memory but we should not need it and it is not a small amount of memory.

@headius
Copy link
Member Author

headius commented Dec 3, 2014

I did not see a lot of AST nodes being retained...at least not in the top heap items.

I've pushed jruby/jruby@memory_opto branch with some of the improvements we have discussed here. Memory is starting to shrink a bit, but we need to do more to flush out unused structures after prepareForInterpretation.

Here's a heap histogram of test:mri after my changes...still a TON of HashSet/HashMap stuff.

@enebo enebo self-assigned this Dec 16, 2014
@enebo
Copy link
Member

enebo commented Dec 16, 2014

I have been working on this so I will give an update. I separated out the DirectedGraph support into its own artifact 'org.jruby:dirgra' and changed master to use this. In this new package I changed the DG edge list and all vertex in/out sets into primitive arrays + a length field. Informally, an empty Rails project in console heap after this change:

Before: 172M
After: 55M
1.7: 52M

I could probably cut this down more but at this point. I think the next bigger fish to fry is probably variable allocation maps (and a few other maps) in IRScope. I will be landing this later in the week after I am done with my current eval cleanup work.

@enebo
Copy link
Member

enebo commented Jan 14, 2015

I was trusting jvisualvm for those values and I realized afterwards those values are not trustworthy. Looking at sampled heap in jvisualvm always seems to be accurate (I have done many many runs since then). So at this point in time we are (rails console on empty app):

1.7: 57M
9k HEAD: 114M

So still a ways to go but up to this point I have not tried eliminating ANY analysis data. I know also that cloning all instrs for interpretation is ~10% of the memory. Removing this clone and pushing it to JIT will get rid of that. Last comment is that JIT for rails console is only using 2-3M total. So JIT overhead at least early on is smaller than I expected to see.

@enebo
Copy link
Member

enebo commented Jan 18, 2015

This is an ongoing process which is not finished and is not blocking pre1 so I am moving it forward.

@enebo enebo modified the milestone: JRuby 9.0.0.0 Jul 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants