Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ivar creation with instance_eval triggers lookup hell in other instances #3859

Closed
jsvd opened this issue May 6, 2016 · 6 comments
Closed

Comments

@jsvd
Copy link
Contributor

jsvd commented May 6, 2016

Hi!
Recently a huge slowdown in logstash has been traced to different instances of our LogStash::Pipeline having instance variables with different names between each other. And while these instances are being garbage collected, something is left behind, causing dramatic CPU overhead.

The following script illustrates the scenario:

$i = 1

class A
  def m
    instance_eval "@v_#{$i} = 1"
    $i = $i + 1
  end
end

def new_instance
  A.new.m
  nil
end

s = Time.now
1_000_000.times do |i|
  if i % 10000 == 0
    puts "(#{i}) last 10000 iteration took #{Time.now - s}s" 
    s = Time.now
    new_instance()
  else
    new_instance()
  end
end

Obviously there's an argument as to "WHY WOULD YOU DO THAT" but, that aside, in my perception of the ruby object model, each instance_eval creates the ivar on a single instance, so it's perfectly isolated from the next (instance).
To corroborate with this, I cannot programmatically find a class or object in ruby that shows me a build up/list of "@v_*" instance variables, each instance of A only shows a single ivar.
However, we can see that there's a big impact of their creation:

$ cat ivar_hell.rb 
$ ruby -v
ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
$ ruby ivar_hell.rb 
(0) last 10000 iteration took 6.136e-06s
(10000) last 10000 iteration took 0.468596995s
(20000) last 10000 iteration took 1.013960991s
(30000) last 10000 iteration took 1.340206003s
(40000) last 10000 iteration took 1.484439639s
(50000) last 10000 iteration took 1.781446295s
(60000) last 10000 iteration took 2.132031787s
(70000) last 10000 iteration took 2.505281806s
(80000) last 10000 iteration took 2.686239691s
(90000) last 10000 iteration took 3.013230595s
(100000) last 10000 iteration took 3.305966179s
(110000) last 10000 iteration took 3.574423993s
(120000) last 10000 iteration took 3.879910485s
(130000) last 10000 iteration took 4.333488317s
(140000) last 10000 iteration took 4.584951165s
(150000) last 10000 iteration took 4.932804317s
^C
$ rvm use jruby
Using /home/jsvd/.rvm/gems/jruby-9.0.5.0
$ ruby ivar_hell.rb 
(0) last 10000 iteration took 0.022s
(10000) last 10000 iteration took 6.936s
(20000) last 10000 iteration took 14.802s
(30000) last 10000 iteration took 27.517s
^C

I should note that we can see a speed degradation in MRI as well, but it's an order of magnitude slower for jruby.
Any thoughts on why this happens?

@jsvd jsvd changed the title ivar creating with instance eval creates lookup hell in other instances ivar creation with instance eval triggers lookup hell in other instances May 6, 2016
@jsvd jsvd changed the title ivar creation with instance eval triggers lookup hell in other instances ivar creation with instance_eval triggers lookup hell in other instances May 6, 2016
@headius
Copy link
Member

headius commented May 6, 2016

The main problem here is that instance_eval always creates a singleton class for that object. So every loop through this code, you're creating an object, then creating a totally new class to be its singleton class, then assigning a variable, and finally throwing it away. On top of that, it's a string eval every time, which has to be parsed, compiled into our IR, and finally interpreted.

The fact that it degrades could mean there's a leak, or it could just be reflecting the fact that a very large amount of garbage is being created and forcing the JVM GC to work very hard. That's worth investigating.

FWIW, instance_variable_set does not have this behavior.

@headius
Copy link
Member

headius commented May 6, 2016

Ok, so I suspect the degradation is due to having a new instance variable name every time.

JRuby's instance variable logic (and I believe MRI's is similar) maintains a mapping from name to offset in the object's nearest non-singleton class. In this case, that's A, and every new variable adds another entry adds another entry with "last_offset += 1". We also use the size of that table to proactively allocate a large enough varTable to hold all variables we've seen for that type...so it grows and grows and grows and it takes longer and longer to allocate another A instance.

So, there's a small, slow leak in the variable table that would be hard to eliminate, but the main degradation issue is that the actual array of variables get larger and larger.

If your code is doing this, with a completely new ivar name every time, that's probably not a great pattern for any implementation. We are somewhat limited in how we can map those variables, and like I say I believe MRI does it mostly the same way.

@colinsurprenant
Copy link
Contributor

@headius I confirm that using isntance_variable_set has the same behaviour which is in line with your last observation.

class A
  def initialize(i)
    instance_variable_set("@a#{i}", i)
  end
end

i = 1
start = Time.now

while true
  i += 1
  A.new(i)

  if (i % 1000) == 0
    puts "#{i} last iteration took #{Time.now - start}"
    start = Time.now
  end
end
1000 last iteration took 0.087
2000 last iteration took 0.198
3000 last iteration took 0.147
4000 last iteration took 0.161
5000 last iteration took 0.18
6000 last iteration took 0.165
7000 last iteration took 0.19
8000 last iteration took 0.228
9000 last iteration took 0.235
10000 last iteration took 0.244
11000 last iteration took 0.247
12000 last iteration took 0.265
13000 last iteration took 0.302
14000 last iteration took 0.351
15000 last iteration took 0.352
16000 last iteration took 0.409
17000 last iteration took 0.416
18000 last iteration took 0.445

MRI behaves similarly.

Agreed that the pattern we use here is not good, we should refactor that.
Thanks for the explanation!

@jsvd
Copy link
Contributor Author

jsvd commented May 9, 2016

Thanks for the explanation @headius, it's definitely a bad pattern and we're changing it.
Just out of curiosity, I couldn't find a way to query the singleton class and see this growing list of ivars, is there a way to get that list?

I think we can close this issue since it's a characteristic of Ruby's object model?
Thanks again for the response and congrats on the ruby hero award ;)

@headius
Copy link
Member

headius commented May 9, 2016

@jsvd There's no standard Ruby way, but in JRuby, you can do this:

class A
  def initialize(i)
    instance_eval "@var#{i} = 1"
  end
end

10.times {|i| A.new(i)}

puts JRuby.reference(A).variable_table_manager.variable_names.to_a

This outputs @var0 to @var9.

The justification for not managing an instance variable table on the singleton class is that we'd have to always duplicate the non-singleton class's list into the singleton anyway, since any one of the variables the regular class has seen might get assigned in the singleton at some time in the future.

An alternative mechanism might be to manage instance variable names like we manage methods, via the class hierarchy. However, this would mean having to search that hierarchy frequently and I don't currently know how we'd coordinate monotonically-increasing offsets across a chain of ancestors.

Interestingly, Rubinius does not appear to have this issue for dynamically-assigned variables, but it may be dumping those into a (much slower) hash table on the object rather than assigning an offset to every new variable. I've pinged @evanphx to learn more.

@headius
Copy link
Member

headius commented May 9, 2016

@jsvd Oh, and thanks :-) I will close this issue for now since I don't see us changing how we manage ivars right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants