Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various literal to_proc'ed Symbol optimizations. #3571

Merged
merged 1 commit into from
Dec 30, 2015

Conversation

headius
Copy link
Member

@headius headius commented Dec 28, 2015

Does this seem like a good optimization? Are there any gotchas I've missed?

  • Use a dummy binding rather than creating new every time.
  • Cache the resulting proc at create site.

The second optimization results in a given &:foo in code only
creating a single Proc, ever, and caching it at that point in the
code. This is based on the observation that symbol procs typically
are used to iterate over homogeneous collections of objects, so
caching the proc allows its cache to stay populated and local to
the related code. This also eliminates the allocation of a Block,
BlockBody, and RubyProc for each encounter, which improves perf
also for heterogeneous collections with poor cacheability.

Benchmark:

loop {
  puts Benchmark.measure {
    ary = [1,2,3,4]
    1_000_000.times {
      ary.each(&:object_id)
    }
  }
}

Before:

  1.270000   0.070000   1.340000 (  0.710043)
  0.640000   0.020000   0.660000 (  0.511692)
  0.470000   0.000000   0.470000 (  0.460667)
  0.490000   0.010000   0.500000 (  0.480732)
  0.470000   0.000000   0.470000 (  0.462888)

Just the dummy binding optimization:

  1.210000   0.070000   1.280000 (  0.660924)
  0.540000   0.020000   0.560000 (  0.432614)
  0.430000   0.000000   0.430000 (  0.422502)
  0.430000   0.000000   0.430000 (  0.416549)
  0.410000   0.010000   0.420000 (  0.412461)

And with proc caching:

  0.890000   0.060000   0.950000 (  0.456065)
  0.410000   0.020000   0.430000 (  0.279023)
  0.290000   0.000000   0.290000 (  0.282117)
  0.300000   0.010000   0.310000 (  0.288516)
  0.270000   0.000000   0.270000 (  0.270100)

* Use a dummy binding rather than creating new every time.
* Cache the resulting proc at create site.

The second optimization results in a given &:foo in code only
creating a single Proc, ever, and caching it at that point in the
code. This is based on the observation that symbol procs typically
are used to iterate over homogeneous collections of objects, so
caching the proc allows its cache to stay populated and local to
the related code. This also eliminates the allocation of a Block,
BlockBody, and RubyProc for each encounter, which improves perf
also for heterogeneous collections with poor cacheability.

Benchmark:

```ruby
loop {
  puts Benchmark.measure {
    ary = [1,2,3,4]
    1_000_000.times {
      ary.each(&:object_id)
    }
  }
}
```

Before:

```
  1.270000   0.070000   1.340000 (  0.710043)
  0.640000   0.020000   0.660000 (  0.511692)
  0.470000   0.000000   0.470000 (  0.460667)
  0.490000   0.010000   0.500000 (  0.480732)
  0.470000   0.000000   0.470000 (  0.462888)
```

Just the dummy binding optimization:

```
  1.210000   0.070000   1.280000 (  0.660924)
  0.540000   0.020000   0.560000 (  0.432614)
  0.430000   0.000000   0.430000 (  0.422502)
  0.430000   0.000000   0.430000 (  0.416549)
  0.410000   0.010000   0.420000 (  0.412461)
```

And with proc caching:

```
  0.890000   0.060000   0.950000 (  0.456065)
  0.410000   0.020000   0.430000 (  0.279023)
  0.290000   0.000000   0.290000 (  0.282117)
  0.300000   0.010000   0.310000 (  0.288516)
  0.270000   0.000000   0.270000 (  0.270100)
```
@headius
Copy link
Member Author

headius commented Dec 28, 2015

Note that the dummy binding improves the overall cost of calling Symbol#to_proc as well, even if the proc is not cached.

@enebo
Copy link
Member

enebo commented Dec 29, 2015

My only reservation would be any complexity it would add to Block/BlockBody, but it appears to add none (one custom type replaced by another). At worst, there is something weird with exotic parameter binding?

I am wondering if we can make a to_proc in Ruby (or IR assembly) and then potentially inline it? I guess since inlining is still in its infant stages this might not be worth discussing now...

@headius
Copy link
Member Author

headius commented Dec 29, 2015

The new inner class just pulled out an existing anon inner class, so there's no new classes. That part of the change was mostly lateral, but I hate large anon inner classes.

My main concerns were around the dummy binding, but that seems fine in testing since the block body is not in Ruby. The proc should perhaps be frozen...but can you do anything to a proc?

Inlining...yes, that is definitely a possibility. It may be easier without the moving target of a new proc every time anyway, though. If we can inline to_proc, we could see it is a send of a constant name, which is just a call site. We'd need to inline through #each as well but in theory it could go all the way through to the method that the send actually calls.

In any case, knowing it is a constant block rather than a new one should help that.

headius added a commit that referenced this pull request Dec 30, 2015
Various literal to_proc'ed Symbol optimizations.
@headius headius merged commit cca4a33 into master Dec 30, 2015
@headius headius deleted the symbol_to_proc_cache branch December 30, 2015 20:42
@headius headius restored the symbol_to_proc_cache branch January 19, 2016 16:02
@headius headius deleted the symbol_to_proc_cache branch January 19, 2016 16:04
@enebo enebo modified the milestone: Non-Release May 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants