Truffle symbol refactor #3029

chrisseaton · 2015-06-09T20:19:20Z

Big refactoring of our symbol implementation, removing the RubySymbol class, removing some potential problems with shared mutable ByteList, moving some stuff into Ruby, optimising some methods, adding symbol GC and completing specs. This leaves us with the whole of Symbol implemented in just 400 lines of Java, including license blocks.

I added caching for Symbol#to_proc. This is an extremely common pattern, so it's surprising nobody else has noticed this is hot and done the same optimisation as us. I hope it's correct - it passes all specs and makes sense to me but please have a think about it as well.

numbers = [3.14] * 100_000

loop do
  start = Time.now
  1000.times do
    numbers.each(&:abs)
  end
  puts Time.now - start
end

$ ruby sym-to-proc.rb 
4.840322
4.735055
4.625257
4.727999
4.758009

$ bin/jruby sym-to-proc.rb 
5.45
5.398
5.376
5.306
5.401
5.335

$ ~/.rbenv/versions/rbx-2.5.5/bin/ruby sym-to-proc.rb 
29.604462
29.749868
29.63042
30.040485
29.486567

$ ~/.rbenv/versions/topaz-dev/bin/ruby ../sym-to-proc.rb 
4.66224217415
4.56306409836
4.71132302284
4.65929985046
4.5716149807
4.54749894142

$ jt run --graal ../sym-to-proc.rb 
0.7869999999999999
0.29500000000000004
0.11099999999999999
0.112
0.10200000000000001
0.095
0.09200000000000001
0.094
0.08700000000000001
0.08600000000000001
0.08600000000000001
0.08600000000000001

There's clearly something badly wrong with Rubinius there - but even though we're using their code for a lot of Symbol it doesn't seem to matter for us.

Also note that our warmup is so much better these days. Warmed up within a second or so, and first iteration time is already better than anyone else's warmed up time.

@nirvdrum @eregon @pitr-ch please review. @thomaswue for information.

This reverts commit bbdd0ab.

eregon · 2015-06-09T20:59:44Z

MRI has a global Symbol#to_proc cache (for 67 symbols).
People used to write Symbol#to_proc themselves before it was added to MRI (1.8 era I think), and their native implementation was already performing much better than the hand-written one.
It's sort of common knowledge that the explicit block form ( .each { |e| e.sym } ) is faster, so it would be pretty interesting to have a benchmark to compare.

Are you sure abs is called at all in this case? Since it is side-effect free and the result is not kept anywhere (map! could be an alternative to ensure side-effects and less likely to optimize away).

I am thinking caching the generated Proc per Symbol might make sense too if it's used in multiple locations so only one Proc would need to be created for a given Symbol (but that would need some care in multi-threaded).
I'll review the diff tomorrow.

chrisseaton · 2015-06-09T21:43:44Z

@eregon you are always useful for explaining the historical context of all of this!

The abs is entirely removed - I can't see it anywhere in the graph, but the loop is not as the array is escaped so the length could change. The graph is very understandable if you try IGV.

I deliberately used something that could be removed to make the difference as extreme as possible - like a PE test. If you use a user-written empty method instead of abs it's actually a little slower - not sure why. Something in the prelude maybe?

I didn't put the proc inside the symbol object, as I'm not sure to_proc is used on many symbols, and either we take up a lot of space by having a field in all symbols, or we have two kinds of symbols which might make this bimorphic. A global cache would be escaped.

nirvdrum · 2015-06-09T22:20:16Z

Re: Symbol#to_proc . . . it's part of https://github.com/JuanitoFatas/fast-ruby as a benchmark. And it links to an interesting discussion on Rails going back and forth on its use.

eregon · 2015-06-10T11:32:06Z

@chrisseaton Nice IGV graph indeed!
Although it does seem a bit unfair if we want to compare mostly Symbol#to_proc.
Keeping the Proc in the node seems like the best solution for now.

eregon · 2015-06-10T11:50:28Z

Looks good, does it have any impact on the benchmarks?

Truffle symbol refactor

chrisseaton added 17 commits June 5, 2015 22:40

[Truffle] Remove use of Java 8 method.

8aafce4

[Truffle] For hot paths, use RubySymbol.getString rather than toString

4c8ee62

[Truffle] Move most of Symbol to Ruby.

72e4bc8

[Truffle] Simplify Symbol#to_proc

73543a3

[Truffle] Cache the hash code for RubySymbol.

48ea137

[Truffle] More general tidy up of Symbol.

d4a9b11

[Truffle] Cache Symbol#to_proc

a645073

[Truffle] Switch RubySymbol fields to OM.

c2f0785

[Truffle] bytes to byteList in Symbol for consistency.

b58bbf9

[Truffle] Remove most usages of RubySymbol.

2eae47d

[Truffle] Remove RubySymbol.

1b5ee39

Loosen the specs for Symbol#all_symbols increasing in size.

5b0d8f8

[Truffle] Simplify Symbol.all_symbols

8af4e38

[Truffle] Sometimes we need to handle Rubinius undefined as our missing.

2cb7716

[Truffle] The proc from Symbol#to_proc needs an arity check.

3528613

Revert "Revert "Merge branch 'truffle-symbol-refactor'""

eb4f675

This reverts commit bbdd0ab.

Merge branch 'master' into truffle-symbol-refactor

294bf27

chrisseaton added the truffle label Jun 9, 2015

chrisseaton added this to the truffle-dev milestone Jun 9, 2015

chrisseaton self-assigned this Jun 9, 2015

chrisseaton added a commit that referenced this pull request Jun 10, 2015

Merge pull request #3029 from jruby/truffle-symbol-refactor

06e1efa

Truffle symbol refactor

chrisseaton merged commit 06e1efa into master Jun 10, 2015

chrisseaton deleted the truffle-symbol-refactor branch June 12, 2015 16:26

enebo added this to the Non-Release milestone Dec 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Truffle symbol refactor #3029

Truffle symbol refactor #3029

chrisseaton commented Jun 9, 2015

eregon commented Jun 9, 2015

chrisseaton commented Jun 9, 2015

nirvdrum commented Jun 9, 2015

eregon commented Jun 10, 2015

eregon commented Jun 10, 2015

Truffle symbol refactor #3029

Truffle symbol refactor #3029

Conversation

chrisseaton commented Jun 9, 2015

eregon commented Jun 9, 2015

chrisseaton commented Jun 9, 2015

nirvdrum commented Jun 9, 2015

eregon commented Jun 10, 2015

eregon commented Jun 10, 2015