Make Indexable#to_a preallocate the array #6079

carlhoerberg · 2018-05-08T12:01:17Z

Benchmark:

require "benchmark"

struct Tuple
  def to_aa
    Array(Union(*T)).new(size) do |i|
      self[i]
    end
  end
end

t = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 }
Benchmark.ips do |x|
  x.report("original") { t.to_a }
  x.report("optimized") { t.to_aa }
end

Output:

original    2.86M (349.71ns) (± 2.36%)  1.72× slower
optimized   4.93M (202.81ns) (± 3.79%)       fastest

hugoabonizio · 2018-05-08T12:10:03Z

src/tuple.cr

@@ -392,6 +392,18 @@ struct Tuple
    pp.list("{", self, "}")
  end

+  # Returns a `Array` with all the tuple's elements


Returns an Array (...)

RX14 · 2018-05-08T12:10:54Z

src/tuple.cr

+  # tuple = { 1, 2 }
+  # tuple.to_a # => [1, 2]
+  # ```
+  def to_a


This can actually be put on Indexable to improve all indexable collections instead of just tuple.

Doing this means you can remove Deque#to_a too, and I'd rather you copied Deque#to_a's implementation using each and preallocating the size than this code which does a bounds-check on every self[i].

Of course, benchmarking after this change would be required. (and maybe investigate using this code but using self.unsafe_at too, to see if there's better performance)

require "benchmark" module Enumerable def to_aa ary = Array(T).new(size) each { |e| ary << e } ary end end struct Tuple def to_a1 Array(Union(*T)).new(size) do |i| self[i] end end def to_a2 arr = Array(Union(*T)).new(size) {% for i in 0...T.size %} arr << self[{{i}}] {% end %} arr end end t = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, "a" } Benchmark.ips do |x| x.report("to_a") { t.to_a } x.report("to_aa") { t.to_aa } x.report("to_a1") { t.to_a1 } x.report("to_a2") { t.to_a2 } end

output:

to_a 1.37M (732.25ns) (± 4.60%) 1.51× slower to_aa 2.04M (489.27ns) (± 9.49%) 1.01× slower to_a1 1.65M (606.33ns) (± 3.47%) 1.25× slower to_a2 2.06M ( 485.6ns) (± 8.03%) fastest

Conclusion: Skipping optimizing Tuple#to_a, will just preallocate the size of the array in Enumerable#to_a

Please don't use size in Enumerable#to_a, because size traverses the enumerable, so you'll end up traversing it twice.

Let's preallocate in Indexable and Tuple, but leave Enumerable#to_a as is.

Thanks!

but @asterite don't you think that counting (which only happens if the class doesn't implement their own size method) still might be faster than dynamically growing an array?

@carlhoerberg I guess that depends on the implementation of the Enumerable. We should assume it's cost intensive by default and use a faster algorithm where size is trivial (like in Indexable).

luislavena · 2018-05-08T13:08:58Z

src/indexable.cr

+  # Returns an `Array` with all the elements in the collection.
+  #
+  # ```
+  # { 1, 2, 3 }.to_a # => [1, 2, 3]


Please run formatting (crystal tool format) also for the example code:

{1, 2, 3}.to_a # => [1, 2, 3]

Thank you.

carlhoerberg · 2018-05-08T15:35:58Z

Fixed the doc formatting and rebased

RX14 · 2018-05-08T15:49:46Z

I think this can still be speced, we do have some indexable specs and a stub indexable to test with.

asterite · 2018-05-08T15:55:18Z

What happened to Tuple#to_a using macro for?

carlhoerberg · 2018-05-08T15:58:29Z

It didn't yield any performance improvement

…

On Tue, May 8, 2018, 17:55 Ary Borenszweig ***@***.***> wrote: What happened to Tuple#to_a using macro for? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#6079 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAK_TmUSn4eu5uJFZKUCfOiFYU4O85u7ks5twb_xgaJpZM4T2hDh> .

asterite · 2018-05-08T16:04:12Z

Bah, nevermind. Let's not worry about that for now.

RX14 · 2018-05-08T16:04:50Z

@asterite Tuple each uses macro for already, and since yield is inlined it generates exactly duplicate code.

asterite · 2018-05-08T16:05:46Z

@RX14 Ah, right, I forgot about that.

yxhuvud · 2018-05-08T17:00:52Z

src/indexable.cr

+  # ```
+  def to_a
+    ary = Array(T).new(size)
+    each { |e| ary << e }


The optimizer should make this into a single memcopy for indexable with the elements in sequence, right?

I don't think so. The optimizer is pretty dumb.

If this ends up as a bottleneck in any specific class, there's plenty of scope for overriding it to use memcpy in classes where that's possible.

luislavena · 2018-05-08T20:55:24Z

@RX14 can you change the title of this issue to match the commits of the PR? That way someone linking back will be able to spot the connection.

Thank you.

hugoabonizio reviewed May 8, 2018

View reviewed changes

RX14 requested changes May 8, 2018

View reviewed changes

luislavena reviewed May 8, 2018

View reviewed changes

ysbaddaden approved these changes May 8, 2018

View reviewed changes

carlhoerberg force-pushed the Tuple#to_a branch from 5a9aaf7 to 5f7b700 Compare May 8, 2018 15:29

Make Indexable#to_a preallocate the array

ae54a43

carlhoerberg force-pushed the Tuple#to_a branch from 5f7b700 to ae54a43 Compare May 8, 2018 15:35

yxhuvud reviewed May 8, 2018

View reviewed changes

RX14 approved these changes May 8, 2018

View reviewed changes

asterite approved these changes May 8, 2018

View reviewed changes

wooster0 approved these changes May 8, 2018

View reviewed changes

RX14 added this to the Next milestone May 8, 2018

RX14 added kind:feature topic:stdlib performance labels May 8, 2018

RX14 merged commit 5d3a458 into crystal-lang:master May 8, 2018

RX14 changed the title ~~Optimize Tuple#to_a by preallocate Array~~ Make Indexable#to_a preallocate the array May 8, 2018

chris-huxtable pushed a commit to chris-huxtable/crystal that referenced this pull request Jun 6, 2018

Make Indexable#to_a preallocate the array (crystal-lang#6079)

98cb83b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Indexable#to_a preallocate the array #6079

Make Indexable#to_a preallocate the array #6079

carlhoerberg commented May 8, 2018

hugoabonizio May 8, 2018

RX14 May 8, 2018

RX14 May 8, 2018 •

edited

Loading

carlhoerberg May 8, 2018

carlhoerberg May 8, 2018

asterite May 8, 2018

carlhoerberg May 8, 2018

carlhoerberg May 8, 2018

asterite May 8, 2018

straight-shoota May 8, 2018 •

edited

Loading

luislavena May 8, 2018

carlhoerberg commented May 8, 2018

RX14 commented May 8, 2018

asterite commented May 8, 2018

carlhoerberg commented May 8, 2018 via email

asterite commented May 8, 2018

RX14 commented May 8, 2018

asterite commented May 8, 2018

yxhuvud May 8, 2018

asterite May 8, 2018

RX14 May 8, 2018

luislavena commented May 8, 2018

Make Indexable#to_a preallocate the array #6079

Make Indexable#to_a preallocate the array #6079

Conversation

carlhoerberg commented May 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RX14 May 8, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

straight-shoota May 8, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlhoerberg commented May 8, 2018

RX14 commented May 8, 2018

asterite commented May 8, 2018

carlhoerberg commented May 8, 2018 via email

asterite commented May 8, 2018

RX14 commented May 8, 2018

asterite commented May 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luislavena commented May 8, 2018

RX14 May 8, 2018 •

edited

Loading

straight-shoota May 8, 2018 •

edited

Loading