Implement Array#count instead of falling back to Enumerable#count #3133

jemc · 2014-09-18T03:15:37Z

This avoids iteration with #each if no arguments are given to #count.

This is an easy win for performance for large arrays, and MRI and JRuby already have this optimization.

In addition to performance, the implementing Module as given by introspection now matches MRI, meaning behavior will be consistent if some silly user deems to monkey-patch Enumerable#count.

Array.instance_method(:count).owner #=> Array (not Enumerable)

This avoids iteration with #each if no arguments are given.

jemc · 2014-09-18T03:39:14Z

(Travis failure looks unrelated)

jc00ke · 2014-09-18T06:46:05Z

Are there specs for this? If not, can you add some. If so, are there tags to delete?

jc00ke · 2014-09-18T07:00:20Z

Never mind, I spoke before I looked.

Implement Array#count instead of falling back to Enumerable#count

yorickpeterse · 2014-09-18T09:46:10Z

kernel/common/array.rb

@@ -415,6 +415,18 @@ def concat(other)
    concat other
  end

+  def count(item = undefined)
+    seq = 0
+    if !undefined.equal?(item)


Is there a specific reason for writing this backwards, instead of using if item != undefined or if !item.equal?(undefined) (or perhaps just if item)?

@yorickpeterse
The implementation was directly copied from that of Enumerable#count for consistency's sake, then modified to return the optimized result in the third case.

Regarding the options you mentioned, I can speculate as to why the original author of Enumerable#count chose not to use them:

if item != undefined Having item be the first term here could be problematic, if the object at item overrode the != operator.

if !item.equal?(undefined) This has the same potential problem as 1.

if item This one is definitely not acceptable, because [true,true,false].count(false) should return 1, not 3. I imagine this is why undefined was used here instead of just testing falsiness or .nil?.

if undefined != item would probably be okay, though that form is a little more 'relaxed' might be more likely to tempt someone to make the kind of refactorings later mentioned above that could break things.

The implementation of Enumerable#count I referred to:
https://github.com/rubinius/rubinius/blob/master/kernel/common/enumerable.rb#L59-L69

Instead of relying on String#count for counting newlines in text nodes, Oga now does this in C/Java. String#count isn't exactly the fastest way of counting characters. Performance was measured using benchmark/xml/lexer/string_average_bench.rb. Before this patch the results were as following: MRI: 0.529s Rbx: 4.965s JRuby: 0.622s After this patch: MRI: 0.424s Rbx: 1.942s JRuby: 0.665s => numbers vary a bit, seem roughly the same as before The commands used for benchmarking: $ rake clean # to make sure that C exts aren't shared between MRI/Rbx $ rake generate $ rake fixtures $ ruby benchmark/xml/lexer/string_average_bench.rb The big difference for Rbx is probably due to the implementation of String#count not being super fast. Some changes were made (rubinius/rubinius#3133) to the method, but this hasn't been released yet. JRuby seems to perform in a similar way, so either it was already optimizing things for me or I suck at writing well performing Java code. This fixes #51.

Implement Array#count instead of falling back to Enumerable#count

afae918

This avoids iteration with #each if no arguments are given.

jc00ke added a commit that referenced this pull request Sep 18, 2014

Merge pull request #3133 from jemc/array_count

47f4647

Implement Array#count instead of falling back to Enumerable#count

jc00ke merged commit 47f4647 into rubinius:master Sep 18, 2014

yorickpeterse reviewed Sep 18, 2014
View reviewed changes

jemc deleted the array_count branch September 18, 2014 15:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Array#count instead of falling back to Enumerable#count #3133

Implement Array#count instead of falling back to Enumerable#count #3133

jemc commented Sep 18, 2014

jemc commented Sep 18, 2014

jc00ke commented Sep 18, 2014

jc00ke commented Sep 18, 2014

yorickpeterse Sep 18, 2014

jemc Sep 18, 2014

jemc Sep 18, 2014

Implement Array#count instead of falling back to Enumerable#count #3133

Implement Array#count instead of falling back to Enumerable#count #3133

Conversation

jemc commented Sep 18, 2014

jemc commented Sep 18, 2014

jc00ke commented Sep 18, 2014

jc00ke commented Sep 18, 2014

yorickpeterse Sep 18, 2014

Choose a reason for hiding this comment

jemc Sep 18, 2014

Choose a reason for hiding this comment

jemc Sep 18, 2014

Choose a reason for hiding this comment