-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Array#count instead of falling back to Enumerable#count #3133
Conversation
This avoids iteration with #each if no arguments are given.
(Travis failure looks unrelated) |
Are there specs for this? If not, can you add some. If so, are there tags to delete? |
Never mind, I spoke before I looked. |
Implement Array#count instead of falling back to Enumerable#count
@@ -415,6 +415,18 @@ def concat(other) | |||
concat other | |||
end | |||
|
|||
def count(item = undefined) | |||
seq = 0 | |||
if !undefined.equal?(item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason for writing this backwards, instead of using if item != undefined
or if !item.equal?(undefined)
(or perhaps just if item
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yorickpeterse
The implementation was directly copied from that of Enumerable#count
for consistency's sake, then modified to return the optimized result in the third case.
Regarding the options you mentioned, I can speculate as to why the original author of Enumerable#count
chose not to use them:
if item != undefined
Havingitem
be the first term here could be problematic, if the object atitem
overrode the!=
operator.if !item.equal?(undefined)
This has the same potential problem as 1.if item
This one is definitely not acceptable, because[true,true,false].count(false)
should return1
, not3
. I imagine this is whyundefined
was used here instead of just testing falsiness or.nil?
.if undefined != item
would probably be okay, though that form is a little more 'relaxed' might be more likely to tempt someone to make the kind of refactorings later mentioned above that could break things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of Enumerable#count
I referred to:
https://github.com/rubinius/rubinius/blob/master/kernel/common/enumerable.rb#L59-L69
Instead of relying on String#count for counting newlines in text nodes, Oga now does this in C/Java. String#count isn't exactly the fastest way of counting characters. Performance was measured using benchmark/xml/lexer/string_average_bench.rb. Before this patch the results were as following: MRI: 0.529s Rbx: 4.965s JRuby: 0.622s After this patch: MRI: 0.424s Rbx: 1.942s JRuby: 0.665s => numbers vary a bit, seem roughly the same as before The commands used for benchmarking: $ rake clean # to make sure that C exts aren't shared between MRI/Rbx $ rake generate $ rake fixtures $ ruby benchmark/xml/lexer/string_average_bench.rb The big difference for Rbx is probably due to the implementation of String#count not being super fast. Some changes were made (rubinius/rubinius#3133) to the method, but this hasn't been released yet. JRuby seems to perform in a similar way, so either it was already optimizing things for me or I suck at writing well performing Java code. This fixes #51.
This avoids iteration with
#each
if no arguments are given to#count
.This is an easy win for performance for large arrays, and MRI and JRuby already have this optimization.
In addition to performance, the implementing Module as given by introspection now matches MRI, meaning behavior will be consistent if some silly user deems to monkey-patch
Enumerable#count
.