Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicated iteration issue of Enumerable#in_groups_of #3128

Merged
merged 8 commits into from
Sep 1, 2016
Merged

Fix duplicated iteration issue of Enumerable#in_groups_of #3128

merged 8 commits into from
Sep 1, 2016

Conversation

kachick
Copy link
Contributor

@kachick kachick commented Aug 9, 2016

No description provided.

ary = Array(T | U).new(size, filled_up_with)
end
each_slice(size) do |slice|
group = slice.size < size ? slice + [filled_up_with] : slice
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, this needs to fill it up with size - slice.size elements, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, does it mean assert { [1, 2, 3, 4].in_groups_of(3).should eq([[1, 2, 3], [4, nil, nil]]) }?
It sounds so reasonable to me, but current behavior does not work for it. Would I fix the behavior in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, current code works for it ... 🙇

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Rails/ActiveSupport's version does so anyway:

irb(main):001:0> [1, 2, 3, 4].in_groups_of(3)
=> [[1, 2, 3], [4, nil, nil]]

yield group
else
yield slice
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just

(size - slice.size).times { slice << filled_up_with }
yield slice

Or perhaps

slice.concat Array(T|U).new(size - slice.size, filled_up_with)
yield slice

? It would be interesting to see what's more expensive (slower), potentially repeated resizings of pushing n-m times or allocating a whole new array. Actually does Array#fill resize? I can't remember.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! The code looks simple to me.
But I think the slice is declared to Array(T) at

slice = Array(T).new(count)
.
Can I append the nil or given value to the slice at here?

I have tried some code as below they show errors.

group = slice.as(Array(T | U)) # => can't cast Array(Int32) to Array(Int32 | Nil)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right, I forgot about that. How about

unless slice.size == size
  group = Array(T|U).new(size, filled_up_with)
  slice.each_with_index {|item, index| group[index] = item }
  slice = group
end
yield slice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds perfect to me!
I copied the logic in 1c1334f.
Thank you!

(Style adjusted to standardize naming for the returned variable 🙇 )

@benoist
Copy link
Contributor

benoist commented Aug 10, 2016

Can you provide an example when it fails with the current crystal version?
I tried running your specs, without the change and it seems to work fine.

https://play.crystal-lang.org/#/r/16rs

@asterite
Copy link
Member

@benoist It's not about failure, it's about the Enumerable being traversed twice. It can probably be tested with a custom enumerable.

@benoist
Copy link
Contributor

benoist commented Aug 10, 2016

Ah that makes more sense :)

@kachick
Copy link
Contributor Author

kachick commented Aug 11, 2016

not about failure

Yes, I found the TODO comment in the source 👀


📝

bin/crystal eval '1000.times { (0...100000).in_groups_of(3, "padding") }'
  • Before(31562ca): 6.59s user 0.79s system 118% cpu 6.214 total
  • After(d2a3cbd): 3.15s user 0.60s system 125% cpu 2.982 total

@asterite
Copy link
Member

Thank you!

But the implementation is not correct, and I think once we improve generics + covariance and contravariance this will give an error.

The problem is that in_groups_of can't use each_silce. each_slice returns Array(T), and then we are mixing an Array(T | U) at some point, so a union Array(T) | Array(T | U) is created. It somehow compiles, when we assign any of those to Array(Array(T | U)), but it shouldn't.

I think the proper way to implement this is to make a third private method that both each_slice and in_groups_of use, in which you pass the type of array you want to create.

@asterite
Copy link
Member

Also, benchmarking via crystal eval is not very good, one should always pass --release (but crystal eval doesn't allow this, so you'll have to write a file)

@kachick
Copy link
Contributor Author

kachick commented Aug 18, 2016

😨

Thank you for sharing with me!

I have updated some code, is it correct for your pointed out? Then I could choose #3128 (comment) way.

And benchmarked as below. Is it enough for the fairness?

echo '1000.times { (0...400000).in_groups_of(3, "padding") }' > bench.cr
git checkout 31562ca54be262f60fee941ec859f86a699ea461
time bin/crystal run --release bench.cr
#=> bin/crystal run --release bench.cr  8.71s user 0.99s system 129% cpu 7.480 total
git checkout 29667d4ad394ec66fa0b88bdb9bc427a81675ed0
time bin/crystal run --release bench.cr
#=> bin/crystal run --release bench.cr  6.73s user 0.81s system 111% cpu 6.774 total

@asterite
Copy link
Member

asterite commented Sep 1, 2016

@kachick Looks good, thank you!

@asterite asterite merged commit 77dc00a into crystal-lang:master Sep 1, 2016
@kachick
Copy link
Contributor Author

kachick commented Sep 1, 2016

Thank you! 😂

@kachick kachick deleted the fix-todo-of-enumerable-in_groups_of branch September 1, 2016 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants