Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a macro to collect outputs of a yielding method to an array #4813

Closed
wants to merge 3 commits into from

Conversation

oprypin
Copy link
Member

@oprypin oprypin commented Aug 9, 2017

Just having some fun reducing code duplication.

Obviously this can be improved and documented, I'm just curious about initial feedback.

@Papierkorb
Copy link
Contributor

I like the idea, but I don't think this should be a global macro, also as #collect is a somewhat common name, which could end up confusing users. Maybe Array would be a good place, considering it "returns" an array?

Array.collect each_byte looks clearer to me than a mere collect each_byte

@oprypin
Copy link
Member Author

oprypin commented Aug 9, 2017

Yeah I had a thought like that but assumed it wouldn't work (that macros can be only used unqualified or something)

@jhass
Copy link
Member

jhass commented Aug 9, 2017

Ideally this would be something like Iterator#to_a I think, that is we would have an iterator for everything and can just call to_a on it. Meanwhile I'm not sure I'm a big fan of the collect name, maybe just Array.from or Array.of or so.

@oprypin
Copy link
Member Author

oprypin commented Aug 9, 2017

Well sure, having an iterator for every yielding method would be amazing, but that topic has been neglected, and manually implementing those iterators is no fun.

Disagree on your naming ideas.

src/macros.cr Outdated

class Array(T)
macro collect(expr)
arr = [] of typeof(begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no macro variables for the array? We don't want to leak the arr name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right...
The rest of this is such an eye sore that I didn't realize.

src/macros.cr Outdated
@@ -83,6 +83,25 @@ macro record(name, *properties)
end
end

private record TypeSentinel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private record does not work, better use struct directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right

Sucks that it doesn't work and doesn't error

src/macros.cr Show resolved Hide resolved
@konovod
Copy link
Contributor

konovod commented Aug 9, 2017

Method is great, and Array.collect looks like a right name for it, but it will surely confuse people coming from Ruby (where collect is a synonym to map).

@straight-shoota
Copy link
Member

straight-shoota commented Aug 9, 2017

I'd suggest to add an option to define the generic type of the array directly instead of inferring it as a union of the yielded types:

macro collect(type, expr)
    %arr = [] of {{type}}
    {{expr}} do |x|
      %arr << x
    end
    %arr
  end

This makes it more clear what array type is returned and allows fine grained control over it. Sometimes you don't want an array of unioned subtypes (or "plus type") but rather explicitly an array of a parent type.
I'd even argue that this might be preferred as default variant (or even replace the current one altogether). Inferring is nice but stating the type directly is more expressive. This ensures the array type is independent of the yielded block.

@oprypin
Copy link
Member Author

oprypin commented Aug 9, 2017

My previous implementation of this idea had the name "gather" but that's not what it means. It also was more opinionated and wrapped the function instead, so less flexibility, especially for docs. But I can't think of a way to specify the type explicitly like I did there, in a way that doesn't look goofy.

@oprypin
Copy link
Member Author

oprypin commented Aug 9, 2017

Wonder if it would be possible to get the type from Array(T).collect(expr)

EDIT: Nope

@Papierkorb
Copy link
Contributor

Papierkorb commented Aug 9, 2017

@konovod I think the name collect is perfectly fine and descriptive. And even in Ruby, one never writes Array.collect, so it shouldn't be confusing either.

@straight-shoota No idea honestly how common it would be that people would want to use a different type. Still, that explicit-type version macro is much nicer to read, and seeing what kind of Array will be returned is also nice. So 👍 from me.

@ysbaddaden
Copy link
Contributor

I'm not sure I can like this. Creating an empty array, iterating and pushing values to it is boring and repetitive, but it's descriptive as to what's happening. This solution may look interesting but it's complex to understand and confusing.

I prefer alternative methods that return an interator that implements #to_a, just like Array does:

[1, 2, 3].cycle(2).to_a
[1, 2, 3, 4, 5].each_cons(3).to_a

@oprypin
Copy link
Member Author

oprypin commented Aug 9, 2017

The dream solution is automatically implementing Iterators based on yielding methods.
Without that, to make this a reality, one needs to write an Iterator (a ton of boilerplate) for every yielding method.

@straight-shoota
Copy link
Member

Maybe it'd be easier to understand if you'd have to create the array manually but have a shortcut for adding yielded elements: Array(Int32).new.collect { each } where #collect would call the method in the block, adding all yielded values to self, and return self.
But I don't know if this could be implemented somehow.

@Fryguy
Copy link
Contributor

Fryguy commented Aug 13, 2017

I agree with @ysbaddaden. We already have Iterator#to_a , not sure why we need another style...unless this is providing something different?

Personally, I'm not a fan of top-level (or nearly top-level) functions over methods on objects... reminds me of Python's built-in functions, which I don't really like (len(ary) as opposed to ary.len)

@@ -1752,8 +1752,7 @@ describe "String" do
end

it "works with strings with block" do
res = [] of String
"bla bla ablf".scan("bl") { |s| res << s }
res = Array.collect "bla bla ablf".scan("bl")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular test's purpose is to test scan with a block, so I don't think this one should change (yeah, it's using a block via the collect macro, but that obscures the test itself)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it is better to be more expressive to understand the purpose.

@RX14
Copy link
Contributor

RX14 commented Aug 13, 2017

The reason is that not every method with a block has a corresponding iterator. And even if they do, not all those iterators are the same speed as their block-based counterparts.

@straight-shoota
Copy link
Member

Collecting with blocks should always be faster than Iterator#to_a because there is no need to initialize an Iterator instance.

@Fryguy
Copy link
Contributor

Fryguy commented Aug 14, 2017

The reason is that not every method with a block has a corresponding iterator.

Ah ok, I think what threw me is that nearly every change in this PR is on methods that already return an iterator (cycle, glob, each_slice, etc). The only one in these changes that doesn't return an iterator is the downcase method.

@akzhan
Copy link
Contributor

akzhan commented Aug 14, 2017

it's failed on Travis CI and has merge conflicts.

@bew
Copy link
Contributor

bew commented Sep 26, 2017

Quoting @oprypin:

Wonder if it would be possible to get the type from Array(T).collect(expr)

EDIT: Nope

Maybe related to #5023

@asterite
Copy link
Member

Even though I like this, and I'm glad it's possible to do, I'm not sure this should go into the std. I'd like to have general iterators related to yield but it's super hard. Maybe collecting with x = [] of T; x << elem is not that bad for now.

@asterite
Copy link
Member

(closing because of the above comment)

@asterite asterite closed this Sep 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet