Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add String#each_split #3754

Closed
wants to merge 5 commits into from
Closed

Conversation

bmulvihill
Copy link
Contributor

Base on some of the conversation in #3702

This currently implements an each_split(Char) method which returns an iterator for a split. I implemented most of the tests that are currently used to test split(Char) . I would like to get some feedback before implementing the other Iterators.

Thanks!

@asterite
Copy link
Member

@bmulvihill This is excellent, thank you!

I was thinking about this and we'll probably need different iterator implementation for each kind of argument that you can pass: no arguments (split by whitespace), char (the one you already implemented), string and regex. What do you think?

@bmulvihill bmulvihill force-pushed the each-split branch 4 times, most recently from f0851d2 to c00fe79 Compare December 29, 2016 13:54
@bmulvihill
Copy link
Contributor Author

@asterite I added an iterator for each type of split, and also one for an Empty separator split, which is shared logic for the String and Regex splits when no separator is provided.

Thanks!

end

it "splits with mixed spaces" do
"foo ".split(' ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this line for ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does nothing :) I'll remove it

@bmulvihill bmulvihill changed the title Add String#each_split(Char) Add String#each_split Jan 4, 2017
@bmulvihill
Copy link
Contributor Author

@asterite Any feedback on this? I tried to keep the iterator implementations similar to their split counterparts. Would you like me to split this out into different PRs for easier review?

Thanks!

@asterite
Copy link
Member

asterite commented Jan 6, 2017

@bmulvihill Thank you for this!

I did have a chance to review it, but now I'm not 100% convinced this is a good idea. Now that there's code for this, I took the time to benchmark it, both in terms of speed and memory. It only starts to make a difference for really big strings.

Since the whole idea of iterators are to reduce memory allocation, one wouldn't have a huge string in memory in the first place. For example, I can imagine one would iterate a File line by line, or something like that. And then, lines aren't usually very long, and the difference between split and each_split is not big, or even noticeable, because one still needs to allocate memory for the iterator itself.

In short, I'm not sure it's worth adding so much code for this when the current split is probably enough.

Note that I couldn't have come to this conclusion without having code to check this. I'm not convinced this PR should be rejected, I should probably consult this with others and see what they think. For example Rust has iterators for split.

@bmulvihill bmulvihill closed this Jan 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants