Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String: implement Rabin-Karp algorithm for #byte_index #4609

Merged

Conversation

makenowjust
Copy link
Contributor

@asterite My tomorrow comes after five months! (#3876 (comment)) Sorry.

Benchmark is here: https://gist.github.com/MakeNowJust/363ae118b266e77e576a828acaad59a0

And my result:

$ crystal run --release byte_index_bench.cr
For long string:
     naive 641.16  (  1.56ms) (± 7.60%)  2.06× slower
rabin_karp   1.32k (757.07µs) (± 7.69%)       fastest
For short string:
     naive   3.33M (299.87ns) (±12.33%)  1.78× slower
rabin_karp   5.95M (168.18ns) (±14.56%)       fastest

String#index: #3873
String#rindex: #3876

pointer += 1
end

while true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use loop do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop do; end is expanded to: i = 0; while true; i += 1; end. I'm not sure variable i is eliminated by LLVM anytime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a single unused integer local variable, i think it'd be very very unlikely for LLVM to keep it around. You should check for yourself though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git grep 'while true' | wc -l yields 149, so I think all while true should be replaced to loop if this was fixed. I would like to keep while true for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think while true is fine but I was just pointing out that in a release build there's no difference in performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I want to keep it for merging this quickly because I believe this fix is good enogh. We can discuss about loop do optimization after it.

@makenowjust
Copy link
Contributor Author

ping. I believe this is ready to be merged.

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@RX14 RX14 merged commit d23db08 into crystal-lang:master Jul 27, 2017
@RX14 RX14 added this to the Next milestone Jul 27, 2017
@makenowjust makenowjust deleted the fix/string/byte-index-rabin-karp branch July 27, 2017 10:09
@makenowjust
Copy link
Contributor Author

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants