Fix: String#gsub replacing ascii chars to work in non-ascii string #5350

straight-shoota · 2017-12-05T15:19:31Z

straight-shoota · 2017-12-05T15:21:29Z

~~This is a fix for the immediate issue.~~
It would probably be better if gsub_ascii_char worked with non-ascii strings, it is still just replacing an ascii char for another.

UPDATED: I've implemented the better solution.

straight-shoota · 2017-12-05T15:54:22Z

This would do as a proper solution in gsub_ascii_char, but I'm not fully confident about that pointer copying:

if my_char == char
  buffer[i] = replacement.ord.to_u8
else
  (buffer + i).copy_from(to_unsafe + i, my_char.bytesize)
end

larubujo · 2017-12-05T16:54:25Z

solution is to use each_byte_with_index, not go by char when char is ascii. faster and better. no need for string to be full ascii.

straight-shoota · 2017-12-05T18:22:17Z

There is no each_byte_with_index, but to_slice.each_with_index will do.

straight-shoota · 2017-12-13T11:33:48Z

Can I get a review? It should be a good solution now, not just a workaround as initially announced.

RX14 · 2017-12-13T14:35:11Z

src/string.cr

-      each_char_with_index do |my_char, i|
-        buffer[i] = if my_char == char
+      to_slice.each_with_index do |byte, i|
+        buffer[i] = if char === byte


Please don't use ===, use char.ord == byte.

Also i'd much prefer

if char.ord == byte buffer[i] = replacement.ord.to_u8 else buffer[i] = byte end

to avoid the ugly indent.

@RX14 I vote to change formatter rule for this indent to uniformly used 2 spaces.

That's another issue, and I disagree with you.

Out of the curiosity, how's that you don't want to change the rule, while in the same time you suggest against using it? Is there sth I don't get?

Because changing the rule would mean it looks like

buffer[i] = if char.ord == byte replacement.ord.to_u8 else byte end

which i think is even more ugly than indenting it all.

My preference goes:

what I suggested (assign in the if)

what's implemented in this PR (indent the if)

what you suggested (don't indent the if)

Got it! for my taste it'd be 3,1,2 - in that order.

is what it currently looks like. Until now this PR doesn't change anything about that.

well it won't get changed if it's not in this PR

@RX14 now it is =)

chastell · 2018-01-01T15:20:56Z

make spec passes for me on this PR, so the TCPSocket failures on Travis are probably unrelated; could someone with access restart the Travis jobs?

RX14 · 2018-01-01T16:06:52Z

This PR needs to be rebased onto master to pick up the latest travis.yml changes to pass CI.

Fixes crystal-lang#5348

straight-shoota · 2018-01-01T16:48:51Z

Rebased.

RX14 · 2018-01-01T18:36:28Z

@bcardiff this should also probably be cherry-picked into 0.24.2 but i've milestoned it Next until that happens.

…rystal-lang#5350) Fixes crystal-lang#5348

straight-shoota changed the title ~~Fix: String#gsub should use shortcut only for ascii-only string~~ Fix: String#gsub replacing ascii chars to work in non-ascii string Dec 5, 2017

RX14 requested changes Dec 13, 2017

View reviewed changes

RX14 approved these changes Dec 14, 2017

View reviewed changes

asterite mentioned this pull request Jan 1, 2018

Fix String#gsub and #tr ASCII-only optimisations #5498

Closed

straight-shoota added 3 commits January 1, 2018 17:41

Fix: String#gsub should use shortcut only for ascii-only string

6b18ab0

Fixes crystal-lang#5348

Fix String#gsub_ascii_char for non-ascii characters in string

4657b3a

Replace === operator and large indent

89bf14c

straight-shoota force-pushed the jm-issue-5348 branch from 02ab638 to 89bf14c Compare January 1, 2018 16:48

asterite approved these changes Jan 1, 2018

View reviewed changes

RX14 added kind:bug topic:stdlib labels Jan 1, 2018

RX14 modified the milestones: 0.24.2, Next Jan 1, 2018

RX14 merged commit d7273c7 into crystal-lang:master Jan 1, 2018

straight-shoota deleted the jm-issue-5348 branch January 1, 2018 18:37

lukeasrodgers pushed a commit to lukeasrodgers/crystal that referenced this pull request Jan 7, 2018

Fix: String#gsub replacing ascii chars to work in non-ascii string (c…

0b3473c

…rystal-lang#5350) Fixes crystal-lang#5348

bcardiff mentioned this pull request Jun 15, 2018

Crystal bug while compiling #6183

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: String#gsub replacing ascii chars to work in non-ascii string #5350

Fix: String#gsub replacing ascii chars to work in non-ascii string #5350

straight-shoota commented Dec 5, 2017

straight-shoota commented Dec 5, 2017 •

edited

Loading

straight-shoota commented Dec 5, 2017 •

edited

Loading

larubujo commented Dec 5, 2017

straight-shoota commented Dec 5, 2017

straight-shoota commented Dec 13, 2017

RX14 Dec 13, 2017

RX14 Dec 13, 2017 •

edited

Loading

Sija Dec 13, 2017 •

edited

Loading

RX14 Dec 13, 2017

Sija Dec 13, 2017

RX14 Dec 13, 2017

Sija Dec 13, 2017

straight-shoota Dec 13, 2017

RX14 Dec 13, 2017

straight-shoota Dec 13, 2017

chastell commented Jan 1, 2018

RX14 commented Jan 1, 2018

straight-shoota commented Jan 1, 2018

RX14 commented Jan 1, 2018

Fix: String#gsub replacing ascii chars to work in non-ascii string #5350

Fix: String#gsub replacing ascii chars to work in non-ascii string #5350

Conversation

straight-shoota commented Dec 5, 2017

straight-shoota commented Dec 5, 2017 • edited Loading

straight-shoota commented Dec 5, 2017 • edited Loading

larubujo commented Dec 5, 2017

straight-shoota commented Dec 5, 2017

straight-shoota commented Dec 13, 2017

Choose a reason for hiding this comment

RX14 Dec 13, 2017 • edited Loading

Choose a reason for hiding this comment

Sija Dec 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chastell commented Jan 1, 2018

RX14 commented Jan 1, 2018

straight-shoota commented Jan 1, 2018

RX14 commented Jan 1, 2018

straight-shoota commented Dec 5, 2017 •

edited

Loading

straight-shoota commented Dec 5, 2017 •

edited

Loading

RX14 Dec 13, 2017 •

edited

Loading

Sija Dec 13, 2017 •

edited

Loading