Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JRuby 1.7.26: String#encoding returns wrong encoding? #4452

Closed
rovf opened this issue Jan 24, 2017 · 7 comments
Closed

JRuby 1.7.26: String#encoding returns wrong encoding? #4452

rovf opened this issue Jan 24, 2017 · 7 comments

Comments

@rovf
Copy link

rovf commented Jan 24, 2017

jruby 1.7.26 (1.9.3p551) 2016-08-26 69763b8 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_79-b15 +jit [Windows 7-amd64]

In irb on Windows, I tried

'abc'.encoding

and got as answer

#Encoding:Windows-1252

This already came as surprise, because I thought that strings, by default, would be UTF-8. Well, next I tried a string containing a single Japanese character:

'う'.encoding

Here too, I got #Encoding:Windows-1252 as a response - and this for sure can't be true, because an う can't be represented in the Windows-1252 character set.

I think, the wrong encoding object is returned from the string.

@headius
Copy link
Member

headius commented Jan 24, 2017

You would want to compare this to CRuby 1.9.3. I believe the change to make all strings be UTF-8 by default did not arrive until at least Ruby 2.0, and possibly later than that.

If this works the same as CRuby 1.9.3, there's nothing to fix. Hopefully it works properly in JRuby 9.1.7.0, since that is the maintained line. Can you confirm both of those?

@rovf
Copy link
Author

rovf commented Jan 25, 2017

Actually, the feature to have each String contain its own encoding, was introduced in CRuby 1.9. See for instance http://nuclearsquid.com/writings/ruby-1-9-encodings/ and http://ruby-doc.org/core-1.9.3/String.html#method-i-encoding . I don't think that the bug is about using the wrong encoding internally - otherwise using i.e. japanese characters in my string wouldn't work correctly -, but that String#encoding does not return the correct information. It would be helpful to compare this with CRuby 1.9.3, but I don't have one here.

BTW, I tried the same with jruby-9.0.4.0 (also on Windows) and got the same behaviour, but since we, unfortunately, have to work with JRuby 1.7, I'm also interested in getting it fixed for this version.

@enebo
Copy link
Member

enebo commented Jan 26, 2017

@rovf I think you misunderstood what headius said. He was not saying m17n was added at Ruby 2.0 but that the default encoding for strings being UTF-8 was at Ruby 2.0 (and for some reason I thought it was 2.1). So MRI 1.9.3 which 1.7 is emulating may end up defaulting to Windows-1252 for ordinary ASCII values. I guess we need someone to find a windows install of 1.9.3 and verify what it should be.

For the Japanese character, I do find this really odd. I know there is magic on windows for translating output to the console to CP-1252 but I find it weird it is reporting that encoding back.

@rovf I have a second request...can you put that code into a file and run the file instead of doing this via irb. I am just curious if this behavior is different.

@headius
Copy link
Member

headius commented Jan 26, 2017

@enebo The UTF-8 default change was 2.x, I don't remember...but I just wanted to point out that it definitely wasn't 1.9 (and therefore we may have correct behavior here).

@rovf Can you try JRuby 9.1.7.0? JRuby 9.0.4.0 is over a year old at this point.

@rovf
Copy link
Author

rovf commented Jan 27, 2017

@enebo: You were right! When running from a file, both 1.7 and 9.x correctly show UTF-8 for both cases, so it really is only an irb issue!

Should this be a bug against irb? In this case, I would install JRuby 9.1.7.0 as @headius suggested; but if you think that within irb, the behaviour is OK, we can close the issue.

@enebo
Copy link
Member

enebo commented Jan 27, 2017

@rovf I don't know if there is an irb-related issue or an issue with our implementation when doing output to a console window in 1.7.x. However 1.7.x is destined to be shutdown development wise pretty soon so I recommend trying stuff with 9.x. and seeing if you still have issues.

@kares kares added this to the Won't Fix milestone Jun 23, 2017
@headius
Copy link
Member

headius commented Jul 18, 2020

No updates against 9.x so I'm closing this as invalid.

If there's still issues here, please open a new bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants