-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-ASCII Symbol gives ArgumentError when calling inspect on the symbol #4070
Comments
This appears to be specific to irb itself. If I make a symbol in a file or via -e then :Renè parses fine. |
It's an error in
|
Actually, at some point preciseLength() in utils/StringSupport.java is returning -3, so codePoint() is throwing the argument error, which bubbles up via isPrintable() in RubySymbol.java The root cause seems to be in the org.jcodings.Encoding child's length(), but I don't have time right now to keep digging. |
Following up on @phluid61 I see that the bytelist for Renè only contains a single byte for è and the -3 represents -1-missing or that we are missing 2 bytes. So we definitely are storing something wrong as byte data for symbols. Thanks for figuring this is just inspect doing the wrong thing. In retrospect it would make sense this would be why irb was unhappy. |
@enebo @phluid61 Maybe I misunderstood what you are saying, but this definitely happens outside of IRB as well. We have this pop up in our applications. Our workaround is to quote the symbols. Quoted symbols work as expected:
|
@enebo I see the problem, the raw symbol When the symbol is quoted ( Again, I don't have time to dig further into it right now. |
Okay, in In org.jruby.RubySymbol:
In that final line, ByteList.plain is essentially just After that call stack, newSymbol() calls Not sure what the appropriate fix is. |
@phluid61 I think this is running into what we need to do but haven't. We have made some stuff work but it is inconsistent and the broken way is the way which allows symbols to work in some cases. See: #3880 (comment) |
I should add to that comment by saying properly encoded values are not just strictly for display purposes but are also needed in cases which will cross a resource gap like to a native extension or to a Java type via our Java Integration. |
Too risky before 9.1.3.0 but this along with #3880 should be one of the first things we do for 9.1.4.0 so it can bake. |
Fixed, likely by changes for #4564. |
Confirmed fixed in my application! |
Environment
$ ruby -v
jruby 9.1.3.0-SNAPSHOT (2.3.0) 2016-08-12 93bd82f Java HotSpot(TM) 64-Bit Server VM 25.92-b14 on 1.8.0_92-b14 +jit [darwin-x86_64]
$ uname -a
Darwin macbeth-3.local 14.5.0 Darwin Kernel Version 14.5.0: Thu Jun 16 19:58:21 PDT 2016; root:xnu-2782.50.4~1/RELEASE_X86_64 x86_64
Expected Behavior
$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-darwin14]
$ irb
irb(main):001:0> :Renè
=> :Renè
Actual Behavior
$ ruby -v
'jruby 9.1.3.0-SNAPSHOT (2.3.0) 2016-08-12 93bd82f Java HotSpot(TM) 64-Bit Server VM 25.92-b14 on 1.8.0_92-b14 +jit [darwin-x86_64]
$ irb
irb(main):002:0> :René
ArgumentError: invalid byte sequence in UTF-8
from org/jruby/RubySymbol.java:274:in
inspect' from org/jruby/RubySymbol.java:259:in
inspect'from org/jruby/RubyKernel.java:1295:in
loop' from org/jruby/RubyKernel.java:1114:in
catch'from org/jruby/RubyKernel.java:1114:in
catch' from /Users/uwe/.rubies/jruby-9.1.3.0-snapshot/bin/irb:13:in
The text was updated successfully, but these errors were encountered: