-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hash has bug to set encoding into key string wrongly when the key string is used once with different encoding #3405
Comments
JRuby 1.7.x and MRI works as expected, and only JRuby 9.0.1.0 works in wrong way. All outputs from script above in JRuby 9.0.1.0, JRuby 1.7.22 and MRI 2.2.2:
|
Huh, wacky. |
Ok, I think this is due to our Hash logic using the new "string".freeze optimization for String keys. They are getting deduplicated, but improperly not considering encoding in the dedup store. Options:
|
I will fix this today. |
I have chosen the second option for now. The proper way for us to dedup hash keys is to only do it for literal hashes with literal string keys, but that will require a bit more work in IR compilation. The short term fix of only using the cached string if the encoding matches will work for now (and may be necessary anyway). |
I'm not sure the best way to spec this out, since it depends on literal strings in hashes having specific encodings; this is not easy to spec in a single file. I've filed ruby/spec#162 to discuss that, and we'll proceed for now without a spec or test to go with my fix. |
I've committed a fix for the dedup cache to consider encoding. It's a trivial change, preventing new strings with the same contents but different encodings from updating the cache or picking up a wrong-encoding string. I'll file a separate issue to fix dedup behavior wrt Hash keys, and we'll work on specs separately in ruby/spec#162. |
Specs for #162. I confirmed it does test behavior from jruby/jruby#3405: jruby 9.0.4.0 (2.2.2) 2015-11-12 b9fb7aa Java HotSpot(TM) 64-Bit Server VM 25.60-b23 on 1.8.0_60-b27 +jit [darwin-x86_64] ..................F 1) Hash literal does not change encoding of literal string keys during creation FAILED Expected #<Encoding:ASCII-8BIT> to equal #<Encoding:UTF-8> /Users/headius/projects/rubyspec/language/hash_spec.rb:145:in `block in (root)' org/jruby/RubyBasicObject.java:1633:in `instance_eval' org/jruby/RubyEnumerable.java:1552:in `all?' org/jruby/RubyFixnum.java:301:in `times' org/jruby/RubyArray.java:1560:in `each' /Users/headius/projects/rubyspec/language/hash_spec.rb:6:in `<top>' org/jruby/RubyKernel.java:957:in `load' org/jruby/RubyBasicObject.java:1633:in `instance_eval' org/jruby/RubyArray.java:1560:in `each' Finished in 0.091000 seconds 1 file, 19 examples, 44 expectations, 1 failure, 0 errors jruby 9.0.5.0-SNAPSHOT (2.3.0) 2015-11-20 a8eab95 Java HotSpot(TM) 64-Bit Server VM 25.60-b23 on 1.8.0_60-b27 +jit [darwin-x86_64] ................... Finished in 0.101000 seconds 1 file, 19 examples, 46 expectations, 0 failures, 0 errors
On JRuby 9.0.1.0, Hash sets wrong encoding only when the key string is used as Hash key with different encoding.
Script to show situations to reproduce bugs:
Expected result is:
Actual result in JRuby 9.0.1.0 is:
hello
as hash keys are encoded in ASCII-8BITThe text was updated successfully, but these errors were encountered: