-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
define_method using symbols string syntax works incorrectly #3880
Comments
Trivial repro: define_method(:"中文")do
puts "中文test"
end
中文 |
Another example that shows we're encoding identifiers and symbols differently:
This isn't really a problem with |
Comparative output from MRI:
|
I corrected the title to be more descriptive of what is broken. Note that this works in cases with no quote chars in the symbol: # coding: utf-8
define_method(:中文)do
puts "中文test"
end
中文 This is very specific to encoding with the newer-ish quote syntax (2.1?) of symbols specifically. I also wonder whether the reduced case and the interpolated version originally reported are the same problem but they are both in the parser even if so. As such, I think the other issues linked are problems but fixing this issue will fix none of the other linked issues (although #3719 might be the same kind of fix). |
@enebo Very strange! The parser is definitely handling the quoted case differently:
Oddly enough, both cases do have encoding of UTF-8, so there's simply a bug here turning string-like symbols into real symbols the same way non-string-like symbols are handled. |
Fun:
And @enebo discovered this:
|
So here is where we stand. We inconsistently store symbols in our symbol table. Specifically we lookup the symbol entry based on the hash of a Java String. The Strings we use are inconsistent and is either:
1 is what MRI also does. It is interesting to note this is mildly broken by design in that two symbols with same byte sequence but different encoding will trip over each other. It is what Ruby does so this is just an interesting aside. We started partially doing 2 because all methods in JRuby are keyed off properly encoded Java Strings. So if we use reflective API or try and call a method using something like send then we need to go from a symbol to the properly encoded Java String. By using 2 then our symbol table lines up with these method identifiers. After some discussion we think we need to fully embrace 1 with some changes:
I think the main risk will be making all our code uniform. When making anything which can be a symbol or identifier we need to audit all creation paths and make sure they are 8859_1 and know their encoding. Anything which can possibly print out anything involving symbols or identifiers must end up calling a helper method symbolDisplay(iso) -> encoded String. After having said all this ... THIS IS WAY TOO RISKY to try and change for 9.1.1.0 since it is a quick regression fix release. So we need to punt. This work also needs to happen on the front end of a new dev cycle to make sure we can address any fallout of making this change. Addendum: The iso8859_1 string as common lookup mechanism (this is also intern'd as well) is almost a complete analogue to MRI's ID. Under the covers it should be as cheap as a number and we cannot really use this string for anything we display with. I guess the only advantage of it will be that in debugging more of the time we will know what the ID represents at a glance. |
what is milestone "Post-9000" ? will the issue be fixed? |
@catfishuang It is just not targeted for our next point release atm. We need to prioritize when we fix this as it involves some largish changes. |
Still not working in JRuby master (9.1.9.0 in process) despite some symbol fixes. @enebo There's a good chance this might get fixed if we carry the bytelist symbol stuff through all the way. |
This is fixed by @enebo's work on identifiers and symbols for 9.2. |
* JRuby doesn't seem to be able to handle method names containing multi-byte strings correctly. * see jruby/jruby#1285 * see jruby/jruby#3880
Environment
jruby 9.1.0.0 (2.2.3) 2016-01-26 7bee00d Java HotSpot(TM) 64-Bit Server VM 24.79-b02 on 1.7.0_79-b15 +jit [Windows 7-amd64]
Expected Behavior
Actual Behavior
NameError: undefined local variable or method `??' for main:Object
it seems that the issue has not been resolved....... i checked it just now
The text was updated successfully, but these errors were encountered: