You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a class or instance variable has multibyte characters, it needs to marshal that name as though it were an encoded symbol. Basically, where we currently dump a "plain" bytes version of the identifer, we need to dump it with the "encoding instance variable" logic used for proper symbols. This way we can reconstitute the original name and match it up properly on the load side.
Actual Behavior
In actuality, we are stripping the multibyte nature of those names, and they come out mangled on the other side.
Given this script (based on MRI TestMarshal#test_unloadable_userdef:
classFoodefbarc=eval" class X\u{23F0 23F3} < Time class << self undef _load end self end"o=c.newpMarshal.load(Marshal.dump(o))endendFoo.new.bar
We currently get the following marshaled output (note the mangled name of the class):
Circling back to this because we're fixing up symbol encoding stuff.
If I run this as-is today on 9.1.9.0 (master) I get this:
$ jruby blah.rb
ArgumentError: undefined class/module Foo::X??
load at org/jruby/RubyMarshal.java:145
bar at blah.rb:11
<main> at blah.rb:14
The multibyte chars are at least rendering as question marks now, but the error remains.
If I run this without the eval (and use Xø) as the name of the class, the error changes to a marshaling error:
$ jruby blah.rb
TypeError: class Foo::Xø needs to have method `_load'
load at org/jruby/RubyMarshal.java:145
<class:Foo> at blah.rb:11
<main> at blah.rb:1
...which matches MRI. So it seems that the problem relates to the eval of the multibyte characters.
This does not appear to be fixed in 9.2, but it may simply be a lack of symbol/identifier logic fixes for Marshal. @enebo maybe you can take a look at this?
Environment
JRuby 9.0.5.0 (and likely 9.1.0.0)
Expected Behavior
When a class or instance variable has multibyte characters, it needs to marshal that name as though it were an encoded symbol. Basically, where we currently dump a "plain" bytes version of the identifer, we need to dump it with the "encoding instance variable" logic used for proper symbols. This way we can reconstitute the original name and match it up properly on the load side.
Actual Behavior
In actuality, we are stripping the multibyte nature of those names, and they come out mangled on the other side.
Given this script (based on MRI
TestMarshal#test_unloadable_userdef
:We currently get the following marshaled output (note the mangled name of the class):
I attempted to fix this by actually creating symbols for these elements, but there's something missing on the loading side. Here's my patch: https://gist.github.com/headius/99779f55add2d007ca42
The marshaled output looks better:
But I still get an error from the full script above:
We are not reconstituting the string properly, it seems.
Since this is not a regression or new behavior, I'm going to exclude the related CRuby 2.3 tests for now.
The text was updated successfully, but these errors were encountered: