-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Truffle] Decode the path parameter to open #3202
Conversation
This allows the path to contain characters not represented by ISO-8859-1, which is the default encoding used for ByteLists.
How did you find this problem? Do file paths have an encoding? Aren't they just bytes? Your solution isn't ideal - we don't want to call We can fix that easily, but I'm still not sure what the problem is here or why this solves it. Can you elaborate? |
I found this problem while debugging another failure. The failure was in chdir_spec, and the reason it was failing was because of the way the runtime of the AOT compilation system was handling conversion of a C string to a Java string. In DirSpecs.create_mock_dirs it creates a file with this name. To the operating system the filenames are just bytes, but if you have a ruby string representing that filename, you need to make sure to always send the same bytes to the operating system. The issue is that, as the code is, if you create a file using I didn't think the solution I have is ideal, but I wanted to make you aware of the problem, since it was only by chance that I stumbled upon it. |
Maybe ByteList.toString() is to blame here since it has the encoding? |
JRuby's RubyString.toString() ends in Helpers.decodeByteList() which does the right thing. |
Ok, I'll still don't quite get it. But can you add an I thought JNR would be doing the right thing with a |
@chrisseaton I think it is as simple as |
I fixed it in ddcadaf. |
This allows the path to contain characters not represented by
ISO-8859-1, which is the default encoding used for ByteLists.
Here is a test case to expose this problem:
This results in an error: "No such file or directory". Delete (unlink) does do the decoding, but always assumes utf-8. Calling toString will use the encoding specified by the backing ByteList. It seems like some functions decode utf-8 always, others call toString.