-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jruby 9.1.8.0: unknown encoding name: UTF8 (argument error) #4546
Comments
@LYNx785 as a workaround for now if you can change the env to UTF-8 things should start working for you. Looks like Ruby itself has no concept of 'UTF8' as a valid encoding name and we are passing that in from the Java side via that environment variable. |
The environment variable gets picked up by the jvm properly, but has no bearing on the error that jruby is throwing it seems. Here is JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Here I've deleted the environment variable completely:
Any ideas? |
@LYNx785 not now :) This is confusing. I guess we should figure out where this is leaking in to figure out how best to fix it. Worst case we can break from C Ruby and add UTF8 as a valid name for UTF-8 encodings. I don't actually see any harm in that but I am more curious right now about how it enters into our side of things. Can you run with env JRUBY_OPTS="-d -Xbacktrace.style=raw" I am hoping we can see more than this not being able to load a file. |
|
ok so I know this seems to be from some file which has something like: # encoding: UTF8
ruby source So somewhere there is a file you are loading which has this encoding in it. If you know how to compile java you could add the name of the file in RubyLexer.java:setEncoding which prints out getFile(). I should probably add UTF8 to our list of aliases for UTF-8. I would still like to know what that source file is. If it was generated by cucumber with that env set then that might be the reason why changing the env had no effect. |
I'll try to find the source file this morning, though its definitely unknown to me. |
well, it turns out a comment I had nestled atop a file where I make java based service calls contained the offending 'UTF8'.
Changing it immediately fixed the issue! :) That said, is pulling an assumed encoding from a comment expected behavior? |
AHA! C Ruby (and JRuby) just looking for 'coding' and = or :. The exact expression is slightly more complicated but nonetheless MRI also errors out:
First line of Ruby comments is always scanned for encoding and pragmas like frozen-string-literal. Have you env comment been on the second line you would not have not run into this issue. |
Sounds like this is a "won't fix" or "invalid" then unless MRI agrees to fix it? |
interesting. This isn't my first rodeo with (j)ruby, but I've never encountered this until now (nor did i realize comments on the first line are scanned in such a way). Thank for the help folks! |
@headius we could just add non-standard alias UTF8 in jcodings for this since Java io package uses this name as a default (nio seemingly fixed it to be UTF-8). We have never seen this but I doubt it would do any harm? I am a little ambivalent... |
@enebo It would do harm if you're working on code primarily using JRuby and encounter this error on MRI subsequently while testing. |
Yeah it is possible but m17n has existed like a decade and this string only happened once for us as an issue. I think the likelihood of it causing a compat issue is very small. Otoh, it is not standard so I am ok not adding it too. Maybe a better solution would be to open an issue with mri to consider not being as lax on the format of coding? |
Yeah but for the given example, what should make it fail to parse the format of the coding? # on windows add the environment variable JAVA_TOOL_OPTIONS -Dfile.encoding=UTF8 Should the trailing "en" be allowed but the presence of other trailing characters other than whitespace be disallowed? Do you mean something along the lines of this regex
|
@preetpalS we essentially have a decomposed version of that regexp as simple string parsing code in our lexer. 'encoding' and 'coding' both are specified as valid and has been since the beginning so I don't think those by themselves are what I would consider changing. My take is that pragma lines (coding is not the only valid pragma now frozen-string-literal, warn-indent...) is that a pragma line should only contain pragmas or at least only begin with pragmas. The notion that I will be doing something like: # this is a nice script frozen-string-literal = true This seems like an unusual thing to do and I even question why someone would want to. So I would probably force use of pragmas to only be leading text after the comment symbol (#). Text afterwards I would probably allow since someone might want to explain why they added it. An optional rule could be that |
I think that this probably makes more sense as something that's caught using a linting tool like Rubocop (I could be wrong though) rather than enforced by the language. I say this for the following reasons:
;;; some-file.el --- Some commentary that explains things -*- lexical-binding: t; -*-
All these points could be argued either way. That said I still think that it would be better if that comment on the first line that caused this issue would not be parsed as a pragma. This particular issue probably could be closed though. |
@preetpalS yeah your emacs example may be the reason why extra text is allowed on either side. I am not invested enough to push for any changes in any case. :) |
Hi, we are stumbling a related issue while running Asciidoctorj.
|
I am the user @abelsromero talked about. To clarify:
I changed JVMs via JAVA_HOME. |
@enikao @abelsromero on my system if I print out file.encoding by default it is Cp1252.
@enikao can you try that on J9 and see what it prints and possibly also set file.encoding to that explicitly if that is not what it is? I am increasingly feeling like adding an unofficial alias for UTF8 since this issue seems like it will never really go away (as in other environments will leak this name into Ruby somehow). |
Ok so JRuby 9.2.8.0 has already added the UTF8 alias so our next release should not be bothered when it sees that specific encoding. I still would like to figure out j9 here. It is possible that alias will not actually solve it there when no explicit user-set file.encoding is set. I did peruse some encoding logic in JRuby and we have special logic for Ruby's default external encoding which is basically:
So I am assuming J9 must be set to something or it would just work. Seemingly this works with hotspot so Cp1252 is not a problem either (would be my guess at least). So what is J9 setting file.encoding to? |
I run jRuby |
@abelsromero ah great! So I am guessing J9 may actually pass in UTF8 somehow for file.encoding when you do not specify it explicitly. |
Don't sing victory so soon. As I tried to explain, I am not experiencing the same behavior as @enikao. Given the fix is in JRuby it should work, but let's wait 🤞 |
@abelsromero @lopex and I are discussing the fact we have not actually added the alias yet so that is interesting :P In may we did add some windows-specific logic for filesystem encoding which if unset or not valid it just uses default external which would be UTF-8. We are still planning on pushing a new jcodings (where the new alias gets added) so an explicit UTF8 declaration should work anywhere encodings can be specified. |
Just to make sure I pick the right one: Which JRuby should I use for the test? (I currently don't have any JRuby installed, asciidoctor-maven-plugin brings its own). |
The version is not out, we need to wait until it is officially released. But you can build from source at least to confirm the issue is fixed for you too 😄 Just clone jruby repo, install the complete jar in your local maven repo with <dependency>
<groupId>org.jruby</groupId>
<artifactId>jruby-complete</artifactId>
<version>9.2.8.0-SNAPSHOT</version>
</dependency> |
Result of asking JRuby directly:
|
asciidoctor-maven-plugin also works on J9 when using |
jruby/jruby#4546 causes `unknown encoding error: UTF8` during the `downloadAndInstallJruby` gradle task when run on windows build nodes. This became an issue when elastic/infra@d855603 was merged onto the build workers
Environment
jruby 9.1.8.0 (2.3.1) 2017-03-06 90fc7ab Java HotSpot(TM) 64-Bit Server VM 25.121-b13 on 1.8.0_121-b13 +jit [mswin32-x86_64]
OS: Windows 7 (x64)
Installed gems:
Environment variables:
Expected Behavior
Actual Behavior
Note
The text was updated successfully, but these errors were encountered: