-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nokogiri breaks jRuby 9000 java.nio.charset.UnsupportedCharsetException: ASCII-8BIT #2601
Comments
Interesting. This is likely a bug in Nokogiri, but perhaps exposed by fixes in 9k. The logic in encodeStringToHtmlEntity is blindly trying to load an encoding for "ASCII-8BIT", but there is no such encoding on the JVM. The proper encoding to use for full 8-bit ASCII would be ISO-8859-1 (or equivalent, but that's the typical one). The problem is probably exposed by us providing a string to Nokogiri that is ASCII-8BIT encoding in Ruby. That could be a bug or it could be a corrected issue. @tenderlove @flavorjones @yokolet I'm not sure who to ping about this. If absolutely necessary, I can look into fixing it myself, but we really need to get Nokogiri updated for JRuby 9k. |
@headius - we recently had another issue filed concerning ASCII-8BIT encoding on JRuby: Honestly, it's a bit of a mystery where this encoding is coming from at the moment, but I'll try to look into it this week. |
@flavorjones Were you ever able to look into this? I don't think there's a JRuby bug here, so I'm going to close this, but we can help fix it in Nokogiri with your assistance. |
@lephyrius You should probably re-file this at Nokogiri, so it doesn't get lost if we or @flavorjones don't get to a fix right away. |
The funny thing is that I haven't seen this since rc1 . |
Hmmm. My only thought on @lephyrius last comment is we have renamed some plain (e.g. not foo19 -> foo) to redirect to 19+ versions. Possibly nokogiri is still calling what was the 1.8 method (or plain named version) but it is now routing through an encoding path? Although if this was the case then 1.7.x in --1.9 mode would be broken and 9k would be passing. |
@headius I looked, but was unable to reproduce. If someone can provide a reproducible script, that would help diagnose the issue. |
I get this backtrace when I use loofah:
java.nio.charset.UnsupportedCharsetException: ASCII-8BIT
I use this version of ruby:
jruby 9.0.0.0-SNAPSHOT (2.2.0p0) 2015-02-16 05aad64 Java HotSpot(TM) 64-Bit Server VM 25.0-b70 on 1.8.0-b132 +jit [darwin-x86_64]
The text was updated successfully, but these errors were encountered: