You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not all sequences of bytes are valid UTF-8. A UTF-8 decoder should be prepared for:
A 4-byte sequence (starting with 0xF4) that decodes to a value greater than U+10FFFF
In Ruby 1.9.3,
>> 0x110000.chr(Encoding::UTF_8)
RangeError: invalid codepoint 0x110000 in UTF-8
from (irb):3:in `chr'
from (irb):3
from /Users/yous/.rvm/rubies/ruby-1.9.3-p547/bin/irb:12:in `<main>'
According to the UTF-8 definition (RFC 3629) the high and low surrogate halves used by UTF-16 (U+D800 through U+DFFF) are not legal Unicode values, and their UTF-8 encoding should be treated as an invalid byte sequence.
In Ruby 1.9.3,
>> 0xD800.chr(Encoding::UTF_8)
RangeError: invalid codepoint 0xD800 in UTF-8
from (irb):1:in `chr'
from (irb):1
from /Users/yous/.rvm/rubies/ruby-1.9.3-p547/bin/irb:12:in `<main>'
>> 0xDFFF.chr(Encoding::UTF_8)
RangeError: invalid codepoint 0xDFFF in UTF-8
from (irb):2:in `chr'
from (irb):2
from /Users/yous/.rvm/rubies/ruby-1.9.3-p547/bin/irb:12:in `<main>'
I'm using JRuby-1.7.13.
http://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences
In Ruby 1.9.3,
In JRuby,
http://en.wikipedia.org/wiki/UTF-8#Invalid_code_points
In Ruby 1.9.3,
In JRuby,
The text was updated successfully, but these errors were encountered: