-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegexpError "invalid pattern in look-behind" for certain Regexps since 9.1.16.0 #5086
Comments
MRI has special case for us-ascii regexps and strings with 7-bit coderange where it uses regexp encoding for the match, whereas we're creating new regexp with actual string encoding. So ultimately this could be a big performance penalty for two reasons:
Fixed in 36b44df |
Thanks a lot @lopex, going to test it tomorrow with our code base! |
I think the root cause also deserves some explanation. Character series like ss and ff are special because in Unicode for example ss is https://en.wikipedia.org/wiki/%C3%9F. So right now there's some caviats regarding MRI "us-ascii regexps by default":
If either string or regexp happens to end up with unicode encoding, look-behind will blow. |
Also, great majority of regexps will end up being up to multiple times faster by default, with cases like e ("_" * 1000) + "" =~ /[a-z]+/i being 35 times faster. |
Thank you for the detailed explanation! Unfortunately I was unable to build JRuby locally, but I can try again when 9.1.17.0 is released :-) |
@naag I just corrected an issue with our nightlies link: http://jruby.org/nightly (click stable). |
Onigmo issue: k-takata/Onigmo#92 |
* jruby-9.1: [fix] cast nsec nanos to long to avoid "overflow" with double value Handle this deprecation differently. Default to Java 9 bytecode for any java.specification.version>1.8. WrapperMethod is still needed for visibility. Revert "Finally eliminate use of WrapperMethod." Eliminate deprecation warnings in test suite. Finally eliminate use of WrapperMethod. Fix most deprecated calls. Handle error when attempting to connect to IP6 with default INET4. Add test_coverage to jruby.index. Add test for null filename in coverage. jruby#5111 Do not attempt to add coverage for null filename. Fixes jruby#5111. Add basic specs for Exception#backtrace_locations. Exception.backtrace_locations should persist and be mutable. Return nil if no backtrace has been captured. Fixes jruby#5099. fix for jruby#5086, RegexpError invalid pattern in look-behind for certain Regexps since 9.1.16.0
We've been running a regular expression matching certain file extensions since JRuby 9.1.12.0, but after upgrading to 9.1.16.0, it breaks. 9.1.15.0 is the last release where it works.
We've stripped down the regular expression a lot and it still fails with
RegexpError: invalid pattern in look-behind: /^.*(?<!css)$/i
for the most basic case. Example:Another expression that fails is
/^.*(?<!tiff)$/i
. Both work if we change the last letter froms
to something else ort
to something else. They also work in case sensitive mode. MRI 2.4.3 works fine with all of these variants.Is there any other input you need?
Environment
The text was updated successfully, but these errors were encountered: