-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRITICAL! Regular expression hang with string containing umlaut characters #2998
Comments
Do you see this using the same regexp from Ruby code? I could not reproduce in a Ruby session on 9k, but I'm not sure if the expression is compiling right. |
It was working with JRuby 1.6.x, and started to fail with 1.7.16 (at least that's what we moved to). So, that appears to be a regression in JRuby, and because of this bug, our tooling was unusable. |
Your case works correctly using the JRuby 9k jar for me...
...but I confirmed it fails with JRuby 1.7. I believe the problem is 1.7 using an older version of our regexp library JOni. I'll update it to latest and confirm that fixes the issue for 1.7.21. |
Hmm...updating JOni did not fix the problem. It must be doing something wrong with encodings for this test case. |
Uh, Thanks for checking. Would you suggest any other workaround to get it working ? I can't move back to JRuby 1.6, as we already have dependencies with JRuby 1.7.x. I'm fine with updating any underlying libraries/JOni to get this up and running. |
I'm still looking into the problem. It's probably something we need to fix in 1.7.21. |
Ok, I think the problem here may be the way you're using our internal APIs. RubyRegexp#match_m is a 1.8-mode method, not generally intended to be used with encodings from 1.9+ mode. However, you're creating a Ruby instance using newInstance, which defaults to 1.9 mode and encoding-related logic is activated. I suspect it's a combination of the two causing this, but I have not found the root issue. If I modify your code to use What tool is this? Why are you going at the internal JRuby APIs directly? Perhaps we can find a better solution for you. |
It is used by an Eclipse plugin to format/refactor the code when user type the code in the IDE. Users can contribute auto-indentation formats through ruby bundles, and the IDE loads those bundles, generate the regular expression as specified by user's specification, and then apply the formatting rules if the user's input in the IDE matches. Here's the code I was talking about: |
Ahh, I ses, this is code from Aptana's Ruby tooling. I expect that code has not been updated in some time. I do have another workaround, if there's no way to avoid using these RubyRegexp API calls: RubyInstanceConfig config = new RubyInstanceConfig();
config.setCompatVersion(CompatVersion.RUBY1_8);
Ruby runtime = Ruby.newInstance(config); This creates the new runtime explicitly in Ruby 1.8.7 mode, avoiding the behavioral conflicts with Ruby 1.9.3 mode's encoding support. Your example runs to completion with this change, while still using match_m. I must caution, however, that these are very much internal APIs not intended to be used outside of JRuby. They're public only because we need to use them across packages. And in JRuby 9k, match_m and match_m19 are equivalent and only use the encoding-aware logic, which may cause problems for you in the future. A better solution that uses our regex engine (https://github.com/jruby/joni) directly, or which uses our embedding APIs to run this logic as Ruby code, would both be more robust than this. |
FWIW, this probably started failing when you moved to 1.7 because 1.7 defaults to 1.9.3 mode (and so Aptana's use of non-1.9-mode APIs started to blow up). |
@headius Thanks for your suggestions. We will definitely incorporate the 1.9 mode encoding since we moved to JRuby 1.7. However, I'm considering to move over to regex engine (joni) eventually. Thanks again ! |
Closing as third-party's issue. |
If the string contain umlaut characters, then JRuby hangs while trying to match the regular expression pattern. The below code snippet always hang with JRuby 1.7.20
The text was updated successfully, but these errors were encountered: