-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression with Time.parse
from 1.7 to 9k
#3922
Comments
The interpreter in 9k is known to be slower (for now) so this and your other bug are not terribly surprising. However, these give us some good examples to work with. We would like to get get the interpreter running as fast as possible. I am curious about your setup. Is there a reason you need to throw the ScriptingContainer away every time? Using a single container and letting things JIT should bring better performance. |
Here's a JVM sampled profile of execution of your benchmark script, loaded repeatedly in a loop with
There's a very large amount of samples triggering inside our regexp engine Joni, and then a number of interpreter methods after that. This also shows me that we are getting to the fully-processed IR and not staying in the simple, unoptimized IR ( |
Interesting to note: JRuby 1.7 with JIT is about 2x faster on this benchmark than JRuby 9k with JIT...so there may be something other than Ruby execution slowing this down. |
The profile for JIT is also interesting...it's almost looking like the important difference is in Regexp.
|
I gathered another profile using We are definitely hitting Regexp hard but the code in question has not changed significantly between 1.7 and 9k. Here's the Ruby code getting hit for the top item: if str.sub!(
/(
(?:
\d+\s*:\s*\d+
(?:
\s*:\s*\d+(?:[,.]\d*)?
)?
|
\d+\s*h(?:\s*\d+m?(?:\s*\d+s?)?)?
)
(?:
\s*
[ap](?:m\b|\.m\.)
)?
|
\d+\s*[ap](?:m\b|\.m\.)
)
(?:
\s*
(
(?:gmt|utc?)?[-+]\d+(?:[,.:]\d+(?::\d+)?)?
|
[[:alpha:].\s]+(?:standard|daylight)\stime\b
|
[[:alpha:]]+(?:\sdst)?\b
)
)?
/ix,
' ') Multi-line regex always mess with my head, but this does appear to be a pretty complex regular expression. I also did an allocation profile. The top item was |
A quick patch to cache the ByteCodeMachine for a given Regexp did not appear to improve performance any, so at least it doesn't seem like the construction of the machine is part of this cost. |
@lopex For what it's worth, here's the patch I came up with in Joni for caching most-recently-used Matcher: diff --git a/src/org/joni/ByteCodeMachine.java b/src/org/joni/ByteCodeMachine.java
index 5cf643e..dabf439 100644
--- a/src/org/joni/ByteCodeMachine.java
+++ b/src/org/joni/ByteCodeMachine.java
@@ -36,6 +36,8 @@ import org.joni.constants.OPSize;
import org.joni.exception.ErrorMessages;
import org.joni.exception.InternalException;
+import java.util.Arrays;
+
class ByteCodeMachine extends StackMachine {
private static final int INTERRUPT_CHECK_EVERY = 30000;
int interruptCheckCounter = 0; // we modulos this to occasionally check for interrupts
diff --git a/src/org/joni/Matcher.java b/src/org/joni/Matcher.java
index 5b623fe..7aad74a 100644
--- a/src/org/joni/Matcher.java
+++ b/src/org/joni/Matcher.java
@@ -33,9 +33,9 @@ public abstract class Matcher extends IntHolder {
protected final Regex regex;
protected final Encoding enc;
- protected final byte[]bytes;
- protected final int str;
- protected final int end;
+ protected byte[]bytes;
+ protected int str;
+ protected int end;
protected int msaStart;
protected int msaOptions;
@@ -83,6 +83,32 @@ public abstract class Matcher extends IntHolder {
return msaEnd;
}
+ public final Regex getRegex() {
+ return regex;
+ }
+
+ public final byte[] getBytes() {
+ return bytes;
+ }
+
+ public void init(byte[] bytes, int begin, int end) {
+ this.bytes = bytes;
+ this.str = begin;
+ this.end = end;
+ }
+
+ public void clear() {
+ bytes = null;
+ if (msaRegion != null) msaRegion.clear();
+ msaBegin = 0;
+ msaBestLen = 0;
+ msaBestS = 0;
+ msaEnd = 0;
+ msaOptions = 0;
+ this.str = 0;
+ this.end = 0;
+ }
+
protected final void msaInit(int option, int start) {
msaOptions = option;
msaStart = start;
diff --git a/src/org/joni/Regex.java b/src/org/joni/Regex.java
index 3b77a3a..be6886e 100644
--- a/src/org/joni/Regex.java
+++ b/src/org/joni/Regex.java
@@ -25,6 +25,7 @@ import static org.joni.Option.isDontCaptureGroup;
import java.util.IllegalFormatConversionException;
import java.util.Iterator;
+import java.util.concurrent.atomic.AtomicReferenceFieldUpdater;
import org.jcodings.Encoding;
import org.jcodings.specific.ASCIIEncoding;
@@ -60,6 +61,9 @@ public final class Regex {
public WarnCallback warnings;
public MatcherFactory factory;
+ volatile Matcher lastMatcher;
+ static final AtomicReferenceFieldUpdater lastMatcherUpdater = AtomicReferenceFieldUpdater.newUpdater(Regex.class, Matcher.class, "lastMatcher");
+
final Encoding enc;
int options;
int userOptions;
@@ -166,9 +170,19 @@ public final class Regex {
}
public Matcher matcher(byte[]bytes, int p, int end) {
+ Matcher last = lastMatcher;
+ if (last != null && lastMatcherUpdater.compareAndSet(this, last, null)) {
+ last.init(bytes, p, end);
+ return last;
+ }
return factory.create(this, bytes, p, end);
}
+ public void returnMatcher(Matcher matcher) {
+ matcher.clear(matcher);
+ lastMatcher = matcher;
+ }
+
public int numberOfCaptures() {
return numMem;
} |
@headius thanks for looking at these - greatly appreciated.
Unfortunately, there is a reason we need to terminate them. We don't do it on every single execution, but we have a setting that basically causes them to be destroyed and replaced after a certain period of time. The reason is that we allow our users to extend our applications via Ruby, and many of our users don't have a deep background in software development. Therefore, it's not entirely unlikely that the Ruby code we end up loading will contain memory leaks, and throwing away the ScriptingContainer instances allows us to free up the leaked memory rather than eventually OOM'ing the application server. Many of our users come from a world where the Ruby code was run as a Rack application under an Apache/Passenger setup. In that world, Passenger spawns a specific number of MRI Ruby processes to handle requests, and it will kill off and replace those MRI processes periodically, for the same reasons I described above. Because of that, memory leaks in user extension code don't tend to cause major problems, and that's a trait of our application that it's important for us to retain when running under JRuby. |
@headius Is the one-off matcher still a suspect here ? I remember our irc discussion wrt earlier String implementations that were reusing the matcher. |
Link with #4112. |
@lopex I never managed to get my matcher-caching to improve perf more than noise effects. I don't suspect it as a primary bottleneck at this point. |
think we finally caught up in upcoming 9.2.1.0:
|
Environment
JRuby 1.7.20.1, JRuby 9.1.0.0, JRuby 9.1.1.0; run from Java via ScriptingContainer, with various values tested for the setCompileMode method. Standalone script execution with no frameworks.
Expected Behavior
Expected performance for
Time.parse
to be on par with performance for 1.7.x.Actual Behavior
Time.parse
seems to be roughly 60% slower in JRuby 9k than in 1.7.x.I have set up a github repo with a simple reproducer here:
https://github.com/cprice404/jruby9k-benchmarks/tree/19a2719e4edddc17e3f1fabbf311b9d80c6eba80/time-parse-comparison
Here are some relevant numbers from the benchmark output (the rest of it is available in the linked reproducer repro if you are interested):
Also tested with
compileMode
set toJIT
; that doesn't seem to make much difference in 1.7, and makes the numbers look worse in 9k (though the benchmark may not have had a long enough execution time to give a fair characterization of JIT).The text was updated successfully, but these errors were encountered: