-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Java invoker cache logic to allow encapsulation, sync. #3395
Conversation
The move to IntHashMap in RubyToJavaInvoker improved throughput for multiple-arity calls, but since IntHashMap is not synchronized this could cause corrupted data under concurrent load as seen in CallableSelector to encapsulate the cache and synchronize access to it. I believe this is a minimum effort to make the cache access safe and fix #3394. I have not measured the impact of this synchronization in a high concurrency setting. It may indeed turn out to be more overhead and a worse bottleneck than using ConcurrentHashMap with Integer objects. Unfortunately there's no middle ground here unless we build or adopt some concurrency-friendly int-based map. Another heavy option would be to maintain two references to IntHashMap in RubyToJavaInvoker, copying from a synchronized writable copy to a read-only copy after every write. This would double the size of the cache structures in memory but would provide a fast, unsynchronized cache for read-mostly signature lookups. This pattern is also easier to support with the refactoring in this commit.
FWIW here's the diff that would implement a copy-on-write strategy here to avoid synchronization on reads. I suspect most of these caches are reasonably small, but I have done no measurement of either the performance impact of synchronization nor the memory impact (and cold cache startup impact) of copy-on-write. diff --git a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
index 222636f..29ac1a4 100644
--- a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
+++ b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
@@ -35,7 +35,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
// in case multiple callables (overloaded Java method - same name different args)
// for the invoker exists CallableSelector caches resolution based on args here
- final IntHashMap<T> cache;
+ final IntHashMap<T> writeCache;
+ volatile IntHashMap<T> readCache;
private final Ruby runtime;
@@ -54,7 +55,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
minVarArgsArity = getMemberArity(member) - 1;
}
- cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ readCache = writeCache;
this.javaCallable = callable;
this.javaCallables = null;
@@ -84,7 +86,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
callables = null;
- cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ readCache = writeCache;
}
else {
callable = null; maxArity = -1; minArity = Integer.MAX_VALUE;
@@ -139,7 +142,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
varargsCallables = varArgs.toArray( createCallableArray(varArgs.size()) );
}
- cache = newCallableCache();
+ writeCache = newCallableCache();
+ readCache = (IntHashMap<T>)writeCache.clone();
}
this.javaCallable = callable;
@@ -192,14 +196,13 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
public T getSignature(int signatureCode) {
- synchronized (cache) {
- return cache.get(signatureCode);
- }
+ return readCache.get(signatureCode);
}
public void putSignature(int signatureCode, T callable) {
- synchronized (cache) {
- cache.put(signatureCode, callable);
+ synchronized (writeCache) {
+ writeCache.put(signatureCode, callable);
+ readCache = (IntHashMap<T>)writeCache.clone();
}
}
diff --git a/core/src/main/java/org/jruby/util/collections/IntHashMap.java b/core/src/main/java/org/jruby/util/collections/IntHashMap.java
index fb9a16c..1a09297 100644
--- a/core/src/main/java/org/jruby/util/collections/IntHashMap.java
+++ b/core/src/main/java/org/jruby/util/collections/IntHashMap.java
@@ -397,6 +397,19 @@ public class IntHashMap<V> {
}
}
+ @Override
+ public Object clone() {
+ IntHashMap<V> newMap = new IntHashMap<>(table.length, loadFactor);
+ for (int i = 0; i < table.length; i++) {
+ Entry<V> entry = table[i];
+ while (entry != null) {
+ newMap.put(entry.getKey(), entry.getValue());
+ entry = entry.next;
+ }
+ }
+ return newMap;
+ }
+
@SuppressWarnings("unchecked")
public static <U> IntHashMap<U> nullMap() { return NullMap.INSTANCE; }
|
👍 looks good. unless the user constantly comes up with a different parameters signature (very unlikely) the write cache won't be used, so it should be good. |
For the record, I tried using a read/write lock instead of copy-on-write, and the results indicate that COW is still probably a better option. Based on COW of IntHashMap:
Single IntHashMap with read/write lock:
So the read/write lock is slightly slower in the uncontended case but considerably slower in the contended case. I think we'll go with the COW version for now. The patch, for future reference: diff --git a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
index 29ac1a4..b80af7a 100644
--- a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
+++ b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
@@ -6,6 +6,9 @@ import java.lang.reflect.Member;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
import java.util.ArrayList;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReadWriteLock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
import org.jruby.Ruby;
import org.jruby.RubyModule;
@@ -35,8 +38,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
// in case multiple callables (overloaded Java method - same name different args)
// for the invoker exists CallableSelector caches resolution based on args here
- final IntHashMap<T> writeCache;
- volatile IntHashMap<T> readCache;
+ final IntHashMap<T> cache;
+ final ReadWriteLock lock = new ReentrantReadWriteLock();
private final Ruby runtime;
@@ -55,8 +58,7 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
minVarArgsArity = getMemberArity(member) - 1;
}
- writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
- readCache = writeCache;
+ cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
this.javaCallable = callable;
this.javaCallables = null;
@@ -86,8 +88,7 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
callables = null;
- writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
- readCache = writeCache;
+ cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
}
else {
callable = null; maxArity = -1; minArity = Integer.MAX_VALUE;
@@ -142,8 +143,7 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
varargsCallables = varArgs.toArray( createCallableArray(varArgs.size()) );
}
- writeCache = newCallableCache();
- readCache = (IntHashMap<T>)writeCache.clone();
+ cache = newCallableCache();
}
this.javaCallable = callable;
@@ -196,13 +196,20 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
public T getSignature(int signatureCode) {
- return readCache.get(signatureCode);
+ lock.readLock().lock();
+ try {
+ return cache.get(signatureCode);
+ } finally {
+ lock.readLock().unlock();
+ }
}
public void putSignature(int signatureCode, T callable) {
- synchronized (writeCache) {
- writeCache.put(signatureCode, callable);
- readCache = (IntHashMap<T>)writeCache.clone();
+ lock.writeLock().lock();
+ try {
+ cache.put(signatureCode, callable);
+ } finally {
+ lock.writeLock().unlock();
}
}
|
Also FYI, the numbers for COW IntHashMap versus unsynchronized IntHashMap are roughly identical, as expected. I did not do a comparison with ConcurrentHashMap. Another note: CHM is big. Having two copies of the comparatively tiny IntHashMap probably isn't anywhere near as large in memory. |
Refactor Java invoker cache logic to allow encapsulation, sync.
very nice - thank you for attempting and sharing this, |
The move to IntHashMap in RubyToJavaInvoker improved throughput
for multiple-arity calls, but since IntHashMap is not synchronized
this could cause corrupted data under concurrent load as seen in
CallableSelector to encapsulate the cache and synchronize access
to it. I believe this is a minimum effort to make the cache access
safe and fix #3394.
I have not measured the impact of this synchronization in a high
concurrency setting. It may indeed turn out to be more overhead
and a worse bottleneck than using ConcurrentHashMap with Integer
objects. Unfortunately there's no middle ground here unless we
build or adopt some concurrency-friendly int-based map.
Another heavy option would be to maintain two references to
IntHashMap in RubyToJavaInvoker, copying from a synchronized
writable copy to a read-only copy after every write. This would
double the size of the cache structures in memory but would provide
a fast, unsynchronized cache for read-mostly signature lookups.
This pattern is also easier to support with the refactoring in
this commit.