-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid hash value for nested arrays #3884
Comments
That is very strange. I will point out that just using a hash of the AST has potential to collide. In the past I have used SHA1 or similar to reduce that risk, but they're obviously much more expensive than a simple hash calculation. |
Thanks! I will do the math but I guess a memory leak we introduce by caching is more likely than having an issue with hash collisions :) Anyway, I'll check the probability tomorrow, just need to recover some science from my mind places 🔬 |
Relates to #3884, where JRuby was always returning the same hash for certain combinations of nested array with an innermost empty array.
It turns out that the main issue was that we always hashed an empty array as 0, which had cascading effects for the rest of the |
@headius you're awesome, thank you! Will it be backported and which versions are affected? In other words could you tell me what versions of jruby we should test against (starting from 9k)? |
btw I tested the code and it works now! |
It will be in 9.1.1.0. If 1.7 exhibits the same problem we could backport there, but we only maintain one 9.x release at a time (i.e. there will never be a 9.0.6.0 now that 9.1 is out). |
@headius OK. We don't support jruby 1.7, only 9k, so it's not an issue for us, feel free to decide about backporting. |
My ported hash logic appears to have broken some recursive detection, so I'm backing off that change and trying to just do the |
@headius yeah, sure. A few minutes ago I found that something is not quite right too :) |
I have a patch to master that gets specs passing again (silly bit width mistake on my part): diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644
--- a/core/src/main/java/org/jruby/RubyArray.java
+++ b/core/src/main/java/org/jruby/RubyArray.java
@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
public RubyFixnum hash(ThreadContext context) {
final Ruby runtime = context.runtime;
int begin = RubyArray.this.begin;
- long h = (realLength << 32) & System.identityHashCode(RubyArray.class);
+ long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
for (int i = begin; i < begin + realLength; i++) {
h = (h << 1) | (h < 0 ? 1 : 0);
final IRubyObject value = safeArrayRef(runtime, values, i); Doing further testing on it locally before pushing. |
OK I'll be able to run the specs again In a couple of hours
|
Sorry...I was wrong, it doesn't pass specs. I'll report back here when I have everything passing. |
The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.
The fix here is as follows: * Always provide a non-zero hash for any array, including empty. * Better hash combination function with less collision. An improved recursion guard will go into the next release.
Ok, I've merged in a simpler fix for 9.1.1.0, which you could verify. You might also be interested in trying out the test-new-recursion branch, which has the improved logic. |
The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.
Thanks! I tested both solutions and here is what I got:
irb(main):017:0> [{a: 1}].hash
=> 5019126971421186946
irb(main):018:0> [{b: 1}].hash
=> 5019126971421186946 |
@flash-gordon I do not see that on the current branch, but I believe I had fixes go in after you tested.
I do have one MRI test failure to investigate. |
@headius you should test it in the same process gordon@mac ~/dev/jruby [test-new-recursion] $ ./bin/jruby -e 'p [{a: 1}].hash; p [{b: 1}].hash;'
6962820890589519712
6962820890589519712 |
Yeah I realized that afterward, and it turns out the MRI failure was the same issue. It's now fixed on the branch and will be merged into 9.1.2.0. Feel free to verify again if you like :-) |
Sure. I ran specs (again, 2.2K specs) ten times with random order and had no issues, so it seems to work fine and it's very awesome, thank you! |
Btw, when do you plan to release 9.1.1.0. I mean rough estimation of course. |
Within the next couple days. Blocked on #3867 and related issues right now (problems activating gems inside |
* See jruby/jruby#3884 for retails * I'll replace it with add 9.1.1.0 release when it is available
The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.
The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.
Relates to jruby/jruby#3884, where JRuby was always returning the same hash for certain combinations of nested array with an innermost empty array.
Environment
It seems that every version of jruby starting from at least 9k-pre1 has the following behavior.
Expected Behavior
Actual Behavior
We expect things to work in dry-validation because we use caching based on a hash of AST which in order is represented by a deeply nested array. The issue leads to returning the same result based on different outputs which surprises a lot.
Original values that you can use for testing a fix:
I don't know what exactly is going wrong but I tried to provide as minimal failing case as I could and it would supercool to have this fixed because you guys are doing a great job and having a support for jruby really matters for us, thank you!
The text was updated successfully, but these errors were encountered: