Invalid hash value for nested arrays #3884

flash-gordon · 2016-05-12T18:46:16Z

Environment

It seems that every version of jruby starting from at least 9k-pre1 has the following behavior.

Expected Behavior

2.3.1 :001 > [1, [1, []]].hash
 => 2397343324588010773
2.3.1 :002 > [2, [2, []]].hash
 => 4238421788679554352

Actual Behavior

jruby-9.1.0.0 :086 > [1, [1, []]].hash
 => 0
jruby-9.1.0.0 :087 > [2, [2, []]].hash
 => 0

We expect things to work in dry-validation because we use caching based on a hash of AST which in order is represented by a deeply nested array. The issue leads to returning the same result based on different outputs which surprises a lot.

Original values that you can use for testing a fix:

jruby-9.1.0.0 :001 > [[:and, [[:val, [:predicate, [:key?, [:age]]]], [:key, [:age, [:predicate, [:lt?, [23]]]]]]]].hash
 => 8481884717
jruby-9.1.0.0 :002 > [[:and, [[:val, [:predicate, [:key?, [:foo]]]], [:key, [:foo, [:predicate, [:lt?, [23]]]]]]]].hash
 => 8481884717

I don't know what exactly is going wrong but I tried to provide as minimal failing case as I could and it would supercool to have this fixed because you guys are doing a great job and having a support for jruby really matters for us, thank you!

headius · 2016-05-12T20:31:21Z

That is very strange.

I will point out that just using a hash of the AST has potential to collide. In the past I have used SHA1 or similar to reduce that risk, but they're obviously much more expensive than a simple hash calculation.

flash-gordon · 2016-05-12T21:04:43Z

Thanks! I will do the math but I guess a memory leak we introduce by caching is more likely than having an issue with hash collisions :) Anyway, I'll check the probability tomorrow, just need to recover some science from my mind places 🔬

Relates to #3884, where JRuby was always returning the same hash for certain combinations of nested array with an innermost empty array.

headius · 2016-05-12T21:54:36Z

It turns out that the main issue was that we always hashed an empty array as 0, which had cascading effects for the rest of the Array#hash calculation. I went ahead and re-ported MRI's logic with some JRuby-specific tweaks.

flash-gordon · 2016-05-12T22:24:10Z

@headius you're awesome, thank you! Will it be backported and which versions are affected? In other words could you tell me what versions of jruby we should test against (starting from 9k)?

flash-gordon · 2016-05-12T22:24:27Z

btw I tested the code and it works now!

headius · 2016-05-12T22:58:32Z

It will be in 9.1.1.0. If 1.7 exhibits the same problem we could backport there, but we only maintain one 9.x release at a time (i.e. there will never be a 9.0.6.0 now that 9.1 is out).

flash-gordon · 2016-05-13T07:31:28Z

@headius OK. We don't support jruby 1.7, only 9k, so it's not an issue for us, feel free to decide about backporting.

headius · 2016-05-13T17:43:53Z

My ported hash logic appears to have broken some recursive detection, so I'm backing off that change and trying to just do the [] hash fix. I think that will be enough, but we might need you to reverify.

flash-gordon · 2016-05-13T17:45:49Z

@headius yeah, sure. A few minutes ago I found that something is not quite right too :)

headius · 2016-05-13T17:54:21Z

I have a patch to master that gets specs passing again (silly bit width mistake on my part):

diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644
--- a/core/src/main/java/org/jruby/RubyArray.java
+++ b/core/src/main/java/org/jruby/RubyArray.java
@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
     public RubyFixnum hash(ThreadContext context) {
         final Ruby runtime = context.runtime;
         int begin = RubyArray.this.begin;
-        long h = (realLength << 32) & System.identityHashCode(RubyArray.class);
+        long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
         for (int i = begin; i < begin + realLength; i++) {
             h = (h << 1) | (h < 0 ? 1 : 0);
             final IRubyObject value = safeArrayRef(runtime, values, i);

Doing further testing on it locally before pushing.

flash-gordon · 2016-05-13T17:58:58Z

OK I'll be able to run the specs again In a couple of hours
On Пт, 13 мая 2016 г. at 20:54, Charles Oliver Nutter <
notifications@github.com> wrote:

I have a patch to master that gets specs passing again (silly bit width
mistake on my part):

diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644--- a/core/src/main/java/org/jruby/RubyArray.java+++ b/core/src/main/java/org/jruby/RubyArray.java@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
public RubyFixnum hash(ThreadContext context) {
final Ruby runtime = context.runtime;
int begin = RubyArray.this.begin;- long h = (realLength << 32) & System.identityHashCode(RubyArray.class);+ long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
for (int i = begin; i < begin + realLength; i++) {
h = (h << 1) | (h < 0 ? 1 : 0);
final IRubyObject value = safeArrayRef(runtime, values, i);

Doing further testing on it locally before pushing.

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#3884 (comment)

headius · 2016-05-13T17:59:28Z

Sorry...I was wrong, it doesn't pass specs. I'll report back here when I have everything passing.

headius · 2016-05-13T18:49:29Z

Yeah for various reasons I think this is going to require a completely new recursion detection. I have a prototype that appears to be passing specs.

The new code should address #3887 too. May be a little too risky for 9.1.1.0...@enebo?

The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.

The fix here is as follows: * Always provide a non-zero hash for any array, including empty. * Better hash combination function with less collision. An improved recursion guard will go into the next release.

headius · 2016-05-14T00:16:26Z

Ok, I've merged in a simpler fix for 9.1.1.0, which you could verify. You might also be interested in trying out the test-new-recursion branch, which has the improved logic.

This reverts commit f1e056b.

…"" This reverts commit 89d8af7.

The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.

flash-gordon · 2016-05-14T10:16:00Z

Thanks! I tested both solutions and here is what I got:

Smaller fix f1e056b works fine, all specs (2K+) are green with several runs & random order.
test-new-recursion fails the following example:

irb(main):017:0> [{a: 1}].hash
=> 5019126971421186946
irb(main):018:0> [{b: 1}].hash
=> 5019126971421186946

headius · 2016-05-16T17:27:34Z

@flash-gordon I do not see that on the current branch, but I believe I had fixes go in after you tested.

[] ~/projects/jruby $ jruby -e 'p [{a: 1}].hash'
1994458792184350789

[] ~/projects/jruby $ jruby -e 'p [{b: 1}].hash'
4863057760227230173

I do have one MRI test failure to investigate.

flash-gordon · 2016-05-16T17:33:09Z

@headius you should test it in the same process

gordon@mac ~/dev/jruby [test-new-recursion] $ ./bin/jruby -e 'p [{a: 1}].hash; p [{b: 1}].hash;'
6962820890589519712
6962820890589519712

headius · 2016-05-16T17:34:55Z

Yeah I realized that afterward, and it turns out the MRI failure was the same issue. It's now fixed on the branch and will be merged into 9.1.2.0. Feel free to verify again if you like :-)

flash-gordon · 2016-05-16T17:47:42Z

Sure. I ran specs (again, 2.2K specs) ten times with random order and had no issues, so it seems to work fine and it's very awesome, thank you!

flash-gordon · 2016-05-16T17:51:09Z

Btw, when do you plan to release 9.1.1.0. I mean rough estimation of course.

headius · 2016-05-16T17:52:28Z

Within the next couple days. Blocked on #3867 and related issues right now (problems activating gems inside bundle exec.

* See jruby/jruby#3884 for retails * I'll replace it with add 9.1.1.0 release when it is available

The old recursion guard (ported from MRI years ago) had a number of flaws: * It forced an object ID for every object encountered. * It created transient objects for every recursive stack. * It was not using identity hashing (#3887). * It used a RubyHash internally, which has much more overhead than a typical JDK Map. The new implementation largely follows the pattern of the original but fixes all the above items. It passes all untagged specs. See also #3884, which started this whole thing.

Relates to jruby/jruby#3884, where JRuby was always returning the same hash for certain combinations of nested array with an innermost empty array.

flash-gordon mentioned this issue May 12, 2016

Spec failures on jruby dry-rb/dry-validation#159

Closed

headius added core JRuby 9000 regression labels May 12, 2016

headius closed this as completed in 1b92064 May 12, 2016

headius added a commit that referenced this issue May 12, 2016

Add a spec for hashing a double-nested array with inner zarray.

ac53c54

Relates to #3884, where JRuby was always returning the same hash for certain combinations of nested array with an innermost empty array.

headius added this to the JRuby 9.1.1.0 milestone May 12, 2016

headius reopened this May 13, 2016

headius closed this as completed in 89d8af7 May 14, 2016

headius added a commit that referenced this issue May 14, 2016

Revert "Smaller fix for #3884."

269d815

This reverts commit f1e056b.

headius added a commit that referenced this issue May 14, 2016

Revert "Revert "Re-port Array#hash and some related logic to fix #3884.…

b06dd52

…"" This reverts commit 89d8af7.

flash-gordon added a commit to dry-rb/dry-validation that referenced this issue May 16, 2016

Do not run specs on jruby-9000 (closes #159)

d232677

* See jruby/jruby#3884 for retails * I'll replace it with add 9.1.1.0 release when it is available

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Invalid hash value for nested arrays #3884

Invalid hash value for nested arrays #3884

flash-gordon commented May 12, 2016

headius commented May 12, 2016

flash-gordon commented May 12, 2016

headius commented May 12, 2016

flash-gordon commented May 12, 2016

flash-gordon commented May 12, 2016

headius commented May 12, 2016

flash-gordon commented May 13, 2016

headius commented May 13, 2016

flash-gordon commented May 13, 2016

headius commented May 13, 2016

flash-gordon commented May 13, 2016

headius commented May 13, 2016

headius commented May 13, 2016

headius commented May 14, 2016

flash-gordon commented May 14, 2016

headius commented May 16, 2016

flash-gordon commented May 16, 2016

headius commented May 16, 2016

flash-gordon commented May 16, 2016

flash-gordon commented May 16, 2016

headius commented May 16, 2016

Invalid hash value for nested arrays #3884

Invalid hash value for nested arrays #3884

Comments

flash-gordon commented May 12, 2016

Environment

Expected Behavior

Actual Behavior

headius commented May 12, 2016

flash-gordon commented May 12, 2016

headius commented May 12, 2016

flash-gordon commented May 12, 2016

flash-gordon commented May 12, 2016

headius commented May 12, 2016

flash-gordon commented May 13, 2016

headius commented May 13, 2016

flash-gordon commented May 13, 2016

headius commented May 13, 2016

flash-gordon commented May 13, 2016

headius commented May 13, 2016

headius commented May 13, 2016

headius commented May 14, 2016

flash-gordon commented May 14, 2016

headius commented May 16, 2016

flash-gordon commented May 16, 2016

headius commented May 16, 2016

flash-gordon commented May 16, 2016

flash-gordon commented May 16, 2016

headius commented May 16, 2016