Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array hashing is in Ruby 1.8 mode #2437

Closed
kml opened this issue Jan 7, 2015 · 10 comments · Fixed by #2443
Closed

Array hashing is in Ruby 1.8 mode #2437

kml opened this issue Jan 7, 2015 · 10 comments · Fixed by #2443

Comments

@kml
Copy link

kml commented Jan 7, 2015

Different arrays, are hashing very similarly

>> [].hash
=> 0
>> [2].hash
=> 0

MRI's behaviour is different:

>> [].hash
=> 3252255281033687896
>> [2].hash
=> 3208161167426848866
@headius
Copy link
Member

headius commented Jan 7, 2015

We calculate based on array length and hash of the elements, with some bitshifting etc.

Based on MRI's implementation it appears we're missing a few things:

  • Murmur calculation of the hash
  • Pseudo-random salt (MRI uses rb_ary_hash pointer address)

@Who828
Copy link
Contributor

Who828 commented Jan 8, 2015

I can take this up tonight, looks like a nice bug get started.

@Who828
Copy link
Contributor

Who828 commented Jan 8, 2015

Here is the change, https://gist.github.com/Who828/056848941c6efee4044e
Let me know if that's what you expected before I send the PR.

@headius
Copy link
Member

headius commented Jan 9, 2015

The PR looks good, but I wonder if we should also add some kind of pseudo-random salt like MRI. Could be a one-shot random number from almost anywhere. Or we could skip it.

@enebo, @tduehr: Yay or nay? Do we need it?

@tduehr
Copy link
Contributor

tduehr commented Jan 9, 2015

I think it's a good idea to have one per runtime. I'm not entirely sure the full impact, there may be some issues relating to shared objects between runtimes.

@headius
Copy link
Member

headius commented Jan 12, 2015

#2443 gets us using proper hashing algorithm but does not include salting. I'll file a separate issue for that, since I think we need a bit more discussion about what to use for salt.

@headius
Copy link
Member

headius commented Jan 12, 2015

Had to revert the patch because it has other issues.

@headius headius reopened this Jan 12, 2015
@enebo
Copy link
Member

enebo commented Jan 12, 2015

The index and realLength passed into the hash was not for the bytelist but for the array. That was one thing I noticed. The other issue I saw too was [nil, false].hash nil.toString().getBytes() seems to return null?

@MSNexploder
Copy link
Contributor

Looks like this is fixed in 9.1.10.0:

[].hash
=> 5632391355580947474
[2].hash
=> 4399786657182928470

@headius
Copy link
Member

headius commented Jun 1, 2017

Woot, thanks!

@headius headius closed this as completed Jun 1, 2017
@headius headius added this to the JRuby 9.1.9.0 milestone Jun 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment