Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve ThreadGroup#list order with WeakIdentityLinkedHashSet #2221

Closed
wants to merge 4 commits into from

Conversation

godfat
Copy link
Contributor

@godfat godfat commented Nov 21, 2014

Because MRI uses a linked list which preserves the order of the threads.
https://github.com/ruby/ruby/blob/2a754a733045da9965e88d1f31e650ea6b3f2b6c/vm_core.h#L973-L978
This is useful to assume the first thread in the list is the root of a
group, and saving values in the thread could be shared amongst the
thread group. If we ever have thread group local variable, this is not
really needed.

@godfat
Copy link
Contributor Author

godfat commented Nov 21, 2014

Sorry that I failed to build JRuby on my machine,
and therefore I hope Travis could do this for me.
After everything is ok, I guess we could add a test
in RubySpec for this. Maybe something like:

def check_group_list root, level=10
  Thread.current.group.list.first.should == root
  if level > 0
    Thread.new{ check_group_list(root, level - 1) }.join
  end
end

Thread.new do
  root = Thread.current
  ThreadGroup.new.add(root)
  check_group_list(root)
end.join

@godfat godfat force-pushed the preserve-thread-group-list-order branch 3 times, most recently from a4e109a to 2d312e0 Compare November 21, 2014 07:22
Because MRI uses a linked list which preserves the order of the threads.
https://github.com/ruby/ruby/blob/2a754a733045da9965e88d1f31e650ea6b3f2b6c/vm_core.h#L973-L978
This is useful to assume the first thread in the list is the root of a
group, and saving values in the thread could be shared amongst the
thread group. If we ever have thread group local variable, this is not
really needed.
@godfat godfat force-pushed the preserve-thread-group-list-order branch from 2d312e0 to 475746f Compare November 21, 2014 09:20
@headius
Copy link
Member

headius commented Nov 24, 2014

Your patch is close, but as written it will leave dead WeakReference objects accumulating in the list when the threads associated with them are not removed before being collected.

Have a look at how ObjectSpace tracks objects weakly using a doubly-linked list and a ReferenceQueue; on every insertion or removal the queue is polled and dead references are removed from the list.

@godfat
Copy link
Contributor Author

godfat commented Nov 24, 2014

I just read the source, and it looks like it's a WeakValuedIdentityMap instead of doubly-linked list? For an ArrayList used in this patch, I am not sure if this is a good idea as every remove is O(n), if there's a ton of threads created and destroyed this might take a while?

Should I then change ArrayList to something else, like a SortedMap or something? This might be a bit tricky because we need to keep the same order here. Now I start to believe order preserving map does have something...

@godfat
Copy link
Contributor Author

godfat commented Nov 28, 2014

So is that ok even if remove is O(n)?
Please review this again, and I hope this could get fixed before the next release :D

@godfat
Copy link
Contributor Author

godfat commented Dec 8, 2014

Anyone would like to review this again?
Currently pork is depending on this, and all my tests were based on it. That means all my tests are failing on JRuby due to this issue. It's not really a big deal therefore I left all of them failed. But it would be good that I could be sure that this would get fixed before JRuby 9000 is released.

Many thanks.

@headius
Copy link
Member

headius commented Dec 17, 2014

My only remaining concern is that we're losing Set semantics by going to the ArrayList-based version. This is a problem since you can add threads to a thread group at any time; people are bound to end up with duplicates unless we either scan on insertion or use a Set again.

Unfortunately the best option might be for ThreadGroup to have two weak collections, one with Set semantics and one with ordering. I'll confer with @enebo today.

@godfat
Copy link
Contributor Author

godfat commented Dec 17, 2014

Indeed this could lead to duplicated thread and which is definitely not desired. What about using something like this then? I am not familiar with Java data structure, but this looks like what we want here:
http://docs.oracle.com/javase/6/docs/api/java/util/LinkedHashSet.html

@headius
Copy link
Member

headius commented Dec 17, 2014

This just gets more and more complicated :-(

LinkedHashSet does get us the ordering we want, but it isn't weak. If we just use WeakReference, it doesn't pass through hashcode and it wouldn't == another WeakReference containing the same object.

We need a weak, ordered, identity hash set and there's none in JDK or JRuby at the moment.

We may be able to modify WeakIdentityHashMap to support insertion-ordering by adding doubly-linked list logic to its Entry class. Iteration in that case would simply walk the linked list of Entry objects, and cleaning up an Entry (when its reference goes dead) would remove itself from that linked list

@scrhartley
Copy link

Can you forget about an identity map and use a key wrapper class to give you the behaviour you want?

@godfat
Copy link
Contributor Author

godfat commented Dec 18, 2014

@headius You know what, I just read the source of LinkedHashMap, after understanding how the double linked list works (this is smart!), I went back to JRuby's WeakIdentityHashMap, realizing that there's already a single linked list and an array preserving the order. The iterator is also using the order. That being said, I guess we just need to switch to WeakIdentityHashMap? Could simply use the technique as openjdk that use an new Object() as the unit of a value to turn the map into a set.

Would that work!? I guess I could do this then. Doesn't sound hard.

@godfat
Copy link
Contributor Author

godfat commented Dec 18, 2014

Apparently I am wrong. It's simply the linked list for collision :(
Still need to modify the Entry class. Will try it out.

On the other hand, we should probably make WeakIdentityHashMap generic... using Object for everything is not very safe.

We extend WeakIdentityHashMap for WeakIdentityLinkedHashMap, and use it
for WeakIdentityLinkedHashSet. Sorry that this was a naive
implementation and I didn't write any tests for this. Originally I want
to do something like java.util.LinkedHashMap, but it seems
WeakIdentityHashMap didn't implement every thing we need to do the same.
Therefore I decided to just write what we need for RubyThreadGroup.
We could complete this in the future.

Note that this should also fix a bug where WeakIdentityHashMap#valueRemoved
is not called properly.
@godfat
Copy link
Contributor Author

godfat commented Dec 18, 2014

Please checkout fe56e69 :D

@godfat godfat changed the title Preserve ThreadGroup#list order with ArrayList Preserve ThreadGroup#list order with WeakIdentityLinkedHashSet Dec 18, 2014
@headius
Copy link
Member

headius commented Dec 18, 2014

@godfat Wow, you're my hero of the week. Will try to review this and get it into 1.7.19 (.18 is supposed to go out tomorrow or Monday).

I started trying to do this by modifying WeakIdentityHashMap, and just ran into more problems...plus a lot of code uses that class and I worry about changes to it. I'd still like to share more between the two classes, but this is fine for now.

And I agree about generics. If you'd like to try after we merge this, I'd be happy to merge that.

@headius headius added this to the JRuby 1.7.19 milestone Dec 18, 2014
@headius headius self-assigned this Dec 18, 2014
@godfat
Copy link
Contributor Author

godfat commented Dec 18, 2014

Haha, great! It's even better to have this before 9k.

I feel WeakIdentityHashMap is not as good as how HashMap was implemented. Maybe we should instead update HashMap from OpenJDK to make it easier to be integrated with WeakReference, so that we don't have to implement all those HashMap logic in WeakIdentityHashMap...

And great I would try to make WeakIdentityHashMap generic after this was merged. I thought I might do that in this pull request, but I feel I already touched too much and added too much code I am not very sure if they are all correct...

@headius
Copy link
Member

headius commented Dec 22, 2014

We can't incorporate code directly from OpenJDK but we can implement similar data structures. For now we will pull in your new and improved WeakIdentityHashMap.

headius added a commit that referenced this pull request Dec 22, 2014
This is a squashed commit of the following changes for #2221.

Thanks to @godfat for the work on this one.

commit 4bc9e8e
Author: Lin Jen-Shin <godfat@godfat.org>
Date:   Fri Dec 19 04:13:20 2014 +0800

    WeakIdentityLinkedHashMap: Properly set head.

commit fe56e69
Author: Lin Jen-Shin <godfat@godfat.org>
Date:   Fri Dec 19 02:10:43 2014 +0800

    Use WeakIdentityLinkedHashSet for RubyThreadGroup:

    We extend WeakIdentityHashMap for WeakIdentityLinkedHashMap, and use it
    for WeakIdentityLinkedHashSet. Sorry that this was a naive
    implementation and I didn't write any tests for this. Originally I want
    to do something like java.util.LinkedHashMap, but it seems
    WeakIdentityHashMap didn't implement every thing we need to do the same.
    Therefore I decided to just write what we need for RubyThreadGroup.
    We could complete this in the future.

    Note that this should also fix a bug where WeakIdentityHashMap#valueRemoved
    is not called properly.

commit e5bcad9
Author: Lin Jen-Shin <godfat@godfat.org>
Date:   Tue Nov 25 00:43:53 2014 +0800

    Use a ReferenceQueue to sweep dead references

commit 475746f
Author: Lin Jen-Shin <godfat@godfat.org>
Date:   Fri Nov 21 08:06:39 2014 +0800

    Preserve ThreadGroup#list order with ArrayList

    Because MRI uses a linked list which preserves the order of the threads.
    https://github.com/ruby/ruby/blob/2a754a733045da9965e88d1f31e650ea6b3f2b6c/vm_core.h#L973-L978
    This is useful to assume the first thread in the list is the root of a
    group, and saving values in the thread could be shared amongst the
    thread group. If we ever have thread group local variable, this is not
    really needed.
@headius
Copy link
Member

headius commented Dec 22, 2014

@godfat I squashed your commits into one (to avoid experimentation noise in history) and pushed to test-fix-2221. If it looks ok we'll merge it in for 1.7.19.

@godfat
Copy link
Contributor Author

godfat commented Dec 23, 2014

If you're talking about a9fa229, I didn't see WeakIdentityLinkedHashSet there?

On the other hand, it would be great if the squashed commit could keep the original author (that is me :P). I just checked git, it seems there's no single one command to squash merge and set the author. I could do that in this pull request via rebase if it's troublesome.

godfat added a commit to godfat/pork that referenced this pull request Dec 25, 2014
then why on earth did i spend so much time doing this?
jruby/jruby#2221

fine, let's just find the thread...
@headius
Copy link
Member

headius commented Jan 12, 2015

It turns out this behavior was not specified in Ruby, and MRI stopped keeping them in order with the 2.2 release. Until they decide to specify this behavior, we will wait to make this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants