-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autoload & require deadlock #3341
Comments
I can reproduce it with jruby-1.7.x but so far did not see it in jruby-9k |
also on jruby-9k but needed quite a few trials :) |
There's more, I was able to create a similar problem without autoloads. The require lock itself allows for deadlocks #file: dltest.rb
require 'thread'
ta = Thread.new {
require './class_a'
require './class_b'
}
tb = Thread.new {
require './class_b'
require './class_a'
}
ta.join
tb.join
puts('Try again')
#file: class_a.rb
require './class_b'
require './class_c'
class ClassA
def initialize
@d = ClassC.new
end
end
#file: class_b.rb
require './class_a'
require './class_c'
class ClassB
def initialize
@d = ClassC.new
end
end
#file: class_c.rb
class ClassC
def initialize
end
end
# this sleep makes it easier to repro the problem
sleep(1) run it in the same stubborn way while [ true ] ; do jruby dltest.rb ; done this is the jstack output for this test
|
@bignacio very much appreciated the new reproduction case. I do have some fix for the first deadlock BUT I do see some IllegalMonitorStateException from time to time. |
this eventually fixes #3341 in the sense that both testcases do not produce deadlocks anymore. Sponsored by Lookout Inc.
The general case of two threads requiring two files in opposite order can't be fixed. This is a flaw in how Ruby locks around files, and it's possible to deadlock CRuby in the same way. The only fix for that would be to have a single global lock for all requires, serializing them across all threads. I suspect that the autoload deadlock may just be another flavor of the same problem. I'm looking over the fixes @mkristian made in #3370 and I am confused about a few things:
If the spinning eventually just gives up, that's not proper semantics because the file in question has not been loaded. I suspect it "fixes" the deadlock because the failed thread will now back all the way out of its locks, allowing the other thread to proceed. |
running those two "test" with MRI there are no deadlocks but with warning enabled you see
i.e. warning: loading in progress, circular require considered harmful for both examples. with #3370 jruby produces:
I learned that this deadlock situation can not be solved without having a global require lock - but MRI needs to be first doing it. @bignacio probably a "wont fix" but for me getting closer to MRI would help. |
@mkristian Thanks for looking into this. I'd risk proposing a scenario where a deadlock could happen even with a global require lock - given the fact you can require anytime, anywhere in your code.
now, this example is pure bad coding but perhaps in a more complex sequence of requires, using third-party code, the unadvised developer could run into cases like this. It's a tough problem to solve properly and elegantly and still support the same functionalities. |
@bignacio Assuming those were not mutexes but instead monitors (reentrant locking) there's no deadlock here. The second synchronize in "file3" would have been acquired already. It also would not deadlock with multiple threads, since only the first thread to acquire the lock would continue to finish its loading. The only place where a single require lock would cause the system to deadlock is when a require or load never returns. For example, a server that you start by requiring a file. I don't know how common this is in Ruby-land, but I'd hope it's pretty rare. |
I'm sorry, we need to revert this change and reexamine the problem. This change causes require to allow all contending threads to proceed without waiting for a require to complete (#4091). And my initial re-examination of this bug points out a big problem to me: if two threads require two files in different orders, and those files depend on each other, there's no way to avoid a deadlock under the current model of Ruby require locking. So this bug may be a correct behavior of Ruby as it stands today. |
Specifically, in this example it is not possible to be thread-safe: #3341 (comment)
So we have inconsistent lock ordering. This is the simplest example of a deadlock. The only fix for this is to have a single lock for requiring all files, and only one thread gets to do requires at a time. JRuby used to have such a mode; I will check that it still works, so it can be a workaround for systems with badly-structured require trees. |
Until convinced otherwise, I'm calling this one Won't Fix. It is not possible to avoid deadlocks in Ruby requires with their current specification. Don't require interdependent files in different orders from different threads. I will look into deadlock detection, to see if we can find a way to error out rather than blocking. |
I have written a naive deadlock detector that will error out if it sees two threads with conflicting require locks. I do not believe it will work for anything more than two threads right now. https://gist.github.com/headius/6a86bae2556d03b5fa235d6df695278c The logic here basically spins attempting to acquire the lock. Initially, and for each spin that the lock fails to be acquired, it checks for deadlocks. Deadlock checking involves looking at the thread that owns the require lock and determining if it is waiting on a lock itself. If so, and that lock appears in the list of locks the current thread holds, we have a two-way deadlock. There are algorithms for doing N-way deadlock detection but they get very complicated. I am not sure if we want that in the require path...but then again, it will only be needed when two threads contend a single require lock. It may be unlikely to affect anyone if the deadlock detection is heavier. |
This detector works only with two threads, using the following heuristic: 1. If the lock is already acquired, obtain its owner thread. 2. Check if the owner thread is waiting on a lock. 3. If the lock the owner is waiting on is held by the current thread, raise an error. 4. If there's no owner and no deadlock, try to lock the thread for 500ms + rand(100)ms. 5. If the file has been locked, return success. 6. Otherwise, start over at 1. We may want a more advanced detector that can handle arbitrarily many threads, but it would be much heavier and require scanning all threads in the system, including those unrelated to the current program and those not actually trying to require files. See #3341.
This detector works only with two threads, using the following heuristic: 1. If the lock is already acquired, obtain its owner thread. 2. Check if the owner thread is waiting on a lock. 3. If the lock the owner is waiting on is held by the current thread, raise an error. 4. If there's no owner and no deadlock, try to lock the thread for 500ms + rand(100)ms. 5. If the file has been locked, return success. 6. Otherwise, start over at 1. We may want a more advanced detector that can handle arbitrarily many threads, but it would be much heavier and require scanning all threads in the system, including those unrelated to the current program and those not actually trying to require files. See jruby#3341.
This detector works only with two threads, using the following heuristic: 1. If the lock is already acquired, obtain its owner thread. 2. Check if the owner thread is waiting on a lock. 3. If the lock the owner is waiting on is held by the current thread, raise an error. 4. If there's no owner and no deadlock, try to lock the thread for 500ms + rand(100)ms. 5. If the file has been locked, return success. 6. Otherwise, start over at 1. We may want a more advanced detector that can handle arbitrarily many threads, but it would be much heavier and require scanning all threads in the system, including those unrelated to the current program and those not actually trying to require files. See #3341.
I run into an interesting problem when using require and autoload for the same classes.
I only tested it against JRuby 1.7.21 and quickly looking at the code I'd attributed the problem to combination of a global lock for autoload consts and the require lock.
Since autoload at some point will call require and try to acquire a lock for a file, that same file can contain autoloaded classes that in turn will need to acquire the autoload lock.
I wrote a little, perhaps convoluted, sample code to reproduce the problem. It is gisted here https://gist.github.com/bignacio/7a947960113775477c72 and I copy it below
Save all the files into the same directory and run
$ jruby main.rb
as any good deadlock, it won't happen all the time so you can do something like this
$ while [ true ] ; do jruby main.rb ; done
and wait untill it hangs.
You can confirm the dealock with jstack
Once the process is hung, get its pid and run
jstack -l < pid >
jstack output
The text was updated successfully, but these errors were encountered: