autoload & require deadlock #3341

bignacio · 2015-09-21T16:36:20Z

I run into an interesting problem when using require and autoload for the same classes.

I only tested it against JRuby 1.7.21 and quickly looking at the code I'd attributed the problem to combination of a global lock for autoload consts and the require lock.

Since autoload at some point will call require and try to acquire a lock for a file, that same file can contain autoloaded classes that in turn will need to acquire the autoload lock.

I wrote a little, perhaps convoluted, sample code to reproduce the problem. It is gisted here https://gist.github.com/bignacio/7a947960113775477c72 and I copy it below

Save all the files into the same directory and run

$ jruby main.rb

as any good deadlock, it won't happen all the time so you can do something like this

$ while [ true ] ; do jruby main.rb ; done

and wait untill it hangs.

You can confirm the dealock with jstack
Once the process is hung, get its pid and run

jstack -l < pid >

# another_class_base.rb
module DeadlockTest
  class AnotherClassBase
  end
end

# file:  class_a.rb
module DeadlockTest
  autoload :AnotherClassBase, 'another_class_base'
  class ClassA
    class AnotherClass < AnotherClassBase
    end
  end
end

# file:  deadlock_test.rb
module DeadlockTest
  autoload :ClassA, 'class_a'
end

# file: main.rb
require 'thread'
require 'deadlock_test'

puts("Starting, JRuby version: #{JRUBY_VERSION}")
ta = Thread.new{
  puts("Thread A #{JRuby.reference(Thread.current).native_thread}\n")

  # This is not a usual thing for people to do but it can happen if an external component is loading
  # your ruby code
  # I first run into this problem in a storm topology (via RedStorm), which does
  # a similar thing. I'm sure there are other ways to reproduce the same problem
  # without a require but this
  require 'class_a'
}

tb = Thread.new{
  puts("Thread B #{JRuby.reference(Thread.current).native_thread}\n")

  o = DeadlockTest::ClassA.new
  puts('if you see this, try again :) ')
}

ta.join
tb.join

jstack output

Found one Java-level deadlock:
=============================
"Ruby-0-Thread-3: main.rb:1":
  waiting for ownable synchronizer 0x00000007b65c1c40, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "Ruby-0-Thread-2: main.rb:1"
"Ruby-0-Thread-2: main.rb:1":
  waiting to lock monitor 0x00007f84e9014b38 (object 0x00000007b6408c78, a java.lang.Object),
  which is held by "Ruby-0-Thread-3: main.rb:1"

Java stack information for the threads listed above:
===================================================
"Ruby-0-Thread-3: main.rb:1":
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007b65c1c40> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at org.jruby.runtime.load.LoadService$RequireLocks.lock(LoadService.java:509)
    at org.jruby.runtime.load.LoadService$RequireLocks.access$200(LoadService.java:478)
    at org.jruby.runtime.load.LoadService.requireCommon(LoadService.java:444)
    at org.jruby.runtime.load.LoadService.autoloadRequire(LoadService.java:418)
    at org.jruby.RubyKernel$1.load(RubyKernel.java:202)
    at org.jruby.RubyModule$Autoload.getConstant(RubyModule.java:3832)
    - locked <0x00000007b6408c78> (a java.lang.Object)
    at org.jruby.RubyModule.getAutoloadConstant(RubyModule.java:3621)
    at org.jruby.RubyModule.resolveUndefConstant(RubyModule.java:3108)
    at org.jruby.RubyModule.getConstantFromNoConstMissing(RubyModule.java:3076)
    at org.jruby.ast.executable.RuntimeCache.reCacheFrom(RuntimeCache.java:442)
    at org.jruby.ast.executable.RuntimeCache.getValueFrom(RuntimeCache.java:436)
    at org.jruby.ast.executable.RuntimeCache.getConstantFrom(RuntimeCache.java:429)
    at org.jruby.ast.executable.AbstractScript.getConstantFrom8(AbstractScript.java:336)
    at main.block_1$RUBY$__file__(main.rb:19)
    at main$block_1$RUBY$__file__.call(main$block_1$RUBY$__file__)
    at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)
    at org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)
    at org.jruby.runtime.Block.call(Block.java:101)
    at org.jruby.RubyProc.call(RubyProc.java:290)
    at org.jruby.RubyProc.call(RubyProc.java:228)
    at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
    at java.lang.Thread.run(Thread.java:745)
"Ruby-0-Thread-2: main.rb:1":
    at org.jruby.RubyModule$Autoload.getConstant(RubyModule.java:3825)
    - waiting to lock <0x00000007b6408c78> (a java.lang.Object)
    at org.jruby.RubyModule.getAutoloadConstant(RubyModule.java:3621)
    at org.jruby.RubyModule.resolveUndefConstant(RubyModule.java:3108)
    at org.jruby.RubyModule.getConstantAtSpecial(RubyModule.java:2968)
    at org.jruby.RubyModule.defineOrGetClassUnder(RubyModule.java:1281)
    at org.jruby.ast.ClassNode.interpret(ClassNode.java:134)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_CLASS(ASTInterpreter.java:103)
    at org.jruby.evaluator.ASTInterpreter.evalClassDefinitionBody(ASTInterpreter.java:280)
    at org.jruby.ast.ModuleNode.interpret(ModuleNode.java:120)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.RootNode.interpret(RootNode.java:129)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_ROOT(ASTInterpreter.java:121)
    at org.jruby.Ruby.runInterpreter(Ruby.java:894)
    at org.jruby.Ruby.loadFile(Ruby.java:2860)
    at org.jruby.runtime.load.ExternalScript.load(ExternalScript.java:66)
    at org.jruby.runtime.load.LoadService.tryLoadingLibraryOrScript(LoadService.java:892)
    at org.jruby.runtime.load.LoadService.requireCommon(LoadService.java:465)
    at org.jruby.runtime.load.LoadService.require(LoadService.java:414)
    at org.jruby.RubyKernel.requireCommon(RubyKernel.java:1047)
    at org.jruby.RubyKernel.require19(RubyKernel.java:1040)
    at org.jruby.RubyKernel$INVOKER$s$1$0$require19.call(RubyKernel$INVOKER$s$1$0$require19.gen)
    at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(JavaMethod.java:352)
    at org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.ReturnNode.interpret(ReturnNode.java:92)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.IfNode.interpret(IfNode.java:116)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RescueNode.executeBody(RescueNode.java:221)
    at org.jruby.ast.RescueNode.interpret(RescueNode.java:116)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
    at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:182)
    at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:203)
    at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:326)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:170)
    at main.block_0$RUBY$__file__(main.rb:13)
    at main$block_0$RUBY$__file__.call(main$block_0$RUBY$__file__)
    at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)
    at org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)
    at org.jruby.runtime.Block.call(Block.java:101)
    at org.jruby.RubyProc.call(RubyProc.java:290)
    at org.jruby.RubyProc.call(RubyProc.java:228)
    at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
    at java.lang.Thread.run(Thread.java:745)

Found 1 deadlock.

The text was updated successfully, but these errors were encountered:

mkristian · 2015-09-22T14:15:56Z

I can reproduce it with jruby-1.7.x but so far did not see it in jruby-9k

mkristian · 2015-09-22T14:46:38Z

also on jruby-9k but needed quite a few trials :)

bignacio · 2015-09-23T17:49:04Z

There's more, I was able to create a similar problem without autoloads. The require lock itself allows for deadlocks

#file: dltest.rb
require 'thread'

ta = Thread.new { 
  require './class_a'
  require './class_b'
}

tb = Thread.new {
  require './class_b'
  require './class_a'
}

ta.join
tb.join

puts('Try again')

#file: class_a.rb
require './class_b'
require './class_c'

class ClassA
  def initialize
    @d = ClassC.new
  end
end

#file: class_b.rb
require './class_a'
require './class_c'

class ClassB
  def initialize
    @d = ClassC.new
  end
end

#file: class_c.rb
class ClassC
  def initialize

  end
end
# this sleep makes it easier to repro the problem
sleep(1)

run it in the same stubborn way

while [ true ] ; do jruby dltest.rb ; done

this is the jstack output for this test

Found one Java-level deadlock:
=============================
"Ruby-0-Thread-3: dltest.rb:1":
  waiting for ownable synchronizer 0x00000007b6475aa0, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "Ruby-0-Thread-2: dltest.rb:1"
"Ruby-0-Thread-2: dltest.rb:1":
  waiting for ownable synchronizer 0x00000007b6519890, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "Ruby-0-Thread-3: dltest.rb:1"

Java stack information for the threads listed above:
===================================================
"Ruby-0-Thread-3: dltest.rb:1":
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007b6475aa0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at org.jruby.runtime.load.LoadService$RequireLocks.lock(LoadService.java:509)
    at org.jruby.runtime.load.LoadService$RequireLocks.access$200(LoadService.java:478)
    at org.jruby.runtime.load.LoadService.requireCommon(LoadService.java:444)
    at org.jruby.runtime.load.LoadService.require(LoadService.java:414)
    at org.jruby.RubyKernel.requireCommon(RubyKernel.java:1047)
    at org.jruby.RubyKernel.require19(RubyKernel.java:1040)
    at org.jruby.RubyKernel$INVOKER$s$1$0$require19.call(RubyKernel$INVOKER$s$1$0$require19.gen)
    at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(JavaMethod.java:352)
    at org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.ReturnNode.interpret(ReturnNode.java:92)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.IfNode.interpret(IfNode.java:116)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RescueNode.executeBody(RescueNode.java:221)
    at org.jruby.ast.RescueNode.interpret(RescueNode.java:116)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
    at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:182)
    at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:203)
    at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:326)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:170)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RootNode.interpret(RootNode.java:129)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_ROOT(ASTInterpreter.java:121)
    at org.jruby.Ruby.runInterpreter(Ruby.java:894)
    at org.jruby.Ruby.loadFile(Ruby.java:2860)
    at org.jruby.runtime.load.LibrarySearcher$ResourceLibrary.loadScript(LibrarySearcher.java:264)
    at org.jruby.runtime.load.LibrarySearcher$ResourceLibrary.load(LibrarySearcher.java:252)
    at org.jruby.runtime.load.LibrarySearcher$FoundLibrary.load(LibrarySearcher.java:51)
    at org.jruby.runtime.load.LoadService.tryLoadingLibraryOrScript(LoadService.java:892)
    at org.jruby.runtime.load.LoadService.requireCommon(LoadService.java:465)
    at org.jruby.runtime.load.LoadService.require(LoadService.java:414)
    at org.jruby.RubyKernel.requireCommon(RubyKernel.java:1047)
    at org.jruby.RubyKernel.require19(RubyKernel.java:1040)
    at org.jruby.RubyKernel$INVOKER$s$1$0$require19.call(RubyKernel$INVOKER$s$1$0$require19.gen)
    at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(JavaMethod.java:352)
    at org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.ReturnNode.interpret(ReturnNode.java:92)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.IfNode.interpret(IfNode.java:116)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RescueNode.executeBody(RescueNode.java:221)
    at org.jruby.ast.RescueNode.interpret(RescueNode.java:116)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
    at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:182)
    at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:203)
    at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:326)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:170)
    at dltest.block_1$RUBY$__file__(dltest.rb:9)
    at dltest$block_1$RUBY$__file__.call(dltest$block_1$RUBY$__file__)
    at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)
    at org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)
    at org.jruby.runtime.Block.call(Block.java:101)
    at org.jruby.RubyProc.call(RubyProc.java:290)
    at org.jruby.RubyProc.call(RubyProc.java:228)
    at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
    at java.lang.Thread.run(Thread.java:745)
"Ruby-0-Thread-2: dltest.rb:1":
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007b6519890> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at org.jruby.runtime.load.LoadService$RequireLocks.lock(LoadService.java:509)
    at org.jruby.runtime.load.LoadService$RequireLocks.access$200(LoadService.java:478)
    at org.jruby.runtime.load.LoadService.requireCommon(LoadService.java:444)
    at org.jruby.runtime.load.LoadService.require(LoadService.java:414)
    at org.jruby.RubyKernel.requireCommon(RubyKernel.java:1047)
    at org.jruby.RubyKernel.require19(RubyKernel.java:1040)
    at org.jruby.RubyKernel$INVOKER$s$1$0$require19.call(RubyKernel$INVOKER$s$1$0$require19.gen)
    at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(JavaMethod.java:352)
    at org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.ReturnNode.interpret(ReturnNode.java:92)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.IfNode.interpret(IfNode.java:116)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RescueNode.executeBody(RescueNode.java:221)
    at org.jruby.ast.RescueNode.interpret(RescueNode.java:116)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
    at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:182)
    at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:203)
    at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:326)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:170)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RootNode.interpret(RootNode.java:129)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_ROOT(ASTInterpreter.java:121)
    at org.jruby.Ruby.runInterpreter(Ruby.java:894)
    at org.jruby.Ruby.loadFile(Ruby.java:2860)
    at org.jruby.runtime.load.LibrarySearcher$ResourceLibrary.loadScript(LibrarySearcher.java:264)
    at org.jruby.runtime.load.LibrarySearcher$ResourceLibrary.load(LibrarySearcher.java:252)
    at org.jruby.runtime.load.LibrarySearcher$FoundLibrary.load(LibrarySearcher.java:51)
    at org.jruby.runtime.load.LoadService.tryLoadingLibraryOrScript(LoadService.java:892)
    at org.jruby.runtime.load.LoadService.requireCommon(LoadService.java:465)
    at org.jruby.runtime.load.LoadService.require(LoadService.java:414)
    at org.jruby.RubyKernel.requireCommon(RubyKernel.java:1047)
    at org.jruby.RubyKernel.require19(RubyKernel.java:1040)
    at org.jruby.RubyKernel$INVOKER$s$1$0$require19.call(RubyKernel$INVOKER$s$1$0$require19.gen)
    at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(JavaMethod.java:352)
    at org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
    at org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)
    at org.jruby.ast.ReturnNode.interpret(ReturnNode.java:92)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.IfNode.interpret(IfNode.java:116)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.ast.RescueNode.executeBody(RescueNode.java:221)
    at org.jruby.ast.RescueNode.interpret(RescueNode.java:116)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
    at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:182)
    at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:203)
    at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:326)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:170)
    at dltest.block_0$RUBY$__file__(dltest.rb:4)
    at dltest$block_0$RUBY$__file__.call(dltest$block_0$RUBY$__file__)
    at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)
    at org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)
    at org.jruby.runtime.Block.call(Block.java:101)
    at org.jruby.RubyProc.call(RubyProc.java:290)
    at org.jruby.RubyProc.call(RubyProc.java:228)
    at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
    at java.lang.Thread.run(Thread.java:745)

Found 1 deadlock.

mkristian · 2015-09-23T19:39:07Z

@bignacio very much appreciated the new reproduction case. I do have some fix for the first deadlock BUT I do see some IllegalMonitorStateException from time to time.

this eventually fixes #3341 in the sense that both testcases do not produce deadlocks anymore. Sponsored by Lookout Inc.

headius · 2015-10-05T19:51:13Z

The general case of two threads requiring two files in opposite order can't be fixed. This is a flaw in how Ruby locks around files, and it's possible to deadlock CRuby in the same way. The only fix for that would be to have a single global lock for all requires, serializing them across all threads.

I suspect that the autoload deadlock may just be another flavor of the same problem.

I'm looking over the fixes @mkristian made in #3370 and I am confused about a few things:

Switching to a reentrant lock should not make any difference; Java's object monitor synchronization is already reentrant.
The logic appears to spin for a few attempts to lock the ctxLock, but I don't see what it does if it is unable to acquire the lock during that time.

If the spinning eventually just gives up, that's not proper semantics because the file in question has not been loaded. I suspect it "fixes" the deadlock because the failed thread will now back all the way out of its locks, allowing the other thread to proceed.

mkristian · 2015-10-08T20:33:54Z

running those two "test" with MRI there are no deadlocks but with warning enabled you see

/usr/local/Cellar/ruby/2.2.3/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54: warning: loading in progress, circular require considered harmful - /Users/cmeier/projects/active/jruby/deadlock/test2/class_a.rb
    from dltest.rb:10:in  `block in <main>'
    from /usr/local/Cellar/ruby/2.2.3/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in  `require'
    from /usr/local/Cellar/ruby/2.2.3/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in  `require'
    from /Users/cmeier/projects/active/jruby/deadlock/test2/class_b.rb:2:in  `<top (required)>'
    from /usr/local/Cellar/ruby/2.2.3/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in  `require'
    from /usr/local/Cellar/ruby/2.2.3/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in  `require'

i.e. warning: loading in progress, circular require considered harmful for both examples.

with #3370 jruby produces:

/Users/cmeier/projects/active/jruby/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54: warning: loading in progress, circular require considered harmful - /Users/cmeier/projects/active/jruby/deadlock/test2/class_c.rb
             require at org/jruby/RubyKernel.java:939
             require at /Users/cmeier/projects/active/jruby/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
               <top> at /Users/cmeier/projects/active/jruby/deadlock/test2/class_b.rb:3
             require at org/jruby/RubyKernel.java:939
              (root) at /Users/cmeier/projects/active/jruby/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
  block in dltest.rb at dltest.rb:10

I learned that this deadlock situation can not be solved without having a global require lock - but MRI needs to be first doing it.

@bignacio probably a "wont fix" but for me getting closer to MRI would help.

bignacio · 2015-10-09T14:43:48Z

@mkristian Thanks for looking into this.

I'd risk proposing a scenario where a deadlock could happen even with a global require lock - given the fact you can require anytime, anywhere in your code.

# file1.rb
def do_something
  some_global_mutex.synchronize {
    require 'file2'
  }
end

# file2.rb
require 'file3'

# file3.rb
some_global_mutex.synchronize {
  require 'file1'
}

now, this example is pure bad coding but perhaps in a more complex sequence of requires, using third-party code, the unadvised developer could run into cases like this.

It's a tough problem to solve properly and elegantly and still support the same functionalities.

headius · 2015-10-22T22:03:51Z

@bignacio Assuming those were not mutexes but instead monitors (reentrant locking) there's no deadlock here. The second synchronize in "file3" would have been acquired already.

It also would not deadlock with multiple threads, since only the first thread to acquire the lock would continue to finish its loading.

The only place where a single require lock would cause the system to deadlock is when a require or load never returns. For example, a server that you start by requiring a file.

I don't know how common this is in Ruby-land, but I'd hope it's pretty rare.

headius · 2016-08-19T22:47:14Z

I'm sorry, we need to revert this change and reexamine the problem. This change causes require to allow all contending threads to proceed without waiting for a require to complete (#4091).

And my initial re-examination of this bug points out a big problem to me: if two threads require two files in different orders, and those files depend on each other, there's no way to avoid a deadlock under the current model of Ruby require locking. So this bug may be a correct behavior of Ruby as it stands today.

headius · 2016-08-19T22:55:11Z

Specifically, in this example it is not possible to be thread-safe: #3341 (comment)

a requires b. That requires acquiring the lock for a and then the lock for b.
b requires a. That requires acquiring the lock for b and then the lock for a.

So we have inconsistent lock ordering. This is the simplest example of a deadlock.

The only fix for this is to have a single lock for requiring all files, and only one thread gets to do requires at a time. JRuby used to have such a mode; I will check that it still works, so it can be a workaround for systems with badly-structured require trees.

headius · 2016-08-20T00:50:05Z

Until convinced otherwise, I'm calling this one Won't Fix. It is not possible to avoid deadlocks in Ruby requires with their current specification. Don't require interdependent files in different orders from different threads.

I will look into deadlock detection, to see if we can find a way to error out rather than blocking.

headius · 2016-08-20T02:26:58Z

I have written a naive deadlock detector that will error out if it sees two threads with conflicting require locks. I do not believe it will work for anything more than two threads right now.

https://gist.github.com/headius/6a86bae2556d03b5fa235d6df695278c

The logic here basically spins attempting to acquire the lock. Initially, and for each spin that the lock fails to be acquired, it checks for deadlocks. Deadlock checking involves looking at the thread that owns the require lock and determining if it is waiting on a lock itself. If so, and that lock appears in the list of locks the current thread holds, we have a two-way deadlock.

There are algorithms for doing N-way deadlock detection but they get very complicated. I am not sure if we want that in the require path...but then again, it will only be needed when two threads contend a single require lock. It may be unlikely to affect anyone if the deadlock detection is heavier.

This detector works only with two threads, using the following heuristic: 1. If the lock is already acquired, obtain its owner thread. 2. Check if the owner thread is waiting on a lock. 3. If the lock the owner is waiting on is held by the current thread, raise an error. 4. If there's no owner and no deadlock, try to lock the thread for 500ms + rand(100)ms. 5. If the file has been locked, return success. 6. Otherwise, start over at 1. We may want a more advanced detector that can handle arbitrarily many threads, but it would be much heavier and require scanning all threads in the system, including those unrelated to the current program and those not actually trying to require files. See #3341.

This detector works only with two threads, using the following heuristic: 1. If the lock is already acquired, obtain its owner thread. 2. Check if the owner thread is waiting on a lock. 3. If the lock the owner is waiting on is held by the current thread, raise an error. 4. If there's no owner and no deadlock, try to lock the thread for 500ms + rand(100)ms. 5. If the file has been locked, return success. 6. Otherwise, start over at 1. We may want a more advanced detector that can handle arbitrarily many threads, but it would be much heavier and require scanning all threads in the system, including those unrelated to the current program and those not actually trying to require files. See jruby#3341.

This detector works only with two threads, using the following heuristic: 1. If the lock is already acquired, obtain its owner thread. 2. Check if the owner thread is waiting on a lock. 3. If the lock the owner is waiting on is held by the current thread, raise an error. 4. If there's no owner and no deadlock, try to lock the thread for 500ms + rand(100)ms. 5. If the file has been locked, return success. 6. Otherwise, start over at 1. We may want a more advanced detector that can handle arbitrarily many threads, but it would be much heavier and require scanning all threads in the system, including those unrelated to the current program and those not actually trying to require files. See #3341.

bignacio changed the title ~~autoload & require dealock~~ autoload & require deadlock Sep 21, 2015

rtyler added core JRuby 1.7.x labels Sep 21, 2015

rtyler assigned mkristian Sep 21, 2015

mkristian added a commit that referenced this issue Oct 5, 2015

improve RequireLock to better match its semantic

fc06782

this eventually fixes #3341 in the sense that both testcases do not produce deadlocks anymore. Sponsored by Lookout Inc.

mkristian mentioned this issue Oct 5, 2015

treat deadlocks as circular require - as MRI does it #3370

Merged

kares added this to the JRuby 9.1.0.0 milestone Feb 18, 2016

headius closed this as completed in 455bcf0 Mar 3, 2016

headius mentioned this issue Aug 19, 2016

concurrent loading does not resolve loaded constant #4091

Closed

headius reopened this Aug 19, 2016

headius closed this as completed Aug 20, 2016

headius modified the milestones: Won't Fix, JRuby 9.1.0.0 Aug 20, 2016

headius mentioned this issue Aug 20, 2016

Add a simple deadlock detector for require locks. #4095

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

autoload & require deadlock #3341

autoload & require deadlock #3341

bignacio commented Sep 21, 2015

mkristian commented Sep 22, 2015

mkristian commented Sep 22, 2015

bignacio commented Sep 23, 2015

mkristian commented Sep 23, 2015

headius commented Oct 5, 2015

mkristian commented Oct 8, 2015

bignacio commented Oct 9, 2015

headius commented Oct 22, 2015

headius commented Aug 19, 2016

headius commented Aug 19, 2016

headius commented Aug 20, 2016

headius commented Aug 20, 2016

autoload & require deadlock #3341

autoload & require deadlock #3341

Comments

bignacio commented Sep 21, 2015

mkristian commented Sep 22, 2015

mkristian commented Sep 22, 2015

bignacio commented Sep 23, 2015

mkristian commented Sep 23, 2015

headius commented Oct 5, 2015

mkristian commented Oct 8, 2015

bignacio commented Oct 9, 2015

headius commented Oct 22, 2015

headius commented Aug 19, 2016

headius commented Aug 19, 2016

headius commented Aug 20, 2016

headius commented Aug 20, 2016