You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tested on OS X 10.10.5, but also happened in centos-7, but looks like is an OS independant bug.
See [1] [2] for user reports and their specific environments.
as users reporting the same situation, there is environment explanations there.
Some environments looks to be having config management changing the /etc/resolv file regularly, checking with the users
Expected Behavior
The expectation is to run the host resolver without entering concurrency issues.
Actual Behavior
At logstash we got reported by a few users [1] [2] that in heavy load situations user got this exception:
The actual LS environment is running several threads with instances of this code, so the resolver is shared.
It has been actually tricky to reproduce this with Logstash, however I've been able to isolate the problem independly from Logstash into jruby resolv:host class, you can see an script to reproduce at https://gist.github.com/purbon/106e80a5d85b3a6fbf1e5f10b5d0d643
Problem is actually that the Resolv::Host lazy_initialize method mutex is locked, the threads gets terminated while waiting for this, timeout usage. After analysing users environment it could be that some other process might be messing the host file regularly, this might be helping to cause this, not sure.
Hope the issue is clear and you get all need information, don't hesitate to ask for more context, debug information, or anything you might need.
Thanks a lot,
purbon
The text was updated successfully, but these errors were encountered:
Unfortunately the given script does't really tell me anything :-(
You have it set up to do a native interrupt while it's waiting on the mutex. That does indeed produce the same (somewhat cryptic) error, but it doesn't tell me why the mutex is being interrupted in your real-world case. We need more information on how threads are getting interrupted at a native level and why it is happening during this particular mutex.
Environment
See [1] [2] for user reports and their specific environments.
as users reporting the same situation, there is environment explanations there.
Some environments looks to be having config management changing the
/etc/resolv
file regularly, checking with the usersExpected Behavior
The expectation is to run the host resolver without entering concurrency issues.
Actual Behavior
At logstash we got reported by a few users [1] [2] that in heavy load situations user got this exception:
(partially trimmed)
The actual LS environment is running several threads with instances of this code, so the resolver is shared.
It has been actually tricky to reproduce this with Logstash, however I've been able to isolate the problem independly from Logstash into jruby resolv:host class, you can see an script to reproduce at https://gist.github.com/purbon/106e80a5d85b3a6fbf1e5f10b5d0d643
Problem is actually that the Resolv::Host lazy_initialize method mutex is locked, the threads gets terminated while waiting for this, timeout usage. After analysing users environment it could be that some other process might be messing the host file regularly, this might be helping to cause this, not sure.
Hope the issue is clear and you get all need information, don't hesitate to ask for more context, debug information, or anything you might need.
Thanks a lot,
The text was updated successfully, but these errors were encountered: