-
-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolv fails under load with SocketError: bind: name or service not known #3659
Comments
for completeness sake:
|
After a big more investigation, I see the problem: in resolv when a bind is attempted at a random port, the rescue block expects an exception from the Errno family: def self.bind_random_port(udpsock, bind_host="0.0.0.0") # :nodoc:
begin
port = rangerand(1024..65535)
udpsock.bind(bind_host, port)
rescue Errno::EADDRINUSE, # POSIX
Errno::EACCES, # SunOS: See PRIV_SYS_NFS in privileges(5)
Errno::EPERM # FreeBSD: security.mac.portacl.port_high is configurable. See mac_portacl(4).
retry
end
end So I created a small script to test which exception is thrown if the port is blocked:
So ruby mri 2.2.1 behaves as expected:
But JRuby throws a SocketError instead:
|
@jsvd I do not have time to look at this today but I will add some extra info. I can see that this behavior of expecting EADDRINUSE applies all the way back to 1.8.7. So this has been broken a long time. fwiw, we get back from pretty generic error messages from Java's net layer. It looks like we should be examining the string (sad but likely true) to figure out if we should be raising EADDRINUSE. |
Thanks for the feedback @enebo |
Are you able to reproduce this on 9k? I'm having trouble getting your script to fail the same way. I get ResolvError generally if there's a DNS server at the target address. I'm on OS X. |
Even though I can't reproduce it, if you can give me the SocketError trace while passing -Xbacktrace.style=full I should be able to track down where it is coming from and fix it. |
The problem comes from what @enebo said, a binding on a used port raises a SocketError instead of EADDRINUSE. Now, the reason why resolv fails in the first place is that a Since it's raising "the wrong exception", it bubbles up instead of being retried (normal behaviour). |
@jsvd Confirmed! I believe this will fix by the ruby-2.3+socket branch (still in progress) which aligns our socket implementations much more closely to CRuby. That should get merged in any day now and be in 9.1. |
The socket branch will not make it into 9.1. Bumping. |
It's likely that we're hitting this issue too on jRuby 1.7.20.1 which is the jRuby version embedded in puppet-server 1.1.3 :( Is there way to work around it? Quite a few things rely on Resolv over here :/ |
@nbarrientos It's unlikely we'll be putting a lot of effort into the Socket subsystem on JRuby 1.7, so your best bet would be to try JRuby 9.1 when it comes out. We may not have it fixed, but we'll be closer, and we'll work to get it fixed for a 9.1 update. |
I have a fix for this we could include in 9.1: https://gist.github.com/5b101a92d8c2a3d4cb414140f882ebc5 It's up to @enebo if it's too risky the day of the release :-) |
I've incorporated a localized fix for this issue into 9.1 and we can call this fixed. There's another bug outstanding for the |
@nbarrientos I have made the same fix for 1.7, so it will be in 1.7.26 whenever we release that. In the short term your only workaround would be to modify |
Thanks. |
@headius - any chance we could get a 1.7.26 to get this fix incorporated? It would be Very Nice indeed. 😇 |
@headius - just to make things extremely clear, was this included in 1.7.26? I think I failed to find it when I skimmed through the release notes. |
I setup dnsmasq on my mac and ran the following script:
The text was updated successfully, but these errors were encountered: