Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry in all error cases but a few #1580

Merged
merged 1 commit into from Oct 12, 2017

Conversation

copumpkin
Copy link
Member

@copumpkin copumpkin commented Sep 25, 2017

I could make this optional but it doesn't seem like too weird of a thing to enable in general.

As we discussed on IRC

This is what I'm currently running into in #1573, even though it no longer causes deadlocks. I'm honestly not sure if it was what was causing deadlocks before, but since updating Nix I'm now getting timeout errors with the same frequency as I was getting deadlocks before.

I'm unsure what to do about timeouts. The behavior in this PR will retry on curl connect timeouts (which happens to be the behavior I want, since I'm getting occasional timeouts from dud S3 backend servers), but that doesn't seem like something everyone would want.

cc @edolstra

@edolstra
Copy link
Member

Maybe it would be better to set the default value for connect-timeout to something much lower (say 10 seconds) and then retry timeouts unconditionally.

@copumpkin
Copy link
Member Author

@edolstra I considered that, but I don't think the retry count is fixed or known ahead of time, right? Different download invocations can pass in different values for it I think.

It was getting too much like whac-a-mole listing all the retriable error
conditions, so we now retry by default and list the cases where retrying
is almost certainly hopeless.
@copumpkin copumpkin changed the title Add timeout as a retry-worthy condition Retry in all error cases but a few Oct 3, 2017
@copumpkin
Copy link
Member Author

@edolstra I updated it as discussed on IRC. Let me know what you think about the timeout concern.

@copumpkin
Copy link
Member Author

@edolstra any thoughts on this? I'm happy to make further changes if you see fit, but ultimately just want to stop trying to catch every little error that flaky servers throw my way.

@edolstra edolstra merged commit 177aee0 into NixOS:master Oct 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants