Skip to content

Fix deadlock in nix-store when max-connections=1 #4262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 16, 2020

Conversation

jbaum98
Copy link
Contributor

@jbaum98 jbaum98 commented Nov 16, 2020

This fixes a bug I encountered where nix-store -qR will deadlock when
the --include-outputs flag is passed and max-connections=1.

The deadlock occurs because RemoteStore::queryDerivationOutputs takes
the only connection from the connection pool and uses it to check the
daemon version. If the version is new enough, it calls
Store::queryDerivationOutputs, which eventually calls
RemoteStore::queryPartialDerivationOutputMap, where we take another
connection from the connection pool to check the version again. Because
we still haven't released the connection from the caller, this waits for
a connection to be available, causing a deadlock.

This diff solves the issue by using getProtocol to check the protocol
version in the caller RemoteStore::queryDerivationOutputs, which
immediately frees the connection back to the pool before returning the
protocol version. That way we've already freed the connection by the
time we call RemoteStore::queryPartialDerivationOutputMap.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
This fixes a bug I encountered where `nix-store -qR` will deadlock when
the `--include-outputs` flag is passed and `max-connections=1`.

The deadlock occurs because `RemoteStore::queryDerivationOutputs` takes
the only connection from the connection pool and uses it to check the
daemon version. If the version is new enough, it calls
`Store::queryDerivationOutputs`, which eventually calls
`RemoteStore::queryPartialDerivationOutputMap`, where we take another
connection from the connection pool to check the version again. Because
we still haven't released the connection from the caller, this waits for
a connection to be available, causing a deadlock.

This diff solves the issue by using `getProtocol` to check the protocol
version in the caller `RemoteStore::queryDerivationOutputs`, which
immediately frees the connection back to the pool before returning the
protocol version. That way we've already freed the connection by the
time we call `RemoteStore::queryPartialDerivationOutputMap`.
@edolstra edolstra merged commit 0d6419a into NixOS:master Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants