New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python2: backport fix for pyc race condition, part 2 #64900
Conversation
I should also mention #64067 where the first part was introduced, but contrary to it this one isn't backported by any upstream distributions, reason unknown. |
I can't give a comprehensive review due to a lack of time and I'm far from a C-developer myself anyway. But out of interest, why didn't you use my patch at https://github.com/NixOS/nixpkgs/pull/63616/files#diff-7af205fc79201d40af30ad4983bcc777? |
Huh, I guess I... missed it? I'll compare them, thanks! |
Could we apply this patch only when building tensorflow? |
@FRidh We could do that but it's a bug nevertheless, I feel it could show up in other massively parallel builds too. If there are any we'll have until the EOL that is ;) |
I minimized the patch, also bringing it as close to the spirit of original patch as possible. Hopefully this helps with the review. |
@FRidh Thinking more about it, could be problematic because Tensorflow will then link to its custom Python 2 library and then we'll have a conflicting Python library loaded into our usual interpreter. EDIT: Fixable via patchelf though. |
Reading more into the original patch: unless I'm mistaken there is actually a race condition in it too! Namely, in C (but not in Python) file is opened exclusively to prevent concurrent writes but then Did I overlook something there? The fix is quite easy (move |
Could very well be a bug, probably wouldn't happen much in practice and the exclusive open was added as an afterthought. |
Hehe, actually the bug that I described is present in Python right now but in a very light form - because |
Turns out fixing this only in importlib is not sufficient and we need to backport CPython part of the fix too. This patch is based on https://hg.python.org/cpython/rev/c16063765d3a but because the code around is different there are some changes (C-strings instead of Python objects etc.) With this patch Tensorflow builds successfully on many-core machine.
Patch updated, tested this one through the night running Tensorflow compilations. |
Yes, did the same with |
Checked out |
Motivation for this change
Turns out fixing this only in importlib is not sufficient and we
need to backport CPython part of the fix too.
This patch is based on https://hg.python.org/cpython/rev/c16063765d3a
but because the code around is different there are some changes (C-strings
instead of Python objects etc.)
With this patch Tensorflow builds successfully on many-core machine.
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)I'd appreciate review of the patch given that I basically rewrote it.