New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maintainers/scripts/update.nix: Add support for auto-commiting changes #59372
Conversation
This comment has been minimized.
This comment has been minimized.
Could you then go through all the |
eb30060
to
263686c
Compare
I think I removed most of them when we parallelized the updater.
In GNOME, we use this for the menial parts of update process, expecting that the package might not build. The failures (most often dependency changes) need to be fixed manually but we still want to keep the commit. |
263686c
to
75bef3a
Compare
I have managed to get it to pretty much the state I wanted: It will create a new git worktree for each thread in the pool and run the update script there. Then it will commit the change in the worktree and cherry pick it in the main repo, releasing the worktree for a next change. I have two gripes with this:
|
One thing to consider: do we want to updaters to have to add |
maintainers/scripts/update.py
Outdated
thread_name = package['thread'] | ||
worktree, lock = temp_dirs[thread_name] | ||
changes = json.loads(p.stdout) | ||
for change in changes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too much nesting. Consider a small separate function for the following lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say a much more pressing issue is how tightly coupled is everything with console input/output and the interruption not working very well.
Something like this looks better but interruption is still not handled gracefully. Perhaps using queue.Queue
would make it nicer.
--- a/maintainers/scripts/update.py
+++ b/maintainers/scripts/update.py
@@ -9,6 +9,7 @@
import threading
updates = {}
+temp_dirs = {}
thread_name_prefix='UpdateScriptThread'
@@ -28,6 +29,41 @@
return subprocess.run(package['updateScript'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True, cwd=worktree)
+def update_packages(packages, max_workers, commit):
+ with contextlib.ExitStack() as stack, concurrent.futures.ThreadPoolExecutor(max_workers=max_workers, thread_name_prefix=thread_name_prefix) as executor:
+
+ if commit:
+ for i in range(max_workers):
+ temp_dirs[f'{thread_name_prefix}_{str(i)}'] = stack.enter_context(tempfile.TemporaryDirectory()), threading.Lock()
+
+ for wt, _lock in temp_dirs.values():
+ subprocess.run(['git', 'worktree', 'add', wt], check=True)
+
+ for package in packages:
+ updates[executor.submit(run_update_script, package, commit)] = package
+
+ for future in concurrent.futures.as_completed(updates):
+ package = updates[future]
+
+ try:
+ p = future.result()
+ if commit and 'commit' in package['supportedFeatures']:
+ thread_name = package['thread']
+ worktree, lock = temp_dirs[thread_name]
+ changes = json.loads(p.stdout)
+ for change in changes:
+ subprocess.run(['git', 'add'] + change['files'], check=True, cwd=worktree)
+ commit_message = '{attrName}: {oldVersion} → {newVersion}'.format(**change)
+ subprocess.run(['git', 'commit', '-m', commit_message], check=True, cwd=worktree)
+ subprocess.run(['git', 'cherry-pick', os.path.basename(worktree)], check=True)
+ yield package, True, None
+ except subprocess.CalledProcessError as e:
+ yield package, False, e.stdout
+ finally:
+ if commit and 'commit' in package['supportedFeatures']:
+ lock.release()
+
+
def main(max_workers, keep_going, commit, packages):
with open(sys.argv[1]) as f:
packages = json.load(f)
@@ -43,49 +79,22 @@
eprint()
eprint('Running update for:')
- with contextlib.ExitStack() as stack, concurrent.futures.ThreadPoolExecutor(max_workers=max_workers, thread_name_prefix=thread_name_prefix) as executor:
- global temp_dirs
-
- if commit:
- temp_dirs = {f'{thread_name_prefix}_{str(i)}': (stack.enter_context(tempfile.TemporaryDirectory()), threading.Lock()) for i in range(max_workers)}
-
- for wt, _lock in temp_dirs.values():
- subprocess.run(['git', 'worktree', 'add', wt], check=True)
-
- for package in packages:
- updates[executor.submit(run_update_script, package, commit)] = package
-
- for future in concurrent.futures.as_completed(updates):
- package = updates[future]
-
- try:
- p = future.result()
- if commit and 'commit' in package['supportedFeatures']:
- thread_name = package['thread']
- worktree, lock = temp_dirs[thread_name]
- changes = json.loads(p.stdout)
- for change in changes:
- subprocess.run(['git', 'add'] + change['files'], check=True, cwd=worktree)
- commit_message = '{attrName}: {oldVersion} → {newVersion}'.format(**change)
- subprocess.run(['git', 'commit', '-m', commit_message], check=True, cwd=worktree)
- subprocess.run(['git', 'cherry-pick', os.path.basename(worktree)], check=True)
- eprint(f" - {package['name']}: DONE.")
- except subprocess.CalledProcessError as e:
- eprint(f" - {package['name']}: ERROR")
- eprint()
- eprint(f"--- SHOWING ERROR LOG FOR {package['name']} ----------------------")
- eprint()
- eprint(e.stdout.decode('utf-8'))
- with open(f"{package['pname']}.log", 'wb') as f:
- f.write(e.stdout)
- eprint()
- eprint(f"--- SHOWING ERROR LOG FOR {package['name']} ----------------------")
-
- if not keep_going:
- sys.exit(1)
- finally:
- if commit and 'commit' in package['supportedFeatures']:
- lock.release()
+ for package, status, output in update_packages(packages, max_workers, commit):
+ if package['status']:
+ eprint(f" - {package['name']}: DONE.")
+ else:
+ eprint(f" - {package['name']}: ERROR")
+ eprint()
+ eprint(f"--- SHOWING ERROR LOG FOR {package['name']} ----------------------")
+ eprint()
+ eprint(output.decode('utf-8'))
+ with open(f"{package['pname']}.log", 'wb') as f:
+ f.write(output)
+ eprint()
+ eprint(f"--- SHOWING ERROR LOG FOR {package['name']} ----------------------")
+
+ if not keep_going:
+ sys.exit(1)
eprint()
eprint('Packages updated!')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to separate function. I am also thinking about switching from ThreadPoolExecutor
to asyncio and https://pypi.org/project/asyncio-pool/
642d780
to
43526bf
Compare
Tested this out and it seems to work, didn't seem to take a huge a amount of time compared to the previous version. It did get stuck on one update for a while, though I think that could happen before too. One issue, I had to run |
Do you mean that only single update script was running at the time? I guess we could have the threads waiting for their worktree lock to be released but that should not take a long time. The context manager cleans worktrees once it finishes and should do the same even when exception occurs. But maybe exceptions triggered by signals are somewhat different, especially with regards to threads. Maybe we need to handle the signals separately: https://stackoverflow.com/questions/18499497/how-to-process-sigterm-signal-gracefully Or we could even use asyncio instead of threads. The main benefit is that unlike threads, the subprocesses can be stopped easily. Perhaps we could even have a pool of worktrees, simplifying the code even more. https://stackoverflow.com/questions/41677434/how-to-make-a-asyncio-pool-cancelable This series offers a nice description of how python does these things: https://pymotw.com/3/concurrency.html |
Some more questions: We probably need to use the worktrees for non-commit use case as well, right? Otherwise, the How would we merge the changes then, though? Or should we still update the files in-tree and try to solve this by locking? How would we know what files to lock? Another attribute listing the locked files in the And since I am already considering always using the worktree, we could get the commit feature for free (the merge procedure would only differ by the presence of Finally, should we reset the worktree after each merge? It would make conflicts always manifest, not only when they occur in different worker threads, which might be preferable. |
Also should the |
I think it all the other threads were done, think it was just one update that was stuck a while, probably unrelated.
Yeah, the directories in tmp were gone, but git still had references to them, it looked like the update script tried to reuse them and failed since they were no longer there. I might've pressed ctrl-c twice, killing the cleanup before it was entirely done though.
I wouldn't mind, it's just as easy (if not easier) to deal with the commits as un-committed changes in the worktree. |
It doesn't have to be that complicated with the major and minor versions being split. We have library functions that handle it fine. passthru.upateScript can be enabled once NixOS#59372 is merged
Yeah, hopefully the Pool implementation will default to creating instances on demand. |
Hmm, it looks like it four instances on my machine even if there's only one package:
|
Yeah, I have not pushed the new Pool implementation yet. Hence the WIP status. |
bedbefd
to
0557189
Compare
Hello, I'm a bot and I thank you in the name of the community for your contributions. Nixpkgs is a busy repository, and unfortunately sometimes PRs get left behind for too long. Nevertheless, we'd like to help committers reach the PRs that are still important. This PR has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human. If this is still important to you and you'd like to remove the stale label, we ask that you leave a comment. Your comment can be as simple as "still important to me". But there's a bit more you can do: If you received an approval by an unprivileged maintainer and you are just waiting for a merge, you can @ mention someone with merge permissions and ask them to help. You might be able to find someone relevant by using Git blame on the relevant files, or via GitHub's web interface. You can see if someone's a member of the nixpkgs-committers team, by hovering with the mouse over their username on the web interface, or by searching them directly on the list. If your PR wasn't reviewed at all, it might help to find someone who's perhaps a user of the package or module you are changing, or alternatively, ask once more for a review by the maintainer of the package/module this is about. If you don't know any, you can use Git blame on the relevant files, or GitHub's web interface to find someone who touched the relevant files in the past. If your PR has had reviews and nevertheless got stale, make sure you've responded to all of the reviewer's requests / questions. Usually when PR authors show responsibility and dedication, reviewers (privileged or not) show dedication as well. If you've pushed a change, it's possible the reviewer wasn't notified about your push via email, so you can always officially request them for a review, or just @ mention them and say you've addressed their comments. Lastly, you can always ask for help at our Discourse Forum, or more specifically, at this thread or at #nixos' IRC channel. |
Not sure why I chose ProcessPoolExecutor in the first place.
Printing the changed file and new version can be used to commit the changes to git.
Update scripts can now declare features using passthru.updateScript = { command = [ ../../update.sh pname ]; supportedFeatures = [ "commit" ]; }; A `commit` feature means that when the update script finishes successfully, it will print a JSON list like the following: [ { "attrName": "volume_key", "oldVersion": "0.3.11", "newVersion": "0.3.12", "files": [ "/path/to/nixpkgs/pkgs/development/libraries/volume-key/default.nix" ] } ] and data from that will be used when update.nix is run with --argstr commit true to create commits. We will create a new git worktree for each thread in the pool and run the update script there. Then we will commit the change and cherry pick it in the main repo, releasing the worktree for a next change.
Get rid of some globals, split main into smaller functions, rename some variables, add typehints.
0557189
to
6659615
Compare
6659615
to
30a1339
Compare
Replaced by #98304 |
NixOS#59372 was replaced with NixOS#98304, which was merged as 74c5472, so I'm following the instructions in the comment and enabling the updateScript. Seems to work.
No description provided.