-
-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setup.sh: rewrite stripHash #66161
setup.sh: rewrite stripHash #66161
Conversation
Rewrite the `stripHash` helper function with 2 differences: * Paths starting with `--` will no longer produce an error. * Use Bash string manipulation instead of shelling out to `grep` and `cut`. This should be faster.
@@ -802,14 +802,17 @@ dumpVars() { | |||
# Utility function: echo the base name of the given path, with the | |||
# prefix `HASH-' removed, if present. | |||
stripHash() { | |||
local strippedName | |||
local strippedName casematchOpt=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use oldOpts=$(shopt -p nocasematch)
to match how runHook works above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then eval $oldOpts
at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that, but this way is faster. I realize that the performance difference hardly matters here, but it just bugged me to fork the shell unnecessarily.
If I'm going to accept the cost of a fork, then it would actually be simpler overall to skip this and do
if ( shopt -u nocasematch; [[ "$strippedName" =~ ^[a-z0-9]{32}- ]] ); then
which just unsets it in the subshell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I measured it and shopt -s nocasematch; shopt -u nocasematch
is somewhere around 80x faster (on my machine) than ( shopt -s nocasematch )
. More generally (again on my machine), shopt -s nocasematch; shopt -u nocasematch
takes <10µs whereas spawning a subshell takes at least 650µs.
This seems like a tiny difference, but imagine if I'm running stripHash
on 1000 files. That 650µs turns into 650ms. And we're already spawning a separate process for basename
as it is.
For that matter, I'm tempted to replace the call to basename
with more string manipulation. It seems relatively straightforward except for one thing, which is that the behavior of basename //
is documented as being platform-specific, depending on whether the platform in question treats //
as semantically distinct from /
. I don't know what platforms do this (does Cygwin do this?) but the inability to replicate this edge case makes me leery about replacing the call.
EDIT:
I dug into the basename
source and it is indeed Cygwin (also something called Apollo DomainOS that's so old that basename doesn't even support it, and something else called z/OS). It actually tests for support normally by checking the inodes of /
and //
, but hardcodes those OSes when cross-compiling.
I didn't see this in the manpage but basename
also has support for drive letters, which it enables for Cygwin, Win32, OS/2, and DOS.
This is all to say that I'm not going to try and get rid of the explicit call to basename
.
echo "$strippedName" | cut -c34- | ||
strippedName="$(basename -- "$1")" | ||
shopt -q nocasematch && casematchOpt=1 | ||
shopt -u nocasematch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often would this be set? By default we should have it unset. But, I guess it's always good to cover every possible case in utility functions like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unset by default, but a package that sets it as part of one of their build scripts shouldn't cause the function to change behavior.
We could just elect to skip this entirely. The results of doing so are that if you do shopt -s nocasematch
then stripHash
will start stripping hashes containing uppercase characters too, whereas the default behavior is to only strip hashes containing lowercase characters. Ultimately not a big deal, but it was easy enough to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides a few nits, this looks like a good improvement on stripHash. We don't have tests for this currently, so we need to be a little extra careful with it.
Is there anything I have to do to move this forward? |
Oops I did not mean to delete this. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/prs-ready-for-review-may-2019/3032/69 |
I've read through this and I still don't get what real problem does this solve? |
@veprbl The core fix here is just - strippedName="$(basename "$1")"
+ strippedName="$(basename -- "$1")" The rest of the changes seek to optimize |
I guess that means the performance optimization is by far the most significant part of this PR, and the fact that it fixes an edge case is just more of a freebie. |
Motivation for this change
It really bugged me when I discovered that
stripHash --foo
failed with an error so I decided to fix it. While I was at it, I figured I could get rid of the external command invocation as Bash is perfectly capable of doing the necessary string manipulation itself.Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)I tested the new
stripHash
function in isolation usingnixpkgs.bash
and in all cases I tested it matches behavior with the old one (except for the case where the argument starts with--
, which simply failed in the old one but works in the new).I have not actually run any builds with this as it requires rebuilding the entire world (due to changing stdenv) and I don't have the patience to do that locally. I'm not sure if there's any packages that can serve as a simple testbed for stdenv changes.
I realize the changes outside of
basename --
may be a little arcane, but it just bugged me to run two external commands when I could get rid of all additional processes outside ofbasename
altogether. On my machine direct manipulation is ~70x faster than invoking an external command.