Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

travis-ci: don’t merge prs #29506

Closed
wants to merge 7 commits into from

Conversation

matthewbauer
Copy link
Member

@matthewbauer matthewbauer commented Sep 17, 2017

Travis has a frustrating habit of wanting to merge PRs into master. This is
fine when Nixpkgs is working 100% but that means that when master is failing
the PR is also failing (even though the PR author has literally no control over
master). This is a solution that I think will work out better for everybody.

First, nox-review pr has a new option called --no-merge that will mean that
nox-review won’t try to merge PRs into master. Note that Nox was updated in
94400c0. To prevent Travis from breaking, please verify that this is in
nixpkgs-unstable. You can verify with,

rev=$(git ls-remote https://github.com/nixos/nixpkgs-channels refs/heads/nixpkgs-unstable | awk '{print $1}')
if git merge-base --is-ancestor 94400c0 $rev; then
  echo ok to merge
else
  echo do not merge
fi

Second, the Travis CI script will now check out the HEAD commit of the PR (which
the author can actually change) instead of the MERGE commit.

Together, these changes should mean that PR authors are only responsible for
their own HEADs. This may solve some potential problems but also means PR
authors should start carrying about what is actually in HEAD. Basically, you
should branching off of a working version of Nixpkgs. This should probably be a
commit of something that has gone through Hydra testing like nixpkgs-17.09-darwin or
nixos-17.09. At the very least you should branch off of nixpkgs-unstable unless
you absolutely need something that has been touched since then! (in that case
just branch off of the latest working commit). Basic idea, do NOT blindly branch
off of nixpkgs/master!

This is a preliminary note I’ve made that should probably be included somewhere.
I’ll probably make a mailing list request. Note to PR authors:

For Travis CI to succeed, you need to pay attention to what Nixpkgs commit you
branched off of. Because Nixpkgs is huge, Travis CI needs the binary cache to be
available, so you need to make sure your base commit has entered the binary
cache. It is recommended that you use nixpkgs-unstable or nixos-unstable as a
base.

The following commands will set this up for you,

git remote add https://github.com/nixos/nixpkgs-channels nixpkgs-channels
git fetch nixpkgs-channels nixpkgs-unstable

and to create a new branch,

git checkout -b YOUR_BRANCH_NAME nixpkgs-channels/nixpkgs-unstable

or if you have already create a branch,

git rebase nixpkgs-channels/nixpkgs-unstable

Make sure you still merge into nixpkgs master! If there are merge conflicts with
master, you will still need to resolve them.

Testing for Travis is always difficult so I really can't know how well this works until this PR is opened.

@mention-bot
Copy link

@matthewbauer, thanks for your PR! By analyzing the history of the files in this pull request, we identified @dezgeg, @deepfire and @mpscholten to be potential reviewers.

@matthewbauer matthewbauer force-pushed the travis-no-merge branch 2 times, most recently from a82766a to fc98778 Compare September 17, 2017 20:11
@ThomasMader
Copy link
Contributor

It might make sense to add a delay to the start of the travis-ci PR run because it seems many people including me commit something and a few minutes later realize that there still needs to be done something and force push again. Even this PR created three build runs. (https://travis-ci.org/NixOS/nixpkgs/builds/276596580, https://travis-ci.org/NixOS/nixpkgs/builds/276601311 and https://travis-ci.org/NixOS/nixpkgs/builds/276602831)
It seems that the CI system gets stressed out too much because of this for no good reason.
If one would add a delay of say 30 to 60 minutes, many of the builds would vanish and the overall time should go down drastically.
Even better would be if an old build run could be canceled before a new one is created.

@matthewbauer
Copy link
Member Author

matthewbauer commented Sep 17, 2017

It might make sense to add a delay to the start of the travis-ci PR run because it seems many people including me commit something and a few minutes later realize that there still needs to be done something and force push again

We're pretty much confined to how Travis works. As soon as a commit is pushed to a either a branch or an open PR, it starts a 50 minute timer that we get to build something. Travis basically sets these rules and we have to follow them. One thing you can do, though, is set up Travis for your Nixpkgs fork and get better response times because Travis sets quotas per-repo.

Update:

I don't know who has admin permissions on Travis but turning this on might be a little helpful:

Auto Cancellation allows you to only run builds for the latest commits in the queue. This setting can be applied to builds for Branch builds and Pull Request builds separately. Builds will only be canceled if they are waiting to run, allowing for any running jobs to finish.

Update 2:

You can have a look at Travis CI status:

https://www.traviscistatus.com/

It looks like they have a system-wide backlog recently. Ideally, we could switch back to 'container-based' builds because those have almost 0 backlog, but see discussion on #24200 on why that has its own downsides.

Travis has a frustrating habit of wanting to merge PRs into master. This is
fine when Nixpkgs is working 100% but that means that when master is failing
the PR is also failing (even though the PR author has literally no control over
master). This is a solution that I think will work out better for everybody.

First, ‘nox-review pr’ has a new option called --no-merge that will mean that
nox-review won’t try to merge PRs into master. Note that Nox was updated in
94400c0. To prevent Travis from breaking, please verify that this is in
nixpkgs-unstable. You can verify with,

$ if git merge-base --is-ancestor 94400c0 $(git ls-remote \
    https://github.com/nixos/nixpkgs-channels refs/heads/nixpkgs-unstable \
    | awk '{print $1}'); then echo ok to merge; else echo do not merge; fi

Second, the Travis CI script will now check out the HEAD commit of the PR (which
the author can actually change) instead of the MERGE commit.

Together, these changes should mean that PR authors are only responsible for
their own HEADs. This may solve some potential problems but also means PR
authors should start carrying about what is actually in HEAD. Basically, you
should branching off of a working version of Nixpkgs. This should probably be a
commit of something that has gone through Hydra testing like nixpkgs-17.09-darwin or
nixos-17.09. At the very least you should branch off of nixpkgs-unstable unless
you absolutely need something that has been touched since then! (in that case
just branch off of the latest working commit). Basic idea, do NOT blindly branch
off of nixpkgs/master!

This is a preliminary note I’ve made that should probably be included somewhere.
I’ll probably make a mailing list request. Note to PR authors:

‘For Travis CI to succeed, you need to pay attention to what Nixpkgs commit you
branched off of. Because Nixpkgs is huge, Travis CI needs the binary cache to be
available, so you need to make sure your base commit has entered the binary
cache. It is recommended that you use nixpkgs-unstable or nixos-unstable as a
base. The following commands will set this up for you,

$ git remote add https://github.com/nixos/nixpkgs-channels nixpkgs-channels
$ git fetch nixpkgs-channels nixpkgs-unstable

and to create a new branch,
$ git checkout -b YOUR_BRANCH_NAME nixpkgs-channels/nixpkgs-unstable

or if you have already create a branch,
$ git rebase nixpkgs-channels/nixpkgs-unstable

Make sure you still merge into nixpkgs master! If there are merge conflicts with
master, you probably will not be able to use Travis CI until one of the channels
has included the most recent changes. Unfortunately, this means that big
changes to Nixpkgs will almost always fail Travis CI.’
This will now rebuild branches one commit at a time using nox-review.
@fpletz
Copy link
Member

fpletz commented Sep 18, 2017

I've enabled Auto Cancellation for Pull Requests on Travis since I seem to have the permission somehow. 🐼

As far as this PR is concerned I'm undecided. It would certainly be better to test with the merged changes because this will ultimately happen. The committer could have have based his PR on an old commit from master.

@matthewbauer
Copy link
Member Author

matthewbauer commented Sep 18, 2017

It would certainly be better to test with the merged changes because this will ultimately happen. The committer could have have based his PR on an old commit from master.

I guess my hope is that Hydra would catch this case (or the PR merger). We get ~50 minutes of free CPU time to evaluate a PR with Travis. We're never going to build all of Nixpkgs so let's just use it as a kind of smoke test on PRs. As a rule, aim for 0 false negatives on Travis and 0 false positives on Hydra.

@peterhoeg
Copy link
Member

I typically work against unstable to have as much already compiled as possible. Then I rebase against master and push that or alternatively cherry-pick the commits if rebase fails. I was under the impression that that was the "Best Practice (tm)".

@matthewbauer
Copy link
Member Author

matthewbauer commented Sep 18, 2017

I typically work against unstable to have as much already compiled as possible. Then I rebase against master and push that or alternatively cherry-pick the commits if rebase fails.

I don't think you really need to rebase on master unless there's a merge conflict (which GitHub should tell you). But, this will give you probably the best experience with Travis CI right now because the different between the PR's HEAD commit and the merged commit will not be that much. A problem arises when we get backlogs in Travis and master has changed so much since you originally rebased. Travis CI will always use the latest merge commit and it's extremely unlikely for that to be in binary cache (even on the best days we have ~24 hour delay in channel releases, see https://howoldis.herokuapp.com).

I was under the impression that that was the "Best Practice (tm)".

I'm really not sure if there is a best practice yet, but we probably should have one. I'd suspect almost everyone just bases off of master which I would say is a bad practice.

@ThomasMader
Copy link
Contributor

ThomasMader commented Sep 18, 2017

We're pretty much confined to how Travis works. As soon as a commit is pushed to a either a branch or an open PR, it starts a 50 minute timer that we get to build something. Travis basically sets these rules and we have to follow them. One thing you can do, though, is set up Travis for your Nixpkgs fork and get better response times because Travis sets quotas per-repo.

https://blog.travis-ci.com/2013-08-21-abort-mission-cancel-running-builds/ says it's possible via their console client.
Something like the following should be possible I think but I just skipped over their API and don't have any experience with Travis. (https://docs.travis-ci.com/api/#builds)

  • A new commit is pushed as PR and it is checked if there are already waiting PR builds
  • If so the job isn't gonna be created but stored in unknownPlaceQueue (The question is where)
    If there is another build in unknownPlaceQueue with the same PR number the old one is replaced by the new one.
  • The next time a push comes in the check releases/starts the oldest PRs from the unknownPlaceQueue till there is one waiting PR build on Travis

This is very rough but should be sufficient to get the idea.

Update:

Ok that's exactly what Auto Cancellation is doing. Thanks for activating it. 👍

@primeos
Copy link
Member

primeos commented Sep 18, 2017

From the ML:

Often it won't be able to finish in time, thus failing, causing the contributor to think they might have done something wrong.

We could prevent this by running something like the following:

timeout 45m ./maintainers/scripts/travis-nox-review-pr.sh $TARGETS \
  || if [[ $? -eq 124 ]]; then (>&2 echo "Timeout"); fi

But a drawback would be that this could produce successful (green) CI builds that might fail after the timeout (except there's a special exit code that we could use).

Regarding the PR:

Imho this might make it more complicated for new contributors to submit PRs and is prone to human error.

Another solution could be to extract the changes of a PR (i.e. the changes/commits show by GitHub) and cherry-pick them onto channels/nixpkgs-unstable. We could e.g. get these changes by specifying the following range: ^origin/master HEAD (commits reachable from HEAD without the ones reachable from origin/master, where HEAD is the most recent commit of that PR and origin points to this repository). That should improve the build-time and reduce timeouts since most/all dependencies should be available from the binary cache but compared to the approach from @matthewbauer this has the disadvantage that the author of an PR doesn't have a direct control over the HEAD of the CI build (i.e. the CI build will fail when channels/nixpkgs-unstable is currently failing). Additionally this will obviously fail when the commits can't be applied onto channels/nixpkgs-unstable which will cause problems when channels/nixpkgs-unstable lags behind (and I just noticed that it's currently 11 days behind which is too much imho...). We could try to improve this by using something like origin/master@{yesterday} instead.

Could this improve things or is it a bad idea?

without merging there is no reason to fetch more commits than we will need.
Nox is unpredictable enough that I think it makes sense to not make it affect
the build status.
nix-shell --command true doesn’t seem to work as expected. Instead just redirect
to /dev/null.
I’ve cleaned up some of the commands and added the ‘-Q’ option. I hadn’t seen
this option before, but it seems to quiet nix-build’s fetching. Maybe it will
make our logs a little shorter.
nix-build outputs most stuff to stderr, so just redirect that back to stdout.
@c0bw3b
Copy link
Contributor

c0bw3b commented Nov 15, 2017

@matthewbauer @Mic92 I think this has become irrelevant after #31605

@Mic92 Mic92 closed this Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants