Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[18.09] fix buildbot #56447

Merged
merged 2 commits into from Mar 3, 2019
Merged

Conversation

veprbl
Copy link
Member

@veprbl veprbl commented Feb 27, 2019

Motivation for this change

#56053

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Assured whether relevant documentation is up to date
  • Fits CONTRIBUTING.md.

buildbot was always broken on release-18.09 due to failing tests

One of the failures is:

[ERROR]
Traceback (most recent call last):
  File "/build/buildbot-1.2.0/buildbot/process/properties.py", line 459, in getRenderingFor
    rv = yield build.render(value[index])
  File "/nix/store/sqr3s9cva7r3z12hqb6rxw3w8kiqzmhd-python2.7-Twisted-18.7.0/lib/python2.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/build/buildbot-1.2.0/buildbot/process/properties.py", line 495, in getRenderingFor
    raise KeyError(error_message)
exceptions.KeyError: "secrets service not started, need to configure SecretManager in c['services'] to use 'secrets'in Interpolate"

buildbot.test.unit.test_master.StartupAndReconfig.test_reconfigService_db_url_changed

There is a mention of this at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=907687
Because buildbot was never working before, we can just bump the version
to the one which passes the test.

(cherry picked from commit 71c4246)
@veprbl
Copy link
Member Author

veprbl commented Feb 27, 2019

cc @nand0p @ryansydnor @lopsided98

@dotlambda
Copy link
Member

Assuming that we also merge #56200 into release-19.03, this will be fixed anyway.

@lopsided98
Copy link
Contributor

@dotlambda How? This is for 18.09 and I don't see anything relevant in #56200 anyway.

I think we should just upgrade sqlalchemy_migrate, rather than creating a new package just for Buildbot. The root cause of the Buildbot test failures is actually a SQLite update (which was backported to 18.09), which broke sqlalchemy_migrate. sqlalchemy_migrate 0.12.0 contains fixes to make it compatible with the latest version of SQLite. Buildbot's comprehensive tests happened to catch the issue at build time, but any package that uses sqlalchemy_migrate 0.11.0 with SQLite is likely to break at runtime.

@veprbl
Copy link
Member Author

veprbl commented Feb 27, 2019

@lopsided98 I think you are right. Looking at the changes between 0.11.0 and 0.12.0 they look like just bugfixes: https://gist.github.com/veprbl/b8c50d01343157507990a30d183927e2

@dotlambda
Copy link
Member

Sorry, I mixed up 18.09 and 19.03.

@veprbl veprbl merged commit 770d3ca into NixOS:release-18.09 Mar 3, 2019
@veprbl
Copy link
Member Author

veprbl commented Mar 5, 2019

@veprbl
Copy link
Member Author

veprbl commented Mar 7, 2019

Those tests failed randomly on Hydra, so I disabled them for release-18.03 in 31e932c
It would be nice if maintainers fixed that properly on master and future release-19.03. Example of this problem on trunk is:
https://hydra.nixos.org/build/83000720
cc @makefu

@makefu
Copy link
Contributor

makefu commented Mar 8, 2019

@veprbl is sqlalchemy_migrate also broken on 19.03 in version 0.12.0? It seems that some version update broke the tests (again).
I can have a look at it.

Last successful build 2018-10-13 17:57:56 This build 2018-10-18 14:29:23
It is broken since almost 5 months and not really a "random fail", right?

@veprbl
Copy link
Member Author

veprbl commented Mar 8, 2019

@makefu I don't know for sure. There were only two evaluations for 19.03 so far: https://hydra.nixos.org/job/nixos/release-19.03/nixpkgs.python27Packages.sqlalchemy_migrate.x86_64-linux
There were more evaluations for master ("trunk" on Hydra) so that's why I checked there.

Last successful build 2018-10-13 17:57:56 This build 2018-10-18 14:29:23
It is broken since almost 5 months and not really a "random fail", right?

I think this really means "Previous successful build". You can find this build in the list of trunk evaluations:
https://hydra.nixos.org/job/nixpkgs/trunk/python27Packages.sqlalchemy_migrate.x86_64-linux/all

@dotlambda
Copy link
Member

If not on release-19.03, it might be broken on staging-19.03.

@veprbl
Copy link
Member Author

veprbl commented Mar 8, 2019

I'm also not sure if this is caused by version update. Looking at different number of repetition of the output:

${PYTHON:-python} -m subunit.run discover -t ./ .  --load-list /build/tmpy7vZIL
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-60} \

I get much fewer of these on my machines and during successful builds on Hydra.
I think this must be a some kind of timeout problem, maybe we could increase those.

@makefu
Copy link
Contributor

makefu commented Mar 8, 2019

so this means that sqlalchemy_migrate in version 0.12.0 looks fine up until now. 0.11.0 tests only broke once (hence "random fail" i guess) in october 2018. TBH i really do not know what happened in this build with the shutil thingy.

@veprbl
Copy link
Member Author

veprbl commented Mar 8, 2019

so this means that sqlalchemy_migrate in version 0.12.0 looks fine up until now

I already mentioned that 0.12.0 broke on release-18.09 in the same way https://hydra.nixos.org/build/89918316

@makefu
Copy link
Contributor

makefu commented Mar 9, 2019

well that sucks , the test case which is failing is different but both times a shutil copy step fails which seems to be used to prepare a test environment in /tmp/.
I am unsure where this error could come from.

@veprbl veprbl deleted the pr/buldbot_fix_18.09 branch December 1, 2020 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants