Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/tests/atd : remove non-deterministic test of batch command #37507

Merged
merged 1 commit into from Mar 28, 2018

Conversation

xeji
Copy link
Contributor

@xeji xeji commented Mar 21, 2018

Motivation for this change

The atd test usually failed on Hydra and usually succeded on a local machine. What failed was the test of the batch command, which executes jobs when system load is low - rarely happens on Hydra.

Removed this non-deterministic test case.

/cc ZHF #36453
/cc maintainer @bjornfor

Things done

Manually tested.

"batch" executes jobs based on system load.
test was not deterministic.
@bjornfor
Copy link
Contributor

Thanks! I noticed there was some Hydra failure but didn't understand why. Actually, I still don't understand why. The batch command is run in the VM / guest, which has a separate "load" metric from the host.

@xeji
Copy link
Contributor Author

xeji commented Mar 22, 2018 via email

@bjornfor
Copy link
Contributor

The test also runs "at", waits 1.5 minutes and expects "atd" to have performed the job by then. That's not deterministic either. How about polling every 30s or so, for batch and atd to perform their job, with timeout after 15 minutes? (If that's not enough, I think Hydra is run under way too heavy load.)

@xeji
Copy link
Contributor Author

xeji commented Mar 23, 2018

I've never seen the at test fail. Relative timing looks reliable: at now + 1min is done before sleep 1.5min, no matter the system load. This test makes sense to me, no reason to change it.
As for batch, waiting longer as you suggested may make the test pass in most cases - but then what is its significance? You cannot really test the desired "execute when system load is low" behavior unless you can reliably simulate high or low system load. No way to do that on Hydra.

@bjornfor
Copy link
Contributor

Relative timing looks reliable: at now + 1min is done before sleep 1.5min, no matter the system load.

I bet if the VM spawned a few CPU hogging processes with high enough priority, we could make "at" fail too. So I think both tests are somewhat non-deterministic, and that batch could be made reliable "enough" with some more sleep + poll.

But to be pragmatic, I'm ok with removing the test.

@fpletz fpletz added this to the 18.03 milestone Mar 28, 2018
@fpletz fpletz merged commit 9f3718f into NixOS:master Mar 28, 2018
@xeji xeji deleted the p/test-atd branch March 28, 2018 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants