Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing{-python}.nix: Remove log pretty-printing cruft #87191

Merged
merged 1 commit into from May 9, 2020

Conversation

edolstra
Copy link
Member

@edolstra edolstra commented May 7, 2020

Motivation for this change

This completes the removal of the nested log feature, which previously got removed from Nix, Hydra, stdenv and GNU Make. In particular, this means that the output of VM builds no longer contains a copy of jQuery.

This also makes VM builds more reproducible because the log output contained a lot of timing-sensitive data.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

This completes the removal of the nested log feature, which previously
got removed from Nix, Hydra, stdenv and GNU Make. In particular, this
means that the output of VM builds no longer contains a copy of
jQuery.
@tfc
Copy link
Contributor

tfc commented May 9, 2020

The time window between opening the PR and merging it (2 days) was way too quick for me this week, i did not have a chance to study these changes before they were merged.

This completes the removal of the nested log feature, which previously got removed from Nix, Hydra, stdenv and GNU Make

Do you have links or any other references to discussions about this that were ongoing?

This also makes VM builds more reproducible because the log output contained a lot of timing-sensitive data.

I don't understand how the change makes the tests more reproducible, as in my experience they were reproducible already, as test authors usually write their tests in a way that they are not timing-sensitive.

What test was damaged so hard by the logging that the feature has to be removed completely? The output folder now contains no logging whatsoever.

In the beginning i carefully ported the perl driver's logging code to python because there was a lot of pushback against removing it. In the long term the idea was to use native python logging frameworks, but people were against having no logging in the meantime.
With these changes, we have no logging but left a lot of python code in that is now dysfunctional. One could re-enable log output to $out/log.xml but with a missing template file there is no nice way to generate the HTML log from it (be that with or without jquery dependency).

If it was just about jQuery, it would surely have been sufficient to just remove jquery from the template file and make it a static no-js html page.

I am confused about the hurry and the method of removing this.

@flokli flokli mentioned this pull request May 9, 2020
10 tasks
@flokli
Copy link
Contributor

flokli commented May 14, 2020

I was also baffled by this, and would appreciate some reaction from @edolstra.

For now, we kept "feature compatibility" with the perl driver, and originally planned to do all the refactoring and yakshaving after all the remaining perl test driver occurences have been removed.

As written in #86486 (comment), this PR didn't really fix the issue it was supposed to fix, and my experiments around using the python logging framework revealed the test driver got a lot less powerful (no more "log nesting" for subtests, as there's no mutable log object, screenshots without any context in test output).

I'd personally suggest to make the writes to stderr threadsafe, and revert the nested log output removal (and the "no more log in $out) until we have some time to properly refactor this - during initial discussions about the refactor, a lot of people actually wanted to keep these log reports - we could make them a flag for hydra, if storing the jquery javascript for every test run is really a problem.

@edolstra
Copy link
Member Author

Yes, putting logs in an output makes the build non-reproducible. Let's pick a random line from a VM test run:

 machine# [    1.667557] input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb1/1-1/1-1:1.0/0003:0627:0001.0001/input/input2 

The timestamp 1.667557 is unlikely to be the same across runs. So running with --repeat N will flag this build as non-reproducible.

More importantly, it's bad UX to have two different kinds of log. Also, viewing "pretty-printed" logs puts strain on Hydra since it needs to fetch paths from the binary cache and unpack them to serve them to the client (whereas regular logs are fetched from the binary cache by the client directly).

BTW, I believe I did mention during the Python migration that there was no need to port the logging stuff. We removed log pretty-printing everywhere else so why should we keep it here?

@tfc
Copy link
Contributor

tfc commented May 14, 2020

BTW, I believe I did mention during the Python migration that there was no need to port the logging stuff. We removed log pretty-printing everywhere else so why should we keep it here?

You did, but this started a sub-discussion that exposed that there are people who want to have such logging facilities in general (i certainly do), that we did not resolve.

A flag like @flokli mentioned, that is either off by default or in jobsets that do reproducibility checks, but still allows for logging (without depending on jquery etc) would be great.

@samueldr
Copy link
Member

samueldr commented May 14, 2020

I don't understand how this makes the tests reproducible, if there is no output to verify most of the time.

~ $ nix-store --realise /nix/store/30w0kfrafcgk03hkak2xcqid7l9y2d58-vm-test-run-installer-lvm
...

.../store/30w0kfrafcgk03hkak2xcqid7l9y2d58-vm-test-run-installer-lvm 130 $ find
.
./coverage-data
./coverage-data/vm-state-machine

Two runs could have wildly differing executions, possibly with actually differing behaviours, but be understood as "reproducible" according to what I understand.

What could happen, to help make the logs more reproducible, which may be a fool's errand, is to have the dmesg output massaged in outputting monotonically incrementing numbers rather than timings in the first column.

Now, that won't fix ordering of the messages, but that's because the kernel execution is not reproducible in that context.


EDIT: until we have truly reproducible execution of a system, we shouldn't describe a VM test as reproducible, and shouldn't pretend it is. They are not. They are notoriously fickle enough to sometimes work and sometimes fail.

@flokli
Copy link
Contributor

flokli commented May 17, 2020

I don't think anyone expects we have a truly reproducible system execution, especially not across multiple VMs.

NixOS VM tests are used to "integration test" one or multiple NixOS systems.

Many people expressed the wish to have the logs available in the output. I also don't see a reason on why we should drop logs, but still keep (also non-reproducible) screenshots and coverage-data.

We could not do this for hydra builds if their outputs are really an issue, but TBH, it's mostly text, which can be well compressed, and we probably fill the cache with much bigger things on every world rebuild.

So IMHO, let's put the log output back in the output.

Whether it should be behind a flag, whether HTML output (nested/non-nested) should be preserved, whether we can use the python logging framework - that's all details, but just removing it probably isn't what we should do here.

@edolstra
Copy link
Member Author

I disagree. Putting logs in $out is wildly inconsistent with every other kind of derivation. It's very surprising (in a bad way) from a UX perspective.

The nested logging once made sense because the rest of the Nix ecosystem implemented nested logs as well. But since we got rid of it everywhere else, it no longer makes sense to keep it around just for VM builds.

@flokli
Copy link
Contributor

flokli commented May 18, 2020

Following that logic, screenshots and coverage data also don't belong into the output as well, and the output should be entirely empty.
Like with build logs, their main reason of existence is reasoning about how the build went.

IMHO, VM tests are something different with other "package derivations", as they don't build packages, but do integration tests, using Nix derivations as a tool to compose them.

People make use of the VM testing infrastructure in scopes outside of nixpkgs, and even in nixpkgs, being able to view the screenshots interleaved with the logs helps to debug them.

We shouldn't just remove them without any alternative.

@samueldr
Copy link
Member

Isn't the output of a test, some kind of result, probably data proving the test? As @flokli says, tests don't build packages, they run tests.

Right now, we're losing the output of the tests. There isn't even a "dirty" VM image to peruse in the output. If it isn't logs, the full VM image would be something I would expect to see as the output of a test, just as I expect the output of packages to be the package.

I'm sure we'll hear "nix log has the log output", but this is not where I would expect to see the output of the test. I wouldn't necessarily think the console output would be there, but logs printed while running the test like the test harness output, should be in nix log. Conversely, I would expect to see no harness-specific output inside the produced output.

It may be that "log" is an overloaded term here, there is the log for running the tests, i.e. harness output, and there is the test result log, i.e. console output from the system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants