Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Mesa cache fix #93946

Closed
wants to merge 2 commits into from
Closed

WIP: Mesa cache fix #93946

wants to merge 2 commits into from

Conversation

corngood
Copy link
Contributor

@corngood corngood commented Jul 27, 2020

Motivation for this change

Fixes #92807

Things done
  • enable build-id for llvm and radv
  • remove old disk-cache-key patch from mesa
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@corngood corngood changed the base branch from master to staging July 27, 2020 02:54
@corngood
Copy link
Contributor Author

The mesa patch should probably go upstream. I'll look into that when I get a chance.

@corngood
Copy link
Contributor Author

Upstream MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6081

Copy link
Member

@primeos primeos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really related to #92807? Your case seems to differ since that issue was fixed by the update to Mesa 20.1.3 which didn't fix it for you. So you seem to be hitting another issue.

pkgs/development/compilers/llvm/10/llvm.nix Show resolved Hide resolved
@@ -57,7 +57,7 @@ stdenv.mkDerivation {
patches = [
./missing-includes.patch # dev_t needs sys/stat.h, time_t needs time.h, etc.-- fixes build w/musl
./opencl-install-dir.patch
./disk_cache-include-dri-driver-path-in-cache-key.patch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check carefully if this patch is indeed not required anymore. From a quick look at the source-code it looks ok, but I haven't checked all drivers yet. Some still refer to the parameter as *timestamp (inconsistently even src/util/disk_cache.h) while using the build-id.

I guess I'm mainly concerned since #91145 apparently didn't affect all drivers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a somewhat thorough check. All the dri and vulkan drivers have build-id.

lib/dri/radeonsi_drv_video.so does not have build-id, and does contain si_disk_cache_create. I'm not sure if that's a problem. None of the _drv_video.so libs or vdpau libs have build-id.

The problem with the cache key patch is that it only affects the actual disk cache keys, not other uses of cache_uuid, such as the one in radv.

I'd be happy just to merge a23c6888d399b291ba691e43028e1b295ad10efc while this gets investigated more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the only driver that still uses timestamps directly:

https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/freedreno/vulkan/tu_device.c#L61

@corngood
Copy link
Contributor Author

Is this really related to #92807? Your case seems to differ since that issue was fixed by the update to Mesa 20.1.3 which didn't fix it for you. So you seem to be hitting another issue.

I'm not sure if it's the same problem everyone in that thread was having, but this comment mentions ac_rtld error: !part->elf, which I believe is caused by the cache collision:

#92807 (comment)

@jtojnar jtojnar removed the request for review from 7c6f434c July 27, 2020 12:48
Now that mesa uses build-id for cache keys, this shouldn't be necessary.
This is used by mesa drivers as a cache key.  llvmPackages_9 is
currently used by mesa in nixpkgs, so we just enable it there and in
llvmPackages_10 so it doesn't get lost in future versions.
@primeos
Copy link
Member

primeos commented Jul 28, 2020

I'm not sure if it's the same problem everyone in that thread was having, but this comment mentions ac_rtld error: !part->elf, which I believe is caused by the cache collision:

Ok, yeah it seems like multiple things got mixed up in that issue.

I've gone ahead and pushed the important fix into staging 535a3e8 and rebased your branch.

Regarding the rest of the commits:

@vcunat
Copy link
Member

vcunat commented Jul 28, 2020

How is this related to patchelf 0.11? ATM we're still using 0.9 for mesa drivers because of that bug (and I can't see this PR changing that).

@corngood
Copy link
Contributor Author

Well I guess the concern was patchelf 0.11 being using on libLLVM? I did build llvm 9 and 10, and the notes are intact. I also stepped through the uuid code to verify it gets used.

$ readelf -n result-lib/lib/libLLVM-10.so                                                                                                                                                                                                                                                ~/src/nixpkgs

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size       Description
  GNU                  0x00000014       NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: bf824074735a0a4a13681e577d65adc101281359

@corngood corngood changed the title Mesa cache fix WIP: Mesa cache fix Jul 28, 2020
@primeos
Copy link
Member

primeos commented Jul 28, 2020

Well I guess the concern was patchelf 0.11 being using on libLLVM?

Yes, exactly.

I did build llvm 9 and 10, and the notes are intact. I also stepped through the uuid code to verify it gets used.

Awesome, thanks :)

@vcunat do you have an opinion / any comments regarding ff4481b? I think it's the way to go but I cannot really estimate the potential regressions. The main drivers should all be fine but as @corngood noted here not everything is using a build-id yet. If we encounter regressions we could then fix them properly (mesa shouldn't rely on any timestamps for reproducible builds and they seem to agree) but it might be difficult for users to track these regressions down (which is my main concern as I know how annoying graphics bugs can unfortunately be). But I don't have any objections if anyone would like to merge this right away, I just don't want to make that decision alone with my limited knowledge. (And I guess Mesa updates regularly come with some regressions/bugs anyway, so I might be just overthinking this.)

@vcunat
Copy link
Member

vcunat commented Jul 28, 2020

I've never really looked into details around that patch.

@corngood
Copy link
Contributor Author

@primeos I marked this WIP because I'm also concerned about regressions in ff4481b.

$ find `nix-build -A mesa_drivers` -name \*.so\* -type f | xargs readelf -n
File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/i915_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 2368960098c4ebc32b23d97d516d5d4abefb4e2f

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/i965_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 2368960098c4ebc32b23d97d516d5d4abefb4e2f

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/radeon_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 2368960098c4ebc32b23d97d516d5d4abefb4e2f

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/r200_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 2368960098c4ebc32b23d97d516d5d4abefb4e2f

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/nouveau_vieux_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 2368960098c4ebc32b23d97d516d5d4abefb4e2f

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/radeonsi_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/nouveau_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/swrast_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/kms_swrast_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/iris_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/r300_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/r600_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/vmwgfx_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/virtio_gpu_dri.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 75fba4914990966831d7cf85e525770d868c1ab9

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/r600_drv_video.so

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/radeonsi_drv_video.so

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/dri/nouveau_drv_video.so

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/vdpau/libvdpau_r300.so.1.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/vdpau/libvdpau_r600.so.1.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/vdpau/libvdpau_radeonsi.so.1.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/vdpau/libvdpau_nouveau.so.1.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/d3d/d3dadapter9.so.1.0.0

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 5630f668630388d1f46b11c54117d16b0c75b69e

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libXvMCr600.so.1.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libXvMCnouveau.so.1.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libxatracker.so.2.5.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libvulkan_intel.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 9d2f81e6a959ecf55acc1f5e1c34e14d980221c4

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libvulkan_radeon.so

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 955bd5fbe91cb8c722cf750b483c2376ba0f3faa

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libEGL_mesa.so.0.0.0

File: /nix/store/c7sg1y7z9s13bnc117blfxqsjf6qmnv5-mesa-custom-20.1.4-drivers/lib/libGLX_mesa.so.0.0.0

si_disk_cache_create is included in a bunch of these libraries, including the non-radeon ones, which is odd.

primeos added a commit that referenced this pull request Aug 26, 2020
This is used by mesa.drivers (still on LLVM 9) as a cache key. I've
ported that change to LLVM 11 to test it and so that it doesn't get lost
in future versions. Credit for the change goes to David McFarland.
See #93946 for details.

Co-Authored-By: David McFarland <corngood@gmail.com>
@jansol
Copy link
Contributor

jansol commented Oct 21, 2020

It's been a while, this PR can probably be closed as outdated since upstream has updated after merging the fix from here already.

@corngood unless I missed something in the discussion? Admittedly I didn't follow it too closely.

@corngood
Copy link
Contributor Author

@jansol yeah, we don't need this PR. I'll close it. Thanks.

@corngood corngood closed this Oct 21, 2020
rvem pushed a commit to serokell/nixpkgs that referenced this pull request Nov 8, 2022
This is used by mesa.drivers (still on LLVM 9) as a cache key. I've
ported that change to LLVM 11 to test it and so that it doesn't get lost
in future versions. Credit for the change goes to David McFarland.
See NixOS#93946 for details.

Co-Authored-By: David McFarland <corngood@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

steam crashes amdgpu with mesa 20.0.8
5 participants