New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] AMDGPU-PRO 17.40 -> 18.03 for use with 4.18 kernel #46765
Conversation
9a660ff
to
6f2f25e
Compare
Some things I'll try to test:
|
Looks like the module failed to load due to duplicate Unfortunately the kernel version has an extra I can see a few different ways to fix it...
edit: I'd probably go with (2) |
6f2f25e
to
1c4f658
Compare
Looks like the kernel version and the amdgpu-pro version take the same args, actually. I've fixed that and hit some other missing symbols that I'm not sure what to do about. |
Oh yeah, I must have been looking at an old version or something. Weird. So what errors do you get exactly? |
* | ||
*/ | ||
-int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps) | ||
+int _kcl_pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to rename this. It should still have the same name with older kernels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, oops, didn't actually mean to push that. :)
1c4f658
to
e1d2fdd
Compare
I didn't see any references to these in the source. |
Looks like some new modules were added, so the list in the nix expression needs updating: diff --git a/pkgs/os-specific/linux/amdgpu-pro/default.nix b/pkgs/os-specific/linux/amdgpu-pro/default.nix
index c530cdafbe0..0976dac8b1c 100644
--- a/pkgs/os-specific/linux/amdgpu-pro/default.nix
+++ b/pkgs/os-specific/linux/amdgpu-pro/default.nix
@@ -90,6 +90,9 @@ in stdenv.mkDerivation rec {
modules = [
"amd/amdgpu/amdgpu.ko"
"amd/amdkcl/amdkcl.ko"
+ "amd/amdkfd/amdkfd.ko"
+ "amd/lib/amdchash.ko"
+ "scheduler/amd-sched.ko"
"ttm/amdttm.ko"
]; That got the module loading for me, so KVM works. I started looking at user-space stuff. Xorg crashes on startup:
|
Requires at least 4.15 kernel; I've updated it to work with 4.18.
e1d2fdd
to
e35666d
Compare
I fixed that and a couple other files whose names changed. Here's my crash log:
|
I noticed in an strace log that it was failing to load diff --git a/nixos/modules/hardware/video/amdgpu-pro.nix b/nixos/modules/hardware/video/amdgpu-pro.nix
index ed60aad1047..3f65ae9a1ee 100644
--- a/nixos/modules/hardware/video/amdgpu-pro.nix
+++ b/nixos/modules/hardware/video/amdgpu-pro.nix
@@ -48,6 +48,7 @@ in
mkdir -p /run/lib
ln -sfn ${package}/lib ${package.libCompatDir}
ln -sfn ${package} /run/amdgpu-pro
+ ln -sfn ${package} /run/amdgpu
'' + optionalString opengl.driSupport32Bit ''
ln -sfn ${package32}/lib ${package32.libCompatDir}
'';
diff --git a/pkgs/os-specific/linux/amdgpu-pro/default.nix b/pkgs/os-specific/linux/amdgpu-pro/default.nix
index f96a08ed426..0ee9d83b830 100644
--- a/pkgs/os-specific/linux/amdgpu-pro/default.nix
+++ b/pkgs/os-specific/linux/amdgpu-pro/default.nix
@@ -138,6 +138,11 @@ in stdenv.mkDerivation rec {
'' + ''
popd
+ '' + optionalString (!libsOnly) ''
+ pushd opt/amdgpu
+ cp -r share/libdrm $out/share
+ popd
+
'' + optionalString (!libsOnly)
(concatMapStrings (m:
"install -Dm444 usr/src/amdgpu-${build}/${m}.xz $out/lib/modules/${kernel.modDirVersion}/kernel/drivers/gpu/drm/${m}.xz\n") modules)
@@ -165,6 +170,9 @@ in stdenv.mkDerivation rec {
for lib in dri/amdgpu_dri.so libdrm_amdgpu.so.1.0.0 libgbm.so.1.0.0 libkms.so.1.0.0 libamdocl${bitness}.so; do
perl -pi -e 's:/opt/amdgpu-pro/:/run/amdgpu-pro/:g' "$out/lib/$lib"
done
+ for lib in libdrm_amdgpu.so.1.0.0; do
+ perl -pi -e 's:/opt/amdgpu/:/run/amdgpu/:g' "$out/lib/$lib"
+ done
substituteInPlace "$out/share/vulkan/icd.d/amd_icd${bitness}.json" --replace "/opt/amdgpu-pro/lib/${libArch}" "$out/lib"
'' + optionalString (!libsOnly) ''
for lib in libglamoregl.so; do It didn't fix the assert though. amdgpu.ids just looks like human readable names for GPUs. The actual assert message I'm getting is:
Which is probably a result of:
I didn't see anything else in the strace logs that looked like it might be failing due to hardcoded paths. |
The only thing I have to go on now is some errors from LD_DEBUG=files:
I'm not sure what that's about though, because libEGL does seem to export that symbol, and it looks like it gets loaded correctly. |
Thank you for your contributions.
|
It might still be important, as it is mentioned by some still-open issues. |
I marked this as stale due to inactivity. → More info |
I did a pass on 21.30 in #151019. I currently have xorg and 64-bit opengl working, but probably nothing else. |
No description provided.