Message ID | 20210309032356.20800-1-Felix.Kuehling@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] drm/amdkfd: fix build error with AMD_IOMMU_V2=m | expand |
On Tue, Mar 9, 2021 at 4:23 AM Felix Kuehling <Felix.Kuehling@amd.com> wrote: > > Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link > against the exported functions. If the GPU driver is built-in but the > IOMMU driver is a loadable module, the kfd_iommu.c file is indeed > built but does not work: > > x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device': > kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid' > x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process': > kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid' > x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend': > kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb' > x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb' > x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device' > x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume': > kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device' > x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb' > x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb' > x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid' > x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb' > x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb' > x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device' > > Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols > are reachable by the amdkfd driver. Output a warning if they are not, > because that may not be what the user was expecting. This would fix the compile-time failure, but it still fails my CI because you introduce a compile-time warning. Can you change it into a runtime warning instead? Neither type of warning is likely to actually reach the person trying to debug the problem, so you still end up with machines that don't do what they should, but I can live with the runtime warning as long as the build doesn't break. I think the proper fix would be to not rely on custom hooks into a particular IOMMU driver, but to instead ensure that the amdgpu driver can do everything it needs through the regular linux/iommu.h interfaces. I realize this is more work, but I wonder if you've tried that, and why it didn't work out. Arnd
Am 2021-03-09 um 3:58 a.m. schrieb Arnd Bergmann: > On Tue, Mar 9, 2021 at 4:23 AM Felix Kuehling <Felix.Kuehling@amd.com> wrote: >> Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link >> against the exported functions. If the GPU driver is built-in but the >> IOMMU driver is a loadable module, the kfd_iommu.c file is indeed >> built but does not work: >> >> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device': >> kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid' >> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process': >> kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid' >> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend': >> kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb' >> x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb' >> x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device' >> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume': >> kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device' >> x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb' >> x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb' >> x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid' >> x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb' >> x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb' >> x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device' >> >> Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols >> are reachable by the amdkfd driver. Output a warning if they are not, >> because that may not be what the user was expecting. > This would fix the compile-time failure, but it still fails my CI > because you introduce > a compile-time warning. Can you change it into a runtime warning instead? Sure. > > Neither type of warning is likely to actually reach the person trying > to debug the > problem, so you still end up with machines that don't do what they should, > but I can live with the runtime warning as long as the build doesn't break. OK. > > I think the proper fix would be to not rely on custom hooks into a particular > IOMMU driver, but to instead ensure that the amdgpu driver can do everything > it needs through the regular linux/iommu.h interfaces. I realize this > is more work, > but I wonder if you've tried that, and why it didn't work out. As far as I know this hasn't been tried. I see that intel-iommu has its own SVM thing, which seems to be similar to what our IOMMUv2 does. I guess we'd have to abstract that into a common API. Regards, Felix > > Arnd
Hi Felix, On Tue, Mar 09, 2021 at 11:30:19AM -0500, Felix Kuehling wrote: > > I think the proper fix would be to not rely on custom hooks into a particular > > IOMMU driver, but to instead ensure that the amdgpu driver can do everything > > it needs through the regular linux/iommu.h interfaces. I realize this > > is more work, > > but I wonder if you've tried that, and why it didn't work out. > > As far as I know this hasn't been tried. I see that intel-iommu has its > own SVM thing, which seems to be similar to what our IOMMUv2 does. I > guess we'd have to abstract that into a common API. The common API was added in 26b25a2b98e4 and implemented by the Intel driver in 064a57d7ddfc. To support it an IOMMU driver implements new IOMMU ops: .dev_has_feat() .dev_feat_enabled() .dev_enable_feat() .dev_disable_feat() .sva_bind() .sva_unbind() .sva_get_pasid() And a device driver calls iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA) followed by iommu_sva_bind_device(). If I remember correctly the biggest obstacle for KFD is the PASID allocation, done by the GPU driver instead of the IOMMU driver, but there may be others. Thanks, Jean
On Tue, Mar 9, 2021 at 12:55 PM Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > Hi Felix, > > On Tue, Mar 09, 2021 at 11:30:19AM -0500, Felix Kuehling wrote: > > > I think the proper fix would be to not rely on custom hooks into a particular > > > IOMMU driver, but to instead ensure that the amdgpu driver can do everything > > > it needs through the regular linux/iommu.h interfaces. I realize this > > > is more work, > > > but I wonder if you've tried that, and why it didn't work out. > > > > As far as I know this hasn't been tried. I see that intel-iommu has its > > own SVM thing, which seems to be similar to what our IOMMUv2 does. I > > guess we'd have to abstract that into a common API. > > The common API was added in 26b25a2b98e4 and implemented by the Intel > driver in 064a57d7ddfc. To support it an IOMMU driver implements new IOMMU > ops: > .dev_has_feat() > .dev_feat_enabled() > .dev_enable_feat() > .dev_disable_feat() > .sva_bind() > .sva_unbind() > .sva_get_pasid() > > And a device driver calls iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA) > followed by iommu_sva_bind_device(). > > If I remember correctly the biggest obstacle for KFD is the PASID > allocation, done by the GPU driver instead of the IOMMU driver, but there > may be others. IIRC, we tried to make the original IOMMUv2 functionality generic but other vendors were not interested at the time, so it ended up being AMD specific and since nothing else was using the pasid allocations we put them in the GPU driver. I guess if this is generic now, it could be moved to a common API and taken out of the driver. Alex
Am 09.03.21 um 18:59 schrieb Alex Deucher: > On Tue, Mar 9, 2021 at 12:55 PM Jean-Philippe Brucker > <jean-philippe@linaro.org> wrote: >> Hi Felix, >> >> On Tue, Mar 09, 2021 at 11:30:19AM -0500, Felix Kuehling wrote: >>>> I think the proper fix would be to not rely on custom hooks into a particular >>>> IOMMU driver, but to instead ensure that the amdgpu driver can do everything >>>> it needs through the regular linux/iommu.h interfaces. I realize this >>>> is more work, >>>> but I wonder if you've tried that, and why it didn't work out. >>> As far as I know this hasn't been tried. I see that intel-iommu has its >>> own SVM thing, which seems to be similar to what our IOMMUv2 does. I >>> guess we'd have to abstract that into a common API. >> The common API was added in 26b25a2b98e4 and implemented by the Intel >> driver in 064a57d7ddfc. To support it an IOMMU driver implements new IOMMU >> ops: >> .dev_has_feat() >> .dev_feat_enabled() >> .dev_enable_feat() >> .dev_disable_feat() >> .sva_bind() >> .sva_unbind() >> .sva_get_pasid() >> >> And a device driver calls iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA) >> followed by iommu_sva_bind_device(). >> >> If I remember correctly the biggest obstacle for KFD is the PASID >> allocation, done by the GPU driver instead of the IOMMU driver, but there >> may be others. > IIRC, we tried to make the original IOMMUv2 functionality generic but > other vendors were not interested at the time, so it ended up being > AMD specific and since nothing else was using the pasid allocations we > put them in the GPU driver. I guess if this is generic now, it could > be moved to a common API and taken out of the driver. There has been quite some effort for this already for generic PASID interface etc.. But it looks like that effort is stalled by now. Anyway at least I'm perfectly fine to have the IOMMUv2 || !IOMMUv2 dependency on the core amdgpu driver for x86. That should solve the build problem at hand quite nicely. Regards, Christian. > > Alex
On Tue, Mar 9, 2021 at 7:34 PM Christian König <christian.koenig@amd.com> wrote: > Am 09.03.21 um 18:59 schrieb Alex Deucher: > > There has been quite some effort for this already for generic PASID > interface etc.. But it looks like that effort is stalled by now. > > Anyway at least I'm perfectly fine to have the IOMMUv2 || !IOMMUv2 > dependency on the core amdgpu driver for x86. > > That should solve the build problem at hand quite nicely. Right, that sounds better than the IS_REACHABLE() hack, and would fix the immediate build regression until the driver is changed to use the proper IOMMU interfaces. Arnd
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c index 66bbca61e3ef..7199eb833f66 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c @@ -20,6 +20,10 @@ * OTHER DEALINGS IN THE SOFTWARE. */ +#include <linux/kconfig.h> + +#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2) + #include <linux/printk.h> #include <linux/device.h> #include <linux/slab.h> @@ -355,3 +359,9 @@ int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev) return 0; } + +#else + +#warning "Moldular IOMMU-V2 is not usable by built-in KFD" + +#endif diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h index dd23d9fdf6a8..b25365fc2c4e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h @@ -23,7 +23,9 @@ #ifndef __KFD_IOMMU_H__ #define __KFD_IOMMU_H__ -#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2) +#include <linux/kconfig.h> + +#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2) #define KFD_SUPPORT_IOMMU_V2 @@ -73,6 +75,6 @@ static inline int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev) return 0; } -#endif /* defined(CONFIG_AMD_IOMMU_V2) */ +#endif /* IS_REACHABLE(CONFIG_AMD_IOMMU_V2) */ #endif /* __KFD_IOMMU_H__ */
Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link against the exported functions. If the GPU driver is built-in but the IOMMU driver is a loadable module, the kfd_iommu.c file is indeed built but does not work: x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device': kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid' x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process': kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid' x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend': kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb' x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb' x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device' x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume': kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device' x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb' x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb' x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid' x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb' x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb' x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device' Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols are reachable by the amdkfd driver. Output a warning if they are not, because that may not be what the user was expecting. Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> --- drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 10 ++++++++++ drivers/gpu/drm/amd/amdkfd/kfd_iommu.h | 6 ++++-- 2 files changed, 14 insertions(+), 2 deletions(-)