Message ID | 20241002122422.287276-3-thomas.hellstrom@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/ttm: Add an option to report graphics memory OOM | expand |
Am 02.10.24 um 14:24 schrieb Thomas Hellström: > Some graphics APIs differentiate between out-of-graphics-memory and > out-of-host-memory (system memory). Add a device init flag to > have -ENOSPC propagated from the resource managers instead of being > converted to -ENOMEM, to aid driver stacks in determining what > error code to return or whether corrective action can be taken at > the driver level. > > Cc: Christian König <christian.koenig@amd.com> > Cc: Matthew Brost <matthew.brost@intel.com> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Independent of how we communicate flags to the TTM device init function this looks like the right approach to me. So feel free to add Reviewed-by: Christian König <christian.koenig@amd.com>. Regards, Christian. > --- > drivers/gpu/drm/ttm/ttm_bo.c | 2 +- > drivers/gpu/drm/ttm/ttm_device.c | 1 + > include/drm/ttm/ttm_device.h | 13 +++++++++++++ > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index 320592435252..c4bec2ad301b 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -835,7 +835,7 @@ int ttm_bo_validate(struct ttm_buffer_object *bo, > > /* For backward compatibility with userspace */ > if (ret == -ENOSPC) > - return -ENOMEM; > + return bo->bdev->propagate_enospc ? ret : -ENOMEM; > > /* > * We might need to add a TTM. > diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c > index 0c85d10e5e0b..aee9d52d745b 100644 > --- a/drivers/gpu/drm/ttm/ttm_device.c > +++ b/drivers/gpu/drm/ttm/ttm_device.c > @@ -203,6 +203,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func > } > > bdev->funcs = funcs; > + bdev->propagate_enospc = flags.propagate_enospc; > > ttm_sys_man_init(bdev); > > diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h > index 1534bd946c78..f9da78bbd925 100644 > --- a/include/drm/ttm/ttm_device.h > +++ b/include/drm/ttm/ttm_device.h > @@ -266,6 +266,13 @@ struct ttm_device { > * @wq: Work queue structure for the delayed delete workqueue. > */ > struct workqueue_struct *wq; > + > + /** > + * @propagate_enospc: Whether -ENOSPC should be propagated to the caller after > + * graphics memory allocation failure. If false, this will be converted to > + * -ENOMEM, which is the default behaviour. > + */ > + bool propagate_enospc; > }; > > int ttm_global_swapout(struct ttm_operation_ctx *ctx, gfp_t gfp_flags); > @@ -295,6 +302,12 @@ struct ttm_device_init_flags { > u32 use_dma_alloc : 1; > /** @use_dma32: If we should use GFP_DMA32 for device memory allocations. */ > u32 use_dma32 : 1; > + /** > + * @propagate_enospc: Whether -ENOSPC should be propagated to the caller after > + * graphics memory allocation failure. If false, this will be converted to > + * -ENOMEM, which is the default behaviour. > + */ > + u32 propagate_enospc : 1; > }; > > int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *funcs,
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 320592435252..c4bec2ad301b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -835,7 +835,7 @@ int ttm_bo_validate(struct ttm_buffer_object *bo, /* For backward compatibility with userspace */ if (ret == -ENOSPC) - return -ENOMEM; + return bo->bdev->propagate_enospc ? ret : -ENOMEM; /* * We might need to add a TTM. diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c index 0c85d10e5e0b..aee9d52d745b 100644 --- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -203,6 +203,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func } bdev->funcs = funcs; + bdev->propagate_enospc = flags.propagate_enospc; ttm_sys_man_init(bdev); diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h index 1534bd946c78..f9da78bbd925 100644 --- a/include/drm/ttm/ttm_device.h +++ b/include/drm/ttm/ttm_device.h @@ -266,6 +266,13 @@ struct ttm_device { * @wq: Work queue structure for the delayed delete workqueue. */ struct workqueue_struct *wq; + + /** + * @propagate_enospc: Whether -ENOSPC should be propagated to the caller after + * graphics memory allocation failure. If false, this will be converted to + * -ENOMEM, which is the default behaviour. + */ + bool propagate_enospc; }; int ttm_global_swapout(struct ttm_operation_ctx *ctx, gfp_t gfp_flags); @@ -295,6 +302,12 @@ struct ttm_device_init_flags { u32 use_dma_alloc : 1; /** @use_dma32: If we should use GFP_DMA32 for device memory allocations. */ u32 use_dma32 : 1; + /** + * @propagate_enospc: Whether -ENOSPC should be propagated to the caller after + * graphics memory allocation failure. If false, this will be converted to + * -ENOMEM, which is the default behaviour. + */ + u32 propagate_enospc : 1; }; int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *funcs,
Some graphics APIs differentiate between out-of-graphics-memory and out-of-host-memory (system memory). Add a device init flag to have -ENOSPC propagated from the resource managers instead of being converted to -ENOMEM, to aid driver stacks in determining what error code to return or whether corrective action can be taken at the driver level. Cc: Christian König <christian.koenig@amd.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> --- drivers/gpu/drm/ttm/ttm_bo.c | 2 +- drivers/gpu/drm/ttm/ttm_device.c | 1 + include/drm/ttm/ttm_device.h | 13 +++++++++++++ 3 files changed, 15 insertions(+), 1 deletion(-)