Message ID | 20240902074859.2992849-2-raag.jadav@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce DRM device wedged event | expand |
On Mon, 02 Sep 2024, Raag Jadav <raag.jadav@intel.com> wrote: > Introduce device wedged event, which will notify userspace of wedged > (hanged/unusable) state of the DRM device through a uevent. This is > useful especially in cases where the device is in unrecoverable state > and requires userspace intervention for recovery. > > Purpose of this implementation is to be vendor agnostic. Userspace > consumers (sysadmin) can define udev rules to parse this event and > take respective action to recover the device. > > Consumer expectations: > ---------------------- > 1) Unbind driver > 2) Reset bus device > 3) Re-bind driver > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > --- > drivers/gpu/drm/drm_drv.c | 21 +++++++++++++++++++++ > include/drm/drm_drv.h | 1 + > 2 files changed, 22 insertions(+) > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > index 93543071a500..dc55cc237d89 100644 > --- a/drivers/gpu/drm/drm_drv.c > +++ b/drivers/gpu/drm/drm_drv.c > @@ -499,6 +499,27 @@ void drm_dev_unplug(struct drm_device *dev) > } > EXPORT_SYMBOL(drm_dev_unplug); > > +/** > + * drm_dev_wedged - declare DRM device as wedged > + * @dev: DRM device > + * > + * This declares a DRM device specified by @dev as wedged (hanged/unusable) > + * and generates a uevent for it, on the basis of which, userspace may take > + * respective action to recover the device. > + * Currently we only set WEDGED=1 in the uevent environment, but this can > + * be expanded in the future. > + */ > +void drm_dev_wedged(struct drm_device *dev) > +{ > + char *event_string = "WEDGED=1"; > + char *envp[] = { event_string, NULL }; > + > + DRM_INFO("%s: device wedged, generating uevent\n", dev_name(dev->dev)); drm_info() please, and you can drop that handrolled dev_name(). BR, Jani. > + > + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); > +} > +EXPORT_SYMBOL(drm_dev_wedged); > + > /* > * DRM internal mount > * We want to be able to allocate our own "struct address_space" to control > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h > index cd37936c3926..a0b2d1435b86 100644 > --- a/include/drm/drm_drv.h > +++ b/include/drm/drm_drv.h > @@ -489,6 +489,7 @@ void drm_put_dev(struct drm_device *dev); > bool drm_dev_enter(struct drm_device *dev, int *idx); > void drm_dev_exit(int idx); > void drm_dev_unplug(struct drm_device *dev); > +void drm_dev_wedged(struct drm_device *dev); > > /** > * drm_dev_is_unplugged - is a DRM device unplugged
On 02/09/24 13:18, Raag Jadav wrote: > Introduce device wedged event, which will notify userspace of wedged > (hanged/unusable) state of the DRM device through a uevent. This is > useful especially in cases where the device is in unrecoverable state > and requires userspace intervention for recovery. > > Purpose of this implementation is to be vendor agnostic. Userspace > consumers (sysadmin) can define udev rules to parse this event and > take respective action to recover the device. > > Consumer expectations: > ---------------------- > 1) Unbind driver > 2) Reset bus device > 3) Re-bind driver > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > --- > drivers/gpu/drm/drm_drv.c | 21 +++++++++++++++++++++ > include/drm/drm_drv.h | 1 + > 2 files changed, 22 insertions(+) > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > index 93543071a500..dc55cc237d89 100644 > --- a/drivers/gpu/drm/drm_drv.c > +++ b/drivers/gpu/drm/drm_drv.c > @@ -499,6 +499,27 @@ void drm_dev_unplug(struct drm_device *dev) > } > EXPORT_SYMBOL(drm_dev_unplug); > > +/** > + * drm_dev_wedged - declare DRM device as wedged > + * @dev: DRM device > + * > + * This declares a DRM device specified by @dev as wedged (hanged/unusable) this doesn't seem to set any drm state as wedged, it is just sending an uevent. you might need to correct the above statement. Thanks, Aravind. > + * and generates a uevent for it, on the basis of which, userspace may take > + * respective action to recover the device. > + * Currently we only set WEDGED=1 in the uevent environment, but this can > + * be expanded in the future. > + */ > +void drm_dev_wedged(struct drm_device *dev) > +{ > + char *event_string = "WEDGED=1"; > + char *envp[] = { event_string, NULL }; > + > + DRM_INFO("%s: device wedged, generating uevent\n", dev_name(dev->dev)); > + > + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); > +} > +EXPORT_SYMBOL(drm_dev_wedged); > + > /* > * DRM internal mount > * We want to be able to allocate our own "struct address_space" to control > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h > index cd37936c3926..a0b2d1435b86 100644 > --- a/include/drm/drm_drv.h > +++ b/include/drm/drm_drv.h > @@ -489,6 +489,7 @@ void drm_put_dev(struct drm_device *dev); > bool drm_dev_enter(struct drm_device *dev, int *idx); > void drm_dev_exit(int idx); > void drm_dev_unplug(struct drm_device *dev); > +void drm_dev_wedged(struct drm_device *dev); > > /** > * drm_dev_is_unplugged - is a DRM device unplugged
On Mon, Sep 02, 2024 at 02:44:21PM +0530, Aravind Iddamsetty wrote: > > On 02/09/24 13:18, Raag Jadav wrote: > > Introduce device wedged event, which will notify userspace of wedged > > (hanged/unusable) state of the DRM device through a uevent. This is > > useful especially in cases where the device is in unrecoverable state > > and requires userspace intervention for recovery. > > > > Purpose of this implementation is to be vendor agnostic. Userspace > > consumers (sysadmin) can define udev rules to parse this event and > > take respective action to recover the device. > > > > Consumer expectations: > > ---------------------- > > 1) Unbind driver > > 2) Reset bus device > > 3) Re-bind driver > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > > --- > > drivers/gpu/drm/drm_drv.c | 21 +++++++++++++++++++++ > > include/drm/drm_drv.h | 1 + > > 2 files changed, 22 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > index 93543071a500..dc55cc237d89 100644 > > --- a/drivers/gpu/drm/drm_drv.c > > +++ b/drivers/gpu/drm/drm_drv.c > > @@ -499,6 +499,27 @@ void drm_dev_unplug(struct drm_device *dev) > > } > > EXPORT_SYMBOL(drm_dev_unplug); > > > > +/** > > + * drm_dev_wedged - declare DRM device as wedged > > + * @dev: DRM device > > + * > > + * This declares a DRM device specified by @dev as wedged (hanged/unusable) > this doesn't seem to set any drm state as wedged, it is just sending an > uevent. you might need to correct the above statement. On a second thought, perhaps this warrants any action on drm_device? Raag
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 93543071a500..dc55cc237d89 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -499,6 +499,27 @@ void drm_dev_unplug(struct drm_device *dev) } EXPORT_SYMBOL(drm_dev_unplug); +/** + * drm_dev_wedged - declare DRM device as wedged + * @dev: DRM device + * + * This declares a DRM device specified by @dev as wedged (hanged/unusable) + * and generates a uevent for it, on the basis of which, userspace may take + * respective action to recover the device. + * Currently we only set WEDGED=1 in the uevent environment, but this can + * be expanded in the future. + */ +void drm_dev_wedged(struct drm_device *dev) +{ + char *event_string = "WEDGED=1"; + char *envp[] = { event_string, NULL }; + + DRM_INFO("%s: device wedged, generating uevent\n", dev_name(dev->dev)); + + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); +} +EXPORT_SYMBOL(drm_dev_wedged); + /* * DRM internal mount * We want to be able to allocate our own "struct address_space" to control diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index cd37936c3926..a0b2d1435b86 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -489,6 +489,7 @@ void drm_put_dev(struct drm_device *dev); bool drm_dev_enter(struct drm_device *dev, int *idx); void drm_dev_exit(int idx); void drm_dev_unplug(struct drm_device *dev); +void drm_dev_wedged(struct drm_device *dev); /** * drm_dev_is_unplugged - is a DRM device unplugged
Introduce device wedged event, which will notify userspace of wedged (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is in unrecoverable state and requires userspace intervention for recovery. Purpose of this implementation is to be vendor agnostic. Userspace consumers (sysadmin) can define udev rules to parse this event and take respective action to recover the device. Consumer expectations: ---------------------- 1) Unbind driver 2) Reset bus device 3) Re-bind driver Signed-off-by: Raag Jadav <raag.jadav@intel.com> --- drivers/gpu/drm/drm_drv.c | 21 +++++++++++++++++++++ include/drm/drm_drv.h | 1 + 2 files changed, 22 insertions(+)