diff mbox series

[2/6] drm/drv: Prepare to remove drm_dev_unplug()

Message ID 20190203154200.61479-3-noralf@tronnes.org (mailing list archive)
State New, archived
Headers show
Series drm/drv: Remove drm_dev_unplug() | expand

Commit Message

Noralf Trønnes Feb. 3, 2019, 3:41 p.m. UTC
The only thing now that makes drm_dev_unplug() special is that it sets
drm_device->unplugged. Move this code to drm_dev_unregister() so that we
can remove drm_dev_unplug().

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
---

Maybe s/unplugged/unregistered/ ?

I looked at drm_device->registered, but using that would mean that
drm_dev_is_unplugged() would return before drm_device is registered.
And given that its current purpose is to prevent race against connector
registration, I stayed away from it.

Noralf.


 drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
 include/drm/drm_drv.h     | 10 ++++------
 2 files changed, 19 insertions(+), 18 deletions(-)

Comments

Oleksandr Andrushchenko Feb. 4, 2019, 10:19 a.m. UTC | #1
On 2/3/19 5:41 PM, Noralf Trønnes wrote:
> The only thing now that makes drm_dev_unplug() special is that it sets
> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> can remove drm_dev_unplug().
>
> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
>
> Maybe s/unplugged/unregistered/ ?
>
> I looked at drm_device->registered, but using that would mean that
> drm_dev_is_unplugged() would return before drm_device is registered.
> And given that its current purpose is to prevent race against connector
> registration, I stayed away from it.
>
> Noralf.
>
>
>   drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>   include/drm/drm_drv.h     | 10 ++++------
>   2 files changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 05bbc2b622fc..e0941200edc6 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>    */
>   void drm_dev_unplug(struct drm_device *dev)
>   {
> -	/*
> -	 * After synchronizing any critical read section is guaranteed to see
> -	 * the new value of ->unplugged, and any critical section which might
> -	 * still have seen the old value of ->unplugged is guaranteed to have
> -	 * finished.
> -	 */
> -	dev->unplugged = true;
> -	synchronize_srcu(&drm_unplug_srcu);
> -
>   	drm_dev_unregister(dev);
>   	drm_dev_put(dev);
>   }
> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>    * drm_dev_register() but does not deallocate the device. The caller must call
>    * drm_dev_put() to drop their final reference.
>    *
> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> - * which can be called while there are still open users of @dev.
> + * This function can be called while there are still open users of @dev as long
> + * as the driver protects its device resources using drm_dev_enter() and
> + * drm_dev_exit().
>    *
>    * This should be called first in the device teardown code to make sure
> - * userspace can't access the device instance any more.
> + * userspace can't access the device instance any more. Drivers that support
> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> + * in order to disable the hardware on regular driver module unload.
>    */
>   void drm_dev_unregister(struct drm_device *dev)
>   {
> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>   	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>   		drm_lastclose(dev);
>   
> +	/*
> +	 * After synchronizing any critical read section is guaranteed to see
> +	 * the new value of ->unplugged, and any critical section which might
> +	 * still have seen the old value of ->unplugged is guaranteed to have
> +	 * finished.
> +	 */
> +	dev->unplugged = true;
> +	synchronize_srcu(&drm_unplug_srcu);
> +
>   	dev->registered = false;
>   
>   	drm_client_dev_unregister(dev);
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index ca46a45a9cce..c50696c82a42 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>    * drm_dev_is_unplugged - is a DRM device unplugged
>    * @dev: DRM device
>    *
> - * This function can be called to check whether a hotpluggable is unplugged.
> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> - * unplugged, these two functions guarantee that any store before calling
> - * drm_dev_unplug() is visible to callers of this function after it completes
> + * This function can be called to check whether @dev is unregistered. This can
> + * be used to detect that the underlying parent device is gone.
>    *
> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> - * recommended that drivers instead use the underlying drm_dev_enter() and
> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>    * drm_dev_exit() function pairs.
>    */
>   static inline bool drm_dev_is_unplugged(struct drm_device *dev)
Daniel Vetter Feb. 4, 2019, 3:41 p.m. UTC | #2
On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> The only thing now that makes drm_dev_unplug() special is that it sets
> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> can remove drm_dev_unplug().
> 
> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> ---
> 
> Maybe s/unplugged/unregistered/ ?
> 
> I looked at drm_device->registered, but using that would mean that
> drm_dev_is_unplugged() would return before drm_device is registered.
> And given that its current purpose is to prevent race against connector
> registration, I stayed away from it.

Yeah I think we need to keep the registered state separate from unplugged.
Iirc this exact scenario is what we discussed when you revamped the
unplug infrastructure.

> 
> Noralf.
> 
> 
>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>  include/drm/drm_drv.h     | 10 ++++------
>  2 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 05bbc2b622fc..e0941200edc6 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>   */
>  void drm_dev_unplug(struct drm_device *dev)
>  {
> -	/*
> -	 * After synchronizing any critical read section is guaranteed to see
> -	 * the new value of ->unplugged, and any critical section which might
> -	 * still have seen the old value of ->unplugged is guaranteed to have
> -	 * finished.
> -	 */
> -	dev->unplugged = true;
> -	synchronize_srcu(&drm_unplug_srcu);
> -
>  	drm_dev_unregister(dev);
>  	drm_dev_put(dev);
>  }
> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>   * drm_dev_register() but does not deallocate the device. The caller must call
>   * drm_dev_put() to drop their final reference.
>   *
> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> - * which can be called while there are still open users of @dev.
> + * This function can be called while there are still open users of @dev as long
> + * as the driver protects its device resources using drm_dev_enter() and
> + * drm_dev_exit().
>   *
>   * This should be called first in the device teardown code to make sure
> - * userspace can't access the device instance any more.
> + * userspace can't access the device instance any more. Drivers that support
> + * device unplug will probably want to call drm_atomic_helper_shutdown() first

Read once more with a bit more coffee, spotted this:

s/first/afterwards/ - shutting down the hw before we've taken it away from
userspace is kinda the wrong way round. It should be the inverse of driver
load, which is 1) allocate structures 2) prep hw 3) register driver with
the world (simplified ofc).

> + * in order to disable the hardware on regular driver module unload.
>   */
>  void drm_dev_unregister(struct drm_device *dev)
>  {
> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>  		drm_lastclose(dev);
>  
> +	/*
> +	 * After synchronizing any critical read section is guaranteed to see
> +	 * the new value of ->unplugged, and any critical section which might
> +	 * still have seen the old value of ->unplugged is guaranteed to have
> +	 * finished.
> +	 */
> +	dev->unplugged = true;
> +	synchronize_srcu(&drm_unplug_srcu);
> +
>  	dev->registered = false;
>  
>  	drm_client_dev_unregister(dev);
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index ca46a45a9cce..c50696c82a42 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>   * drm_dev_is_unplugged - is a DRM device unplugged
>   * @dev: DRM device
>   *
> - * This function can be called to check whether a hotpluggable is unplugged.
> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> - * unplugged, these two functions guarantee that any store before calling
> - * drm_dev_unplug() is visible to callers of this function after it completes
> + * This function can be called to check whether @dev is unregistered. This can
> + * be used to detect that the underlying parent device is gone.

I think it'd be good to keep the first part, and just update the reference
to drm_dev_unregister. So:

 * This function can be called to check whether a hotpluggable is unplugged.
 * Unplugging itself is singalled through drm_dev_unregister(). If a device is
 * unplugged, these two functions guarantee that any store before calling
 * drm_dev_unregister() is visible to callers of this function after it
 * completes.

I think your version shrugs a few important details under the rug. With
those nits addressed:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Cheers, Daniel

>   *
> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> - * recommended that drivers instead use the underlying drm_dev_enter() and
> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>   * drm_dev_exit() function pairs.
>   */
>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> -- 
> 2.20.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Noralf Trønnes Feb. 4, 2019, 5:35 p.m. UTC | #3
Den 04.02.2019 16.41, skrev Daniel Vetter:
> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
>> The only thing now that makes drm_dev_unplug() special is that it sets
>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
>> can remove drm_dev_unplug().
>>
>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
>> ---

[...]

>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>>  include/drm/drm_drv.h     | 10 ++++------
>>  2 files changed, 19 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 05bbc2b622fc..e0941200edc6 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>>   */
>>  void drm_dev_unplug(struct drm_device *dev)
>>  {
>> -	/*
>> -	 * After synchronizing any critical read section is guaranteed to see
>> -	 * the new value of ->unplugged, and any critical section which might
>> -	 * still have seen the old value of ->unplugged is guaranteed to have
>> -	 * finished.
>> -	 */
>> -	dev->unplugged = true;
>> -	synchronize_srcu(&drm_unplug_srcu);
>> -
>>  	drm_dev_unregister(dev);
>>  	drm_dev_put(dev);
>>  }
>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>>   * drm_dev_register() but does not deallocate the device. The caller must call
>>   * drm_dev_put() to drop their final reference.
>>   *
>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
>> - * which can be called while there are still open users of @dev.
>> + * This function can be called while there are still open users of @dev as long
>> + * as the driver protects its device resources using drm_dev_enter() and
>> + * drm_dev_exit().
>>   *
>>   * This should be called first in the device teardown code to make sure
>> - * userspace can't access the device instance any more.
>> + * userspace can't access the device instance any more. Drivers that support
>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> 
> Read once more with a bit more coffee, spotted this:
> 
> s/first/afterwards/ - shutting down the hw before we've taken it away from
> userspace is kinda the wrong way round. It should be the inverse of driver
> load, which is 1) allocate structures 2) prep hw 3) register driver with
> the world (simplified ofc).
> 

The problem is that drm_dev_unregister() sets the device as unplugged
and if drm_atomic_helper_shutdown() is called afterwards it's not
allowed to touch hardware.

I know it's the wrong order, but the only way to do it in the right
order is to have a separate function that sets unplugged:

	drm_dev_unregister();
	drm_atomic_helper_shutdown();
	drm_dev_set_unplugged();

Noralf.

>> + * in order to disable the hardware on regular driver module unload.
>>   */
>>  void drm_dev_unregister(struct drm_device *dev)
>>  {
>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>>  		drm_lastclose(dev);
>>  
>> +	/*
>> +	 * After synchronizing any critical read section is guaranteed to see
>> +	 * the new value of ->unplugged, and any critical section which might
>> +	 * still have seen the old value of ->unplugged is guaranteed to have
>> +	 * finished.
>> +	 */
>> +	dev->unplugged = true;
>> +	synchronize_srcu(&drm_unplug_srcu);
>> +
>>  	dev->registered = false;
>>  
>>  	drm_client_dev_unregister(dev);
>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>> index ca46a45a9cce..c50696c82a42 100644
>> --- a/include/drm/drm_drv.h
>> +++ b/include/drm/drm_drv.h
>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>>   * drm_dev_is_unplugged - is a DRM device unplugged
>>   * @dev: DRM device
>>   *
>> - * This function can be called to check whether a hotpluggable is unplugged.
>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
>> - * unplugged, these two functions guarantee that any store before calling
>> - * drm_dev_unplug() is visible to callers of this function after it completes
>> + * This function can be called to check whether @dev is unregistered. This can
>> + * be used to detect that the underlying parent device is gone.
> 
> I think it'd be good to keep the first part, and just update the reference
> to drm_dev_unregister. So:
> 
>  * This function can be called to check whether a hotpluggable is unplugged.
>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
>  * unplugged, these two functions guarantee that any store before calling
>  * drm_dev_unregister() is visible to callers of this function after it
>  * completes.
> 
> I think your version shrugs a few important details under the rug. With
> those nits addressed:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> Cheers, Daniel
> 
>>   *
>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
>> - * recommended that drivers instead use the underlying drm_dev_enter() and
>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>>   * drm_dev_exit() function pairs.
>>   */
>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
>> -- 
>> 2.20.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Daniel Vetter Feb. 5, 2019, 9:11 a.m. UTC | #4
On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
> 
> 
> Den 04.02.2019 16.41, skrev Daniel Vetter:
> > On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> >> The only thing now that makes drm_dev_unplug() special is that it sets
> >> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> >> can remove drm_dev_unplug().
> >>
> >> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> >> ---
> 
> [...]
> 
> >>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
> >>  include/drm/drm_drv.h     | 10 ++++------
> >>  2 files changed, 19 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >> index 05bbc2b622fc..e0941200edc6 100644
> >> --- a/drivers/gpu/drm/drm_drv.c
> >> +++ b/drivers/gpu/drm/drm_drv.c
> >> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
> >>   */
> >>  void drm_dev_unplug(struct drm_device *dev)
> >>  {
> >> -	/*
> >> -	 * After synchronizing any critical read section is guaranteed to see
> >> -	 * the new value of ->unplugged, and any critical section which might
> >> -	 * still have seen the old value of ->unplugged is guaranteed to have
> >> -	 * finished.
> >> -	 */
> >> -	dev->unplugged = true;
> >> -	synchronize_srcu(&drm_unplug_srcu);
> >> -
> >>  	drm_dev_unregister(dev);
> >>  	drm_dev_put(dev);
> >>  }
> >> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
> >>   * drm_dev_register() but does not deallocate the device. The caller must call
> >>   * drm_dev_put() to drop their final reference.
> >>   *
> >> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> >> - * which can be called while there are still open users of @dev.
> >> + * This function can be called while there are still open users of @dev as long
> >> + * as the driver protects its device resources using drm_dev_enter() and
> >> + * drm_dev_exit().
> >>   *
> >>   * This should be called first in the device teardown code to make sure
> >> - * userspace can't access the device instance any more.
> >> + * userspace can't access the device instance any more. Drivers that support
> >> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> > 
> > Read once more with a bit more coffee, spotted this:
> > 
> > s/first/afterwards/ - shutting down the hw before we've taken it away from
> > userspace is kinda the wrong way round. It should be the inverse of driver
> > load, which is 1) allocate structures 2) prep hw 3) register driver with
> > the world (simplified ofc).
> > 
> 
> The problem is that drm_dev_unregister() sets the device as unplugged
> and if drm_atomic_helper_shutdown() is called afterwards it's not
> allowed to touch hardware.
> 
> I know it's the wrong order, but the only way to do it in the right
> order is to have a separate function that sets unplugged:
> 
> 	drm_dev_unregister();
> 	drm_atomic_helper_shutdown();
> 	drm_dev_set_unplugged();

Annoying ... but yeah calling _shutdown() before we stopped userspace is
also not going to work. Because userspace could quickly re-enable
something, and then the refcounts would be all wrong again and leaking
objects.

I get a bit the feeling we're over-optimizing here with trying to devm-ize
drm_dev_register. Just getting drm_device correctly devm-ized is a big
step forward already, and will open up a lot of TODO items across a lot of
drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
structs, which gets released together with drm_device. I think that's a
much clearer path forward, I think we all agree that getting the kfree out
of the driver codes is a good thing, and it would allow us to do this
correctly.

Then once we have that and rolled out to a few drivers we can reconsider
the entire unregister/shutdown gordian knot here. Atm I just have no idea
how to do this properly :-/

Thoughts, other ideas?

Cheers, Daniel

> Noralf.
> 
> >> + * in order to disable the hardware on regular driver module unload.
> >>   */
> >>  void drm_dev_unregister(struct drm_device *dev)
> >>  {
> >> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
> >>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
> >>  		drm_lastclose(dev);
> >>  
> >> +	/*
> >> +	 * After synchronizing any critical read section is guaranteed to see
> >> +	 * the new value of ->unplugged, and any critical section which might
> >> +	 * still have seen the old value of ->unplugged is guaranteed to have
> >> +	 * finished.
> >> +	 */
> >> +	dev->unplugged = true;
> >> +	synchronize_srcu(&drm_unplug_srcu);
> >> +
> >>  	dev->registered = false;
> >>  
> >>  	drm_client_dev_unregister(dev);
> >> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> >> index ca46a45a9cce..c50696c82a42 100644
> >> --- a/include/drm/drm_drv.h
> >> +++ b/include/drm/drm_drv.h
> >> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
> >>   * drm_dev_is_unplugged - is a DRM device unplugged
> >>   * @dev: DRM device
> >>   *
> >> - * This function can be called to check whether a hotpluggable is unplugged.
> >> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> >> - * unplugged, these two functions guarantee that any store before calling
> >> - * drm_dev_unplug() is visible to callers of this function after it completes
> >> + * This function can be called to check whether @dev is unregistered. This can
> >> + * be used to detect that the underlying parent device is gone.
> > 
> > I think it'd be good to keep the first part, and just update the reference
> > to drm_dev_unregister. So:
> > 
> >  * This function can be called to check whether a hotpluggable is unplugged.
> >  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
> >  * unplugged, these two functions guarantee that any store before calling
> >  * drm_dev_unregister() is visible to callers of this function after it
> >  * completes.
> > 
> > I think your version shrugs a few important details under the rug. With
> > those nits addressed:
> > 
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > 
> > Cheers, Daniel
> > 
> >>   *
> >> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> >> - * recommended that drivers instead use the underlying drm_dev_enter() and
> >> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> >> + * is recommended that drivers instead use the underlying drm_dev_enter() and
> >>   * drm_dev_exit() function pairs.
> >>   */
> >>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> >> -- 
> >> 2.20.1
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
Noralf Trønnes Feb. 5, 2019, 10:20 a.m. UTC | #5
Den 05.02.2019 10.11, skrev Daniel Vetter:
> On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
>>
>>
>> Den 04.02.2019 16.41, skrev Daniel Vetter:
>>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
>>>> The only thing now that makes drm_dev_unplug() special is that it sets
>>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
>>>> can remove drm_dev_unplug().
>>>>
>>>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
>>>> ---
>>
>> [...]
>>
>>>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>>>>  include/drm/drm_drv.h     | 10 ++++------
>>>>  2 files changed, 19 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>> index 05bbc2b622fc..e0941200edc6 100644
>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>>>>   */
>>>>  void drm_dev_unplug(struct drm_device *dev)
>>>>  {
>>>> -	/*
>>>> -	 * After synchronizing any critical read section is guaranteed to see
>>>> -	 * the new value of ->unplugged, and any critical section which might
>>>> -	 * still have seen the old value of ->unplugged is guaranteed to have
>>>> -	 * finished.
>>>> -	 */
>>>> -	dev->unplugged = true;
>>>> -	synchronize_srcu(&drm_unplug_srcu);
>>>> -
>>>>  	drm_dev_unregister(dev);
>>>>  	drm_dev_put(dev);
>>>>  }
>>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>>>>   * drm_dev_register() but does not deallocate the device. The caller must call
>>>>   * drm_dev_put() to drop their final reference.
>>>>   *
>>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
>>>> - * which can be called while there are still open users of @dev.
>>>> + * This function can be called while there are still open users of @dev as long
>>>> + * as the driver protects its device resources using drm_dev_enter() and
>>>> + * drm_dev_exit().
>>>>   *
>>>>   * This should be called first in the device teardown code to make sure
>>>> - * userspace can't access the device instance any more.
>>>> + * userspace can't access the device instance any more. Drivers that support
>>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
>>>
>>> Read once more with a bit more coffee, spotted this:
>>>
>>> s/first/afterwards/ - shutting down the hw before we've taken it away from
>>> userspace is kinda the wrong way round. It should be the inverse of driver
>>> load, which is 1) allocate structures 2) prep hw 3) register driver with
>>> the world (simplified ofc).
>>>
>>
>> The problem is that drm_dev_unregister() sets the device as unplugged
>> and if drm_atomic_helper_shutdown() is called afterwards it's not
>> allowed to touch hardware.
>>
>> I know it's the wrong order, but the only way to do it in the right
>> order is to have a separate function that sets unplugged:
>>
>> 	drm_dev_unregister();
>> 	drm_atomic_helper_shutdown();
>> 	drm_dev_set_unplugged();
> 
> Annoying ... but yeah calling _shutdown() before we stopped userspace is
> also not going to work. Because userspace could quickly re-enable
> something, and then the refcounts would be all wrong again and leaking
> objects.
> 

What happens with a USB device that is unplugged with open userspace,
will that leak objects?

> I get a bit the feeling we're over-optimizing here with trying to devm-ize
> drm_dev_register. Just getting drm_device correctly devm-ized is a big
> step forward already, and will open up a lot of TODO items across a lot of
> drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
> structs, which gets released together with drm_device. I think that's a
> much clearer path forward, I think we all agree that getting the kfree out
> of the driver codes is a good thing, and it would allow us to do this
> correctly.
> 
> Then once we have that and rolled out to a few drivers we can reconsider
> the entire unregister/shutdown gordian knot here. Atm I just have no idea
> how to do this properly :-/
> 
> Thoughts, other ideas?
> 

Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
make much sense if we need a driver remove callback anyways.

I think devm_drm_dev_init() makes sense because it yields a cleaner
probe() function. An additional benefit is that it requires a
drm_driver->release function which is a step in the right direction to
get the drm_device lifetime right.

Do we agree that a drm_dev_set_unplugged() function is necessary to get
the remove/disconnect order right?

What about drm_dev_unplug() maybe I should just leave it be?

- amd uses drm_driver->unload, so that one takes some work to get right
  to support unplug. It doesn't check the unplugged state, so really
  doesn't need drm_dev_unplug() I guess. Do they have cards that can be
  hotplugged?
- udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
  It has only one drm_dev_is_unplugged() check and that is in its
  fbdev->open callback.
- xen calls drm_atomic_helper_shutdown() in its drm_driver->release
  callback which I guess is not correct.

This is what it will look like with a set unplugged function:

void drm_dev_unplug(struct drm_device *dev)
{
	drm_dev_set_unplugged(dev);
	drm_dev_unregister(dev);
	drm_dev_put(dev);
}

Hm, I should probably remove it to avoid further use of it since it's
not correct for a modern driver.

Noralf.

> Cheers, Daniel
> 
>> Noralf.
>>
>>>> + * in order to disable the hardware on regular driver module unload.
>>>>   */
>>>>  void drm_dev_unregister(struct drm_device *dev)
>>>>  {
>>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>>>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>>>>  		drm_lastclose(dev);
>>>>  
>>>> +	/*
>>>> +	 * After synchronizing any critical read section is guaranteed to see
>>>> +	 * the new value of ->unplugged, and any critical section which might
>>>> +	 * still have seen the old value of ->unplugged is guaranteed to have
>>>> +	 * finished.
>>>> +	 */
>>>> +	dev->unplugged = true;
>>>> +	synchronize_srcu(&drm_unplug_srcu);
>>>> +
>>>>  	dev->registered = false;
>>>>  
>>>>  	drm_client_dev_unregister(dev);
>>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>>>> index ca46a45a9cce..c50696c82a42 100644
>>>> --- a/include/drm/drm_drv.h
>>>> +++ b/include/drm/drm_drv.h
>>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>>>>   * drm_dev_is_unplugged - is a DRM device unplugged
>>>>   * @dev: DRM device
>>>>   *
>>>> - * This function can be called to check whether a hotpluggable is unplugged.
>>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
>>>> - * unplugged, these two functions guarantee that any store before calling
>>>> - * drm_dev_unplug() is visible to callers of this function after it completes
>>>> + * This function can be called to check whether @dev is unregistered. This can
>>>> + * be used to detect that the underlying parent device is gone.
>>>
>>> I think it'd be good to keep the first part, and just update the reference
>>> to drm_dev_unregister. So:
>>>
>>>  * This function can be called to check whether a hotpluggable is unplugged.
>>>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
>>>  * unplugged, these two functions guarantee that any store before calling
>>>  * drm_dev_unregister() is visible to callers of this function after it
>>>  * completes.
>>>
>>> I think your version shrugs a few important details under the rug. With
>>> those nits addressed:
>>>
>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>
>>> Cheers, Daniel
>>>
>>>>   *
>>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
>>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
>>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
>>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>>>>   * drm_dev_exit() function pairs.
>>>>   */
>>>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
>>>> -- 
>>>> 2.20.1
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>
Daniel Vetter Feb. 5, 2019, 4:31 p.m. UTC | #6
On Tue, Feb 05, 2019 at 11:20:55AM +0100, Noralf Trønnes wrote:
> 
> 
> Den 05.02.2019 10.11, skrev Daniel Vetter:
> > On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
> >>
> >>
> >> Den 04.02.2019 16.41, skrev Daniel Vetter:
> >>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> >>>> The only thing now that makes drm_dev_unplug() special is that it sets
> >>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> >>>> can remove drm_dev_unplug().
> >>>>
> >>>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> >>>> ---
> >>
> >> [...]
> >>
> >>>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
> >>>>  include/drm/drm_drv.h     | 10 ++++------
> >>>>  2 files changed, 19 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>> index 05bbc2b622fc..e0941200edc6 100644
> >>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
> >>>>   */
> >>>>  void drm_dev_unplug(struct drm_device *dev)
> >>>>  {
> >>>> -	/*
> >>>> -	 * After synchronizing any critical read section is guaranteed to see
> >>>> -	 * the new value of ->unplugged, and any critical section which might
> >>>> -	 * still have seen the old value of ->unplugged is guaranteed to have
> >>>> -	 * finished.
> >>>> -	 */
> >>>> -	dev->unplugged = true;
> >>>> -	synchronize_srcu(&drm_unplug_srcu);
> >>>> -
> >>>>  	drm_dev_unregister(dev);
> >>>>  	drm_dev_put(dev);
> >>>>  }
> >>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
> >>>>   * drm_dev_register() but does not deallocate the device. The caller must call
> >>>>   * drm_dev_put() to drop their final reference.
> >>>>   *
> >>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> >>>> - * which can be called while there are still open users of @dev.
> >>>> + * This function can be called while there are still open users of @dev as long
> >>>> + * as the driver protects its device resources using drm_dev_enter() and
> >>>> + * drm_dev_exit().
> >>>>   *
> >>>>   * This should be called first in the device teardown code to make sure
> >>>> - * userspace can't access the device instance any more.
> >>>> + * userspace can't access the device instance any more. Drivers that support
> >>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> >>>
> >>> Read once more with a bit more coffee, spotted this:
> >>>
> >>> s/first/afterwards/ - shutting down the hw before we've taken it away from
> >>> userspace is kinda the wrong way round. It should be the inverse of driver
> >>> load, which is 1) allocate structures 2) prep hw 3) register driver with
> >>> the world (simplified ofc).
> >>>
> >>
> >> The problem is that drm_dev_unregister() sets the device as unplugged
> >> and if drm_atomic_helper_shutdown() is called afterwards it's not
> >> allowed to touch hardware.
> >>
> >> I know it's the wrong order, but the only way to do it in the right
> >> order is to have a separate function that sets unplugged:
> >>
> >> 	drm_dev_unregister();
> >> 	drm_atomic_helper_shutdown();
> >> 	drm_dev_set_unplugged();
> > 
> > Annoying ... but yeah calling _shutdown() before we stopped userspace is
> > also not going to work. Because userspace could quickly re-enable
> > something, and then the refcounts would be all wrong again and leaking
> > objects.
> > 
> 
> What happens with a USB device that is unplugged with open userspace,
> will that leak objects?

Maybe we've jumped to conclusions. drm_atomic_helper_shutdown() will run
as normal, the only thing that should be skipped is actually touching the
hw (as long as the driver doesn't protect too much with
drm_dev_enter/exit). So all the software updates (including refcounting
updates) will still be done. Ofc current udl is not yet atomic, so in
reality something else happens.

And we ofc still have the same issue that if you just unload the driver,
then the hw will stay on (which might really confuse the driver on next
load, when it assumes that it only gets loaded from cold boot where
everything is off - which usually is the case on an arm soc at least).

> > I get a bit the feeling we're over-optimizing here with trying to devm-ize
> > drm_dev_register. Just getting drm_device correctly devm-ized is a big
> > step forward already, and will open up a lot of TODO items across a lot of
> > drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
> > structs, which gets released together with drm_device. I think that's a
> > much clearer path forward, I think we all agree that getting the kfree out
> > of the driver codes is a good thing, and it would allow us to do this
> > correctly.
> > 
> > Then once we have that and rolled out to a few drivers we can reconsider
> > the entire unregister/shutdown gordian knot here. Atm I just have no idea
> > how to do this properly :-/
> > 
> > Thoughts, other ideas?
> > 
> 
> Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
> make much sense if we need a driver remove callback anyways.

Yup, that's what I meant with the above:
- merge devm_drm_dev_register, use it a lot. This is definitely a good
  idea.
- create a drm_dev_kzalloc, which automatically releases memory on the
  final drm_dev_put. Use it every in drivers for drm structures,
  especially in those drivers that currently use devm (which releases
  those allocations potentialy too early on unplug).
- figure out the next steps after that

> I think devm_drm_dev_init() makes sense because it yields a cleaner
> probe() function. An additional benefit is that it requires a
> drm_driver->release function which is a step in the right direction to
> get the drm_device lifetime right.
> 
> Do we agree that a drm_dev_set_unplugged() function is necessary to get
> the remove/disconnect order right?
> 
> What about drm_dev_unplug() maybe I should just leave it be?
> 
> - amd uses drm_driver->unload, so that one takes some work to get right
>   to support unplug. It doesn't check the unplugged state, so really
>   doesn't need drm_dev_unplug() I guess. Do they have cards that can be
>   hotplugged?

Yeah amd still uses ->load and ->unload, which is not great unfortunately.
I just stumbled over that when I tried to make a series to disable the
global drm_global_mutex for most drivers ...

> - udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
>   It has only one drm_dev_is_unplugged() check and that is in its
>   fbdev->open callback.

udl isn't atomic, so can't use the atomic helpers. pre-atomic doesn't have
refcounting issues when something is left on iirc. I think udl is written
under the assumption that ->unload is always called for an unplug, never
for an actual unload.

> - xen calls drm_atomic_helper_shutdown() in its drm_driver->release
>   callback which I guess is not correct.

Yeah this smells fishy. ->release is supposed to be for cleaning up kernel
structures, not for cleaning up the hw. So maybe drm_mode_config_cleanup
could be put there, not sure honestly. But definitely not _shutdown().

> This is what it will look like with a set unplugged function:
> 
> void drm_dev_unplug(struct drm_device *dev)
> {
> 	drm_dev_set_unplugged(dev);
> 	drm_dev_unregister(dev);
> 	drm_dev_put(dev);
> }
> 
> Hm, I should probably remove it to avoid further use of it since it's
> not correct for a modern driver.

I think something flew over my head ... what's the drm_dev_set_unplugged
for? I think I'm not following you ...

Thanks, Daniel

> 
> Noralf.
> 
> > Cheers, Daniel
> > 
> >> Noralf.
> >>
> >>>> + * in order to disable the hardware on regular driver module unload.
> >>>>   */
> >>>>  void drm_dev_unregister(struct drm_device *dev)
> >>>>  {
> >>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
> >>>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
> >>>>  		drm_lastclose(dev);
> >>>>  
> >>>> +	/*
> >>>> +	 * After synchronizing any critical read section is guaranteed to see
> >>>> +	 * the new value of ->unplugged, and any critical section which might
> >>>> +	 * still have seen the old value of ->unplugged is guaranteed to have
> >>>> +	 * finished.
> >>>> +	 */
> >>>> +	dev->unplugged = true;
> >>>> +	synchronize_srcu(&drm_unplug_srcu);
> >>>> +
> >>>>  	dev->registered = false;
> >>>>  
> >>>>  	drm_client_dev_unregister(dev);
> >>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> >>>> index ca46a45a9cce..c50696c82a42 100644
> >>>> --- a/include/drm/drm_drv.h
> >>>> +++ b/include/drm/drm_drv.h
> >>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
> >>>>   * drm_dev_is_unplugged - is a DRM device unplugged
> >>>>   * @dev: DRM device
> >>>>   *
> >>>> - * This function can be called to check whether a hotpluggable is unplugged.
> >>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> >>>> - * unplugged, these two functions guarantee that any store before calling
> >>>> - * drm_dev_unplug() is visible to callers of this function after it completes
> >>>> + * This function can be called to check whether @dev is unregistered. This can
> >>>> + * be used to detect that the underlying parent device is gone.
> >>>
> >>> I think it'd be good to keep the first part, and just update the reference
> >>> to drm_dev_unregister. So:
> >>>
> >>>  * This function can be called to check whether a hotpluggable is unplugged.
> >>>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
> >>>  * unplugged, these two functions guarantee that any store before calling
> >>>  * drm_dev_unregister() is visible to callers of this function after it
> >>>  * completes.
> >>>
> >>> I think your version shrugs a few important details under the rug. With
> >>> those nits addressed:
> >>>
> >>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>
> >>> Cheers, Daniel
> >>>
> >>>>   *
> >>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> >>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
> >>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> >>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
> >>>>   * drm_dev_exit() function pairs.
> >>>>   */
> >>>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> >>>> -- 
> >>>> 2.20.1
> >>>>
> >>>> _______________________________________________
> >>>> Intel-gfx mailing list
> >>>> Intel-gfx@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>>
> >
Noralf Trønnes Feb. 5, 2019, 5:57 p.m. UTC | #7
Den 05.02.2019 17.31, skrev Daniel Vetter:
> On Tue, Feb 05, 2019 at 11:20:55AM +0100, Noralf Trønnes wrote:
>>
>>
>> Den 05.02.2019 10.11, skrev Daniel Vetter:
>>> On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
>>>>
>>>>
>>>> Den 04.02.2019 16.41, skrev Daniel Vetter:
>>>>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
>>>>>> The only thing now that makes drm_dev_unplug() special is that it sets
>>>>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
>>>>>> can remove drm_dev_unplug().
>>>>>>
>>>>>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
>>>>>> ---
>>>>
>>>> [...]
>>>>
>>>>>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>>>>>>  include/drm/drm_drv.h     | 10 ++++------
>>>>>>  2 files changed, 19 insertions(+), 18 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>>>> index 05bbc2b622fc..e0941200edc6 100644
>>>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>>>>>>   */
>>>>>>  void drm_dev_unplug(struct drm_device *dev)
>>>>>>  {
>>>>>> -	/*
>>>>>> -	 * After synchronizing any critical read section is guaranteed to see
>>>>>> -	 * the new value of ->unplugged, and any critical section which might
>>>>>> -	 * still have seen the old value of ->unplugged is guaranteed to have
>>>>>> -	 * finished.
>>>>>> -	 */
>>>>>> -	dev->unplugged = true;
>>>>>> -	synchronize_srcu(&drm_unplug_srcu);
>>>>>> -
>>>>>>  	drm_dev_unregister(dev);
>>>>>>  	drm_dev_put(dev);
>>>>>>  }
>>>>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>>>>>>   * drm_dev_register() but does not deallocate the device. The caller must call
>>>>>>   * drm_dev_put() to drop their final reference.
>>>>>>   *
>>>>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
>>>>>> - * which can be called while there are still open users of @dev.
>>>>>> + * This function can be called while there are still open users of @dev as long
>>>>>> + * as the driver protects its device resources using drm_dev_enter() and
>>>>>> + * drm_dev_exit().
>>>>>>   *
>>>>>>   * This should be called first in the device teardown code to make sure
>>>>>> - * userspace can't access the device instance any more.
>>>>>> + * userspace can't access the device instance any more. Drivers that support
>>>>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
>>>>>
>>>>> Read once more with a bit more coffee, spotted this:
>>>>>
>>>>> s/first/afterwards/ - shutting down the hw before we've taken it away from
>>>>> userspace is kinda the wrong way round. It should be the inverse of driver
>>>>> load, which is 1) allocate structures 2) prep hw 3) register driver with
>>>>> the world (simplified ofc).
>>>>>
>>>>
>>>> The problem is that drm_dev_unregister() sets the device as unplugged
>>>> and if drm_atomic_helper_shutdown() is called afterwards it's not
>>>> allowed to touch hardware.
>>>>
>>>> I know it's the wrong order, but the only way to do it in the right
>>>> order is to have a separate function that sets unplugged:
>>>>
>>>> 	drm_dev_unregister();
>>>> 	drm_atomic_helper_shutdown();
>>>> 	drm_dev_set_unplugged();
>>>
>>> Annoying ... but yeah calling _shutdown() before we stopped userspace is
>>> also not going to work. Because userspace could quickly re-enable
>>> something, and then the refcounts would be all wrong again and leaking
>>> objects.
>>>
>>
>> What happens with a USB device that is unplugged with open userspace,
>> will that leak objects?
> 
> Maybe we've jumped to conclusions. drm_atomic_helper_shutdown() will run
> as normal, the only thing that should be skipped is actually touching the
> hw (as long as the driver doesn't protect too much with
> drm_dev_enter/exit). So all the software updates (including refcounting
> updates) will still be done. Ofc current udl is not yet atomic, so in
> reality something else happens.
> 
> And we ofc still have the same issue that if you just unload the driver,
> then the hw will stay on (which might really confuse the driver on next
> load, when it assumes that it only gets loaded from cold boot where
> everything is off - which usually is the case on an arm soc at least).
> 
>>> I get a bit the feeling we're over-optimizing here with trying to devm-ize
>>> drm_dev_register. Just getting drm_device correctly devm-ized is a big
>>> step forward already, and will open up a lot of TODO items across a lot of
>>> drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
>>> structs, which gets released together with drm_device. I think that's a
>>> much clearer path forward, I think we all agree that getting the kfree out
>>> of the driver codes is a good thing, and it would allow us to do this
>>> correctly.
>>>
>>> Then once we have that and rolled out to a few drivers we can reconsider
>>> the entire unregister/shutdown gordian knot here. Atm I just have no idea
>>> how to do this properly :-/
>>>
>>> Thoughts, other ideas?
>>>
>>
>> Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
>> make much sense if we need a driver remove callback anyways.
> 
> Yup, that's what I meant with the above:
> - merge devm_drm_dev_register, use it a lot. This is definitely a good
>   idea.
> - create a drm_dev_kzalloc, which automatically releases memory on the
>   final drm_dev_put. Use it every in drivers for drm structures,
>   especially in those drivers that currently use devm (which releases
>   those allocations potentialy too early on unplug).
> - figure out the next steps after that
> 
>> I think devm_drm_dev_init() makes sense because it yields a cleaner
>> probe() function. An additional benefit is that it requires a
>> drm_driver->release function which is a step in the right direction to
>> get the drm_device lifetime right.
>>
>> Do we agree that a drm_dev_set_unplugged() function is necessary to get
>> the remove/disconnect order right?
>>
>> What about drm_dev_unplug() maybe I should just leave it be?
>>
>> - amd uses drm_driver->unload, so that one takes some work to get right
>>   to support unplug. It doesn't check the unplugged state, so really
>>   doesn't need drm_dev_unplug() I guess. Do they have cards that can be
>>   hotplugged?
> 
> Yeah amd still uses ->load and ->unload, which is not great unfortunately.
> I just stumbled over that when I tried to make a series to disable the
> global drm_global_mutex for most drivers ...
> 
>> - udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
>>   It has only one drm_dev_is_unplugged() check and that is in its
>>   fbdev->open callback.
> 
> udl isn't atomic, so can't use the atomic helpers. pre-atomic doesn't have
> refcounting issues when something is left on iirc. I think udl is written
> under the assumption that ->unload is always called for an unplug, never
> for an actual unload.
> 
>> - xen calls drm_atomic_helper_shutdown() in its drm_driver->release
>>   callback which I guess is not correct.
> 
> Yeah this smells fishy. ->release is supposed to be for cleaning up kernel
> structures, not for cleaning up the hw. So maybe drm_mode_config_cleanup
> could be put there, not sure honestly. But definitely not _shutdown().
> 
>> This is what it will look like with a set unplugged function:
>>
>> void drm_dev_unplug(struct drm_device *dev)
>> {
>> 	drm_dev_set_unplugged(dev);
>> 	drm_dev_unregister(dev);
>> 	drm_dev_put(dev);
>> }
>>
>> Hm, I should probably remove it to avoid further use of it since it's
>> not correct for a modern driver.
> 
> I think something flew over my head ... what's the drm_dev_set_unplugged
> for? I think I'm not following you ...
> 

Taking it a few steps back:

This patch proposes to move 'dev->unplugged = true' from
drm_dev_unplug() to drm_dev_unregister().

Additionally I proposed this change to the drm_dev_unregister() docs:

  * This should be called first in the device teardown code to make sure
- * userspace can't access the device instance any more.
+ * userspace can't access the device instance any more. Drivers that
support
+ * device unplug will probably want to call
drm_atomic_helper_shutdown() first
+ * in order to disable the hardware on regular driver module unload.

Which would give a driver remove callback like this:

static int driver_remove()
{
	drm_atomic_helper_shutdown();
	drm_dev_unregister();
}

Your reaction was that drm_atomic_helper_shutdown() needs to be called
after drm_dev_unregister() to avoid a race resulting in leaked objects.
However if we call it afterwards, ->unplugged will be true and the
driver can't touch hardware.

Then I proposed moving the unplugged state setting to:

void drm_dev_set_unplugged(struct drm_device *dev)
{
	dev->unplugged = true;
	synchronize_srcu(&drm_unplug_srcu);
}

Now it is possible to avoid the race and still touch hardware:

static int driver_remove()
{
	drm_dev_unregister();
	drm_atomic_helper_shutdown();
	drm_dev_set_unplugged();
}

But now I'm back to the question: Is it the driver or the device that is
going away?

If it's the driver we are fine touching hardware, if it's the device it
depends on how we access the device resource and whether the subsystem
has protection in place to handle access after the device is gone. I
think USB can handle and block device access up until the disconnect
callback has finished (no point in doing so though, since the normal
operation is that the device is gone, not the driver unloading).

Is there a way to determine who's going away without changes to the
device core?

Maybe. The driver can only be unregistered if there are no open file
handles because a ref is taken on the driver module.

So maybe something along these lines:

int drm_dev_open_count(struct drm_device *dev)
{
	int open_count;

	mutex_lock(&drm_global_mutex);
	open_count = dev->open_count;
	mutex_unlock(&drm_global_mutex);

	return open_count;
}

static int driver_remove()
{
	drm_dev_unregister();

	open_count = drm_dev_open_count();

	/* Open fd's, device is going away, block access */
	if (open_count)
		drm_dev_set_unplugged();

	drm_atomic_helper_shutdown();

	/* No open fd's, driver is going away */
	if (!open_count)
		drm_dev_set_unplugged();
}


Personally I have 2 use cases:
- tinydrm SPI drivers
  The only hotpluggable SPI controllers I know of are USB which should
  handle device access while unregistering.
  SPI devices can be removed, but the controller driver is still in
  place so it's safe.
- A future USB driver (that I hope to work on when all this core stuff
  is in place).
  There's no point in touching the hw, so DRM can be set uplugged right
  away in the disconnect() callback.

This means that I don't need to know who's going away for my use cases.

This turned out to be quite long winding, but the bottom line is that
having a separate function to set the unplugged state seems to give a
lot of flexibility for various use cases.

I hope I didn't muddy the waters even more :-)

Noralf.

> Thanks, Daniel
> 
>>
>> Noralf.
>>
>>> Cheers, Daniel
>>>
>>>> Noralf.
>>>>
>>>>>> + * in order to disable the hardware on regular driver module unload.
>>>>>>   */
>>>>>>  void drm_dev_unregister(struct drm_device *dev)
>>>>>>  {
>>>>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>>>>>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>>>>>>  		drm_lastclose(dev);
>>>>>>  
>>>>>> +	/*
>>>>>> +	 * After synchronizing any critical read section is guaranteed to see
>>>>>> +	 * the new value of ->unplugged, and any critical section which might
>>>>>> +	 * still have seen the old value of ->unplugged is guaranteed to have
>>>>>> +	 * finished.
>>>>>> +	 */
>>>>>> +	dev->unplugged = true;
>>>>>> +	synchronize_srcu(&drm_unplug_srcu);
>>>>>> +
>>>>>>  	dev->registered = false;
>>>>>>  
>>>>>>  	drm_client_dev_unregister(dev);
>>>>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>>>>>> index ca46a45a9cce..c50696c82a42 100644
>>>>>> --- a/include/drm/drm_drv.h
>>>>>> +++ b/include/drm/drm_drv.h
>>>>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>>>>>>   * drm_dev_is_unplugged - is a DRM device unplugged
>>>>>>   * @dev: DRM device
>>>>>>   *
>>>>>> - * This function can be called to check whether a hotpluggable is unplugged.
>>>>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
>>>>>> - * unplugged, these two functions guarantee that any store before calling
>>>>>> - * drm_dev_unplug() is visible to callers of this function after it completes
>>>>>> + * This function can be called to check whether @dev is unregistered. This can
>>>>>> + * be used to detect that the underlying parent device is gone.
>>>>>
>>>>> I think it'd be good to keep the first part, and just update the reference
>>>>> to drm_dev_unregister. So:
>>>>>
>>>>>  * This function can be called to check whether a hotpluggable is unplugged.
>>>>>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
>>>>>  * unplugged, these two functions guarantee that any store before calling
>>>>>  * drm_dev_unregister() is visible to callers of this function after it
>>>>>  * completes.
>>>>>
>>>>> I think your version shrugs a few important details under the rug. With
>>>>> those nits addressed:
>>>>>
>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>
>>>>> Cheers, Daniel
>>>>>
>>>>>>   *
>>>>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
>>>>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
>>>>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
>>>>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>>>>>>   * drm_dev_exit() function pairs.
>>>>>>   */
>>>>>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
>>>>>> -- 
>>>>>> 2.20.1
>>>>>>
>>>>>> _______________________________________________
>>>>>> Intel-gfx mailing list
>>>>>> Intel-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>>>
>>>
>
Daniel Vetter Feb. 6, 2019, 3:26 p.m. UTC | #8
On Tue, Feb 05, 2019 at 06:57:50PM +0100, Noralf Trønnes wrote:
> 
> 
> Den 05.02.2019 17.31, skrev Daniel Vetter:
> > On Tue, Feb 05, 2019 at 11:20:55AM +0100, Noralf Trønnes wrote:
> >>
> >>
> >> Den 05.02.2019 10.11, skrev Daniel Vetter:
> >>> On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
> >>>>
> >>>>
> >>>> Den 04.02.2019 16.41, skrev Daniel Vetter:
> >>>>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> >>>>>> The only thing now that makes drm_dev_unplug() special is that it sets
> >>>>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> >>>>>> can remove drm_dev_unplug().
> >>>>>>
> >>>>>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> >>>>>> ---
> >>>>
> >>>> [...]
> >>>>
> >>>>>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
> >>>>>>  include/drm/drm_drv.h     | 10 ++++------
> >>>>>>  2 files changed, 19 insertions(+), 18 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>>> index 05bbc2b622fc..e0941200edc6 100644
> >>>>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
> >>>>>>   */
> >>>>>>  void drm_dev_unplug(struct drm_device *dev)
> >>>>>>  {
> >>>>>> -	/*
> >>>>>> -	 * After synchronizing any critical read section is guaranteed to see
> >>>>>> -	 * the new value of ->unplugged, and any critical section which might
> >>>>>> -	 * still have seen the old value of ->unplugged is guaranteed to have
> >>>>>> -	 * finished.
> >>>>>> -	 */
> >>>>>> -	dev->unplugged = true;
> >>>>>> -	synchronize_srcu(&drm_unplug_srcu);
> >>>>>> -
> >>>>>>  	drm_dev_unregister(dev);
> >>>>>>  	drm_dev_put(dev);
> >>>>>>  }
> >>>>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
> >>>>>>   * drm_dev_register() but does not deallocate the device. The caller must call
> >>>>>>   * drm_dev_put() to drop their final reference.
> >>>>>>   *
> >>>>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> >>>>>> - * which can be called while there are still open users of @dev.
> >>>>>> + * This function can be called while there are still open users of @dev as long
> >>>>>> + * as the driver protects its device resources using drm_dev_enter() and
> >>>>>> + * drm_dev_exit().
> >>>>>>   *
> >>>>>>   * This should be called first in the device teardown code to make sure
> >>>>>> - * userspace can't access the device instance any more.
> >>>>>> + * userspace can't access the device instance any more. Drivers that support
> >>>>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> >>>>>
> >>>>> Read once more with a bit more coffee, spotted this:
> >>>>>
> >>>>> s/first/afterwards/ - shutting down the hw before we've taken it away from
> >>>>> userspace is kinda the wrong way round. It should be the inverse of driver
> >>>>> load, which is 1) allocate structures 2) prep hw 3) register driver with
> >>>>> the world (simplified ofc).
> >>>>>
> >>>>
> >>>> The problem is that drm_dev_unregister() sets the device as unplugged
> >>>> and if drm_atomic_helper_shutdown() is called afterwards it's not
> >>>> allowed to touch hardware.
> >>>>
> >>>> I know it's the wrong order, but the only way to do it in the right
> >>>> order is to have a separate function that sets unplugged:
> >>>>
> >>>> 	drm_dev_unregister();
> >>>> 	drm_atomic_helper_shutdown();
> >>>> 	drm_dev_set_unplugged();
> >>>
> >>> Annoying ... but yeah calling _shutdown() before we stopped userspace is
> >>> also not going to work. Because userspace could quickly re-enable
> >>> something, and then the refcounts would be all wrong again and leaking
> >>> objects.
> >>>
> >>
> >> What happens with a USB device that is unplugged with open userspace,
> >> will that leak objects?
> > 
> > Maybe we've jumped to conclusions. drm_atomic_helper_shutdown() will run
> > as normal, the only thing that should be skipped is actually touching the
> > hw (as long as the driver doesn't protect too much with
> > drm_dev_enter/exit). So all the software updates (including refcounting
> > updates) will still be done. Ofc current udl is not yet atomic, so in
> > reality something else happens.
> > 
> > And we ofc still have the same issue that if you just unload the driver,
> > then the hw will stay on (which might really confuse the driver on next
> > load, when it assumes that it only gets loaded from cold boot where
> > everything is off - which usually is the case on an arm soc at least).
> > 
> >>> I get a bit the feeling we're over-optimizing here with trying to devm-ize
> >>> drm_dev_register. Just getting drm_device correctly devm-ized is a big
> >>> step forward already, and will open up a lot of TODO items across a lot of
> >>> drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
> >>> structs, which gets released together with drm_device. I think that's a
> >>> much clearer path forward, I think we all agree that getting the kfree out
> >>> of the driver codes is a good thing, and it would allow us to do this
> >>> correctly.
> >>>
> >>> Then once we have that and rolled out to a few drivers we can reconsider
> >>> the entire unregister/shutdown gordian knot here. Atm I just have no idea
> >>> how to do this properly :-/
> >>>
> >>> Thoughts, other ideas?
> >>>
> >>
> >> Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
> >> make much sense if we need a driver remove callback anyways.
> > 
> > Yup, that's what I meant with the above:
> > - merge devm_drm_dev_register, use it a lot. This is definitely a good
> >   idea.
> > - create a drm_dev_kzalloc, which automatically releases memory on the
> >   final drm_dev_put. Use it every in drivers for drm structures,
> >   especially in those drivers that currently use devm (which releases
> >   those allocations potentialy too early on unplug).
> > - figure out the next steps after that
> > 
> >> I think devm_drm_dev_init() makes sense because it yields a cleaner
> >> probe() function. An additional benefit is that it requires a
> >> drm_driver->release function which is a step in the right direction to
> >> get the drm_device lifetime right.
> >>
> >> Do we agree that a drm_dev_set_unplugged() function is necessary to get
> >> the remove/disconnect order right?
> >>
> >> What about drm_dev_unplug() maybe I should just leave it be?
> >>
> >> - amd uses drm_driver->unload, so that one takes some work to get right
> >>   to support unplug. It doesn't check the unplugged state, so really
> >>   doesn't need drm_dev_unplug() I guess. Do they have cards that can be
> >>   hotplugged?
> > 
> > Yeah amd still uses ->load and ->unload, which is not great unfortunately.
> > I just stumbled over that when I tried to make a series to disable the
> > global drm_global_mutex for most drivers ...
> > 
> >> - udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
> >>   It has only one drm_dev_is_unplugged() check and that is in its
> >>   fbdev->open callback.
> > 
> > udl isn't atomic, so can't use the atomic helpers. pre-atomic doesn't have
> > refcounting issues when something is left on iirc. I think udl is written
> > under the assumption that ->unload is always called for an unplug, never
> > for an actual unload.
> > 
> >> - xen calls drm_atomic_helper_shutdown() in its drm_driver->release
> >>   callback which I guess is not correct.
> > 
> > Yeah this smells fishy. ->release is supposed to be for cleaning up kernel
> > structures, not for cleaning up the hw. So maybe drm_mode_config_cleanup
> > could be put there, not sure honestly. But definitely not _shutdown().
> > 
> >> This is what it will look like with a set unplugged function:
> >>
> >> void drm_dev_unplug(struct drm_device *dev)
> >> {
> >> 	drm_dev_set_unplugged(dev);
> >> 	drm_dev_unregister(dev);
> >> 	drm_dev_put(dev);
> >> }
> >>
> >> Hm, I should probably remove it to avoid further use of it since it's
> >> not correct for a modern driver.
> > 
> > I think something flew over my head ... what's the drm_dev_set_unplugged
> > for? I think I'm not following you ...
> > 
> 
> Taking it a few steps back:
> 
> This patch proposes to move 'dev->unplugged = true' from
> drm_dev_unplug() to drm_dev_unregister().
> 
> Additionally I proposed this change to the drm_dev_unregister() docs:
> 
>   * This should be called first in the device teardown code to make sure
> - * userspace can't access the device instance any more.
> + * userspace can't access the device instance any more. Drivers that
> support
> + * device unplug will probably want to call
> drm_atomic_helper_shutdown() first
> + * in order to disable the hardware on regular driver module unload.
> 
> Which would give a driver remove callback like this:
> 
> static int driver_remove()
> {
> 	drm_atomic_helper_shutdown();
> 	drm_dev_unregister();
> }
> 
> Your reaction was that drm_atomic_helper_shutdown() needs to be called
> after drm_dev_unregister() to avoid a race resulting in leaked objects.
> However if we call it afterwards, ->unplugged will be true and the
> driver can't touch hardware.
> 
> Then I proposed moving the unplugged state setting to:
> 
> void drm_dev_set_unplugged(struct drm_device *dev)
> {
> 	dev->unplugged = true;
> 	synchronize_srcu(&drm_unplug_srcu);
> }
> 
> Now it is possible to avoid the race and still touch hardware:
> 
> static int driver_remove()
> {
> 	drm_dev_unregister();
> 	drm_atomic_helper_shutdown();
> 	drm_dev_set_unplugged();
> }
> 
> But now I'm back to the question: Is it the driver or the device that is
> going away?
> 
> If it's the driver we are fine touching hardware, if it's the device it
> depends on how we access the device resource and whether the subsystem
> has protection in place to handle access after the device is gone. I
> think USB can handle and block device access up until the disconnect
> callback has finished (no point in doing so though, since the normal
> operation is that the device is gone, not the driver unloading).
> 
> Is there a way to determine who's going away without changes to the
> device core?
> 
> Maybe. The driver can only be unregistered if there are no open file
> handles because a ref is taken on the driver module.

This isn't true. You can (not many people do, but it's possible) to unbind
through sysfs. The module reference only keeps the cpu instructions valid,
nothing else.

> So maybe something along these lines:
> 
> int drm_dev_open_count(struct drm_device *dev)
> {
> 	int open_count;
> 
> 	mutex_lock(&drm_global_mutex);
> 	open_count = dev->open_count;
> 	mutex_unlock(&drm_global_mutex);

Random style nit: READ_ONCE, no locking needed (the locks don't change
anything, except if you have really strange locking rules). Serves
double-duty as a huge warning sign that something tricky is happening.

> 	return open_count;
> }
> 
> static int driver_remove()
> {
> 	drm_dev_unregister();
> 
> 	open_count = drm_dev_open_count();
> 
> 	/* Open fd's, device is going away, block access */
> 	if (open_count)
> 		drm_dev_set_unplugged();
> 
> 	drm_atomic_helper_shutdown();
> 
> 	/* No open fd's, driver is going away */
> 	if (!open_count)
> 		drm_dev_set_unplugged();
> }
> 
> 
> Personally I have 2 use cases:
> - tinydrm SPI drivers
>   The only hotpluggable SPI controllers I know of are USB which should
>   handle device access while unregistering.
>   SPI devices can be removed, but the controller driver is still in
>   place so it's safe.
> - A future USB driver (that I hope to work on when all this core stuff
>   is in place).
>   There's no point in touching the hw, so DRM can be set uplugged right
>   away in the disconnect() callback.
> 
> This means that I don't need to know who's going away for my use cases.
> 
> This turned out to be quite long winding, but the bottom line is that
> having a separate function to set the unplugged state seems to give a
> lot of flexibility for various use cases.
> 
> I hope I didn't muddy the waters even more :-)

Hm, I think I get your idea. Since I'm still not sure what the best option
is I'm leaning towards just leaving stuff as-is. I.e. drivers which want
to shut down hw can do the 1. drm_dev_unregister() 2.
drm_atomic_helper_shutdown() sequence. Drivers which care about hotunplug
more can do just the drm_dev_unplug().

Yes it's messy and unsatisfactory, but in case of serious doubt I like to
wait until maybe in the future we have a good idea. Meanwhile leaving a
bit of a mess around is better than charging ahead in a possibly totally
wrong direction.

There's cases where clue still hasn't hit me, even years later (or anyone
else). That's just how it is sometimes.

Zooming out more looking at the big picture I'd say all your work in the
past few years has enormously simplified drm for simple drivers already.
If we can't resolve this one here right now that just means you "only"
made drm 98% simpler instead of maybe 99%. It's still an epic win :-)

Cheers, Daniel


> Noralf.
> 
> > Thanks, Daniel
> > 
> >>
> >> Noralf.
> >>
> >>> Cheers, Daniel
> >>>
> >>>> Noralf.
> >>>>
> >>>>>> + * in order to disable the hardware on regular driver module unload.
> >>>>>>   */
> >>>>>>  void drm_dev_unregister(struct drm_device *dev)
> >>>>>>  {
> >>>>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
> >>>>>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
> >>>>>>  		drm_lastclose(dev);
> >>>>>>  
> >>>>>> +	/*
> >>>>>> +	 * After synchronizing any critical read section is guaranteed to see
> >>>>>> +	 * the new value of ->unplugged, and any critical section which might
> >>>>>> +	 * still have seen the old value of ->unplugged is guaranteed to have
> >>>>>> +	 * finished.
> >>>>>> +	 */
> >>>>>> +	dev->unplugged = true;
> >>>>>> +	synchronize_srcu(&drm_unplug_srcu);
> >>>>>> +
> >>>>>>  	dev->registered = false;
> >>>>>>  
> >>>>>>  	drm_client_dev_unregister(dev);
> >>>>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> >>>>>> index ca46a45a9cce..c50696c82a42 100644
> >>>>>> --- a/include/drm/drm_drv.h
> >>>>>> +++ b/include/drm/drm_drv.h
> >>>>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
> >>>>>>   * drm_dev_is_unplugged - is a DRM device unplugged
> >>>>>>   * @dev: DRM device
> >>>>>>   *
> >>>>>> - * This function can be called to check whether a hotpluggable is unplugged.
> >>>>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> >>>>>> - * unplugged, these two functions guarantee that any store before calling
> >>>>>> - * drm_dev_unplug() is visible to callers of this function after it completes
> >>>>>> + * This function can be called to check whether @dev is unregistered. This can
> >>>>>> + * be used to detect that the underlying parent device is gone.
> >>>>>
> >>>>> I think it'd be good to keep the first part, and just update the reference
> >>>>> to drm_dev_unregister. So:
> >>>>>
> >>>>>  * This function can be called to check whether a hotpluggable is unplugged.
> >>>>>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
> >>>>>  * unplugged, these two functions guarantee that any store before calling
> >>>>>  * drm_dev_unregister() is visible to callers of this function after it
> >>>>>  * completes.
> >>>>>
> >>>>> I think your version shrugs a few important details under the rug. With
> >>>>> those nits addressed:
> >>>>>
> >>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>>
> >>>>> Cheers, Daniel
> >>>>>
> >>>>>>   *
> >>>>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> >>>>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
> >>>>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> >>>>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
> >>>>>>   * drm_dev_exit() function pairs.
> >>>>>>   */
> >>>>>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> >>>>>> -- 
> >>>>>> 2.20.1
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Intel-gfx mailing list
> >>>>>> Intel-gfx@lists.freedesktop.org
> >>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>>>>
> >>>
> >
Noralf Trønnes Feb. 6, 2019, 4:46 p.m. UTC | #9
Den 06.02.2019 16.26, skrev Daniel Vetter:
> On Tue, Feb 05, 2019 at 06:57:50PM +0100, Noralf Trønnes wrote:
>>
>>
>> Den 05.02.2019 17.31, skrev Daniel Vetter:
>>> On Tue, Feb 05, 2019 at 11:20:55AM +0100, Noralf Trønnes wrote:
>>>>
>>>>
>>>> Den 05.02.2019 10.11, skrev Daniel Vetter:
>>>>> On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
>>>>>>
>>>>>>
>>>>>> Den 04.02.2019 16.41, skrev Daniel Vetter:
>>>>>>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
>>>>>>>> The only thing now that makes drm_dev_unplug() special is that it sets
>>>>>>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
>>>>>>>> can remove drm_dev_unplug().
>>>>>>>>
>>>>>>>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
>>>>>>>> ---
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>>>>>>>>  include/drm/drm_drv.h     | 10 ++++------
>>>>>>>>  2 files changed, 19 insertions(+), 18 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>>>>>> index 05bbc2b622fc..e0941200edc6 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>>>>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>>>>>>>>   */
>>>>>>>>  void drm_dev_unplug(struct drm_device *dev)
>>>>>>>>  {
>>>>>>>> -	/*
>>>>>>>> -	 * After synchronizing any critical read section is guaranteed to see
>>>>>>>> -	 * the new value of ->unplugged, and any critical section which might
>>>>>>>> -	 * still have seen the old value of ->unplugged is guaranteed to have
>>>>>>>> -	 * finished.
>>>>>>>> -	 */
>>>>>>>> -	dev->unplugged = true;
>>>>>>>> -	synchronize_srcu(&drm_unplug_srcu);
>>>>>>>> -
>>>>>>>>  	drm_dev_unregister(dev);
>>>>>>>>  	drm_dev_put(dev);
>>>>>>>>  }
>>>>>>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>>>>>>>>   * drm_dev_register() but does not deallocate the device. The caller must call
>>>>>>>>   * drm_dev_put() to drop their final reference.
>>>>>>>>   *
>>>>>>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
>>>>>>>> - * which can be called while there are still open users of @dev.
>>>>>>>> + * This function can be called while there are still open users of @dev as long
>>>>>>>> + * as the driver protects its device resources using drm_dev_enter() and
>>>>>>>> + * drm_dev_exit().
>>>>>>>>   *
>>>>>>>>   * This should be called first in the device teardown code to make sure
>>>>>>>> - * userspace can't access the device instance any more.
>>>>>>>> + * userspace can't access the device instance any more. Drivers that support
>>>>>>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
>>>>>>>
>>>>>>> Read once more with a bit more coffee, spotted this:
>>>>>>>
>>>>>>> s/first/afterwards/ - shutting down the hw before we've taken it away from
>>>>>>> userspace is kinda the wrong way round. It should be the inverse of driver
>>>>>>> load, which is 1) allocate structures 2) prep hw 3) register driver with
>>>>>>> the world (simplified ofc).
>>>>>>>
>>>>>>
>>>>>> The problem is that drm_dev_unregister() sets the device as unplugged
>>>>>> and if drm_atomic_helper_shutdown() is called afterwards it's not
>>>>>> allowed to touch hardware.
>>>>>>
>>>>>> I know it's the wrong order, but the only way to do it in the right
>>>>>> order is to have a separate function that sets unplugged:
>>>>>>
>>>>>> 	drm_dev_unregister();
>>>>>> 	drm_atomic_helper_shutdown();
>>>>>> 	drm_dev_set_unplugged();
>>>>>
>>>>> Annoying ... but yeah calling _shutdown() before we stopped userspace is
>>>>> also not going to work. Because userspace could quickly re-enable
>>>>> something, and then the refcounts would be all wrong again and leaking
>>>>> objects.
>>>>>
>>>>
>>>> What happens with a USB device that is unplugged with open userspace,
>>>> will that leak objects?
>>>
>>> Maybe we've jumped to conclusions. drm_atomic_helper_shutdown() will run
>>> as normal, the only thing that should be skipped is actually touching the
>>> hw (as long as the driver doesn't protect too much with
>>> drm_dev_enter/exit). So all the software updates (including refcounting
>>> updates) will still be done. Ofc current udl is not yet atomic, so in
>>> reality something else happens.
>>>
>>> And we ofc still have the same issue that if you just unload the driver,
>>> then the hw will stay on (which might really confuse the driver on next
>>> load, when it assumes that it only gets loaded from cold boot where
>>> everything is off - which usually is the case on an arm soc at least).
>>>
>>>>> I get a bit the feeling we're over-optimizing here with trying to devm-ize
>>>>> drm_dev_register. Just getting drm_device correctly devm-ized is a big
>>>>> step forward already, and will open up a lot of TODO items across a lot of
>>>>> drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
>>>>> structs, which gets released together with drm_device. I think that's a
>>>>> much clearer path forward, I think we all agree that getting the kfree out
>>>>> of the driver codes is a good thing, and it would allow us to do this
>>>>> correctly.
>>>>>
>>>>> Then once we have that and rolled out to a few drivers we can reconsider
>>>>> the entire unregister/shutdown gordian knot here. Atm I just have no idea
>>>>> how to do this properly :-/
>>>>>
>>>>> Thoughts, other ideas?
>>>>>
>>>>
>>>> Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
>>>> make much sense if we need a driver remove callback anyways.
>>>
>>> Yup, that's what I meant with the above:
>>> - merge devm_drm_dev_register, use it a lot. This is definitely a good
>>>   idea.
>>> - create a drm_dev_kzalloc, which automatically releases memory on the
>>>   final drm_dev_put. Use it every in drivers for drm structures,
>>>   especially in those drivers that currently use devm (which releases
>>>   those allocations potentialy too early on unplug).
>>> - figure out the next steps after that
>>>
>>>> I think devm_drm_dev_init() makes sense because it yields a cleaner
>>>> probe() function. An additional benefit is that it requires a
>>>> drm_driver->release function which is a step in the right direction to
>>>> get the drm_device lifetime right.
>>>>
>>>> Do we agree that a drm_dev_set_unplugged() function is necessary to get
>>>> the remove/disconnect order right?
>>>>
>>>> What about drm_dev_unplug() maybe I should just leave it be?
>>>>
>>>> - amd uses drm_driver->unload, so that one takes some work to get right
>>>>   to support unplug. It doesn't check the unplugged state, so really
>>>>   doesn't need drm_dev_unplug() I guess. Do they have cards that can be
>>>>   hotplugged?
>>>
>>> Yeah amd still uses ->load and ->unload, which is not great unfortunately.
>>> I just stumbled over that when I tried to make a series to disable the
>>> global drm_global_mutex for most drivers ...
>>>
>>>> - udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
>>>>   It has only one drm_dev_is_unplugged() check and that is in its
>>>>   fbdev->open callback.
>>>
>>> udl isn't atomic, so can't use the atomic helpers. pre-atomic doesn't have
>>> refcounting issues when something is left on iirc. I think udl is written
>>> under the assumption that ->unload is always called for an unplug, never
>>> for an actual unload.
>>>
>>>> - xen calls drm_atomic_helper_shutdown() in its drm_driver->release
>>>>   callback which I guess is not correct.
>>>
>>> Yeah this smells fishy. ->release is supposed to be for cleaning up kernel
>>> structures, not for cleaning up the hw. So maybe drm_mode_config_cleanup
>>> could be put there, not sure honestly. But definitely not _shutdown().
>>>
>>>> This is what it will look like with a set unplugged function:
>>>>
>>>> void drm_dev_unplug(struct drm_device *dev)
>>>> {
>>>> 	drm_dev_set_unplugged(dev);
>>>> 	drm_dev_unregister(dev);
>>>> 	drm_dev_put(dev);
>>>> }
>>>>
>>>> Hm, I should probably remove it to avoid further use of it since it's
>>>> not correct for a modern driver.
>>>
>>> I think something flew over my head ... what's the drm_dev_set_unplugged
>>> for? I think I'm not following you ...
>>>
>>
>> Taking it a few steps back:
>>
>> This patch proposes to move 'dev->unplugged = true' from
>> drm_dev_unplug() to drm_dev_unregister().
>>
>> Additionally I proposed this change to the drm_dev_unregister() docs:
>>
>>   * This should be called first in the device teardown code to make sure
>> - * userspace can't access the device instance any more.
>> + * userspace can't access the device instance any more. Drivers that
>> support
>> + * device unplug will probably want to call
>> drm_atomic_helper_shutdown() first
>> + * in order to disable the hardware on regular driver module unload.
>>
>> Which would give a driver remove callback like this:
>>
>> static int driver_remove()
>> {
>> 	drm_atomic_helper_shutdown();
>> 	drm_dev_unregister();
>> }
>>
>> Your reaction was that drm_atomic_helper_shutdown() needs to be called
>> after drm_dev_unregister() to avoid a race resulting in leaked objects.
>> However if we call it afterwards, ->unplugged will be true and the
>> driver can't touch hardware.
>>
>> Then I proposed moving the unplugged state setting to:
>>
>> void drm_dev_set_unplugged(struct drm_device *dev)
>> {
>> 	dev->unplugged = true;
>> 	synchronize_srcu(&drm_unplug_srcu);
>> }
>>
>> Now it is possible to avoid the race and still touch hardware:
>>
>> static int driver_remove()
>> {
>> 	drm_dev_unregister();
>> 	drm_atomic_helper_shutdown();
>> 	drm_dev_set_unplugged();
>> }
>>
>> But now I'm back to the question: Is it the driver or the device that is
>> going away?
>>
>> If it's the driver we are fine touching hardware, if it's the device it
>> depends on how we access the device resource and whether the subsystem
>> has protection in place to handle access after the device is gone. I
>> think USB can handle and block device access up until the disconnect
>> callback has finished (no point in doing so though, since the normal
>> operation is that the device is gone, not the driver unloading).
>>
>> Is there a way to determine who's going away without changes to the
>> device core?
>>
>> Maybe. The driver can only be unregistered if there are no open file
>> handles because a ref is taken on the driver module.
> 
> This isn't true. You can (not many people do, but it's possible) to unbind
> through sysfs. The module reference only keeps the cpu instructions valid,
> nothing else.
> 
>> So maybe something along these lines:
>>
>> int drm_dev_open_count(struct drm_device *dev)
>> {
>> 	int open_count;
>>
>> 	mutex_lock(&drm_global_mutex);
>> 	open_count = dev->open_count;
>> 	mutex_unlock(&drm_global_mutex);
> 
> Random style nit: READ_ONCE, no locking needed (the locks don't change
> anything, except if you have really strange locking rules). Serves
> double-duty as a huge warning sign that something tricky is happening.
> 
>> 	return open_count;
>> }
>>
>> static int driver_remove()
>> {
>> 	drm_dev_unregister();
>>
>> 	open_count = drm_dev_open_count();
>>
>> 	/* Open fd's, device is going away, block access */
>> 	if (open_count)
>> 		drm_dev_set_unplugged();
>>
>> 	drm_atomic_helper_shutdown();
>>
>> 	/* No open fd's, driver is going away */
>> 	if (!open_count)
>> 		drm_dev_set_unplugged();
>> }
>>
>>
>> Personally I have 2 use cases:
>> - tinydrm SPI drivers
>>   The only hotpluggable SPI controllers I know of are USB which should
>>   handle device access while unregistering.
>>   SPI devices can be removed, but the controller driver is still in
>>   place so it's safe.
>> - A future USB driver (that I hope to work on when all this core stuff
>>   is in place).
>>   There's no point in touching the hw, so DRM can be set uplugged right
>>   away in the disconnect() callback.
>>
>> This means that I don't need to know who's going away for my use cases.
>>
>> This turned out to be quite long winding, but the bottom line is that
>> having a separate function to set the unplugged state seems to give a
>> lot of flexibility for various use cases.
>>
>> I hope I didn't muddy the waters even more :-)
> 
> Hm, I think I get your idea. Since I'm still not sure what the best option
> is I'm leaning towards just leaving stuff as-is. I.e. drivers which want
> to shut down hw can do the 1. drm_dev_unregister() 2.
> drm_atomic_helper_shutdown() sequence. Drivers which care about hotunplug
> more can do just the drm_dev_unplug().
> 
> Yes it's messy and unsatisfactory, but in case of serious doubt I like to
> wait until maybe in the future we have a good idea. Meanwhile leaving a
> bit of a mess around is better than charging ahead in a possibly totally
> wrong direction.
> 
> There's cases where clue still hasn't hit me, even years later (or anyone
> else). That's just how it is sometimes.
> 

Ok, I'll drop this.

This means that I'll drop devm_drm_dev_init() as well since it doesn't
play well with drm_dev_unplug() since both will do drm_dev_put(). Not a
big deal really, it just means that I need to add one error path in the
probe function so I can drm_dev_put() on error.

Are you still ok with the first bug fix patch in this series?

> Zooming out more looking at the big picture I'd say all your work in the
> past few years has enormously simplified drm for simple drivers already.
> If we can't resolve this one here right now that just means you "only"
> made drm 98% simpler instead of maybe 99%. It's still an epic win :-)
> 

Thanks, your mentoring and reviews helped turn my rough ideas into
something useful :-)

Noralf.

> Cheers, Daniel
> 
> 
>> Noralf.
>>
>>> Thanks, Daniel
>>>
>>>>
>>>> Noralf.
>>>>
>>>>> Cheers, Daniel
>>>>>
>>>>>> Noralf.
>>>>>>
>>>>>>>> + * in order to disable the hardware on regular driver module unload.
>>>>>>>>   */
>>>>>>>>  void drm_dev_unregister(struct drm_device *dev)
>>>>>>>>  {
>>>>>>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>>>>>>>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>>>>>>>>  		drm_lastclose(dev);
>>>>>>>>  
>>>>>>>> +	/*
>>>>>>>> +	 * After synchronizing any critical read section is guaranteed to see
>>>>>>>> +	 * the new value of ->unplugged, and any critical section which might
>>>>>>>> +	 * still have seen the old value of ->unplugged is guaranteed to have
>>>>>>>> +	 * finished.
>>>>>>>> +	 */
>>>>>>>> +	dev->unplugged = true;
>>>>>>>> +	synchronize_srcu(&drm_unplug_srcu);
>>>>>>>> +
>>>>>>>>  	dev->registered = false;
>>>>>>>>  
>>>>>>>>  	drm_client_dev_unregister(dev);
>>>>>>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>>>>>>>> index ca46a45a9cce..c50696c82a42 100644
>>>>>>>> --- a/include/drm/drm_drv.h
>>>>>>>> +++ b/include/drm/drm_drv.h
>>>>>>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>>>>>>>>   * drm_dev_is_unplugged - is a DRM device unplugged
>>>>>>>>   * @dev: DRM device
>>>>>>>>   *
>>>>>>>> - * This function can be called to check whether a hotpluggable is unplugged.
>>>>>>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
>>>>>>>> - * unplugged, these two functions guarantee that any store before calling
>>>>>>>> - * drm_dev_unplug() is visible to callers of this function after it completes
>>>>>>>> + * This function can be called to check whether @dev is unregistered. This can
>>>>>>>> + * be used to detect that the underlying parent device is gone.
>>>>>>>
>>>>>>> I think it'd be good to keep the first part, and just update the reference
>>>>>>> to drm_dev_unregister. So:
>>>>>>>
>>>>>>>  * This function can be called to check whether a hotpluggable is unplugged.
>>>>>>>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
>>>>>>>  * unplugged, these two functions guarantee that any store before calling
>>>>>>>  * drm_dev_unregister() is visible to callers of this function after it
>>>>>>>  * completes.
>>>>>>>
>>>>>>> I think your version shrugs a few important details under the rug. With
>>>>>>> those nits addressed:
>>>>>>>
>>>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>>
>>>>>>> Cheers, Daniel
>>>>>>>
>>>>>>>>   *
>>>>>>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
>>>>>>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
>>>>>>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
>>>>>>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>>>>>>>>   * drm_dev_exit() function pairs.
>>>>>>>>   */
>>>>>>>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
>>>>>>>> -- 
>>>>>>>> 2.20.1
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Intel-gfx mailing list
>>>>>>>> Intel-gfx@lists.freedesktop.org
>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>>>>>
>>>>>
>>>
>
Eric Anholt Feb. 6, 2019, 6:10 p.m. UTC | #10
Daniel Vetter <daniel@ffwll.ch> writes:
>
> Zooming out more looking at the big picture I'd say all your work in the
> past few years has enormously simplified drm for simple drivers already.
> If we can't resolve this one here right now that just means you "only"
> made drm 98% simpler instead of maybe 99%. It's still an epic win :-)

I'd like to second this.  So many of Noralf's cleanups I think "oof,
that's a lot of work for a little cleanup here".  But we've benefited
immensely from it accumulating over the years.  Thanks again!
Daniel Vetter Feb. 6, 2019, 8:36 p.m. UTC | #11
On Wed, Feb 06, 2019 at 05:46:51PM +0100, Noralf Trønnes wrote:
> 
> 
> Den 06.02.2019 16.26, skrev Daniel Vetter:
> > On Tue, Feb 05, 2019 at 06:57:50PM +0100, Noralf Trønnes wrote:
> >>
> >>
> >> Den 05.02.2019 17.31, skrev Daniel Vetter:
> >>> On Tue, Feb 05, 2019 at 11:20:55AM +0100, Noralf Trønnes wrote:
> >>>>
> >>>>
> >>>> Den 05.02.2019 10.11, skrev Daniel Vetter:
> >>>>> On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
> >>>>>>
> >>>>>>
> >>>>>> Den 04.02.2019 16.41, skrev Daniel Vetter:
> >>>>>>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> >>>>>>>> The only thing now that makes drm_dev_unplug() special is that it sets
> >>>>>>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> >>>>>>>> can remove drm_dev_unplug().
> >>>>>>>>
> >>>>>>>> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> >>>>>>>> ---
> >>>>>>
> >>>>>> [...]
> >>>>>>
> >>>>>>>>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
> >>>>>>>>  include/drm/drm_drv.h     | 10 ++++------
> >>>>>>>>  2 files changed, 19 insertions(+), 18 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>>>>> index 05bbc2b622fc..e0941200edc6 100644
> >>>>>>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>>>>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>>>>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
> >>>>>>>>   */
> >>>>>>>>  void drm_dev_unplug(struct drm_device *dev)
> >>>>>>>>  {
> >>>>>>>> -	/*
> >>>>>>>> -	 * After synchronizing any critical read section is guaranteed to see
> >>>>>>>> -	 * the new value of ->unplugged, and any critical section which might
> >>>>>>>> -	 * still have seen the old value of ->unplugged is guaranteed to have
> >>>>>>>> -	 * finished.
> >>>>>>>> -	 */
> >>>>>>>> -	dev->unplugged = true;
> >>>>>>>> -	synchronize_srcu(&drm_unplug_srcu);
> >>>>>>>> -
> >>>>>>>>  	drm_dev_unregister(dev);
> >>>>>>>>  	drm_dev_put(dev);
> >>>>>>>>  }
> >>>>>>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
> >>>>>>>>   * drm_dev_register() but does not deallocate the device. The caller must call
> >>>>>>>>   * drm_dev_put() to drop their final reference.
> >>>>>>>>   *
> >>>>>>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> >>>>>>>> - * which can be called while there are still open users of @dev.
> >>>>>>>> + * This function can be called while there are still open users of @dev as long
> >>>>>>>> + * as the driver protects its device resources using drm_dev_enter() and
> >>>>>>>> + * drm_dev_exit().
> >>>>>>>>   *
> >>>>>>>>   * This should be called first in the device teardown code to make sure
> >>>>>>>> - * userspace can't access the device instance any more.
> >>>>>>>> + * userspace can't access the device instance any more. Drivers that support
> >>>>>>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> >>>>>>>
> >>>>>>> Read once more with a bit more coffee, spotted this:
> >>>>>>>
> >>>>>>> s/first/afterwards/ - shutting down the hw before we've taken it away from
> >>>>>>> userspace is kinda the wrong way round. It should be the inverse of driver
> >>>>>>> load, which is 1) allocate structures 2) prep hw 3) register driver with
> >>>>>>> the world (simplified ofc).
> >>>>>>>
> >>>>>>
> >>>>>> The problem is that drm_dev_unregister() sets the device as unplugged
> >>>>>> and if drm_atomic_helper_shutdown() is called afterwards it's not
> >>>>>> allowed to touch hardware.
> >>>>>>
> >>>>>> I know it's the wrong order, but the only way to do it in the right
> >>>>>> order is to have a separate function that sets unplugged:
> >>>>>>
> >>>>>> 	drm_dev_unregister();
> >>>>>> 	drm_atomic_helper_shutdown();
> >>>>>> 	drm_dev_set_unplugged();
> >>>>>
> >>>>> Annoying ... but yeah calling _shutdown() before we stopped userspace is
> >>>>> also not going to work. Because userspace could quickly re-enable
> >>>>> something, and then the refcounts would be all wrong again and leaking
> >>>>> objects.
> >>>>>
> >>>>
> >>>> What happens with a USB device that is unplugged with open userspace,
> >>>> will that leak objects?
> >>>
> >>> Maybe we've jumped to conclusions. drm_atomic_helper_shutdown() will run
> >>> as normal, the only thing that should be skipped is actually touching the
> >>> hw (as long as the driver doesn't protect too much with
> >>> drm_dev_enter/exit). So all the software updates (including refcounting
> >>> updates) will still be done. Ofc current udl is not yet atomic, so in
> >>> reality something else happens.
> >>>
> >>> And we ofc still have the same issue that if you just unload the driver,
> >>> then the hw will stay on (which might really confuse the driver on next
> >>> load, when it assumes that it only gets loaded from cold boot where
> >>> everything is off - which usually is the case on an arm soc at least).
> >>>
> >>>>> I get a bit the feeling we're over-optimizing here with trying to devm-ize
> >>>>> drm_dev_register. Just getting drm_device correctly devm-ized is a big
> >>>>> step forward already, and will open up a lot of TODO items across a lot of
> >>>>> drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
> >>>>> structs, which gets released together with drm_device. I think that's a
> >>>>> much clearer path forward, I think we all agree that getting the kfree out
> >>>>> of the driver codes is a good thing, and it would allow us to do this
> >>>>> correctly.
> >>>>>
> >>>>> Then once we have that and rolled out to a few drivers we can reconsider
> >>>>> the entire unregister/shutdown gordian knot here. Atm I just have no idea
> >>>>> how to do this properly :-/
> >>>>>
> >>>>> Thoughts, other ideas?
> >>>>>
> >>>>
> >>>> Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
> >>>> make much sense if we need a driver remove callback anyways.
> >>>
> >>> Yup, that's what I meant with the above:
> >>> - merge devm_drm_dev_register, use it a lot. This is definitely a good
> >>>   idea.
> >>> - create a drm_dev_kzalloc, which automatically releases memory on the
> >>>   final drm_dev_put. Use it every in drivers for drm structures,
> >>>   especially in those drivers that currently use devm (which releases
> >>>   those allocations potentialy too early on unplug).
> >>> - figure out the next steps after that
> >>>
> >>>> I think devm_drm_dev_init() makes sense because it yields a cleaner
> >>>> probe() function. An additional benefit is that it requires a
> >>>> drm_driver->release function which is a step in the right direction to
> >>>> get the drm_device lifetime right.
> >>>>
> >>>> Do we agree that a drm_dev_set_unplugged() function is necessary to get
> >>>> the remove/disconnect order right?
> >>>>
> >>>> What about drm_dev_unplug() maybe I should just leave it be?
> >>>>
> >>>> - amd uses drm_driver->unload, so that one takes some work to get right
> >>>>   to support unplug. It doesn't check the unplugged state, so really
> >>>>   doesn't need drm_dev_unplug() I guess. Do they have cards that can be
> >>>>   hotplugged?
> >>>
> >>> Yeah amd still uses ->load and ->unload, which is not great unfortunately.
> >>> I just stumbled over that when I tried to make a series to disable the
> >>> global drm_global_mutex for most drivers ...
> >>>
> >>>> - udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
> >>>>   It has only one drm_dev_is_unplugged() check and that is in its
> >>>>   fbdev->open callback.
> >>>
> >>> udl isn't atomic, so can't use the atomic helpers. pre-atomic doesn't have
> >>> refcounting issues when something is left on iirc. I think udl is written
> >>> under the assumption that ->unload is always called for an unplug, never
> >>> for an actual unload.
> >>>
> >>>> - xen calls drm_atomic_helper_shutdown() in its drm_driver->release
> >>>>   callback which I guess is not correct.
> >>>
> >>> Yeah this smells fishy. ->release is supposed to be for cleaning up kernel
> >>> structures, not for cleaning up the hw. So maybe drm_mode_config_cleanup
> >>> could be put there, not sure honestly. But definitely not _shutdown().
> >>>
> >>>> This is what it will look like with a set unplugged function:
> >>>>
> >>>> void drm_dev_unplug(struct drm_device *dev)
> >>>> {
> >>>> 	drm_dev_set_unplugged(dev);
> >>>> 	drm_dev_unregister(dev);
> >>>> 	drm_dev_put(dev);
> >>>> }
> >>>>
> >>>> Hm, I should probably remove it to avoid further use of it since it's
> >>>> not correct for a modern driver.
> >>>
> >>> I think something flew over my head ... what's the drm_dev_set_unplugged
> >>> for? I think I'm not following you ...
> >>>
> >>
> >> Taking it a few steps back:
> >>
> >> This patch proposes to move 'dev->unplugged = true' from
> >> drm_dev_unplug() to drm_dev_unregister().
> >>
> >> Additionally I proposed this change to the drm_dev_unregister() docs:
> >>
> >>   * This should be called first in the device teardown code to make sure
> >> - * userspace can't access the device instance any more.
> >> + * userspace can't access the device instance any more. Drivers that
> >> support
> >> + * device unplug will probably want to call
> >> drm_atomic_helper_shutdown() first
> >> + * in order to disable the hardware on regular driver module unload.
> >>
> >> Which would give a driver remove callback like this:
> >>
> >> static int driver_remove()
> >> {
> >> 	drm_atomic_helper_shutdown();
> >> 	drm_dev_unregister();
> >> }
> >>
> >> Your reaction was that drm_atomic_helper_shutdown() needs to be called
> >> after drm_dev_unregister() to avoid a race resulting in leaked objects.
> >> However if we call it afterwards, ->unplugged will be true and the
> >> driver can't touch hardware.
> >>
> >> Then I proposed moving the unplugged state setting to:
> >>
> >> void drm_dev_set_unplugged(struct drm_device *dev)
> >> {
> >> 	dev->unplugged = true;
> >> 	synchronize_srcu(&drm_unplug_srcu);
> >> }
> >>
> >> Now it is possible to avoid the race and still touch hardware:
> >>
> >> static int driver_remove()
> >> {
> >> 	drm_dev_unregister();
> >> 	drm_atomic_helper_shutdown();
> >> 	drm_dev_set_unplugged();
> >> }
> >>
> >> But now I'm back to the question: Is it the driver or the device that is
> >> going away?
> >>
> >> If it's the driver we are fine touching hardware, if it's the device it
> >> depends on how we access the device resource and whether the subsystem
> >> has protection in place to handle access after the device is gone. I
> >> think USB can handle and block device access up until the disconnect
> >> callback has finished (no point in doing so though, since the normal
> >> operation is that the device is gone, not the driver unloading).
> >>
> >> Is there a way to determine who's going away without changes to the
> >> device core?
> >>
> >> Maybe. The driver can only be unregistered if there are no open file
> >> handles because a ref is taken on the driver module.
> > 
> > This isn't true. You can (not many people do, but it's possible) to unbind
> > through sysfs. The module reference only keeps the cpu instructions valid,
> > nothing else.
> > 
> >> So maybe something along these lines:
> >>
> >> int drm_dev_open_count(struct drm_device *dev)
> >> {
> >> 	int open_count;
> >>
> >> 	mutex_lock(&drm_global_mutex);
> >> 	open_count = dev->open_count;
> >> 	mutex_unlock(&drm_global_mutex);
> > 
> > Random style nit: READ_ONCE, no locking needed (the locks don't change
> > anything, except if you have really strange locking rules). Serves
> > double-duty as a huge warning sign that something tricky is happening.
> > 
> >> 	return open_count;
> >> }
> >>
> >> static int driver_remove()
> >> {
> >> 	drm_dev_unregister();
> >>
> >> 	open_count = drm_dev_open_count();
> >>
> >> 	/* Open fd's, device is going away, block access */
> >> 	if (open_count)
> >> 		drm_dev_set_unplugged();
> >>
> >> 	drm_atomic_helper_shutdown();
> >>
> >> 	/* No open fd's, driver is going away */
> >> 	if (!open_count)
> >> 		drm_dev_set_unplugged();
> >> }
> >>
> >>
> >> Personally I have 2 use cases:
> >> - tinydrm SPI drivers
> >>   The only hotpluggable SPI controllers I know of are USB which should
> >>   handle device access while unregistering.
> >>   SPI devices can be removed, but the controller driver is still in
> >>   place so it's safe.
> >> - A future USB driver (that I hope to work on when all this core stuff
> >>   is in place).
> >>   There's no point in touching the hw, so DRM can be set uplugged right
> >>   away in the disconnect() callback.
> >>
> >> This means that I don't need to know who's going away for my use cases.
> >>
> >> This turned out to be quite long winding, but the bottom line is that
> >> having a separate function to set the unplugged state seems to give a
> >> lot of flexibility for various use cases.
> >>
> >> I hope I didn't muddy the waters even more :-)
> > 
> > Hm, I think I get your idea. Since I'm still not sure what the best option
> > is I'm leaning towards just leaving stuff as-is. I.e. drivers which want
> > to shut down hw can do the 1. drm_dev_unregister() 2.
> > drm_atomic_helper_shutdown() sequence. Drivers which care about hotunplug
> > more can do just the drm_dev_unplug().
> > 
> > Yes it's messy and unsatisfactory, but in case of serious doubt I like to
> > wait until maybe in the future we have a good idea. Meanwhile leaving a
> > bit of a mess around is better than charging ahead in a possibly totally
> > wrong direction.
> > 
> > There's cases where clue still hasn't hit me, even years later (or anyone
> > else). That's just how it is sometimes.
> > 
> 
> Ok, I'll drop this.
> 
> This means that I'll drop devm_drm_dev_init() as well since it doesn't
> play well with drm_dev_unplug() since both will do drm_dev_put(). Not a
> big deal really, it just means that I need to add one error path in the
> probe function so I can drm_dev_put() on error.

Hm, I think we could just move the drm_dev_put out from drm_dev_unplug.
And then encourage everyone to use devm_drm_dev_init() everywhere. I do
like to get started with auto-cleaned up in drm somehow, and
devm_drm_dev_init doing the drm_dev_put() looks like a really good idea.

> Are you still ok with the first bug fix patch in this series?

Yeah that one still looks good.

Cheers, Daniel

> 
> > Zooming out more looking at the big picture I'd say all your work in the
> > past few years has enormously simplified drm for simple drivers already.
> > If we can't resolve this one here right now that just means you "only"
> > made drm 98% simpler instead of maybe 99%. It's still an epic win :-)
> > 
> 
> Thanks, your mentoring and reviews helped turn my rough ideas into
> something useful :-)
> 
> Noralf.
> 
> > Cheers, Daniel
> > 
> > 
> >> Noralf.
> >>
> >>> Thanks, Daniel
> >>>
> >>>>
> >>>> Noralf.
> >>>>
> >>>>> Cheers, Daniel
> >>>>>
> >>>>>> Noralf.
> >>>>>>
> >>>>>>>> + * in order to disable the hardware on regular driver module unload.
> >>>>>>>>   */
> >>>>>>>>  void drm_dev_unregister(struct drm_device *dev)
> >>>>>>>>  {
> >>>>>>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
> >>>>>>>>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
> >>>>>>>>  		drm_lastclose(dev);
> >>>>>>>>  
> >>>>>>>> +	/*
> >>>>>>>> +	 * After synchronizing any critical read section is guaranteed to see
> >>>>>>>> +	 * the new value of ->unplugged, and any critical section which might
> >>>>>>>> +	 * still have seen the old value of ->unplugged is guaranteed to have
> >>>>>>>> +	 * finished.
> >>>>>>>> +	 */
> >>>>>>>> +	dev->unplugged = true;
> >>>>>>>> +	synchronize_srcu(&drm_unplug_srcu);
> >>>>>>>> +
> >>>>>>>>  	dev->registered = false;
> >>>>>>>>  
> >>>>>>>>  	drm_client_dev_unregister(dev);
> >>>>>>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> >>>>>>>> index ca46a45a9cce..c50696c82a42 100644
> >>>>>>>> --- a/include/drm/drm_drv.h
> >>>>>>>> +++ b/include/drm/drm_drv.h
> >>>>>>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
> >>>>>>>>   * drm_dev_is_unplugged - is a DRM device unplugged
> >>>>>>>>   * @dev: DRM device
> >>>>>>>>   *
> >>>>>>>> - * This function can be called to check whether a hotpluggable is unplugged.
> >>>>>>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> >>>>>>>> - * unplugged, these two functions guarantee that any store before calling
> >>>>>>>> - * drm_dev_unplug() is visible to callers of this function after it completes
> >>>>>>>> + * This function can be called to check whether @dev is unregistered. This can
> >>>>>>>> + * be used to detect that the underlying parent device is gone.
> >>>>>>>
> >>>>>>> I think it'd be good to keep the first part, and just update the reference
> >>>>>>> to drm_dev_unregister. So:
> >>>>>>>
> >>>>>>>  * This function can be called to check whether a hotpluggable is unplugged.
> >>>>>>>  * Unplugging itself is singalled through drm_dev_unregister(). If a device is
> >>>>>>>  * unplugged, these two functions guarantee that any store before calling
> >>>>>>>  * drm_dev_unregister() is visible to callers of this function after it
> >>>>>>>  * completes.
> >>>>>>>
> >>>>>>> I think your version shrugs a few important details under the rug. With
> >>>>>>> those nits addressed:
> >>>>>>>
> >>>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>>>>
> >>>>>>> Cheers, Daniel
> >>>>>>>
> >>>>>>>>   *
> >>>>>>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> >>>>>>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
> >>>>>>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> >>>>>>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
> >>>>>>>>   * drm_dev_exit() function pairs.
> >>>>>>>>   */
> >>>>>>>>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> >>>>>>>> -- 
> >>>>>>>> 2.20.1
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Intel-gfx mailing list
> >>>>>>>> Intel-gfx@lists.freedesktop.org
> >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>>>>>>
> >>>>>
> >>>
> >
Sean Paul Feb. 7, 2019, 9:07 p.m. UTC | #12
On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> The only thing now that makes drm_dev_unplug() special is that it sets
> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> can remove drm_dev_unplug().
> 
> Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> ---
> 
> Maybe s/unplugged/unregistered/ ?
> 
> I looked at drm_device->registered, but using that would mean that
> drm_dev_is_unplugged() would return before drm_device is registered.
> And given that its current purpose is to prevent race against connector
> registration, I stayed away from it.

Makes sense. I think having unregistered along with registered might cause some
confusion unless it's really clearly documented. Perhaps
s/unplugged/unregistered/ and s/registered/initialized/ but init has its own
meaning...

I'd hate to hold up the patchset over naming bikeshed, so maybe this would be
better off as a follow-on series where everyone can pile on opinions. So,

Reviewed-by: Sean Paul <sean@poorly.run>


> 
> Noralf.
> 
> 
>  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
>  include/drm/drm_drv.h     | 10 ++++------
>  2 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 05bbc2b622fc..e0941200edc6 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
>   */
>  void drm_dev_unplug(struct drm_device *dev)
>  {
> -	/*
> -	 * After synchronizing any critical read section is guaranteed to see
> -	 * the new value of ->unplugged, and any critical section which might
> -	 * still have seen the old value of ->unplugged is guaranteed to have
> -	 * finished.
> -	 */
> -	dev->unplugged = true;
> -	synchronize_srcu(&drm_unplug_srcu);
> -
>  	drm_dev_unregister(dev);
>  	drm_dev_put(dev);
>  }
> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
>   * drm_dev_register() but does not deallocate the device. The caller must call
>   * drm_dev_put() to drop their final reference.
>   *
> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> - * which can be called while there are still open users of @dev.
> + * This function can be called while there are still open users of @dev as long
> + * as the driver protects its device resources using drm_dev_enter() and
> + * drm_dev_exit().
>   *
>   * This should be called first in the device teardown code to make sure
> - * userspace can't access the device instance any more.
> + * userspace can't access the device instance any more. Drivers that support
> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> + * in order to disable the hardware on regular driver module unload.
>   */
>  void drm_dev_unregister(struct drm_device *dev)
>  {
> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
>  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
>  		drm_lastclose(dev);
>  
> +	/*
> +	 * After synchronizing any critical read section is guaranteed to see
> +	 * the new value of ->unplugged, and any critical section which might
> +	 * still have seen the old value of ->unplugged is guaranteed to have
> +	 * finished.
> +	 */
> +	dev->unplugged = true;
> +	synchronize_srcu(&drm_unplug_srcu);
> +
>  	dev->registered = false;
>  
>  	drm_client_dev_unregister(dev);
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index ca46a45a9cce..c50696c82a42 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
>   * drm_dev_is_unplugged - is a DRM device unplugged
>   * @dev: DRM device
>   *
> - * This function can be called to check whether a hotpluggable is unplugged.
> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> - * unplugged, these two functions guarantee that any store before calling
> - * drm_dev_unplug() is visible to callers of this function after it completes
> + * This function can be called to check whether @dev is unregistered. This can
> + * be used to detect that the underlying parent device is gone.
>   *
> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> - * recommended that drivers instead use the underlying drm_dev_enter() and
> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> + * is recommended that drivers instead use the underlying drm_dev_enter() and
>   * drm_dev_exit() function pairs.
>   */
>  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> -- 
> 2.20.1
>
Sean Paul Feb. 7, 2019, 9:52 p.m. UTC | #13
On Thu, Feb 07, 2019 at 04:07:33PM -0500, Sean Paul wrote:
> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> > The only thing now that makes drm_dev_unplug() special is that it sets
> > drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> > can remove drm_dev_unplug().
> > 
> > Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
> > ---
> > 
> > Maybe s/unplugged/unregistered/ ?
> > 
> > I looked at drm_device->registered, but using that would mean that
> > drm_dev_is_unplugged() would return before drm_device is registered.
> > And given that its current purpose is to prevent race against connector
> > registration, I stayed away from it.
> 
> Makes sense. I think having unregistered along with registered might cause some
> confusion unless it's really clearly documented. Perhaps
> s/unplugged/unregistered/ and s/registered/initialized/ but init has its own
> meaning...
> 
> I'd hate to hold up the patchset over naming bikeshed, so maybe this would be
> better off as a follow-on series where everyone can pile on opinions. So,
> 
> Reviewed-by: Sean Paul <sean@poorly.run>

Looks like I was dropped from Cc, so I didn't see the replies after Oleksandr's
review. So go ahead and disregard this :-)

Sean

> 
> 
> > 
> > Noralf.
> > 
> > 
> >  drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
> >  include/drm/drm_drv.h     | 10 ++++------
> >  2 files changed, 19 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 05bbc2b622fc..e0941200edc6 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
> >   */
> >  void drm_dev_unplug(struct drm_device *dev)
> >  {
> > -	/*
> > -	 * After synchronizing any critical read section is guaranteed to see
> > -	 * the new value of ->unplugged, and any critical section which might
> > -	 * still have seen the old value of ->unplugged is guaranteed to have
> > -	 * finished.
> > -	 */
> > -	dev->unplugged = true;
> > -	synchronize_srcu(&drm_unplug_srcu);
> > -
> >  	drm_dev_unregister(dev);
> >  	drm_dev_put(dev);
> >  }
> > @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
> >   * drm_dev_register() but does not deallocate the device. The caller must call
> >   * drm_dev_put() to drop their final reference.
> >   *
> > - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> > - * which can be called while there are still open users of @dev.
> > + * This function can be called while there are still open users of @dev as long
> > + * as the driver protects its device resources using drm_dev_enter() and
> > + * drm_dev_exit().
> >   *
> >   * This should be called first in the device teardown code to make sure
> > - * userspace can't access the device instance any more.
> > + * userspace can't access the device instance any more. Drivers that support
> > + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> > + * in order to disable the hardware on regular driver module unload.
> >   */
> >  void drm_dev_unregister(struct drm_device *dev)
> >  {
> > @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
> >  	if (drm_core_check_feature(dev, DRIVER_LEGACY))
> >  		drm_lastclose(dev);
> >  
> > +	/*
> > +	 * After synchronizing any critical read section is guaranteed to see
> > +	 * the new value of ->unplugged, and any critical section which might
> > +	 * still have seen the old value of ->unplugged is guaranteed to have
> > +	 * finished.
> > +	 */
> > +	dev->unplugged = true;
> > +	synchronize_srcu(&drm_unplug_srcu);
> > +
> >  	dev->registered = false;
> >  
> >  	drm_client_dev_unregister(dev);
> > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> > index ca46a45a9cce..c50696c82a42 100644
> > --- a/include/drm/drm_drv.h
> > +++ b/include/drm/drm_drv.h
> > @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
> >   * drm_dev_is_unplugged - is a DRM device unplugged
> >   * @dev: DRM device
> >   *
> > - * This function can be called to check whether a hotpluggable is unplugged.
> > - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> > - * unplugged, these two functions guarantee that any store before calling
> > - * drm_dev_unplug() is visible to callers of this function after it completes
> > + * This function can be called to check whether @dev is unregistered. This can
> > + * be used to detect that the underlying parent device is gone.
> >   *
> > - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> > - * recommended that drivers instead use the underlying drm_dev_enter() and
> > + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> > + * is recommended that drivers instead use the underlying drm_dev_enter() and
> >   * drm_dev_exit() function pairs.
> >   */
> >  static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> > -- 
> > 2.20.1
> > 
> 
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 05bbc2b622fc..e0941200edc6 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -366,15 +366,6 @@  EXPORT_SYMBOL(drm_dev_exit);
  */
 void drm_dev_unplug(struct drm_device *dev)
 {
-	/*
-	 * After synchronizing any critical read section is guaranteed to see
-	 * the new value of ->unplugged, and any critical section which might
-	 * still have seen the old value of ->unplugged is guaranteed to have
-	 * finished.
-	 */
-	dev->unplugged = true;
-	synchronize_srcu(&drm_unplug_srcu);
-
 	drm_dev_unregister(dev);
 	drm_dev_put(dev);
 }
@@ -832,11 +823,14 @@  EXPORT_SYMBOL(drm_dev_register);
  * drm_dev_register() but does not deallocate the device. The caller must call
  * drm_dev_put() to drop their final reference.
  *
- * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
- * which can be called while there are still open users of @dev.
+ * This function can be called while there are still open users of @dev as long
+ * as the driver protects its device resources using drm_dev_enter() and
+ * drm_dev_exit().
  *
  * This should be called first in the device teardown code to make sure
- * userspace can't access the device instance any more.
+ * userspace can't access the device instance any more. Drivers that support
+ * device unplug will probably want to call drm_atomic_helper_shutdown() first
+ * in order to disable the hardware on regular driver module unload.
  */
 void drm_dev_unregister(struct drm_device *dev)
 {
@@ -845,6 +839,15 @@  void drm_dev_unregister(struct drm_device *dev)
 	if (drm_core_check_feature(dev, DRIVER_LEGACY))
 		drm_lastclose(dev);
 
+	/*
+	 * After synchronizing any critical read section is guaranteed to see
+	 * the new value of ->unplugged, and any critical section which might
+	 * still have seen the old value of ->unplugged is guaranteed to have
+	 * finished.
+	 */
+	dev->unplugged = true;
+	synchronize_srcu(&drm_unplug_srcu);
+
 	dev->registered = false;
 
 	drm_client_dev_unregister(dev);
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index ca46a45a9cce..c50696c82a42 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -736,13 +736,11 @@  void drm_dev_unplug(struct drm_device *dev);
  * drm_dev_is_unplugged - is a DRM device unplugged
  * @dev: DRM device
  *
- * This function can be called to check whether a hotpluggable is unplugged.
- * Unplugging itself is singalled through drm_dev_unplug(). If a device is
- * unplugged, these two functions guarantee that any store before calling
- * drm_dev_unplug() is visible to callers of this function after it completes
+ * This function can be called to check whether @dev is unregistered. This can
+ * be used to detect that the underlying parent device is gone.
  *
- * WARNING: This function fundamentally races against drm_dev_unplug(). It is
- * recommended that drivers instead use the underlying drm_dev_enter() and
+ * WARNING: This function fundamentally races against drm_dev_unregister(). It
+ * is recommended that drivers instead use the underlying drm_dev_enter() and
  * drm_dev_exit() function pairs.
  */
 static inline bool drm_dev_is_unplugged(struct drm_device *dev)