diff mbox

Can no longer shutdown after drm/radeon: Implement radeon_pci_shutdown

Message ID 20131212025800.GA273@x4 (mailing list archive)
State New, archived
Headers show

Commit Message

Markus Trippelsdorf Dec. 12, 2013, 2:58 a.m. UTC
On 2013.12.11 at 23:46 +0000, Deucher, Alexander wrote:
> > -----Original Message-----
> > From: Peter Chubb [mailto:peter.chubb@nicta.com.au]
> > Sent: Wednesday, December 11, 2013 5:11 PM
> > To: Markus Trippelsdorf
> > Cc: Peter Chubb; Deucher, Alexander; airlied@linux.ie; dri-
> > devel@lists.freedesktop.org
> > Subject: Re: Can no longer shutdown after drm/radeon: Implement
> > radeon_pci_shutdown
> > 
> > >>>>> "Markus" == Markus Trippelsdorf <markus@trippelsdorf.de> writes:
> > 
> > Markus> On 2013.12.11 at 11:37 +1100, Peter Chubb wrote:
> > 
> > Markus> It would be interesting to know where exactly it hangs.  Could
> > Markus> you comment out the *_fini(rdev) calls in
> > Markus> radeon_driver_unload_kms
> > (drivers/gpu/drm/radeon/radeon_kms.c)
> > Markus> one after the other to find out which one is responsible?
> > 
> > It's radeon_device_fini() that is the problem.
> 
> I think the problem is that the drm subsystem tears down the device
> via drm_driver.unload in drm_dev_unregister(), but now that we have a
> pci_driver.shutdown callback (which is needed for kexec) that gets
> called too so the driver gets torn down twice.

If that is the case the following patch should fix the issue.
Can you give it a try, Peter?

Comments

Peter Chubb Dec. 12, 2013, 3:10 a.m. UTC | #1
>>>>> "Markus" == Markus Trippelsdorf <markus@trippelsdorf.de> writes:
>> > >>>>> "Markus" == Markus Trippelsdorf <markus@trippelsdorf.de>


Markus> If that is the case the following patch should fix the issue.
Markus> Can you give it a try, Peter?

Thanks that works.  I tested shutdown, kexec, and s2disk --- all work
correctly.


Peter C
--
Dr Peter Chubb				        peter.chubb AT nicta.com.au
http://www.ssrg.nicta.com.au          Software Systems Research Group/NICTA
Alex Deucher Dec. 12, 2013, 3:27 a.m. UTC | #2
> -----Original Message-----
> From: Markus Trippelsdorf [mailto:markus@trippelsdorf.de]
> Sent: Wednesday, December 11, 2013 9:58 PM
> To: Deucher, Alexander
> Cc: Peter Chubb; airlied@linux.ie; dri-devel@lists.freedesktop.org
> Subject: Re: Can no longer shutdown after drm/radeon: Implement
> radeon_pci_shutdown
> 
> On 2013.12.11 at 23:46 +0000, Deucher, Alexander wrote:
> > > -----Original Message-----
> > > From: Peter Chubb [mailto:peter.chubb@nicta.com.au]
> > > Sent: Wednesday, December 11, 2013 5:11 PM
> > > To: Markus Trippelsdorf
> > > Cc: Peter Chubb; Deucher, Alexander; airlied@linux.ie; dri-
> > > devel@lists.freedesktop.org
> > > Subject: Re: Can no longer shutdown after drm/radeon: Implement
> > > radeon_pci_shutdown
> > >
> > > >>>>> "Markus" == Markus Trippelsdorf <markus@trippelsdorf.de>
> writes:
> > >
> > > Markus> On 2013.12.11 at 11:37 +1100, Peter Chubb wrote:
> > >
> > > Markus> It would be interesting to know where exactly it hangs.  Could
> > > Markus> you comment out the *_fini(rdev) calls in
> > > Markus> radeon_driver_unload_kms
> > > (drivers/gpu/drm/radeon/radeon_kms.c)
> > > Markus> one after the other to find out which one is responsible?
> > >
> > > It's radeon_device_fini() that is the problem.
> >
> > I think the problem is that the drm subsystem tears down the device
> > via drm_driver.unload in drm_dev_unregister(), but now that we have a
> > pci_driver.shutdown callback (which is needed for kexec) that gets
> > called too so the driver gets torn down twice.
> 
> If that is the case the following patch should fix the issue.
> Can you give it a try, Peter?

That may work, but I think it's just papering over a race which may still bite someone else depending on the timing.

Alex

> 
> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c
> b/drivers/gpu/drm/radeon/radeon_kms.c
> index 55d0b474bd37..539e5f1ff5e3 100644
> --- a/drivers/gpu/drm/radeon/radeon_kms.c
> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
> @@ -59,7 +59,8 @@ int radeon_driver_unload_kms(struct drm_device
> *dev)
>  	radeon_acpi_fini(rdev);
> 
>  	radeon_modeset_fini(rdev);
> -	radeon_device_fini(rdev);
> +	if (!rdev->shutdown)
> +		radeon_device_fini(rdev);
> 
>  done_free:
>  	kfree(rdev);
> --
> Markus
Markus Trippelsdorf Dec. 12, 2013, 1:56 p.m. UTC | #3
On 2013.12.12 at 03:27 +0000, Deucher, Alexander wrote:
> > On 2013.12.11 at 23:46 +0000, Deucher, Alexander wrote:
> > > > -----Original Message-----
> > > > From: Peter Chubb [mailto:peter.chubb@nicta.com.au]
> > > > Sent: Wednesday, December 11, 2013 5:11 PM
> > > > To: Markus Trippelsdorf
> > > > Cc: Peter Chubb; Deucher, Alexander; airlied@linux.ie; dri-
> > > > devel@lists.freedesktop.org
> > > > Subject: Re: Can no longer shutdown after drm/radeon: Implement
> > > > radeon_pci_shutdown
> > > >
> > > > >>>>> "Markus" == Markus Trippelsdorf <markus@trippelsdorf.de>
> > writes:
> > > >
> > > > Markus> On 2013.12.11 at 11:37 +1100, Peter Chubb wrote:
> > > >
> > > > Markus> It would be interesting to know where exactly it hangs.  Could
> > > > Markus> you comment out the *_fini(rdev) calls in
> > > > Markus> radeon_driver_unload_kms
> > > > (drivers/gpu/drm/radeon/radeon_kms.c)
> > > > Markus> one after the other to find out which one is responsible?
> > > >
> > > > It's radeon_device_fini() that is the problem.
> > >
> > > I think the problem is that the drm subsystem tears down the device
> > > via drm_driver.unload in drm_dev_unregister(), but now that we have a
> > > pci_driver.shutdown callback (which is needed for kexec) that gets
> > > called too so the driver gets torn down twice.
> > 
> > If that is the case the following patch should fix the issue.
> > Can you give it a try, Peter?
(Peter:)
> Thanks that works.  I tested shutdown, kexec, and s2disk --- all work
> correctly.

> That may work, but I think it's just papering over a race which may
> still bite someone else depending on the timing.

This leaves three possibilities:

1) Revert 846ae41ae99d now and come up with a solution with proper
locking for 3.14
2) Add my simple fix now and implement additional locking if the need
arises for 3.14.
3) Implement a fix with proper locking now.

It's your choice Alex.
Alex Deucher Dec. 12, 2013, 2:11 p.m. UTC | #4
On Thu, Dec 12, 2013 at 8:56 AM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2013.12.12 at 03:27 +0000, Deucher, Alexander wrote:
>> > On 2013.12.11 at 23:46 +0000, Deucher, Alexander wrote:
>> > > > -----Original Message-----
>> > > > From: Peter Chubb [mailto:peter.chubb@nicta.com.au]
>> > > > Sent: Wednesday, December 11, 2013 5:11 PM
>> > > > To: Markus Trippelsdorf
>> > > > Cc: Peter Chubb; Deucher, Alexander; airlied@linux.ie; dri-
>> > > > devel@lists.freedesktop.org
>> > > > Subject: Re: Can no longer shutdown after drm/radeon: Implement
>> > > > radeon_pci_shutdown
>> > > >
>> > > > >>>>> "Markus" == Markus Trippelsdorf <markus@trippelsdorf.de>
>> > writes:
>> > > >
>> > > > Markus> On 2013.12.11 at 11:37 +1100, Peter Chubb wrote:
>> > > >
>> > > > Markus> It would be interesting to know where exactly it hangs.  Could
>> > > > Markus> you comment out the *_fini(rdev) calls in
>> > > > Markus> radeon_driver_unload_kms
>> > > > (drivers/gpu/drm/radeon/radeon_kms.c)
>> > > > Markus> one after the other to find out which one is responsible?
>> > > >
>> > > > It's radeon_device_fini() that is the problem.
>> > >
>> > > I think the problem is that the drm subsystem tears down the device
>> > > via drm_driver.unload in drm_dev_unregister(), but now that we have a
>> > > pci_driver.shutdown callback (which is needed for kexec) that gets
>> > > called too so the driver gets torn down twice.
>> >
>> > If that is the case the following patch should fix the issue.
>> > Can you give it a try, Peter?
> (Peter:)
>> Thanks that works.  I tested shutdown, kexec, and s2disk --- all work
>> correctly.
>
>> That may work, but I think it's just papering over a race which may
>> still bite someone else depending on the timing.
>
> This leaves three possibilities:
>
> 1) Revert 846ae41ae99d now and come up with a solution with proper
> locking for 3.14
> 2) Add my simple fix now and implement additional locking if the need
> arises for 3.14.
> 3) Implement a fix with proper locking now.
>
> It's your choice Alex.

I guess I'll revert it for now.  I need to think about it a bit more.
Thanks for your help.

Alex

>
> --
> Markus
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c
index 55d0b474bd37..539e5f1ff5e3 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -59,7 +59,8 @@  int radeon_driver_unload_kms(struct drm_device *dev)
 	radeon_acpi_fini(rdev);
 	
 	radeon_modeset_fini(rdev);
-	radeon_device_fini(rdev);
+	if (!rdev->shutdown)
+		radeon_device_fini(rdev);
 
 done_free:
 	kfree(rdev);