diff mbox series

gpu: drm: radeon: Set DPM_FLAG_NEVER_SKIP when enabling PM-runtime

Message ID 2281684.8tZHfIXjiu@aspire.rjw.lan (mailing list archive)
State Not Applicable, archived
Headers show
Series gpu: drm: radeon: Set DPM_FLAG_NEVER_SKIP when enabling PM-runtime | expand

Commit Message

Rafael J. Wysocki Feb. 14, 2019, 10:46 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
and the direct-complete optimization is used for the radeon device
during system-wide suspend, the system doesn't resume.

Preventing direct-complete from being used with the radeon device by
setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
go away, which indicates that direct-complete is not safe for the
radeon driver in general and should not be used with it (at least
for now).

This fixes a regression introduced by commit c62ec4610c40
("PM / core: Fix direct_complete handling for devices with no
callbacks") which allowed direct-complete to be applied to
devices without PM callbacks (again) which in turn unlocked
direct-complete for radeon on HP ProBook 4540s.

Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201519
Reported-by: Ярослав Семченко <ukrkyi@gmail.com>
Tested-by: Ярослав Семченко <ukrkyi@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/gpu/drm/radeon/radeon_kms.c |    1 +
 1 file changed, 1 insertion(+)

Comments

Alex Deucher Feb. 15, 2019, 4:01 p.m. UTC | #1
On Fri, Feb 15, 2019 at 10:39 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
> and the direct-complete optimization is used for the radeon device
> during system-wide suspend, the system doesn't resume.
>
> Preventing direct-complete from being used with the radeon device by
> setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
> go away, which indicates that direct-complete is not safe for the
> radeon driver in general and should not be used with it (at least
> for now).
>
> This fixes a regression introduced by commit c62ec4610c40
> ("PM / core: Fix direct_complete handling for devices with no
> callbacks") which allowed direct-complete to be applied to
> devices without PM callbacks (again) which in turn unlocked
> direct-complete for radeon on HP ProBook 4540s.

Do other similar drivers like amdgpu and nouveau need the same fix?
I'm not too familiar with the direct_complete feature in general.

Alex

>
> Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=201519
> Reported-by: Ярослав Семченко <ukrkyi@gmail.com>
> Tested-by: Ярослав Семченко <ukrkyi@gmail.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/gpu/drm/radeon/radeon_kms.c |    1 +
>  1 file changed, 1 insertion(+)
>
> Index: linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> ===================================================================
> --- linux-pm.orig/drivers/gpu/drm/radeon/radeon_kms.c
> +++ linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> @@ -172,6 +172,7 @@ int radeon_driver_load_kms(struct drm_de
>         }
>
>         if (radeon_is_px(dev)) {
> +               dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP);
>                 pm_runtime_use_autosuspend(dev->dev);
>                 pm_runtime_set_autosuspend_delay(dev->dev, 5000);
>                 pm_runtime_set_active(dev->dev);
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Lukas Wunner Feb. 16, 2019, 6 a.m. UTC | #2
On Fri, Feb 15, 2019 at 11:01:04AM -0500, Alex Deucher wrote:
> On Fri, Feb 15, 2019 at 10:39 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
> > and the direct-complete optimization is used for the radeon device
> > during system-wide suspend, the system doesn't resume.
> >
> > Preventing direct-complete from being used with the radeon device by
> > setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
> > go away, which indicates that direct-complete is not safe for the
> > radeon driver in general and should not be used with it (at least
> > for now).
> >
> > This fixes a regression introduced by commit c62ec4610c40
> > ("PM / core: Fix direct_complete handling for devices with no
> > callbacks") which allowed direct-complete to be applied to
> > devices without PM callbacks (again) which in turn unlocked
> > direct-complete for radeon on HP ProBook 4540s.
> 
> Do other similar drivers like amdgpu and nouveau need the same fix?
> I'm not too familiar with the direct_complete feature in general.

direct_complete means that a discrete GPU which is in D3cold upon
entering system sleep is left as is, i.e. it is not woken.  It is
also expected to still be in D3cold when resuming from system sleep
from the PM core's point of view.  (If it is in D0uninitialized, the
GPU's driver needs to ensure it is transitioned to D3cold again.)

I know for a fact that resuming the discrete GPU is not necessary
on my MacBook Pro with Nvidia GPU.  I'd expect those with AMD GPUs
to behave the same.  The apple-gmux driver takes care of putting
the GPU into D3cold on resume from system sleep if it was in D3cold
when entering system sleep (see drivers/platform/x86/apple-gmux.c,
gmux_resume()).

I think it is desirable to use direct_complete because it saves power
(no need to gratuitously wake the GPU upon entering system sleep,
only to immediately cut its power) and it also speeds up the suspend
process by about half a second.

The root cause on the HP ProBook 4540s needs to be debugged, I'd
suspect a BIOS issue which could be adressed by a quirk, either for
this particular machine or for a certain class of devices (e.g. all
machines which use PR3 to transition to D3cold) if that is necessary
to behave identically to Windows.  Or maybe the atpx vga_switcheroo
handler needs to be amended to put the GPU into D3cold on resume from
system sleep if it was runtime suspended before.

Is this machine using s2idle or does it suspend to S3?

Thanks,

Lukas

> > Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks")
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=201519
> > Reported-by: ?????????????? ???????????????? <ukrkyi@gmail.com>
> > Tested-by: ?????????????? ???????????????? <ukrkyi@gmail.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/gpu/drm/radeon/radeon_kms.c |    1 +
> >  1 file changed, 1 insertion(+)
> >
> > Index: linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> > ===================================================================
> > --- linux-pm.orig/drivers/gpu/drm/radeon/radeon_kms.c
> > +++ linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> > @@ -172,6 +172,7 @@ int radeon_driver_load_kms(struct drm_de
> >         }
> >
> >         if (radeon_is_px(dev)) {
> > +               dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP);
> >                 pm_runtime_use_autosuspend(dev->dev);
> >                 pm_runtime_set_autosuspend_delay(dev->dev, 5000);
> >                 pm_runtime_set_active(dev->dev);
Alex Deucher Feb. 16, 2019, 11:37 p.m. UTC | #3
On Sat, Feb 16, 2019 at 1:01 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Fri, Feb 15, 2019 at 11:01:04AM -0500, Alex Deucher wrote:
> > On Fri, Feb 15, 2019 at 10:39 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
> > > and the direct-complete optimization is used for the radeon device
> > > during system-wide suspend, the system doesn't resume.
> > >
> > > Preventing direct-complete from being used with the radeon device by
> > > setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
> > > go away, which indicates that direct-complete is not safe for the
> > > radeon driver in general and should not be used with it (at least
> > > for now).
> > >
> > > This fixes a regression introduced by commit c62ec4610c40
> > > ("PM / core: Fix direct_complete handling for devices with no
> > > callbacks") which allowed direct-complete to be applied to
> > > devices without PM callbacks (again) which in turn unlocked
> > > direct-complete for radeon on HP ProBook 4540s.
> >
> > Do other similar drivers like amdgpu and nouveau need the same fix?
> > I'm not too familiar with the direct_complete feature in general.
>
> direct_complete means that a discrete GPU which is in D3cold upon
> entering system sleep is left as is, i.e. it is not woken.  It is
> also expected to still be in D3cold when resuming from system sleep
> from the PM core's point of view.  (If it is in D0uninitialized, the
> GPU's driver needs to ensure it is transitioned to D3cold again.)
>
> I know for a fact that resuming the discrete GPU is not necessary
> on my MacBook Pro with Nvidia GPU.  I'd expect those with AMD GPUs
> to behave the same.  The apple-gmux driver takes care of putting
> the GPU into D3cold on resume from system sleep if it was in D3cold
> when entering system sleep (see drivers/platform/x86/apple-gmux.c,
> gmux_resume()).
>
> I think it is desirable to use direct_complete because it saves power
> (no need to gratuitously wake the GPU upon entering system sleep,
> only to immediately cut its power) and it also speeds up the suspend
> process by about half a second.

Thanks for the info.  It sounds like we need a similar patch for
amdgpu.  With dGPUs controlled by the ACPI ATPX method, I believe the
dGPU is powered by automatically on resume from S3/S4.  I think there
may be a way to change that behavior in some revisions of ATPX (i.e.,
to keep the state across suspend cycles), but it's not the default.
I'm not sure about the newer _PR3 stuff in Hybrid Graphics laptops.  I
think it retains state.  In both radeon and amdgpu we probably need to
check if the system is using ATPX or _PR3 and disable direct complete
for ATPX at least.

Alex

>
> The root cause on the HP ProBook 4540s needs to be debugged, I'd
> suspect a BIOS issue which could be adressed by a quirk, either for
> this particular machine or for a certain class of devices (e.g. all
> machines which use PR3 to transition to D3cold) if that is necessary
> to behave identically to Windows.  Or maybe the atpx vga_switcheroo
> handler needs to be amended to put the GPU into D3cold on resume from
> system sleep if it was runtime suspended before.
>
> Is this machine using s2idle or does it suspend to S3?
>
> Thanks,
>
> Lukas
>
> > > Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks")
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=201519
> > > Reported-by: ?????????????? ???????????????? <ukrkyi@gmail.com>
> > > Tested-by: ?????????????? ???????????????? <ukrkyi@gmail.com>
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/gpu/drm/radeon/radeon_kms.c |    1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > Index: linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/gpu/drm/radeon/radeon_kms.c
> > > +++ linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
> > > @@ -172,6 +172,7 @@ int radeon_driver_load_kms(struct drm_de
> > >         }
> > >
> > >         if (radeon_is_px(dev)) {
> > > +               dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP);
> > >                 pm_runtime_use_autosuspend(dev->dev);
> > >                 pm_runtime_set_autosuspend_delay(dev->dev, 5000);
> > >                 pm_runtime_set_active(dev->dev);
Rafael J. Wysocki Feb. 17, 2019, 9:25 p.m. UTC | #4
On Sun, Feb 17, 2019 at 12:37 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Sat, Feb 16, 2019 at 1:01 AM Lukas Wunner <lukas@wunner.de> wrote:
> >
> > On Fri, Feb 15, 2019 at 11:01:04AM -0500, Alex Deucher wrote:
> > > On Fri, Feb 15, 2019 at 10:39 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > > On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
> > > > and the direct-complete optimization is used for the radeon device
> > > > during system-wide suspend, the system doesn't resume.
> > > >
> > > > Preventing direct-complete from being used with the radeon device by
> > > > setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
> > > > go away, which indicates that direct-complete is not safe for the
> > > > radeon driver in general and should not be used with it (at least
> > > > for now).
> > > >
> > > > This fixes a regression introduced by commit c62ec4610c40
> > > > ("PM / core: Fix direct_complete handling for devices with no
> > > > callbacks") which allowed direct-complete to be applied to
> > > > devices without PM callbacks (again) which in turn unlocked
> > > > direct-complete for radeon on HP ProBook 4540s.
> > >
> > > Do other similar drivers like amdgpu and nouveau need the same fix?
> > > I'm not too familiar with the direct_complete feature in general.
> >
> > direct_complete means that a discrete GPU which is in D3cold upon
> > entering system sleep is left as is, i.e. it is not woken.  It is
> > also expected to still be in D3cold when resuming from system sleep
> > from the PM core's point of view.  (If it is in D0uninitialized, the
> > GPU's driver needs to ensure it is transitioned to D3cold again.)
> >
> > I know for a fact that resuming the discrete GPU is not necessary
> > on my MacBook Pro with Nvidia GPU.  I'd expect those with AMD GPUs
> > to behave the same.  The apple-gmux driver takes care of putting
> > the GPU into D3cold on resume from system sleep if it was in D3cold
> > when entering system sleep (see drivers/platform/x86/apple-gmux.c,
> > gmux_resume()).
> >
> > I think it is desirable to use direct_complete because it saves power
> > (no need to gratuitously wake the GPU upon entering system sleep,
> > only to immediately cut its power) and it also speeds up the suspend
> > process by about half a second.
>
> Thanks for the info.  It sounds like we need a similar patch for
> amdgpu.  With dGPUs controlled by the ACPI ATPX method, I believe the
> dGPU is powered by automatically on resume from S3/S4.  I think there
> may be a way to change that behavior in some revisions of ATPX (i.e.,
> to keep the state across suspend cycles), but it's not the default.
> I'm not sure about the newer _PR3 stuff in Hybrid Graphics laptops.  I
> think it retains state.  In both radeon and amdgpu we probably need to
> check if the system is using ATPX or _PR3 and disable direct complete
> for ATPX at least.

I would disable direct-complete entirely for them then and possibly
consider using DPM_FLAG_SMART_SUSPEND in the cases when that would be
safe.

Anyway, I posted this patch for radeon, because it addresses a
specific regression and I'm not super-familiar with GPU drivers in
general.

Cheers,
Rafael
Alex Deucher Feb. 18, 2019, 10:21 p.m. UTC | #5
On Sun, Feb 17, 2019 at 4:26 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Sun, Feb 17, 2019 at 12:37 AM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Sat, Feb 16, 2019 at 1:01 AM Lukas Wunner <lukas@wunner.de> wrote:
> > >
> > > On Fri, Feb 15, 2019 at 11:01:04AM -0500, Alex Deucher wrote:
> > > > On Fri, Feb 15, 2019 at 10:39 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > > > On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
> > > > > and the direct-complete optimization is used for the radeon device
> > > > > during system-wide suspend, the system doesn't resume.
> > > > >
> > > > > Preventing direct-complete from being used with the radeon device by
> > > > > setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
> > > > > go away, which indicates that direct-complete is not safe for the
> > > > > radeon driver in general and should not be used with it (at least
> > > > > for now).
> > > > >
> > > > > This fixes a regression introduced by commit c62ec4610c40
> > > > > ("PM / core: Fix direct_complete handling for devices with no
> > > > > callbacks") which allowed direct-complete to be applied to
> > > > > devices without PM callbacks (again) which in turn unlocked
> > > > > direct-complete for radeon on HP ProBook 4540s.
> > > >
> > > > Do other similar drivers like amdgpu and nouveau need the same fix?
> > > > I'm not too familiar with the direct_complete feature in general.
> > >
> > > direct_complete means that a discrete GPU which is in D3cold upon
> > > entering system sleep is left as is, i.e. it is not woken.  It is
> > > also expected to still be in D3cold when resuming from system sleep
> > > from the PM core's point of view.  (If it is in D0uninitialized, the
> > > GPU's driver needs to ensure it is transitioned to D3cold again.)
> > >
> > > I know for a fact that resuming the discrete GPU is not necessary
> > > on my MacBook Pro with Nvidia GPU.  I'd expect those with AMD GPUs
> > > to behave the same.  The apple-gmux driver takes care of putting
> > > the GPU into D3cold on resume from system sleep if it was in D3cold
> > > when entering system sleep (see drivers/platform/x86/apple-gmux.c,
> > > gmux_resume()).
> > >
> > > I think it is desirable to use direct_complete because it saves power
> > > (no need to gratuitously wake the GPU upon entering system sleep,
> > > only to immediately cut its power) and it also speeds up the suspend
> > > process by about half a second.
> >
> > Thanks for the info.  It sounds like we need a similar patch for
> > amdgpu.  With dGPUs controlled by the ACPI ATPX method, I believe the
> > dGPU is powered by automatically on resume from S3/S4.  I think there
> > may be a way to change that behavior in some revisions of ATPX (i.e.,
> > to keep the state across suspend cycles), but it's not the default.
> > I'm not sure about the newer _PR3 stuff in Hybrid Graphics laptops.  I
> > think it retains state.  In both radeon and amdgpu we probably need to
> > check if the system is using ATPX or _PR3 and disable direct complete
> > for ATPX at least.
>
> I would disable direct-complete entirely for them then and possibly
> consider using DPM_FLAG_SMART_SUSPEND in the cases when that would be
> safe.
>
> Anyway, I posted this patch for radeon, because it addresses a
> specific regression and I'm not super-familiar with GPU drivers in
> general.

Thanks.  I've applied this patch and sent out a patch for amdgpu.

Alex
diff mbox series

Patch

Index: linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
===================================================================
--- linux-pm.orig/drivers/gpu/drm/radeon/radeon_kms.c
+++ linux-pm/drivers/gpu/drm/radeon/radeon_kms.c
@@ -172,6 +172,7 @@  int radeon_driver_load_kms(struct drm_de
 	}
 
 	if (radeon_is_px(dev)) {
+		dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP);
 		pm_runtime_use_autosuspend(dev->dev);
 		pm_runtime_set_autosuspend_delay(dev->dev, 5000);
 		pm_runtime_set_active(dev->dev);