diff mbox

drm/amd/powerplay: don't give up if DPM is already running

Message ID CADnq5_M=Lt1qoOG2=T+t+TjaJNJQpU=_vPqUZbusAq+6O7Hq9Q@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alex Deucher Oct. 13, 2016, 3:29 p.m. UTC
On Thu, Oct 13, 2016 at 3:45 AM, Zhu, Rex <Rex.Zhu@amd.com> wrote:
> Hi all,
>
> The attached patches were also for this issue.
> Disable dpm when rmmod amdgpu.
>
> Please help to review.

Patch 1:
+    r = adev->ip_blocks[AMD_IP_BLOCK_TYPE_SMC].funcs->hw_fini((void *)adev);
+    if (r)
+        DRM_DEBUG("hw_fini of IP block <%s> failed %d\n",
+            adev->ip_blocks[AMD_IP_BLOCK_TYPE_SMC].funcs->name, r);
+
+    adev->ip_block_status[AMD_IP_BLOCK_TYPE_SMC].hw = false;

You can't use AMD_IP_BLOCK_TYPE_* as index.  Some chips may not have
the IP block.  Maybe something like the attached patch.  That said, I
think longer term it makes more sense to allows the SOCs to specify
the order for init, fini, etc. otherwise we'll have lots of variable
logic in the common code.

Patch 2:
+    if (1 == PHM_READ_VFPF_INDIRECT_FIELD(hwmgr->device,
CGS_IND_REG__SMC, FEATURE_STATUS, AVS_ON)) {
+        PP_ASSERT_WITH_CODE((0 == smum_send_msg_to_smc(hwmgr->smumgr,
PPSMC_MSG_DisableAvfs)),
+                    "Failed to disable AVFS!",
+                    return -1;);
+    }

Use a proper error code there like -EINVAL or something.  With that fixed:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Alex

>
> Best Regards
> Rex
>
> -----Original Message-----
> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Zhu, Rex
> Sent: Wednesday, October 12, 2016 9:45 PM
> To: Alex Deucher; Grazvydas Ignotas
> Cc: amd-gfx list; Maling list - DRI developers
> Subject: RE: [PATCH] drm/amd/powerplay: don't give up if DPM is already running
>
> Hi Grazvydas and Alex,
>
> We needed to disable dpm when rmmod amdgpu for this issue.
>
> I am checking the function of disable dpm task.
>
> Best Regards
> Rex
>
> -----Original Message-----
> From: Alex Deucher [mailto:alexdeucher@gmail.com]
> Sent: Wednesday, October 12, 2016 4:01 AM
> To: Grazvydas Ignotas; Zhu, Rex
> Cc: Maling list - DRI developers; amd-gfx list
> Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already running
>
> +Rex to review this.
>
> Alex
>
> On Sun, Oct 9, 2016 at 3:23 PM, Grazvydas Ignotas <notasas@gmail.com> wrote:
>> Currently the driver crashes if smu7_enable_dpm_tasks() returns early,
>> which it does if DPM is already active. It seems to be better just to
>> continue anyway, at least I haven't noticed any ill effects. It's also
>> unclear at what state the hardware was left by the previous driver, so
>> IMO it's better to always fully initialize.
>>
>> Way to reproduce:
>> $ modprobe amdgpu
>> $ rmmod amdgpu
>> $ modprobe amdgpu
>> ...
>> DPM is already running right now, no need to enable DPM!
>> ...
>> failed to send message 18b ret is 0
>> BUG: unable to handle kernel paging request at ffffed01fc9ab21f Call
>> Trace:
>>  smu7_set_power_state_tasks+0x499/0x1940 [amdgpu]
>>  phm_set_power_state+0xcb/0x120 [amdgpu]
>>  psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu]
>>  pem_task_adjust_power_state+0xb9/0xd0 [amdgpu]
>>  pem_excute_event_chain+0x7d/0xe0 [amdgpu]
>>  pem_handle_event_unlocked+0x49/0x60 [amdgpu]
>>  pem_handle_event+0xe/0x10 [amdgpu]
>>  pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu]
>>  amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu]
>>  dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu]
>>  dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu]
>>  drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper]
>>  amdgpu_fbdev_init+0x13e/0x170 [amdgpu]
>>  amdgpu_device_init+0x1aeb/0x2490 [amdgpu]
>>
>> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
>> ---
>>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>> index f6afa6a..327030b 100644
>> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
>> @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr
>> *hwmgr)
>>
>>         tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1;
>>         PP_ASSERT_WITH_CODE(tmp_result == 0,
>> -                       "DPM is already running right now, no need to enable DPM!",
>> -                       return 0);
>> +                       "DPM is already running",
>> +                       );
>>
>>         if (smu7_voltage_control(hwmgr)) {
>>                 tmp_result = smu7_enable_voltage_control(hwmgr);
>> --
>> 2.7.4
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Comments

Zhu, Rex Oct. 14, 2016, 12:12 p.m. UTC | #1
Hi Alex,

Your new attached patch is Tested-by and Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>

Just one question,

Do we need to set clock_gate for smu? 

Best Regards
Rex


-----Original Message-----
From: Alex Deucher [mailto:alexdeucher@gmail.com] 

Sent: Thursday, October 13, 2016 11:29 PM
To: Zhu, Rex
Cc: Grazvydas Ignotas; amd-gfx list; Maling list - DRI developers
Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already running

 On Thu, Oct 13, 2016 at 3:45 AM, Zhu, Rex <Rex.Zhu@amd.com> wrote:
> Hi all,

>

> The attached patches were also for this issue.

> Disable dpm when rmmod amdgpu.

>

> Please help to review.


Patch 1:
+    r = adev->ip_blocks[AMD_IP_BLOCK_TYPE_SMC].funcs->hw_fini((void *)adev);
+    if (r)
+        DRM_DEBUG("hw_fini of IP block <%s> failed %d\n",
+            adev->ip_blocks[AMD_IP_BLOCK_TYPE_SMC].funcs->name, r);
+
+    adev->ip_block_status[AMD_IP_BLOCK_TYPE_SMC].hw = false;

You can't use AMD_IP_BLOCK_TYPE_* as index.  Some chips may not have the IP block.  Maybe something like the attached patch.  That said, I think longer term it makes more sense to allows the SOCs to specify the order for init, fini, etc. otherwise we'll have lots of variable logic in the common code.

Patch 2:
+    if (1 == PHM_READ_VFPF_INDIRECT_FIELD(hwmgr->device,
CGS_IND_REG__SMC, FEATURE_STATUS, AVS_ON)) {
+        PP_ASSERT_WITH_CODE((0 == smum_send_msg_to_smc(hwmgr->smumgr,
PPSMC_MSG_DisableAvfs)),
+                    "Failed to disable AVFS!",
+                    return -1;);
+    }

Use a proper error code there like -EINVAL or something.  With that fixed:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>


Alex

>

> Best Regards

> Rex

>

> -----Original Message-----

> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf 

> Of Zhu, Rex

> Sent: Wednesday, October 12, 2016 9:45 PM

> To: Alex Deucher; Grazvydas Ignotas

> Cc: amd-gfx list; Maling list - DRI developers

> Subject: RE: [PATCH] drm/amd/powerplay: don't give up if DPM is 

> already running

>

> Hi Grazvydas and Alex,

>

> We needed to disable dpm when rmmod amdgpu for this issue.

>

> I am checking the function of disable dpm task.

>

> Best Regards

> Rex

>

> -----Original Message-----

> From: Alex Deucher [mailto:alexdeucher@gmail.com]

> Sent: Wednesday, October 12, 2016 4:01 AM

> To: Grazvydas Ignotas; Zhu, Rex

> Cc: Maling list - DRI developers; amd-gfx list

> Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is 

> already running

>

> +Rex to review this.

>

> Alex

>

> On Sun, Oct 9, 2016 at 3:23 PM, Grazvydas Ignotas <notasas@gmail.com> wrote:

>> Currently the driver crashes if smu7_enable_dpm_tasks() returns 

>> early, which it does if DPM is already active. It seems to be better 

>> just to continue anyway, at least I haven't noticed any ill effects. 

>> It's also unclear at what state the hardware was left by the previous 

>> driver, so IMO it's better to always fully initialize.

>>

>> Way to reproduce:

>> $ modprobe amdgpu

>> $ rmmod amdgpu

>> $ modprobe amdgpu

>> ...

>> DPM is already running right now, no need to enable DPM!

>> ...

>> failed to send message 18b ret is 0

>> BUG: unable to handle kernel paging request at ffffed01fc9ab21f Call

>> Trace:

>>  smu7_set_power_state_tasks+0x499/0x1940 [amdgpu]

>>  phm_set_power_state+0xcb/0x120 [amdgpu]

>>  psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu]

>>  pem_task_adjust_power_state+0xb9/0xd0 [amdgpu]

>>  pem_excute_event_chain+0x7d/0xe0 [amdgpu]

>>  pem_handle_event_unlocked+0x49/0x60 [amdgpu]

>>  pem_handle_event+0xe/0x10 [amdgpu]

>>  pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu]

>>  amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu]

>>  dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu]

>>  dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu]

>>  drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper]

>>  amdgpu_fbdev_init+0x13e/0x170 [amdgpu]

>>  amdgpu_device_init+0x1aeb/0x2490 [amdgpu]

>>

>> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>

>> ---

>>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--

>>  1 file changed, 2 insertions(+), 2 deletions(-)

>>

>> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

>> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

>> index f6afa6a..327030b 100644

>> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

>> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

>> @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct 

>> pp_hwmgr

>> *hwmgr)

>>

>>         tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1;

>>         PP_ASSERT_WITH_CODE(tmp_result == 0,

>> -                       "DPM is already running right now, no need to enable DPM!",

>> -                       return 0);

>> +                       "DPM is already running",

>> +                       );

>>

>>         if (smu7_voltage_control(hwmgr)) {

>>                 tmp_result = smu7_enable_voltage_control(hwmgr);

>> --

>> 2.7.4

>>

>> _______________________________________________

>> amd-gfx mailing list

>> amd-gfx@lists.freedesktop.org

>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

> _______________________________________________

> amd-gfx mailing list

> amd-gfx@lists.freedesktop.org

> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Alex Deucher Oct. 14, 2016, 1:37 p.m. UTC | #2
> -----Original Message-----

> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf

> Of Zhu, Rex

> Sent: Friday, October 14, 2016 8:12 AM

> To: Alex Deucher

> Cc: Maling list - DRI developers; amd-gfx list

> Subject: RE: [PATCH] drm/amd/powerplay: don't give up if DPM is already

> running

> 

>  Hi Alex,

> 

> Your new attached patch is Tested-by and Reviewed-by: Rex Zhu

> <Rex.Zhu@amd.com>

> 

> Just one question,

> 

> Do we need to set clock_gate for smu?


At the moment, no, but if we ever have have clockgating for smu related IPs, we might as well include it.  It matches the current behavior with respect to IP tear down.

Alex

> 

> Best Regards

> Rex

> 

> 

> -----Original Message-----

> From: Alex Deucher [mailto:alexdeucher@gmail.com]

> Sent: Thursday, October 13, 2016 11:29 PM

> To: Zhu, Rex

> Cc: Grazvydas Ignotas; amd-gfx list; Maling list - DRI developers

> Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already

> running

> 

>  On Thu, Oct 13, 2016 at 3:45 AM, Zhu, Rex <Rex.Zhu@amd.com> wrote:

> > Hi all,

> >

> > The attached patches were also for this issue.

> > Disable dpm when rmmod amdgpu.

> >

> > Please help to review.

> 

> Patch 1:

> +    r = adev->ip_blocks[AMD_IP_BLOCK_TYPE_SMC].funcs->hw_fini((void

> *)adev);

> +    if (r)

> +        DRM_DEBUG("hw_fini of IP block <%s> failed %d\n",

> +            adev->ip_blocks[AMD_IP_BLOCK_TYPE_SMC].funcs->name, r);

> +

> +    adev->ip_block_status[AMD_IP_BLOCK_TYPE_SMC].hw = false;

> 

> You can't use AMD_IP_BLOCK_TYPE_* as index.  Some chips may not have

> the IP block.  Maybe something like the attached patch.  That said, I think

> longer term it makes more sense to allows the SOCs to specify the order for

> init, fini, etc. otherwise we'll have lots of variable logic in the common code.

> 

> Patch 2:

> +    if (1 == PHM_READ_VFPF_INDIRECT_FIELD(hwmgr->device,

> CGS_IND_REG__SMC, FEATURE_STATUS, AVS_ON)) {

> +        PP_ASSERT_WITH_CODE((0 == smum_send_msg_to_smc(hwmgr-

> >smumgr,

> PPSMC_MSG_DisableAvfs)),

> +                    "Failed to disable AVFS!",

> +                    return -1;);

> +    }

> 

> Use a proper error code there like -EINVAL or something.  With that fixed:

> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> 

> Alex

> 

> >

> > Best Regards

> > Rex

> >

> > -----Original Message-----

> > From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf

> > Of Zhu, Rex

> > Sent: Wednesday, October 12, 2016 9:45 PM

> > To: Alex Deucher; Grazvydas Ignotas

> > Cc: amd-gfx list; Maling list - DRI developers

> > Subject: RE: [PATCH] drm/amd/powerplay: don't give up if DPM is

> > already running

> >

> > Hi Grazvydas and Alex,

> >

> > We needed to disable dpm when rmmod amdgpu for this issue.

> >

> > I am checking the function of disable dpm task.

> >

> > Best Regards

> > Rex

> >

> > -----Original Message-----

> > From: Alex Deucher [mailto:alexdeucher@gmail.com]

> > Sent: Wednesday, October 12, 2016 4:01 AM

> > To: Grazvydas Ignotas; Zhu, Rex

> > Cc: Maling list - DRI developers; amd-gfx list

> > Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is

> > already running

> >

> > +Rex to review this.

> >

> > Alex

> >

> > On Sun, Oct 9, 2016 at 3:23 PM, Grazvydas Ignotas <notasas@gmail.com>

> wrote:

> >> Currently the driver crashes if smu7_enable_dpm_tasks() returns

> >> early, which it does if DPM is already active. It seems to be better

> >> just to continue anyway, at least I haven't noticed any ill effects.

> >> It's also unclear at what state the hardware was left by the previous

> >> driver, so IMO it's better to always fully initialize.

> >>

> >> Way to reproduce:

> >> $ modprobe amdgpu

> >> $ rmmod amdgpu

> >> $ modprobe amdgpu

> >> ...

> >> DPM is already running right now, no need to enable DPM!

> >> ...

> >> failed to send message 18b ret is 0

> >> BUG: unable to handle kernel paging request at ffffed01fc9ab21f Call

> >> Trace:

> >>  smu7_set_power_state_tasks+0x499/0x1940 [amdgpu]

> >>  phm_set_power_state+0xcb/0x120 [amdgpu]

> >>  psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu]

> >>  pem_task_adjust_power_state+0xb9/0xd0 [amdgpu]

> >>  pem_excute_event_chain+0x7d/0xe0 [amdgpu]

> >>  pem_handle_event_unlocked+0x49/0x60 [amdgpu]

> >>  pem_handle_event+0xe/0x10 [amdgpu]

> >>  pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu]

> >>  amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu]

> >>  dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu]

> >>  dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu]

> >>  drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper]

> >>  amdgpu_fbdev_init+0x13e/0x170 [amdgpu]

> >>  amdgpu_device_init+0x1aeb/0x2490 [amdgpu]

> >>

> >> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>

> >> ---

> >>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++--

> >>  1 file changed, 2 insertions(+), 2 deletions(-)

> >>

> >> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

> >> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

> >> index f6afa6a..327030b 100644

> >> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

> >> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c

> >> @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct

> >> pp_hwmgr

> >> *hwmgr)

> >>

> >>         tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1;

> >>         PP_ASSERT_WITH_CODE(tmp_result == 0,

> >> -                       "DPM is already running right now, no need to enable DPM!",

> >> -                       return 0);

> >> +                       "DPM is already running",

> >> +                       );

> >>

> >>         if (smu7_voltage_control(hwmgr)) {

> >>                 tmp_result = smu7_enable_voltage_control(hwmgr);

> >> --

> >> 2.7.4

> >>

> >> _______________________________________________

> >> amd-gfx mailing list

> >> amd-gfx@lists.freedesktop.org

> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

> > _______________________________________________

> > amd-gfx mailing list

> > amd-gfx@lists.freedesktop.org

> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

> _______________________________________________

> amd-gfx mailing list

> amd-gfx@lists.freedesktop.org

> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
diff mbox

Patch

From 0e338a127bbe5eb9de37a41330a62b0b0f5e1983 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 13 Oct 2016 11:22:17 -0400
Subject: [PATCH] drm/amdgpu: disable smu hw first on tear down

Otherwise, you can't disable dpm.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b88a3b7..5a99a43 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1463,6 +1463,30 @@  static int amdgpu_fini(struct amdgpu_device *adev)
 {
 	int i, r;
 
+	/* need to disable SMC first */
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_block_status[i].hw)
+			continue;
+		if (adev->ip_blocks[i].type == AMD_IP_BLOCK_TYPE_SMC) {
+			/* ungate blocks before hw fini so that we can shutdown the blocks safely */
+			r = adev->ip_blocks[i].funcs->set_clockgating_state((void *)adev,
+									    AMD_CG_STATE_UNGATE);
+			if (r) {
+				DRM_ERROR("set_clockgating_state(ungate) of IP block <%s> failed %d\n",
+					  adev->ip_blocks[i].funcs->name, r);
+				return r;
+			}
+			r = adev->ip_blocks[i].funcs->hw_fini((void *)adev);
+			/* XXX handle errors */
+			if (r) {
+				DRM_DEBUG("hw_fini of IP block <%s> failed %d\n",
+					  adev->ip_blocks[i].funcs->name, r);
+			}
+			adev->ip_block_status[i].hw = false;
+			break;
+		}
+	}
+
 	for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
 		if (!adev->ip_block_status[i].hw)
 			continue;
-- 
2.5.5