Message ID | 1476040988-11676-1-git-send-email-notasas@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> On 10/10/2016 06:23 AM, Grazvydas Ignotas wrote: > Currently the driver crashes if smu7_enable_dpm_tasks() returns early, > which it does if DPM is already active. It seems to be better just to > continue anyway, at least I haven't noticed any ill effects. It's also > unclear at what state the hardware was left by the previous driver, so > IMO it's better to always fully initialize. > > Way to reproduce: > $ modprobe amdgpu > $ rmmod amdgpu > $ modprobe amdgpu > ... > DPM is already running right now, no need to enable DPM! > ... > failed to send message 18b ret is 0 > BUG: unable to handle kernel paging request at ffffed01fc9ab21f > Call Trace: > smu7_set_power_state_tasks+0x499/0x1940 [amdgpu] > phm_set_power_state+0xcb/0x120 [amdgpu] > psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu] > pem_task_adjust_power_state+0xb9/0xd0 [amdgpu] > pem_excute_event_chain+0x7d/0xe0 [amdgpu] > pem_handle_event_unlocked+0x49/0x60 [amdgpu] > pem_handle_event+0xe/0x10 [amdgpu] > pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu] > amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu] > dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu] > dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu] > drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper] > amdgpu_fbdev_init+0x13e/0x170 [amdgpu] > amdgpu_device_init+0x1aeb/0x2490 [amdgpu] > > Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> > --- > drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > index f6afa6a..327030b 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr *hwmgr) > > tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1; > PP_ASSERT_WITH_CODE(tmp_result == 0, > - "DPM is already running right now, no need to enable DPM!", > - return 0); > + "DPM is already running", > + ); > > if (smu7_voltage_control(hwmgr)) { > tmp_result = smu7_enable_voltage_control(hwmgr); >
+Rex to review this. Alex On Sun, Oct 9, 2016 at 3:23 PM, Grazvydas Ignotas <notasas@gmail.com> wrote: > Currently the driver crashes if smu7_enable_dpm_tasks() returns early, > which it does if DPM is already active. It seems to be better just to > continue anyway, at least I haven't noticed any ill effects. It's also > unclear at what state the hardware was left by the previous driver, so > IMO it's better to always fully initialize. > > Way to reproduce: > $ modprobe amdgpu > $ rmmod amdgpu > $ modprobe amdgpu > ... > DPM is already running right now, no need to enable DPM! > ... > failed to send message 18b ret is 0 > BUG: unable to handle kernel paging request at ffffed01fc9ab21f > Call Trace: > smu7_set_power_state_tasks+0x499/0x1940 [amdgpu] > phm_set_power_state+0xcb/0x120 [amdgpu] > psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu] > pem_task_adjust_power_state+0xb9/0xd0 [amdgpu] > pem_excute_event_chain+0x7d/0xe0 [amdgpu] > pem_handle_event_unlocked+0x49/0x60 [amdgpu] > pem_handle_event+0xe/0x10 [amdgpu] > pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu] > amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu] > dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu] > dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu] > drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper] > amdgpu_fbdev_init+0x13e/0x170 [amdgpu] > amdgpu_device_init+0x1aeb/0x2490 [amdgpu] > > Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> > --- > drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > index f6afa6a..327030b 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr *hwmgr) > > tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1; > PP_ASSERT_WITH_CODE(tmp_result == 0, > - "DPM is already running right now, no need to enable DPM!", > - return 0); > + "DPM is already running", > + ); > > if (smu7_voltage_control(hwmgr)) { > tmp_result = smu7_enable_voltage_control(hwmgr); > -- > 2.7.4 > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Hi Grazvydas and Alex, We needed to disable dpm when rmmod amdgpu for this issue. I am checking the function of disable dpm task. Best Regards Rex -----Original Message----- From: Alex Deucher [mailto:alexdeucher@gmail.com] Sent: Wednesday, October 12, 2016 4:01 AM To: Grazvydas Ignotas; Zhu, Rex Cc: Maling list - DRI developers; amd-gfx list Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already running +Rex to review this. Alex On Sun, Oct 9, 2016 at 3:23 PM, Grazvydas Ignotas <notasas@gmail.com> wrote: > Currently the driver crashes if smu7_enable_dpm_tasks() returns early, > which it does if DPM is already active. It seems to be better just to > continue anyway, at least I haven't noticed any ill effects. It's also > unclear at what state the hardware was left by the previous driver, so > IMO it's better to always fully initialize. > > Way to reproduce: > $ modprobe amdgpu > $ rmmod amdgpu > $ modprobe amdgpu > ... > DPM is already running right now, no need to enable DPM! > ... > failed to send message 18b ret is 0 > BUG: unable to handle kernel paging request at ffffed01fc9ab21f Call > Trace: > smu7_set_power_state_tasks+0x499/0x1940 [amdgpu] > phm_set_power_state+0xcb/0x120 [amdgpu] > psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu] > pem_task_adjust_power_state+0xb9/0xd0 [amdgpu] > pem_excute_event_chain+0x7d/0xe0 [amdgpu] > pem_handle_event_unlocked+0x49/0x60 [amdgpu] > pem_handle_event+0xe/0x10 [amdgpu] > pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu] > amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu] > dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu] > dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu] > drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper] > amdgpu_fbdev_init+0x13e/0x170 [amdgpu] > amdgpu_device_init+0x1aeb/0x2490 [amdgpu] > > Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> > --- > drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > index f6afa6a..327030b 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr > *hwmgr) > > tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1; > PP_ASSERT_WITH_CODE(tmp_result == 0, > - "DPM is already running right now, no need to enable DPM!", > - return 0); > + "DPM is already running", > + ); > > if (smu7_voltage_control(hwmgr)) { > tmp_result = smu7_enable_voltage_control(hwmgr); > -- > 2.7.4 > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Hi all, The attached patches were also for this issue. Disable dpm when rmmod amdgpu. Please help to review. Best Regards Rex -----Original Message----- From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Zhu, Rex Sent: Wednesday, October 12, 2016 9:45 PM To: Alex Deucher; Grazvydas Ignotas Cc: amd-gfx list; Maling list - DRI developers Subject: RE: [PATCH] drm/amd/powerplay: don't give up if DPM is already running Hi Grazvydas and Alex, We needed to disable dpm when rmmod amdgpu for this issue. I am checking the function of disable dpm task. Best Regards Rex -----Original Message----- From: Alex Deucher [mailto:alexdeucher@gmail.com] Sent: Wednesday, October 12, 2016 4:01 AM To: Grazvydas Ignotas; Zhu, Rex Cc: Maling list - DRI developers; amd-gfx list Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already running +Rex to review this. Alex On Sun, Oct 9, 2016 at 3:23 PM, Grazvydas Ignotas <notasas@gmail.com> wrote: > Currently the driver crashes if smu7_enable_dpm_tasks() returns early, > which it does if DPM is already active. It seems to be better just to > continue anyway, at least I haven't noticed any ill effects. It's also > unclear at what state the hardware was left by the previous driver, so > IMO it's better to always fully initialize. > > Way to reproduce: > $ modprobe amdgpu > $ rmmod amdgpu > $ modprobe amdgpu > ... > DPM is already running right now, no need to enable DPM! > ... > failed to send message 18b ret is 0 > BUG: unable to handle kernel paging request at ffffed01fc9ab21f Call > Trace: > smu7_set_power_state_tasks+0x499/0x1940 [amdgpu] > phm_set_power_state+0xcb/0x120 [amdgpu] > psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu] > pem_task_adjust_power_state+0xb9/0xd0 [amdgpu] > pem_excute_event_chain+0x7d/0xe0 [amdgpu] > pem_handle_event_unlocked+0x49/0x60 [amdgpu] > pem_handle_event+0xe/0x10 [amdgpu] > pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu] > amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu] > dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu] > dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu] > drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper] > amdgpu_fbdev_init+0x13e/0x170 [amdgpu] > amdgpu_device_init+0x1aeb/0x2490 [amdgpu] > > Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> > --- > drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > index f6afa6a..327030b 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr > *hwmgr) > > tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1; > PP_ASSERT_WITH_CODE(tmp_result == 0, > - "DPM is already running right now, no need to enable DPM!", > - return 0); > + "DPM is already running", > + ); > > if (smu7_voltage_control(hwmgr)) { > tmp_result = smu7_enable_voltage_control(hwmgr); > -- > 2.7.4 > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Hi, On Thu, Oct 13, 2016 at 10:45 AM, Zhu, Rex <Rex.Zhu@amd.com> wrote: > > The attached patches were also for this issue. > Disable dpm when rmmod amdgpu. It works for modprobe-rmmod-modprobe test, thanks. However with GPU passthrough (giving control of the GPU to a Windows virtual machine using iommu, then shutting down the VM and loading amdgpu) the problem is still there, same backtrace as in my commit message. It seems the Windows driver leaves DPM enabled on shutdown. With my patch the crash goes away. It would be nice to have GPU passthrough working, this is the only issue that breaks it. Windows can drive the GPU after amdgpu fine already, only amdgpu has problems to run after Windows. Gražvydas
Hi Gražvydas, Sorry for so late response. I can reproduce this issue in gpu passthrough case. When we forced power off the VM or forced reset the VM without shutting down the Os, the rmmod will not be called. So dpm was still running. If we skipped the enable dpm tasks, some clock tables were not be initialized, so kernel panic when we visited those tables in set power state. I will send the fix patch for review tomorrow. Thanks. Best Regards Rex -----Original Message----- From: Grazvydas Ignotas [mailto:notasas@gmail.com] Sent: Sunday, October 16, 2016 2:55 AM To: Zhu, Rex Cc: Alex Deucher; amd-gfx list; Maling list - DRI developers Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already running Hi, On Thu, Oct 13, 2016 at 10:45 AM, Zhu, Rex <Rex.Zhu@amd.com> wrote: > > The attached patches were also for this issue. > Disable dpm when rmmod amdgpu. It works for modprobe-rmmod-modprobe test, thanks. However with GPU passthrough (giving control of the GPU to a Windows virtual machine using iommu, then shutting down the VM and loading amdgpu) the problem is still there, same backtrace as in my commit message. It seems the Windows driver leaves DPM enabled on shutdown. With my patch the crash goes away. It would be nice to have GPU passthrough working, this is the only issue that breaks it. Windows can drive the GPU after amdgpu fine already, only amdgpu has problems to run after Windows. Gražvydas
Hi Gražvydas, In GPU passthrough case, I can't find better solution than yours from driver. So will apply your patch and just update reproduce step. Thanks. Best Regards Rex -----Original Message----- From: Zhu, Rex Sent: Wednesday, October 19, 2016 10:00 PM To: 'Grazvydas Ignotas' Cc: Alex Deucher; amd-gfx list; Maling list - DRI developers Subject: RE: [PATCH] drm/amd/powerplay: don't give up if DPM is already running Hi Gražvydas, Sorry for so late response. I can reproduce this issue in gpu passthrough case. When we forced power off the VM or forced reset the VM without shutting down the Os, the rmmod will not be called. So dpm was still running. If we skipped the enable dpm tasks, some clock tables were not be initialized, so kernel panic when we visited those tables in set power state. I will send the fix patch for review tomorrow. Thanks. Best Regards Rex -----Original Message----- From: Grazvydas Ignotas [mailto:notasas@gmail.com] Sent: Sunday, October 16, 2016 2:55 AM To: Zhu, Rex Cc: Alex Deucher; amd-gfx list; Maling list - DRI developers Subject: Re: [PATCH] drm/amd/powerplay: don't give up if DPM is already running Hi, On Thu, Oct 13, 2016 at 10:45 AM, Zhu, Rex <Rex.Zhu@amd.com> wrote: > > The attached patches were also for this issue. > Disable dpm when rmmod amdgpu. It works for modprobe-rmmod-modprobe test, thanks. However with GPU passthrough (giving control of the GPU to a Windows virtual machine using iommu, then shutting down the VM and loading amdgpu) the problem is still there, same backtrace as in my commit message. It seems the Windows driver leaves DPM enabled on shutdown. With my patch the crash goes away. It would be nice to have GPU passthrough working, this is the only issue that breaks it. Windows can drive the GPU after amdgpu fine already, only amdgpu has problems to run after Windows. Gražvydas
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c index f6afa6a..327030b 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c @@ -1166,8 +1166,8 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr *hwmgr) tmp_result = (!smum_is_dpm_running(hwmgr)) ? 0 : -1; PP_ASSERT_WITH_CODE(tmp_result == 0, - "DPM is already running right now, no need to enable DPM!", - return 0); + "DPM is already running", + ); if (smu7_voltage_control(hwmgr)) { tmp_result = smu7_enable_voltage_control(hwmgr);
Currently the driver crashes if smu7_enable_dpm_tasks() returns early, which it does if DPM is already active. It seems to be better just to continue anyway, at least I haven't noticed any ill effects. It's also unclear at what state the hardware was left by the previous driver, so IMO it's better to always fully initialize. Way to reproduce: $ modprobe amdgpu $ rmmod amdgpu $ modprobe amdgpu ... DPM is already running right now, no need to enable DPM! ... failed to send message 18b ret is 0 BUG: unable to handle kernel paging request at ffffed01fc9ab21f Call Trace: smu7_set_power_state_tasks+0x499/0x1940 [amdgpu] phm_set_power_state+0xcb/0x120 [amdgpu] psm_adjust_power_state_dynamic+0x11e/0x1b0 [amdgpu] pem_task_adjust_power_state+0xb9/0xd0 [amdgpu] pem_excute_event_chain+0x7d/0xe0 [amdgpu] pem_handle_event_unlocked+0x49/0x60 [amdgpu] pem_handle_event+0xe/0x10 [amdgpu] pp_dpm_dispatch_tasks+0xe0/0x190 [amdgpu] amdgpu_pm_compute_clocks+0x10c/0xc60 [amdgpu] dce_v11_0_crtc_dpms+0x7d/0x150 [amdgpu] dce_v11_0_crtc_disable+0x90/0x4a0 [amdgpu] drm_helper_disable_unused_functions+0x67/0x80 [drm_kms_helper] amdgpu_fbdev_init+0x13e/0x170 [amdgpu] amdgpu_device_init+0x1aeb/0x2490 [amdgpu] Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> --- drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)