Message ID | 1654249343-24959-1-git-send-email-quic_vpolimer@quicinc.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v2] drm/msm: add null checks for drm device to avoid crash during probe defer | expand |
On 03/06/2022 12:42, Vinod Polimera wrote: > During probe defer, drm device is not initialized and an external > trigger to shutdown is trying to clean up drm device leading to crash. > Add checks to avoid drm device cleanup in such cases. > > BUG: unable to handle kernel NULL pointer dereference at virtual > address 00000000000000b8 > > Call trace: > > drm_atomic_helper_shutdown+0x44/0x144 > msm_pdev_shutdown+0x2c/0x38 > platform_shutdown+0x2c/0x38 > device_shutdown+0x158/0x210 > kernel_restart_prepare+0x40/0x4c > kernel_restart+0x20/0x6c > __arm64_sys_reboot+0x194/0x23c > invoke_syscall+0x50/0x13c > el0_svc_common+0xa0/0x17c > do_el0_svc_compat+0x28/0x34 > el0_svc_compat+0x20/0x70 > el0t_32_sync_handler+0xa8/0xcc > el0t_32_sync+0x1a8/0x1ac > > Changes in v2: > - Add fixes tag. I'm still waiting for an answer to the questions raised in v1 review. > > Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU components failed to bind") > Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> > --- > drivers/gpu/drm/msm/msm_drv.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index 4448536..d62ac66 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device *dev) > struct msm_drm_private *priv = dev->dev_private; > struct msm_kms *kms = priv->kms; > > + if (!irq_has_action(kms->irq)) > + return; > + > kms->funcs->irq_uninstall(kms); > if (kms->irq_requested) > free_irq(kms->irq, dev); > @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) > > ddev->dev_private = NULL; > drm_dev_put(ddev); > + priv->dev = NULL; > > destroy_workqueue(priv->wq); > > @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device *pdev) > struct msm_drm_private *priv = platform_get_drvdata(pdev); > struct drm_device *drm = priv ? priv->dev : NULL; > > - if (!priv || !priv->kms) > + if (!priv || !priv->kms || !drm) > return; > > drm_atomic_helper_shutdown(drm);
On 03/06/2022 12:42, Vinod Polimera wrote: > During probe defer, drm device is not initialized and an external > trigger to shutdown is trying to clean up drm device leading to crash. > Add checks to avoid drm device cleanup in such cases. > > BUG: unable to handle kernel NULL pointer dereference at virtual > address 00000000000000b8 > > Call trace: > > drm_atomic_helper_shutdown+0x44/0x144 > msm_pdev_shutdown+0x2c/0x38 > platform_shutdown+0x2c/0x38 > device_shutdown+0x158/0x210 > kernel_restart_prepare+0x40/0x4c > kernel_restart+0x20/0x6c > __arm64_sys_reboot+0x194/0x23c > invoke_syscall+0x50/0x13c > el0_svc_common+0xa0/0x17c > do_el0_svc_compat+0x28/0x34 > el0_svc_compat+0x20/0x70 > el0t_32_sync_handler+0xa8/0xcc > el0t_32_sync+0x1a8/0x1ac > > Changes in v2: > - Add fixes tag. > > Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU components failed to bind") > Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> Also please remove bouncing quicinc.com emails from cc list > --- > drivers/gpu/drm/msm/msm_drv.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index 4448536..d62ac66 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device *dev) > struct msm_drm_private *priv = dev->dev_private; > struct msm_kms *kms = priv->kms; > > + if (!irq_has_action(kms->irq)) > + return; > + > kms->funcs->irq_uninstall(kms); > if (kms->irq_requested) > free_irq(kms->irq, dev); > @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) > > ddev->dev_private = NULL; > drm_dev_put(ddev); > + priv->dev = NULL; > > destroy_workqueue(priv->wq); > > @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device *pdev) > struct msm_drm_private *priv = platform_get_drvdata(pdev); > struct drm_device *drm = priv ? priv->dev : NULL; > > - if (!priv || !priv->kms) > + if (!priv || !priv->kms || !drm) > return; > > drm_atomic_helper_shutdown(drm);
On 03/06/2022 12:42, Vinod Polimera wrote: > During probe defer, drm device is not initialized and an external > trigger to shutdown is trying to clean up drm device leading to crash. > Add checks to avoid drm device cleanup in such cases. > > BUG: unable to handle kernel NULL pointer dereference at virtual > address 00000000000000b8 > > Call trace: > > drm_atomic_helper_shutdown+0x44/0x144 > msm_pdev_shutdown+0x2c/0x38 > platform_shutdown+0x2c/0x38 > device_shutdown+0x158/0x210 > kernel_restart_prepare+0x40/0x4c > kernel_restart+0x20/0x6c > __arm64_sys_reboot+0x194/0x23c > invoke_syscall+0x50/0x13c > el0_svc_common+0xa0/0x17c > do_el0_svc_compat+0x28/0x34 > el0_svc_compat+0x20/0x70 > el0t_32_sync_handler+0xa8/0xcc > el0t_32_sync+0x1a8/0x1ac > > Changes in v2: > - Add fixes tag. > > Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU components failed to bind") > Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > --- > drivers/gpu/drm/msm/msm_drv.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index 4448536..d62ac66 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device *dev) > struct msm_drm_private *priv = dev->dev_private; > struct msm_kms *kms = priv->kms; > > + if (!irq_has_action(kms->irq)) > + return; > + > kms->funcs->irq_uninstall(kms); > if (kms->irq_requested) > free_irq(kms->irq, dev); > @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) > > ddev->dev_private = NULL; > drm_dev_put(ddev); > + priv->dev = NULL; > > destroy_workqueue(priv->wq); > > @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device *pdev) > struct msm_drm_private *priv = platform_get_drvdata(pdev); > struct drm_device *drm = priv ? priv->dev : NULL; > > - if (!priv || !priv->kms) > + if (!priv || !priv->kms || !drm) > return; > > drm_atomic_helper_shutdown(drm);
On 03/06/2022 12:42, Vinod Polimera wrote: > During probe defer, drm device is not initialized and an external > trigger to shutdown is trying to clean up drm device leading to crash. > Add checks to avoid drm device cleanup in such cases. > > BUG: unable to handle kernel NULL pointer dereference at virtual > address 00000000000000b8 > > Call trace: > > drm_atomic_helper_shutdown+0x44/0x144 > msm_pdev_shutdown+0x2c/0x38 > platform_shutdown+0x2c/0x38 > device_shutdown+0x158/0x210 > kernel_restart_prepare+0x40/0x4c > kernel_restart+0x20/0x6c > __arm64_sys_reboot+0x194/0x23c > invoke_syscall+0x50/0x13c > el0_svc_common+0xa0/0x17c > do_el0_svc_compat+0x28/0x34 > el0_svc_compat+0x20/0x70 > el0t_32_sync_handler+0xa8/0xcc > el0t_32_sync+0x1a8/0x1ac > > Changes in v2: > - Add fixes tag. > > Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU components failed to bind") > Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> > --- > drivers/gpu/drm/msm/msm_drv.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index 4448536..d62ac66 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device *dev) > struct msm_drm_private *priv = dev->dev_private; > struct msm_kms *kms = priv->kms; > > + if (!irq_has_action(kms->irq)) > + return; As a second thought I'd still prefer a variable here. irq_has_action would check that there is _any_ IRQ handler for this IRQ. While we do not have anybody sharing this IRQ, I'd prefer to be clear here, that we do not want to uninstall our IRQ handler rather than any IRQ handler. > + > kms->funcs->irq_uninstall(kms); > if (kms->irq_requested) > free_irq(kms->irq, dev); > @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) > > ddev->dev_private = NULL; > drm_dev_put(ddev); > + priv->dev = NULL; > > destroy_workqueue(priv->wq); > > @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device *pdev) > struct msm_drm_private *priv = platform_get_drvdata(pdev); > struct drm_device *drm = priv ? priv->dev : NULL; > > - if (!priv || !priv->kms) > + if (!priv || !priv->kms || !drm) > return; > > drm_atomic_helper_shutdown(drm);
On 15/06/2022 15:23, Dmitry Baryshkov wrote: > On 03/06/2022 12:42, Vinod Polimera wrote: >> During probe defer, drm device is not initialized and an external >> trigger to shutdown is trying to clean up drm device leading to crash. >> Add checks to avoid drm device cleanup in such cases. >> >> BUG: unable to handle kernel NULL pointer dereference at virtual >> address 00000000000000b8 >> >> Call trace: >> >> drm_atomic_helper_shutdown+0x44/0x144 >> msm_pdev_shutdown+0x2c/0x38 >> platform_shutdown+0x2c/0x38 >> device_shutdown+0x158/0x210 >> kernel_restart_prepare+0x40/0x4c >> kernel_restart+0x20/0x6c >> __arm64_sys_reboot+0x194/0x23c >> invoke_syscall+0x50/0x13c >> el0_svc_common+0xa0/0x17c >> do_el0_svc_compat+0x28/0x34 >> el0_svc_compat+0x20/0x70 >> el0t_32_sync_handler+0xa8/0xcc >> el0t_32_sync+0x1a8/0x1ac >> >> Changes in v2: >> - Add fixes tag. >> >> Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU components >> failed to bind") >> Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> >> --- >> drivers/gpu/drm/msm/msm_drv.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/msm/msm_drv.c >> b/drivers/gpu/drm/msm/msm_drv.c >> index 4448536..d62ac66 100644 >> --- a/drivers/gpu/drm/msm/msm_drv.c >> +++ b/drivers/gpu/drm/msm/msm_drv.c >> @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device *dev) >> struct msm_drm_private *priv = dev->dev_private; >> struct msm_kms *kms = priv->kms; >> + if (!irq_has_action(kms->irq)) >> + return; > > As a second thought I'd still prefer a variable here. irq_has_action > would check that there is _any_ IRQ handler for this IRQ. While we do > not have anybody sharing this IRQ, I'd prefer to be clear here, that we > do not want to uninstall our IRQ handler rather than any IRQ handler. Vinod, do we still want to pursue this fix? If so, could you please update it according to the comment. > >> + >> kms->funcs->irq_uninstall(kms); >> if (kms->irq_requested) >> free_irq(kms->irq, dev); >> @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) >> ddev->dev_private = NULL; >> drm_dev_put(ddev); >> + priv->dev = NULL; >> destroy_workqueue(priv->wq); >> @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device *pdev) >> struct msm_drm_private *priv = platform_get_drvdata(pdev); >> struct drm_device *drm = priv ? priv->dev : NULL; >> - if (!priv || !priv->kms) >> + if (!priv || !priv->kms || !drm) >> return; >> drm_atomic_helper_shutdown(drm); > >
> -----Original Message----- > From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > Sent: Friday, August 26, 2022 2:11 PM > To: Vinod Polimera (QUIC) <quic_vpolimer@quicinc.com>; dri- > devel@lists.freedesktop.org; linux-arm-msm@vger.kernel.org; > freedreno@lists.freedesktop.org; devicetree@vger.kernel.org > Cc: linux-kernel@vger.kernel.org; robdclark@gmail.com; > dianders@chromium.org; vpolimer@quicinc.com; swboyd@chromium.org; > kalyant@quicinc.com > Subject: Re: [v2] drm/msm: add null checks for drm device to avoid crash > during probe defer > > WARNING: This email originated from outside of Qualcomm. Please be wary > of any links or attachments, and do not enable macros. > > On 15/06/2022 15:23, Dmitry Baryshkov wrote: > > On 03/06/2022 12:42, Vinod Polimera wrote: > >> During probe defer, drm device is not initialized and an external > >> trigger to shutdown is trying to clean up drm device leading to crash. > >> Add checks to avoid drm device cleanup in such cases. > >> > >> BUG: unable to handle kernel NULL pointer dereference at virtual > >> address 00000000000000b8 > >> > >> Call trace: > >> > >> drm_atomic_helper_shutdown+0x44/0x144 > >> msm_pdev_shutdown+0x2c/0x38 > >> platform_shutdown+0x2c/0x38 > >> device_shutdown+0x158/0x210 > >> kernel_restart_prepare+0x40/0x4c > >> kernel_restart+0x20/0x6c > >> __arm64_sys_reboot+0x194/0x23c > >> invoke_syscall+0x50/0x13c > >> el0_svc_common+0xa0/0x17c > >> do_el0_svc_compat+0x28/0x34 > >> el0_svc_compat+0x20/0x70 > >> el0t_32_sync_handler+0xa8/0xcc > >> el0t_32_sync+0x1a8/0x1ac > >> > >> Changes in v2: > >> - Add fixes tag. > >> > >> Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU > components > >> failed to bind") > >> Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> > >> --- > >> drivers/gpu/drm/msm/msm_drv.c | 6 +++++- > >> 1 file changed, 5 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/msm/msm_drv.c > >> b/drivers/gpu/drm/msm/msm_drv.c > >> index 4448536..d62ac66 100644 > >> --- a/drivers/gpu/drm/msm/msm_drv.c > >> +++ b/drivers/gpu/drm/msm/msm_drv.c > >> @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device > *dev) > >> struct msm_drm_private *priv = dev->dev_private; > >> struct msm_kms *kms = priv->kms; > >> + if (!irq_has_action(kms->irq)) > >> + return; > > > > As a second thought I'd still prefer a variable here. irq_has_action > > would check that there is _any_ IRQ handler for this IRQ. While we do > > not have anybody sharing this IRQ, I'd prefer to be clear here, that we > > do not want to uninstall our IRQ handler rather than any IRQ handler. > > Vinod, do we still want to pursue this fix? If so, could you please > update it according to the comment. > I have looked up and found many kernel drivers are using Irq_has_action to see if the interrupt is requested, it appears to me as an aggregable way of doing it. Having a variable to track the state seems unnecessary as it needs to be managed race free. let me know your views on it. > > > >> + > >> kms->funcs->irq_uninstall(kms); > >> if (kms->irq_requested) > >> free_irq(kms->irq, dev); > >> @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) > >> ddev->dev_private = NULL; > >> drm_dev_put(ddev); > >> + priv->dev = NULL; > >> destroy_workqueue(priv->wq); > >> @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct > platform_device *pdev) > >> struct msm_drm_private *priv = platform_get_drvdata(pdev); > >> struct drm_device *drm = priv ? priv->dev : NULL; > >> - if (!priv || !priv->kms) > >> + if (!priv || !priv->kms || !drm) > >> return; > >> drm_atomic_helper_shutdown(drm); > > > > > > -- > With best wishes > Dmitry - Vinod P.
Hello Vinod and Dmitry, On 9/27/22 09:31, Vinod Polimera wrote: >> -----Original Message----- >> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> >> Sent: Friday, August 26, 2022 2:11 PM >> To: Vinod Polimera (QUIC) <quic_vpolimer@quicinc.com>; dri- >> devel@lists.freedesktop.org; linux-arm-msm@vger.kernel.org; >> freedreno@lists.freedesktop.org; devicetree@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org; robdclark@gmail.com; >> dianders@chromium.org; vpolimer@quicinc.com; swboyd@chromium.org; >> kalyant@quicinc.com >> Subject: Re: [v2] drm/msm: add null checks for drm device to avoid crash >> during probe defer >> [...] >> Vinod, do we still want to pursue this fix? If so, could you please >> update it according to the comment. >> I don't think this patch is needed anymore, since AFAICT the issue has been fixed by commit 0a58d2ae572a ("drm/msm: Make .remove and .shutdown HW shutdown consistent") which is already in the drm/drm-next branch.
> -----Original Message----- > From: Javier Martinez Canillas <javierm@redhat.com> > Sent: Tuesday, September 27, 2022 2:33 PM > To: Vinod Polimera <vpolimer@qti.qualcomm.com>; > dmitry.baryshkov@linaro.org; Vinod Polimera (QUIC) > <quic_vpolimer@quicinc.com>; dri-devel@lists.freedesktop.org; linux-arm- > msm@vger.kernel.org; freedreno@lists.freedesktop.org; > devicetree@vger.kernel.org > Cc: dianders@chromium.org; vpolimer@quicinc.com; Abhinav Kumar > <abhinavk@quicinc.com>; linux-kernel@vger.kernel.org; > swboyd@chromium.org; kalyant@quicinc.com > Subject: Re: [v2] drm/msm: add null checks for drm device to avoid crash > during probe defer > > WARNING: This email originated from outside of Qualcomm. Please be wary > of any links or attachments, and do not enable macros. > > Hello Vinod and Dmitry, > > On 9/27/22 09:31, Vinod Polimera wrote: > >> -----Original Message----- > >> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > >> Sent: Friday, August 26, 2022 2:11 PM > >> To: Vinod Polimera (QUIC) <quic_vpolimer@quicinc.com>; dri- > >> devel@lists.freedesktop.org; linux-arm-msm@vger.kernel.org; > >> freedreno@lists.freedesktop.org; devicetree@vger.kernel.org > >> Cc: linux-kernel@vger.kernel.org; robdclark@gmail.com; > >> dianders@chromium.org; vpolimer@quicinc.com; > swboyd@chromium.org; > >> kalyant@quicinc.com > >> Subject: Re: [v2] drm/msm: add null checks for drm device to avoid crash > >> during probe defer > >> > > [...] > > >> Vinod, do we still want to pursue this fix? If so, could you please > >> update it according to the comment. > >> > > I don't think this patch is needed anymore, since AFAICT the issue has > been fixed by commit 0a58d2ae572a ("drm/msm: Make .remove and > .shutdown > HW shutdown consistent") which is already in the drm/drm-next branch. Yes , Issue will be fixed with the commit 0a58d2ae572a ("drm/msm: Make .remove and .shutdown) . Hence we can drop this patch. > > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat - Vinod P.
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 4448536..d62ac66 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -142,6 +142,9 @@ static void msm_irq_uninstall(struct drm_device *dev) struct msm_drm_private *priv = dev->dev_private; struct msm_kms *kms = priv->kms; + if (!irq_has_action(kms->irq)) + return; + kms->funcs->irq_uninstall(kms); if (kms->irq_requested) free_irq(kms->irq, dev); @@ -259,6 +262,7 @@ static int msm_drm_uninit(struct device *dev) ddev->dev_private = NULL; drm_dev_put(ddev); + priv->dev = NULL; destroy_workqueue(priv->wq); @@ -1167,7 +1171,7 @@ void msm_drv_shutdown(struct platform_device *pdev) struct msm_drm_private *priv = platform_get_drvdata(pdev); struct drm_device *drm = priv ? priv->dev : NULL; - if (!priv || !priv->kms) + if (!priv || !priv->kms || !drm) return; drm_atomic_helper_shutdown(drm);
During probe defer, drm device is not initialized and an external trigger to shutdown is trying to clean up drm device leading to crash. Add checks to avoid drm device cleanup in such cases. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000000000b8 Call trace: drm_atomic_helper_shutdown+0x44/0x144 msm_pdev_shutdown+0x2c/0x38 platform_shutdown+0x2c/0x38 device_shutdown+0x158/0x210 kernel_restart_prepare+0x40/0x4c kernel_restart+0x20/0x6c __arm64_sys_reboot+0x194/0x23c invoke_syscall+0x50/0x13c el0_svc_common+0xa0/0x17c do_el0_svc_compat+0x28/0x34 el0_svc_compat+0x20/0x70 el0t_32_sync_handler+0xa8/0xcc el0t_32_sync+0x1a8/0x1ac Changes in v2: - Add fixes tag. Fixes: 623f279c778 ("drm/msm: fix shutdown hook in case GPU components failed to bind") Signed-off-by: Vinod Polimera <quic_vpolimer@quicinc.com> --- drivers/gpu/drm/msm/msm_drv.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)