diff mbox series

[v6,6/7] media: mediatek: vcodec: prevent kernel crash when scp ipi timeout

Message ID 20220513092526.9670-7-yunfei.dong@mediatek.com (mailing list archive)
State New, archived
Headers show
Series support mt8195 decoder | expand

Commit Message

Yunfei Dong May 13, 2022, 9:25 a.m. UTC
When SCP timeout during playing video, kernel crashes with following
message. It's caused by accessing NULL pointer in vpu_dec_ipi_handler.
This patch doesn't solve the root cause of NULL pointer, but merely
prevent kernel crashed when encounter the NULL pointer.

After applied this patch, kernel keeps alive, only the video player turns
to green screen.

[67242.065474] pc : vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec]
[67242.065485] [MTK_V4L2] level=0 fops_vcodec_open(),334:
18000000.vcodec_dec decoder [135]
[67242.065523] lr : scp_ipi_handler+0x11c/0x244 [mtk_scp]
[67242.065540] sp : ffffffbb4207fb10
[67242.065557] x29: ffffffbb4207fb30 x28: ffffffd00a1d5000
[67242.065592] x27: 1ffffffa0143aa24 x26: 0000000000000000
[67242.065625] x25: dfffffd000000000 x24: ffffffd0168bfdb0
[67242.065659] x23: 1ffffff76840ff74 x22: ffffffbb41fa8a88
[67242.065692] x21: ffffffbb4207fb9c x20: ffffffbb4207fba0
[67242.065725] x19: ffffffbb4207fb98 x18: 0000000000000000
[67242.065758] x17: 0000000000000000 x16: ffffffd042022094
[67242.065791] x15: 1ffffff77ed4b71a x14: 1ffffff77ed4b719
[67242.065824] x13: 0000000000000000 x12: 0000000000000000
[67242.065857] x11: 0000000000000000 x10: dfffffd000000001
[67242.065890] x9 : 0000000000000000 x8 : 0000000000000002
[67242.065923] x7 : 0000000000000000 x6 : 000000000000003f
[67242.065956] x5 : 0000000000000040 x4 : ffffffffffffffe0
[67242.065989] x3 : ffffffd043b841b8 x2 : 0000000000000000
[67242.066021] x1 : 0000000000000010 x0 : 0000000000000010
[67242.066055] Call trace:
[67242.066092]  vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec
12220d230d83a7426fc38c56b3e7bc6066955bae]
[67242.066119]  scp_ipi_handler+0x11c/0x244 [mtk_scp
8fb69c2ef141dd3192518b952b65aba35627b8bf]
[67242.066145]  mt8192_scp_irq_handler+0x70/0x128 [mtk_scp
8fb69c2ef141dd3192518b952b65aba35627b8bf]
[67242.066172]  scp_irq_handler+0xa0/0x114 [mtk_scp
8fb69c2ef141dd3192518b952b65aba35627b8bf]
[67242.066200]  irq_thread_fn+0x84/0xf8
[67242.066220]  irq_thread+0x170/0x1ec
[67242.066242]  kthread+0x2f8/0x3b8
[67242.066264]  ret_from_fork+0x10/0x30
[67242.066292] Code: 38f96908 35003628 91004340 d343fc08 (38f96908)

Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
Reviewed-by: Macpaul Lin <macpaul.lin@mediatek.com>
---
 drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Hans Verkuil May 18, 2022, 9:37 a.m. UTC | #1
Hi Yunfei,

On 5/13/22 11:25, Yunfei Dong wrote:
> When SCP timeout during playing video, kernel crashes with following
> message. It's caused by accessing NULL pointer in vpu_dec_ipi_handler.
> This patch doesn't solve the root cause of NULL pointer, but merely
> prevent kernel crashed when encounter the NULL pointer.

Is the root cause being addressed as well? Where is the root cause? Is it
in this driver or in the scp (i.e. the remoteproc) driver?

I need a bit more information to decide whether this series is ready to
be merged for 5.20 or not.

Regards,

	Hans

> 
> After applied this patch, kernel keeps alive, only the video player turns
> to green screen.
> 
> [67242.065474] pc : vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec]
> [67242.065485] [MTK_V4L2] level=0 fops_vcodec_open(),334:
> 18000000.vcodec_dec decoder [135]
> [67242.065523] lr : scp_ipi_handler+0x11c/0x244 [mtk_scp]
> [67242.065540] sp : ffffffbb4207fb10
> [67242.065557] x29: ffffffbb4207fb30 x28: ffffffd00a1d5000
> [67242.065592] x27: 1ffffffa0143aa24 x26: 0000000000000000
> [67242.065625] x25: dfffffd000000000 x24: ffffffd0168bfdb0
> [67242.065659] x23: 1ffffff76840ff74 x22: ffffffbb41fa8a88
> [67242.065692] x21: ffffffbb4207fb9c x20: ffffffbb4207fba0
> [67242.065725] x19: ffffffbb4207fb98 x18: 0000000000000000
> [67242.065758] x17: 0000000000000000 x16: ffffffd042022094
> [67242.065791] x15: 1ffffff77ed4b71a x14: 1ffffff77ed4b719
> [67242.065824] x13: 0000000000000000 x12: 0000000000000000
> [67242.065857] x11: 0000000000000000 x10: dfffffd000000001
> [67242.065890] x9 : 0000000000000000 x8 : 0000000000000002
> [67242.065923] x7 : 0000000000000000 x6 : 000000000000003f
> [67242.065956] x5 : 0000000000000040 x4 : ffffffffffffffe0
> [67242.065989] x3 : ffffffd043b841b8 x2 : 0000000000000000
> [67242.066021] x1 : 0000000000000010 x0 : 0000000000000010
> [67242.066055] Call trace:
> [67242.066092]  vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec
> 12220d230d83a7426fc38c56b3e7bc6066955bae]
> [67242.066119]  scp_ipi_handler+0x11c/0x244 [mtk_scp
> 8fb69c2ef141dd3192518b952b65aba35627b8bf]
> [67242.066145]  mt8192_scp_irq_handler+0x70/0x128 [mtk_scp
> 8fb69c2ef141dd3192518b952b65aba35627b8bf]
> [67242.066172]  scp_irq_handler+0xa0/0x114 [mtk_scp
> 8fb69c2ef141dd3192518b952b65aba35627b8bf]
> [67242.066200]  irq_thread_fn+0x84/0xf8
> [67242.066220]  irq_thread+0x170/0x1ec
> [67242.066242]  kthread+0x2f8/0x3b8
> [67242.066264]  ret_from_fork+0x10/0x30
> [67242.066292] Code: 38f96908 35003628 91004340 d343fc08 (38f96908)
> 
> Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> Reviewed-by: Macpaul Lin <macpaul.lin@mediatek.com>
> ---
>  drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> index 35f4d5583084..1041dd663e76 100644
> --- a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> +++ b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> @@ -91,6 +91,11 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
>  	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
>  					(unsigned long)msg->ap_inst_addr;
>  
> +	if (!vpu) {
> +		mtk_v4l2_err("ap_inst_addr is NULL");
> +		return;
> +	}
> +
>  	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
>  
>  	vpu->failure = msg->status;
Yunfei Dong May 18, 2022, 11:29 a.m. UTC | #2
Dear Hans,

Thanks for your review.
On Wed, 2022-05-18 at 11:37 +0200, Hans Verkuil wrote:
> Hi Yunfei,
> 
> On 5/13/22 11:25, Yunfei Dong wrote:
> > When SCP timeout during playing video, kernel crashes with
> > following
> > message. It's caused by accessing NULL pointer in
> > vpu_dec_ipi_handler.
> > This patch doesn't solve the root cause of NULL pointer, but merely
> > prevent kernel crashed when encounter the NULL pointer.
> 
> Is the root cause being addressed as well? Where is the root cause?
> Is it
> in this driver or in the scp (i.e. the remoteproc) driver?
> 
> I need a bit more information to decide whether this series is ready
> to
> be merged for 5.20 or not.
> 
> Regards,
> 
> 	Hans
> 
Vpu will be NUll when scp(micro processor) is hang or crash. Need to
keep kernel works well , so add this patch.

Best Regards,
Yunfei Dong
> > 
> > After applied this patch, kernel keeps alive, only the video player
> > turns
> > to green screen.
> > 
> > [67242.065474] pc : vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec]
> > [67242.065485] [MTK_V4L2] level=0 fops_vcodec_open(),334:
> > 18000000.vcodec_dec decoder [135]
> > [67242.065523] lr : scp_ipi_handler+0x11c/0x244 [mtk_scp]
> > [67242.065540] sp : ffffffbb4207fb10
> > [67242.065557] x29: ffffffbb4207fb30 x28: ffffffd00a1d5000
> > [67242.065592] x27: 1ffffffa0143aa24 x26: 0000000000000000
> > [67242.065625] x25: dfffffd000000000 x24: ffffffd0168bfdb0
> > [67242.065659] x23: 1ffffff76840ff74 x22: ffffffbb41fa8a88
> > [67242.065692] x21: ffffffbb4207fb9c x20: ffffffbb4207fba0
> > [67242.065725] x19: ffffffbb4207fb98 x18: 0000000000000000
> > [67242.065758] x17: 0000000000000000 x16: ffffffd042022094
> > [67242.065791] x15: 1ffffff77ed4b71a x14: 1ffffff77ed4b719
> > [67242.065824] x13: 0000000000000000 x12: 0000000000000000
> > [67242.065857] x11: 0000000000000000 x10: dfffffd000000001
> > [67242.065890] x9 : 0000000000000000 x8 : 0000000000000002
> > [67242.065923] x7 : 0000000000000000 x6 : 000000000000003f
> > [67242.065956] x5 : 0000000000000040 x4 : ffffffffffffffe0
> > [67242.065989] x3 : ffffffd043b841b8 x2 : 0000000000000000
> > [67242.066021] x1 : 0000000000000010 x0 : 0000000000000010
> > [67242.066055] Call trace:
> > [67242.066092]  vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec
> > 12220d230d83a7426fc38c56b3e7bc6066955bae]
> > [67242.066119]  scp_ipi_handler+0x11c/0x244 [mtk_scp
> > 8fb69c2ef141dd3192518b952b65aba35627b8bf]
> > [67242.066145]  mt8192_scp_irq_handler+0x70/0x128 [mtk_scp
> > 8fb69c2ef141dd3192518b952b65aba35627b8bf]
> > [67242.066172]  scp_irq_handler+0xa0/0x114 [mtk_scp
> > 8fb69c2ef141dd3192518b952b65aba35627b8bf]
> > [67242.066200]  irq_thread_fn+0x84/0xf8
> > [67242.066220]  irq_thread+0x170/0x1ec
> > [67242.066242]  kthread+0x2f8/0x3b8
> > [67242.066264]  ret_from_fork+0x10/0x30
> > [67242.066292] Code: 38f96908 35003628 91004340 d343fc08 (38f96908)
> > 
> > Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > Reviewed-by: Macpaul Lin <macpaul.lin@mediatek.com>
> > ---
> >  drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > index 35f4d5583084..1041dd663e76 100644
> > --- a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > +++ b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > @@ -91,6 +91,11 @@ static void vpu_dec_ipi_handler(void *data,
> > unsigned int len, void *priv)
> >  	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
> >  					(unsigned long)msg-
> > >ap_inst_addr;
> >  
> > +	if (!vpu) {
> > +		mtk_v4l2_err("ap_inst_addr is NULL");
> > +		return;
> > +	}
> > +
> >  	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
> >  
> >  	vpu->failure = msg->status;
Hans Verkuil May 18, 2022, 11:34 a.m. UTC | #3
On 5/18/22 13:29, yunfei.dong@mediatek.com wrote:
> Dear Hans,
> 
> Thanks for your review.
> On Wed, 2022-05-18 at 11:37 +0200, Hans Verkuil wrote:
>> Hi Yunfei,
>>
>> On 5/13/22 11:25, Yunfei Dong wrote:
>>> When SCP timeout during playing video, kernel crashes with
>>> following
>>> message. It's caused by accessing NULL pointer in
>>> vpu_dec_ipi_handler.
>>> This patch doesn't solve the root cause of NULL pointer, but merely
>>> prevent kernel crashed when encounter the NULL pointer.
>>
>> Is the root cause being addressed as well? Where is the root cause?
>> Is it
>> in this driver or in the scp (i.e. the remoteproc) driver?
>>
>> I need a bit more information to decide whether this series is ready
>> to
>> be merged for 5.20 or not.
>>
>> Regards,
>>
>> 	Hans
>>
> Vpu will be NUll when scp(micro processor) is hang or crash. Need to
> keep kernel works well , so add this patch.

OK, I think this should be stated in the commit log, and also in the code
(see below).

> 
> Best Regards,
> Yunfei Dong

<snip>

>>> diff --git a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
>>> b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
>>> index 35f4d5583084..1041dd663e76 100644
>>> --- a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
>>> +++ b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
>>> @@ -91,6 +91,11 @@ static void vpu_dec_ipi_handler(void *data,
>>> unsigned int len, void *priv)
>>>  	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
>>>  					(unsigned long)msg-
>>>> ap_inst_addr;
>>>  
>>> +	if (!vpu) {
>>> +		mtk_v4l2_err("ap_inst_addr is NULL");

E.g., either add a comment here or perhaps change the error message to:

"ap_inst_addr is NULL, did the SCP hang?"

Or something along those lines.

Shouldn't there be a \n at the end of this message as well? Or does
mtk_v4l2_err add that?

Regards,

	Hans

>>> +		return;
>>> +	}
>>> +
>>>  	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
>>>  
>>>  	vpu->failure = msg->status;
>
Yunfei Dong May 18, 2022, 12:06 p.m. UTC | #4
Dear Hans,

Thanks for your suggestion.

On Wed, 2022-05-18 at 13:34 +0200, Hans Verkuil wrote:
> 
> On 5/18/22 13:29, yunfei.dong@mediatek.com wrote:
> > Dear Hans,
> > 
> > Thanks for your review.
> > On Wed, 2022-05-18 at 11:37 +0200, Hans Verkuil wrote:
> > > Hi Yunfei,
> > > 
> > > On 5/13/22 11:25, Yunfei Dong wrote:
> > > > When SCP timeout during playing video, kernel crashes with
> > > > following
> > > > message. It's caused by accessing NULL pointer in
> > > > vpu_dec_ipi_handler.
> > > > This patch doesn't solve the root cause of NULL pointer, but
> > > > merely
> > > > prevent kernel crashed when encounter the NULL pointer.
> > > 
> > > Is the root cause being addressed as well? Where is the root
> > > cause?
> > > Is it
> > > in this driver or in the scp (i.e. the remoteproc) driver?
> > > 
> > > I need a bit more information to decide whether this series is
> > > ready
> > > to
> > > be merged for 5.20 or not.
> > > 
> > > Regards,
> > > 
> > > 	Hans
> > > 
> > 
> > Vpu will be NUll when scp(micro processor) is hang or crash. Need
> > to
> > keep kernel works well , so add this patch.
> 
> OK, I think this should be stated in the commit log, and also in the
> code
> (see below).
> 
> > 
> > Best Regards,
> > Yunfei Dong
> 
> <snip>
> 
> > > > diff --git
> > > > a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > > > b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > > > index 35f4d5583084..1041dd663e76 100644
> > > > --- a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > > > +++ b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
> > > > @@ -91,6 +91,11 @@ static void vpu_dec_ipi_handler(void *data,
> > > > unsigned int len, void *priv)
> > > >  	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
> > > >  					(unsigned long)msg-
> > > > > ap_inst_addr;
> > > > 
> > > >  
> > > > +	if (!vpu) {
> > > > +		mtk_v4l2_err("ap_inst_addr is NULL");
> 
> E.g., either add a comment here or perhaps change the error message
> to:
> 
> "ap_inst_addr is NULL, did the SCP hang?"
> 
> Or something along those lines.
> 
I will change the message in next patch like below.

mtk_v4l2_err("ap_inst_addr is NULL, did the SCP hang or crash?");

> Shouldn't there be a \n at the end of this message as well? Or does
> mtk_v4l2_err add that?
> 
mtk_v4l2_err add '\n' in the end.
> Regards,
> 
> 	Hans
> 
Best Regards,
Yunfei Dong
> > > > +		return;
> > > > +	}
> > > > +
> > > >  	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
> > > >  
> > > >  	vpu->failure = msg->status;
diff mbox series

Patch

diff --git a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
index 35f4d5583084..1041dd663e76 100644
--- a/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mediatek/vcodec/vdec_vpu_if.c
@@ -91,6 +91,11 @@  static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
 	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
 					(unsigned long)msg->ap_inst_addr;
 
+	if (!vpu) {
+		mtk_v4l2_err("ap_inst_addr is NULL");
+		return;
+	}
+
 	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
 
 	vpu->failure = msg->status;