diff mbox series

[v7,04/12] iommu/mediatek: Add device_link between the consumer and the larb devices

Message ID 20210730025238.22456-5-yong.wu@mediatek.com (mailing list archive)
State New, archived
Headers show
Series Clean up "mediatek,larb" | expand

Commit Message

Yong Wu July 30, 2021, 2:52 a.m. UTC
MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

        M4U
         |
    smi-common
         |
  -------------
  |         |    ...
  |         |
larb1     larb2
  |         |
vdec       venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Tested-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> # on mt8173
---
 drivers/iommu/mtk_iommu.c    | 22 ++++++++++++++++++++++
 drivers/iommu/mtk_iommu_v1.c | 20 +++++++++++++++++++-
 2 files changed, 41 insertions(+), 1 deletion(-)

Comments

Dafna Hirschfeld Aug. 5, 2021, 1:22 p.m. UTC | #1
On 30.07.21 04:52, Yong Wu wrote:
> MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
> smi-larb, then connect with smi-common.
> 
>          M4U
>           |
>      smi-common
>           |
>    -------------
>    |         |    ...
>    |         |
> larb1     larb2
>    |         |
> vdec       venc
> 
> When the consumer works, it should enable the smi-larb's power which
> also need enable the smi-common's power firstly.
> 
> Thus, First of all, use the device link connect the consumer and the
> smi-larbs. then add device link between the smi-larb and smi-common.
> 
> This patch adds device_link between the consumer and the larbs.
> 
> When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
> pm_runtime_xx to keep the original status of clocks. It can avoid two
> issues:
> 1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
> all the clocks are enabled before entering kernel, but the clocks for
> display HW(always in larb0) will be gated after clk_enable and clk_disable
> called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
> operation happened before display driver probe. At that time, the display
> HW will be abnormal.
> 
> 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
> pm_runtime_xx to avoid the deadlock.
> 
> Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
> device_link_removed should be added explicitly.
> 
> [1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
> [2] https://lore.kernel.org/patchwork/patch/1086569/
> 
> Suggested-by: Tomasz Figa <tfiga@chromium.org>
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> Tested-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> # on mt8173

Hi, unfortunately, I have to take back the Tested-by tag.
I am now testing the mtk-vcodec with latest kernel + patches sent from the mailing list:
https://gitlab.collabora.com/eballetbo/linux/-/commits/topic/chromeos/chromeos-5.14
which includes this patchset.

On chromeos I open a video conference with googl-meet which cause the mtk-vcodec vp8 encoder to run.
If I kill it with `killall -9 chrome` I get some page fault messages from the iommu:

[  837.255952] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.265696] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.282367] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.299028] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.315683] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.332345] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.349004] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.365665] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.382329] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
[  837.400002] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read

In addition, running the encoder tests from the shell:

sudo --user=#1000 /usr/local/libexec/chrome-binary-tests/video_encode_accelerator_tests --gtest_filter=VideoEncoderTest.FlushAtEndOfStream_Multiple*  --codec=vp8 /usr/local/share/tast/data/chromiumos/tast/local/bundles/cros/video/data/tulip2-320x180.yuv --disable_validator

At some point it fails with the error

[ 5472.161821] [MTK_V4L2][ERROR] mtk_vcodec_wait_for_done_ctx:32: [290] ctx->type=1, cmd=1, wait_event_interruptible_timeout time=1000ms out 0 0!
[ 5472.174678] [MTK_VCODEC][ERROR][290]: vp8_enc_encode_frame() irq_status=0 failed
[ 5472.182687] [MTK_V4L2][ERROR] mtk_venc_worker:1239: venc_if_encode failed=-5


If you have any idea of what might be the problem or how to debug?

Thanks,
Dafna

> ---
>   drivers/iommu/mtk_iommu.c    | 22 ++++++++++++++++++++++
>   drivers/iommu/mtk_iommu_v1.c | 20 +++++++++++++++++++-
>   2 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index a02dde094788..ee742900cf4b 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
>   {
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   	struct mtk_iommu_data *data;
> +	struct device_link *link;
> +	struct device *larbdev;
> +	unsigned int larbid;
>   
>   	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
>   		return ERR_PTR(-ENODEV); /* Not a iommu client device */
>   
>   	data = dev_iommu_priv_get(dev);
>   
> +	/*
> +	 * Link the consumer device with the smi-larb device(supplier)
> +	 * The device in each a larb is a independent HW. thus only link
> +	 * one larb here.
> +	 */
> +	larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
> +	larbdev = data->larb_imu[larbid].dev;
> +	link = device_link_add(dev, larbdev,
> +			       DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
> +	if (!link)
> +		dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
>   	return &data->iommu;
>   }
>   
>   static void mtk_iommu_release_device(struct device *dev)
>   {
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +	struct mtk_iommu_data *data;
> +	struct device *larbdev;
> +	unsigned int larbid;
>   
>   	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
>   		return;
>   
> +	data = dev_iommu_priv_get(dev);
> +	larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
> +	larbdev = data->larb_imu[larbid].dev;
> +	device_link_remove(dev, larbdev);
> +
>   	iommu_fwspec_free(dev);
>   }
>   
> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> index c259433f1130..806d4200665b 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -424,7 +424,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   	struct of_phandle_args iommu_spec;
>   	struct mtk_iommu_data *data;
> -	int err, idx = 0;
> +	int err, idx = 0, larbid;
> +	struct device_link *link;
> +	struct device *larbdev;
>   
>   	/*
>   	 * In the deferred case, free the existed fwspec if the dev already has,
> @@ -454,6 +456,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
>   
>   	data = dev_iommu_priv_get(dev);
>   
> +	/* Link the consumer device with the smi-larb device(supplier) */
> +	larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
> +	larbdev = data->larb_imu[larbid].dev;
> +	link = device_link_add(dev, larbdev,
> +			       DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
> +	if (!link)
> +		dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
> +
>   	return &data->iommu;
>   }
>   
> @@ -474,10 +484,18 @@ static void mtk_iommu_probe_finalize(struct device *dev)
>   static void mtk_iommu_release_device(struct device *dev)
>   {
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +	struct mtk_iommu_data *data;
> +	struct device *larbdev;
> +	unsigned int larbid;
>   
>   	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
>   		return;
>   
> +	data = dev_iommu_priv_get(dev);
> +	larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
> +	larbdev = data->larb_imu[larbid].dev;
> +	device_link_remove(dev, larbdev);
> +
>   	iommu_fwspec_free(dev);
>   }
>   
>
Yong Wu Aug. 9, 2021, 8 a.m. UTC | #2
On Thu, 2021-08-05 at 15:22 +0200, Dafna Hirschfeld wrote:
> 
> On 30.07.21 04:52, Yong Wu wrote:
> > MediaTek IOMMU-SMI diagram is like below. all the consumer connect
> > with
> > smi-larb, then connect with smi-common.
> > 
> >          M4U
> >           |
> >      smi-common
> >           |
> >    -------------
> >    |         |    ...
> >    |         |
> > larb1     larb2
> >    |         |
> > vdec       venc
> > 
> > When the consumer works, it should enable the smi-larb's power
> > which
> > also need enable the smi-common's power firstly.
> > 
> > Thus, First of all, use the device link connect the consumer and
> > the
> > smi-larbs. then add device link between the smi-larb and smi-
> > common.
> > 
> > This patch adds device_link between the consumer and the larbs.
> > 
> > When device_link_add, I add the flag DL_FLAG_STATELESS to avoid
> > calling
> > pm_runtime_xx to keep the original status of clocks. It can avoid
> > two
> > issues:
> > 1) Display HW show fastlogo abnormally reported in [1]. At the
> > beggining,
> > all the clocks are enabled before entering kernel, but the clocks
> > for
> > display HW(always in larb0) will be gated after clk_enable and
> > clk_disable
> > called from device_link_add(->pm_runtime_resume) and rpm_idle. The
> > clock
> > operation happened before display driver probe. At that time, the
> > display
> > HW will be abnormal.
> > 
> > 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
> > pm_runtime_xx to avoid the deadlock.
> > 
> > Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
> > device_link_removed should be added explicitly.
> > 
> > [1] 
> > https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
> > [2] https://lore.kernel.org/patchwork/patch/1086569/
> > 
> > Suggested-by: Tomasz Figa <tfiga@chromium.org>
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > Tested-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> # on
> > mt8173
> 
> Hi, unfortunately, I have to take back the Tested-by tag.

sorry for inconvenience you.

(and sorry for reply late, there is something wrong about my local mail
server.)

> I am now testing the mtk-vcodec with latest kernel + patches sent
> from the mailing list:
> 
https://gitlab.collabora.com/eballetbo/linux/-/commits/topic/chromeos/chromeos-5.14
> which includes this patchset.
> 
> On chromeos I open a video conference with googl-meet which cause the
> mtk-vcodec vp8 encoder to run.
> If I kill it with `killall -9 chrome` I get some page fault messages
> from the iommu:

Does the "git bisect" point to this patch?

If you don't kill it, Does it also have these error below?

I don't know what happen about "killall -9 chrome', Does it cause
freeing some buffer?

> 
> [  837.255952] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read

This means "larb0 port0" translation fault. 

If I am not wrong, you work at mt8173, from [0], this is DISP_OVL0.

May be "killall -9 chrome" free the buffer(iova:0xfcff0000) that
DISP_OVL is accessing, then iommu complain it is not a valid iova.

[0] 
https://elixir.bootlin.com/linux/v5.14-rc1/source/include/dt-bindings/memory/mt8173-larb-port.h#L19

> [  837.265696] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.282367] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.299028] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.315683] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.332345] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.349004] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.365665] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.382329] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> [  837.400002] mtk-iommu 10205000.iommu: fault type=0x5
> iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read
> 
> In addition, running the encoder tests from the shell:
> 
> sudo --user=#1000 /usr/local/libexec/chrome-binary-
> tests/video_encode_accelerator_tests --
> gtest_filter=VideoEncoderTest.FlushAtEndOfStream_Multiple*  
> --codec=vp8
> /usr/local/share/tast/data/chromiumos/tast/local/bundles/cros/video/d
> ata/tulip2-320x180.yuv --disable_validator
> 
> At some point it fails with the error
> 
> [ 5472.161821] [MTK_V4L2][ERROR] mtk_vcodec_wait_for_done_ctx:32:
> [290] ctx->type=1, cmd=1, wait_event_interruptible_timeout
> time=1000ms out 0 0!
> [ 5472.174678] [MTK_VCODEC][ERROR][290]: vp8_enc_encode_frame()
> irq_status=0 failed
> [ 5472.182687] [MTK_V4L2][ERROR] mtk_venc_worker:1239: venc_if_encode
> failed=-5

+our venc guy Irui.

This looks VENC HW don't start to work. Does this caused by this
patchset?  this patchset only change the flow of power.

I guess we should check if the power/clock for venc here is enable or
not?

> 
> 
> If you have any idea of what might be the problem or how to debug?
> 
> Thanks,
> Dafna
> 
> > --
diff mbox series

Patch

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a02dde094788..ee742900cf4b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -571,22 +571,44 @@  static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct mtk_iommu_data *data;
+	struct device_link *link;
+	struct device *larbdev;
+	unsigned int larbid;
 
 	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
 		return ERR_PTR(-ENODEV); /* Not a iommu client device */
 
 	data = dev_iommu_priv_get(dev);
 
+	/*
+	 * Link the consumer device with the smi-larb device(supplier)
+	 * The device in each a larb is a independent HW. thus only link
+	 * one larb here.
+	 */
+	larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+	larbdev = data->larb_imu[larbid].dev;
+	link = device_link_add(dev, larbdev,
+			       DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+	if (!link)
+		dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
 	return &data->iommu;
 }
 
 static void mtk_iommu_release_device(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct mtk_iommu_data *data;
+	struct device *larbdev;
+	unsigned int larbid;
 
 	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
 		return;
 
+	data = dev_iommu_priv_get(dev);
+	larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+	larbdev = data->larb_imu[larbid].dev;
+	device_link_remove(dev, larbdev);
+
 	iommu_fwspec_free(dev);
 }
 
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index c259433f1130..806d4200665b 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -424,7 +424,9 @@  static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct of_phandle_args iommu_spec;
 	struct mtk_iommu_data *data;
-	int err, idx = 0;
+	int err, idx = 0, larbid;
+	struct device_link *link;
+	struct device *larbdev;
 
 	/*
 	 * In the deferred case, free the existed fwspec if the dev already has,
@@ -454,6 +456,14 @@  static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
 
 	data = dev_iommu_priv_get(dev);
 
+	/* Link the consumer device with the smi-larb device(supplier) */
+	larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+	larbdev = data->larb_imu[larbid].dev;
+	link = device_link_add(dev, larbdev,
+			       DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+	if (!link)
+		dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
 	return &data->iommu;
 }
 
@@ -474,10 +484,18 @@  static void mtk_iommu_probe_finalize(struct device *dev)
 static void mtk_iommu_release_device(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct mtk_iommu_data *data;
+	struct device *larbdev;
+	unsigned int larbid;
 
 	if (!fwspec || fwspec->ops != &mtk_iommu_ops)
 		return;
 
+	data = dev_iommu_priv_get(dev);
+	larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+	larbdev = data->larb_imu[larbid].dev;
+	device_link_remove(dev, larbdev);
+
 	iommu_fwspec_free(dev);
 }