diff mbox series

iommu/arm-smmu-v3: disable stall for quiet_cd

Message ID 20231206005727.46150-1-zhangfei.gao@linaro.org (mailing list archive)
State New, archived
Headers show
Series iommu/arm-smmu-v3: disable stall for quiet_cd | expand

Commit Message

Zhangfei Gao Dec. 6, 2023, 12:57 a.m. UTC
From: Wenkai Lin <linwenkai6@hisilicon.com>

In the stall model, invalid transactions were expected to be
stalled and aborted by the IOPF handler.

However, when killing a test case with a huge amount of data, the
accelerator streamline can not stop until all data is consumed
even if the page fault handler reports errors. As a result, the
kill may take a long time, about 10 seconds with numerous iopf
interrupts.

So disable stall for quiet_cd in the non-force stall model, since
force stall model (STALL_MODEL==0b10) requires CD.S must be 1.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jason Gunthorpe Dec. 6, 2023, 1:30 a.m. UTC | #1
On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote:
> From: Wenkai Lin <linwenkai6@hisilicon.com>
> 
> In the stall model, invalid transactions were expected to be
> stalled and aborted by the IOPF handler.
> 
> However, when killing a test case with a huge amount of data, the
> accelerator streamline can not stop until all data is consumed
> even if the page fault handler reports errors. As a result, the
> kill may take a long time, about 10 seconds with numerous iopf
> interrupts.
> 
> So disable stall for quiet_cd in the non-force stall model, since
> force stall model (STALL_MODEL==0b10) requires CD.S must be 1.

I think this force-stall thing should get a closer look, it doesn't
look completely implemented and what does it mean for, eg, non-SVA
domains attached to the device (as we now support with S2 and soon
with PASID)

The manual says:

0b10 Stall is forced (all faults eligible to stall cause stall),
     STE.S2S and CD.S must be 1.

And there is a note:

 Note: For faulting transactions that are associated with client
 devices that have been configured to stall, but where
 the system has not explicitly advertised the client devices to be
 usable with the stall model, Arm recommends for
 software to expect that events might be recorded with Stall == 0.

Which makes it seem like it isn't actually "force" per-say, but
something else.

I notice the driver never sets STE.S2S, and it isn't entirely clear
what software should even do for a standard non-faulting domain where
non-present means failure? Take the fault event and always respond
with failure? What is the purpose?

Aside from that the change looks OK to me:

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason
Jean-Philippe Brucker Dec. 6, 2023, 9:57 a.m. UTC | #2
On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote:
> From: Wenkai Lin <linwenkai6@hisilicon.com>
> 
> In the stall model, invalid transactions were expected to be
> stalled and aborted by the IOPF handler.
> 
> However, when killing a test case with a huge amount of data, the
> accelerator streamline can not stop until all data is consumed
> even if the page fault handler reports errors. As a result, the
> kill may take a long time, about 10 seconds with numerous iopf
> interrupts.
> 
> So disable stall for quiet_cd in the non-force stall model, since
> force stall model (STALL_MODEL==0b10) requires CD.S must be 1.
> 
> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
> Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7445454c2af2..7086e5fa41ff 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1063,6 +1063,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
>  	bool cd_live;
>  	__le64 *cdptr;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> +	struct arm_smmu_device *smmu = master->smmu;
>  
>  	if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
>  		return -E2BIG;
> @@ -1077,6 +1078,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
>  	if (!cd) { /* (5) */
>  		val = 0;
>  	} else if (cd == &quiet_cd) { /* (4) */
> +		if (!(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
> +			val &= ~(CTXDESC_CD_0_S | CTXDESC_CD_0_R);
>  		val |= CTXDESC_CD_0_TCR_EPD0;
>  	} else if (cd_live) { /* (3) */
>  		val &= ~CTXDESC_CD_0_ASID;
> -- 
> 2.39.3 (Apple Git-145)
>
Zhangfei Gao Dec. 6, 2023, 1:46 p.m. UTC | #3
On Wed, 6 Dec 2023 at 17:57, Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote:
> > From: Wenkai Lin <linwenkai6@hisilicon.com>
> >
> > In the stall model, invalid transactions were expected to be
> > stalled and aborted by the IOPF handler.
> >
> > However, when killing a test case with a huge amount of data, the
> > accelerator streamline can not stop until all data is consumed
> > even if the page fault handler reports errors. As a result, the
> > kill may take a long time, about 10 seconds with numerous iopf
> > interrupts.
> >
> > So disable stall for quiet_cd in the non-force stall model, since
> > force stall model (STALL_MODEL==0b10) requires CD.S must be 1.
> >
> > Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> > Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
> > Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>
> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Thanks Jean.
Zhangfei Gao Dec. 6, 2023, 1:47 p.m. UTC | #4
On Wed, 6 Dec 2023 at 09:30, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote:
> > From: Wenkai Lin <linwenkai6@hisilicon.com>
> >
> > In the stall model, invalid transactions were expected to be
> > stalled and aborted by the IOPF handler.
> >
> > However, when killing a test case with a huge amount of data, the
> > accelerator streamline can not stop until all data is consumed
> > even if the page fault handler reports errors. As a result, the
> > kill may take a long time, about 10 seconds with numerous iopf
> > interrupts.
> >
> > So disable stall for quiet_cd in the non-force stall model, since
> > force stall model (STALL_MODEL==0b10) requires CD.S must be 1.
>
> I think this force-stall thing should get a closer look, it doesn't
> look completely implemented and what does it mean for, eg, non-SVA
> domains attached to the device (as we now support with S2 and soon
> with PASID)
>
> The manual says:
>
> 0b10 Stall is forced (all faults eligible to stall cause stall),
>      STE.S2S and CD.S must be 1.
>
> And there is a note:
>
>  Note: For faulting transactions that are associated with client
>  devices that have been configured to stall, but where
>  the system has not explicitly advertised the client devices to be
>  usable with the stall model, Arm recommends for
>  software to expect that events might be recorded with Stall == 0.
>
> Which makes it seem like it isn't actually "force" per-say, but
> something else.
>
> I notice the driver never sets STE.S2S, and it isn't entirely clear
> what software should even do for a standard non-faulting domain where
> non-present means failure? Take the fault event and always respond
> with failure? What is the purpose?

Sorry, I have no idea :)

>
> Aside from that the change looks OK to me:
>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Thanks Jason.
Will Deacon Dec. 12, 2023, 5:21 p.m. UTC | #5
On Wed, 6 Dec 2023 08:57:27 +0800, Zhangfei Gao wrote:
> From: Wenkai Lin <linwenkai6@hisilicon.com>
> 
> In the stall model, invalid transactions were expected to be
> stalled and aborted by the IOPF handler.
> 
> However, when killing a test case with a huge amount of data, the
> accelerator streamline can not stop until all data is consumed
> even if the page fault handler reports errors. As a result, the
> kill may take a long time, about 10 seconds with numerous iopf
> interrupts.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: disable stall for quiet_cd
      https://git.kernel.org/will/c/b41932f54458

Cheers,
Zhangfei Gao Dec. 13, 2023, 1:36 a.m. UTC | #6
On Wed, 13 Dec 2023 at 01:21, Will Deacon <will@kernel.org> wrote:
>
> On Wed, 6 Dec 2023 08:57:27 +0800, Zhangfei Gao wrote:
> > From: Wenkai Lin <linwenkai6@hisilicon.com>
> >
> > In the stall model, invalid transactions were expected to be
> > stalled and aborted by the IOPF handler.
> >
> > However, when killing a test case with a huge amount of data, the
> > accelerator streamline can not stop until all data is consumed
> > even if the page fault handler reports errors. As a result, the
> > kill may take a long time, about 10 seconds with numerous iopf
> > interrupts.
> >
> > [...]
>
> Applied to will (for-joerg/arm-smmu/updates), thanks!
>
> [1/1] iommu/arm-smmu-v3: disable stall for quiet_cd
>       https://git.kernel.org/will/c/b41932f54458

Thanks Will
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7445454c2af2..7086e5fa41ff 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1063,6 +1063,7 @@  int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
 	bool cd_live;
 	__le64 *cdptr;
 	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
+	struct arm_smmu_device *smmu = master->smmu;
 
 	if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
 		return -E2BIG;
@@ -1077,6 +1078,8 @@  int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
 	if (!cd) { /* (5) */
 		val = 0;
 	} else if (cd == &quiet_cd) { /* (4) */
+		if (!(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
+			val &= ~(CTXDESC_CD_0_S | CTXDESC_CD_0_R);
 		val |= CTXDESC_CD_0_TCR_EPD0;
 	} else if (cd_live) { /* (3) */
 		val &= ~CTXDESC_CD_0_ASID;