Message ID | 20231206005727.46150-1-zhangfei.gao@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iommu/arm-smmu-v3: disable stall for quiet_cd | expand |
On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote: > From: Wenkai Lin <linwenkai6@hisilicon.com> > > In the stall model, invalid transactions were expected to be > stalled and aborted by the IOPF handler. > > However, when killing a test case with a huge amount of data, the > accelerator streamline can not stop until all data is consumed > even if the page fault handler reports errors. As a result, the > kill may take a long time, about 10 seconds with numerous iopf > interrupts. > > So disable stall for quiet_cd in the non-force stall model, since > force stall model (STALL_MODEL==0b10) requires CD.S must be 1. I think this force-stall thing should get a closer look, it doesn't look completely implemented and what does it mean for, eg, non-SVA domains attached to the device (as we now support with S2 and soon with PASID) The manual says: 0b10 Stall is forced (all faults eligible to stall cause stall), STE.S2S and CD.S must be 1. And there is a note: Note: For faulting transactions that are associated with client devices that have been configured to stall, but where the system has not explicitly advertised the client devices to be usable with the stall model, Arm recommends for software to expect that events might be recorded with Stall == 0. Which makes it seem like it isn't actually "force" per-say, but something else. I notice the driver never sets STE.S2S, and it isn't entirely clear what software should even do for a standard non-faulting domain where non-present means failure? Take the fault event and always respond with failure? What is the purpose? Aside from that the change looks OK to me: Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Jason
On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote: > From: Wenkai Lin <linwenkai6@hisilicon.com> > > In the stall model, invalid transactions were expected to be > stalled and aborted by the IOPF handler. > > However, when killing a test case with a huge amount of data, the > accelerator streamline can not stop until all data is consumed > even if the page fault handler reports errors. As a result, the > kill may take a long time, about 10 seconds with numerous iopf > interrupts. > > So disable stall for quiet_cd in the non-force stall model, since > force stall model (STALL_MODEL==0b10) requires CD.S must be 1. > > Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> > Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> > Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > --- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index 7445454c2af2..7086e5fa41ff 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -1063,6 +1063,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, > bool cd_live; > __le64 *cdptr; > struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; > + struct arm_smmu_device *smmu = master->smmu; > > if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) > return -E2BIG; > @@ -1077,6 +1078,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, > if (!cd) { /* (5) */ > val = 0; > } else if (cd == &quiet_cd) { /* (4) */ > + if (!(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) > + val &= ~(CTXDESC_CD_0_S | CTXDESC_CD_0_R); > val |= CTXDESC_CD_0_TCR_EPD0; > } else if (cd_live) { /* (3) */ > val &= ~CTXDESC_CD_0_ASID; > -- > 2.39.3 (Apple Git-145) >
On Wed, 6 Dec 2023 at 17:57, Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote: > > From: Wenkai Lin <linwenkai6@hisilicon.com> > > > > In the stall model, invalid transactions were expected to be > > stalled and aborted by the IOPF handler. > > > > However, when killing a test case with a huge amount of data, the > > accelerator streamline can not stop until all data is consumed > > even if the page fault handler reports errors. As a result, the > > kill may take a long time, about 10 seconds with numerous iopf > > interrupts. > > > > So disable stall for quiet_cd in the non-force stall model, since > > force stall model (STALL_MODEL==0b10) requires CD.S must be 1. > > > > Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> > > Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> > > Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Thanks Jean.
On Wed, 6 Dec 2023 at 09:30, Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Wed, Dec 06, 2023 at 08:57:27AM +0800, Zhangfei Gao wrote: > > From: Wenkai Lin <linwenkai6@hisilicon.com> > > > > In the stall model, invalid transactions were expected to be > > stalled and aborted by the IOPF handler. > > > > However, when killing a test case with a huge amount of data, the > > accelerator streamline can not stop until all data is consumed > > even if the page fault handler reports errors. As a result, the > > kill may take a long time, about 10 seconds with numerous iopf > > interrupts. > > > > So disable stall for quiet_cd in the non-force stall model, since > > force stall model (STALL_MODEL==0b10) requires CD.S must be 1. > > I think this force-stall thing should get a closer look, it doesn't > look completely implemented and what does it mean for, eg, non-SVA > domains attached to the device (as we now support with S2 and soon > with PASID) > > The manual says: > > 0b10 Stall is forced (all faults eligible to stall cause stall), > STE.S2S and CD.S must be 1. > > And there is a note: > > Note: For faulting transactions that are associated with client > devices that have been configured to stall, but where > the system has not explicitly advertised the client devices to be > usable with the stall model, Arm recommends for > software to expect that events might be recorded with Stall == 0. > > Which makes it seem like it isn't actually "force" per-say, but > something else. > > I notice the driver never sets STE.S2S, and it isn't entirely clear > what software should even do for a standard non-faulting domain where > non-present means failure? Take the fault event and always respond > with failure? What is the purpose? Sorry, I have no idea :) > > Aside from that the change looks OK to me: > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Thanks Jason.
On Wed, 6 Dec 2023 08:57:27 +0800, Zhangfei Gao wrote: > From: Wenkai Lin <linwenkai6@hisilicon.com> > > In the stall model, invalid transactions were expected to be > stalled and aborted by the IOPF handler. > > However, when killing a test case with a huge amount of data, the > accelerator streamline can not stop until all data is consumed > even if the page fault handler reports errors. As a result, the > kill may take a long time, about 10 seconds with numerous iopf > interrupts. > > [...] Applied to will (for-joerg/arm-smmu/updates), thanks! [1/1] iommu/arm-smmu-v3: disable stall for quiet_cd https://git.kernel.org/will/c/b41932f54458 Cheers,
On Wed, 13 Dec 2023 at 01:21, Will Deacon <will@kernel.org> wrote: > > On Wed, 6 Dec 2023 08:57:27 +0800, Zhangfei Gao wrote: > > From: Wenkai Lin <linwenkai6@hisilicon.com> > > > > In the stall model, invalid transactions were expected to be > > stalled and aborted by the IOPF handler. > > > > However, when killing a test case with a huge amount of data, the > > accelerator streamline can not stop until all data is consumed > > even if the page fault handler reports errors. As a result, the > > kill may take a long time, about 10 seconds with numerous iopf > > interrupts. > > > > [...] > > Applied to will (for-joerg/arm-smmu/updates), thanks! > > [1/1] iommu/arm-smmu-v3: disable stall for quiet_cd > https://git.kernel.org/will/c/b41932f54458 Thanks Will
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7445454c2af2..7086e5fa41ff 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1063,6 +1063,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, bool cd_live; __le64 *cdptr; struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; + struct arm_smmu_device *smmu = master->smmu; if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) return -E2BIG; @@ -1077,6 +1078,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, if (!cd) { /* (5) */ val = 0; } else if (cd == &quiet_cd) { /* (4) */ + if (!(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) + val &= ~(CTXDESC_CD_0_S | CTXDESC_CD_0_R); val |= CTXDESC_CD_0_TCR_EPD0; } else if (cd_live) { /* (3) */ val &= ~CTXDESC_CD_0_ASID;