Message ID | 20230809011204.v5.8.Idedc0f496231e2faab3df057219c5e2d937bbfe4@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Refactor the SMMU's CD table ownership | expand |
On Wed, Aug 09, 2023 at 01:12:04AM +0800, Michael Shavit wrote: > This commit explicitly keeps track of whether a CD table is installed in > an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This > was previously achieved through the domain->devices list, but we are > moving to a model where arm_smmu_sync_cd directly operates on a master > and the master's CD table instead of a domain. Why is this path worth optimising? > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index f5ad386cc8760..488d12dd2d4aa 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -985,6 +985,9 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master, > }, > }; > > + if (!master->cd_table.installed) > + return; Doesn't this interact badly with the sync in arm_smmu_detach_dev(), which I think happens after zapping the STE? > cmds.num = 0; > for (i = 0; i < master->num_streams; i++) { > cmd.cfgi.sid = master->streams[i].id; > @@ -1091,7 +1094,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, > cdptr[3] = cpu_to_le64(cd->mair); > > /* > - * STE is live, and the SMMU might read dwords of this CD in any > + * STE may be live, and the SMMU might read dwords of this CD in any > * order. Ensure that it observes valid values before reading > * V=1. > */ Why does this patch need to update this comment? Will
On Wed, Aug 9, 2023 at 9:50 PM Will Deacon <will@kernel.org> wrote: > > On Wed, Aug 09, 2023 at 01:12:04AM +0800, Michael Shavit wrote: > > This commit explicitly keeps track of whether a CD table is installed in > > an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This > > was previously achieved through the domain->devices list, but we are > > moving to a model where arm_smmu_sync_cd directly operates on a master > > and the master's CD table instead of a domain. > > Why is this path worth optimising? I have no idea what the practical impact of this optimization is, but the motivation here was to make the overall series as close to a nop as possible. This optimization existed before but is "broken" by the previous patch. This patch restores it. > Doesn't this interact badly with the sync in arm_smmu_detach_dev(), which I > think happens after zapping the STE? The arm_smmu_write_ctx_desc call added in arm_smmu_detach_dev() was inserted after zapping the STE precisely so that we could skip the sync. Is there a concern that a stale CD could be used when the CDtable is re-inserted into the STE? > > /* > > - * STE is live, and the SMMU might read dwords of this CD in any > > + * STE may be live, and the SMMU might read dwords of this CD in any > > * order. Ensure that it observes valid values before reading > > * V=1. > > */ > > Why does this patch need to update this comment? This is a drive-by to make this comment more accurate. Note how (before this patch series) arm_smmu_domain_finalise_s1 explicitly mentions that it calls arm_smmu_write_ctx_desc while the STE isn't installed yet. Yet this comment asserts the STE *is* live.
On Thu, Aug 10, 2023 at 04:34:39PM +0800, Michael Shavit wrote: > On Wed, Aug 9, 2023 at 9:50 PM Will Deacon <will@kernel.org> wrote: > > > > On Wed, Aug 09, 2023 at 01:12:04AM +0800, Michael Shavit wrote: > > > This commit explicitly keeps track of whether a CD table is installed in > > > an STE so that arm_smmu_sync_cd can skip the sync when unnecessary. This > > > was previously achieved through the domain->devices list, but we are > > > moving to a model where arm_smmu_sync_cd directly operates on a master > > > and the master's CD table instead of a domain. > > > > Why is this path worth optimising? > > I have no idea what the practical impact of this optimization is, but > the motivation here was to make the overall series as close to a nop > as possible. This optimization existed before but is "broken" by the > previous patch. This patch restores it. I'm not sure it's necessary, tbh. It's not like we're calling arm_smmu_sync_cd() all over the place -- it's used when we're actually working with the CD. > > Doesn't this interact badly with the sync in arm_smmu_detach_dev(), which I > > think happens after zapping the STE? > > The arm_smmu_write_ctx_desc call added in arm_smmu_detach_dev() was > inserted after zapping the STE precisely so that we could skip the > sync. Is there a concern that a stale CD could be used when the > CDtable is re-inserted into the STE? Ah, sorry, I went and looked at the architecture and it says for CMD_CFGI_STE: | This command invalidates all Context descriptors (including L1CD) | that were cached using the given StreamID. so as long as we make the CD unreachable in the STE before the STE invalidation (which I think we do by setting the Config field to bypass or abort), then I agree that we don't need the subsequent CD invalidation. > > > /* > > > - * STE is live, and the SMMU might read dwords of this CD in any > > > + * STE may be live, and the SMMU might read dwords of this CD in any > > > * order. Ensure that it observes valid values before reading > > > * V=1. > > > */ > > > > Why does this patch need to update this comment? > > This is a drive-by to make this comment more accurate. Note how > (before this patch series) arm_smmu_domain_finalise_s1 explicitly > mentions that it calls arm_smmu_write_ctx_desc while the STE isn't > installed yet. Yet this comment asserts the STE *is* live. Can you do it as its own patch then, please? Will
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index f5ad386cc8760..488d12dd2d4aa 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -985,6 +985,9 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master, }, }; + if (!master->cd_table.installed) + return; + cmds.num = 0; for (i = 0; i < master->num_streams; i++) { cmd.cfgi.sid = master->streams[i].id; @@ -1091,7 +1094,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, cdptr[3] = cpu_to_le64(cd->mair); /* - * STE is live, and the SMMU might read dwords of this CD in any + * STE may be live, and the SMMU might read dwords of this CD in any * order. Ensure that it observes valid values before reading * V=1. */ @@ -1333,6 +1336,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, */ if (smmu) arm_smmu_sync_ste_for_sid(smmu, sid); + master->cd_table.installed = false; return; } @@ -1360,6 +1364,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, cd_table->l1_desc ? STRTAB_STE_0_S1FMT_64K_L2 : STRTAB_STE_0_S1FMT_LINEAR); + cd_table->installed = true; + } else { + master->cd_table.installed = false; } if (s2_cfg) { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 1f3b370257779..e76452e735a04 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -599,6 +599,8 @@ struct arm_smmu_ctx_desc_cfg { u8 max_cds_bits; /* Whether CD entries in this table have the stall bit set. */ u8 stall_enabled:1; + /* Whether this CD table is installed in any STE */ + u8 installed:1; }; struct arm_smmu_s2_cfg {