Message ID | 09a290f1-27a0-5ee3-16b9-659ef2ba99dc@free.fr (mailing list archive) |
---|---|
State | Mainlined, archived |
Commit | 4e4abae311e4b44aaf61f18a826fd7136037f199 |
Headers | show |
Series | [v3] iommu/arm-smmu: Avoid constant zero in TLBI writes | expand |
[+Joerg on To:] On Mon, Jun 03, 2019 at 02:15:37PM +0200, Marc Gonzalez wrote: > From: Robin Murphy <robin.murphy@arm.com> > > Apparently, some Qualcomm arm64 platforms which appear to expose their > SMMU global register space are still, in fact, using a hypervisor to > mediate it by trapping and emulating register accesses. Sadly, some > deployed versions of said trapping code have bugs wherein they go > horribly wrong for stores using r31 (i.e. XZR/WZR) as the source > register. > > While this can be mitigated for GCC today by tweaking the constraints > for the implementation of writel_relaxed(), to avoid any potential > arms race with future compilers more aggressively optimising register > allocation, the simple way is to just remove all the problematic > constant zeros. For the write-only TLB operations, the actual value is > irrelevant anyway and any old nearby variable will provide a suitable > GPR to encode. The one point at which we really do need a zero to clear > a context bank happens before any of the TLB maintenance where crashes > have been reported, so is apparently not a problem... :/ > > Reported-by: AngeloGioacchino Del Regno <kholk11@gmail.com> > Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Acked-by: Will Deacon <will.deacon@arm.com> Joerg -- Please can you take this as a fix for 5.2, with a Cc stable? Cheers, Will
On 05/06/2019 14:19, Will Deacon wrote: > On Mon, Jun 03, 2019 at 02:15:37PM +0200, Marc Gonzalez wrote: > >> From: Robin Murphy <robin.murphy@arm.com> >> >> Apparently, some Qualcomm arm64 platforms which appear to expose their >> SMMU global register space are still, in fact, using a hypervisor to >> mediate it by trapping and emulating register accesses. Sadly, some >> deployed versions of said trapping code have bugs wherein they go >> horribly wrong for stores using r31 (i.e. XZR/WZR) as the source >> register. >> >> While this can be mitigated for GCC today by tweaking the constraints >> for the implementation of writel_relaxed(), to avoid any potential >> arms race with future compilers more aggressively optimising register >> allocation, the simple way is to just remove all the problematic >> constant zeros. For the write-only TLB operations, the actual value is >> irrelevant anyway and any old nearby variable will provide a suitable >> GPR to encode. The one point at which we really do need a zero to clear >> a context bank happens before any of the TLB maintenance where crashes >> have been reported, so is apparently not a problem... :/ >> >> Reported-by: AngeloGioacchino Del Regno <kholk11@gmail.com> >> Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > > Acked-by: Will Deacon <will.deacon@arm.com> > > Joerg -- Please can you take this as a fix for 5.2, with a Cc stable? Hello Joerg, Can you ping this thread once this patch hits linux-next, so I can ask Bjorn to pick up the 8998 ANOC1 DT node, and the PCIe DT node that requires ANOC1. Bjorn: for ANOC1, a small fixup: s/arm,smmu/iommu/ https://patchwork.kernel.org/project/linux-arm-msm/list/?series=99701 https://patchwork.kernel.org/patch/10895341/ Regards.
On Mon, Jun 03, 2019 at 02:15:37PM +0200, Marc Gonzalez wrote: > drivers/iommu/arm-smmu.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) Applied, thanks. It should show up in linux-next in the next days.
[ Trimming recipients to minimize inconvenience ] On 12/06/2019 10:10, Joerg Roedel wrote: > On Mon, Jun 03, 2019 at 02:15:37PM +0200, Marc Gonzalez wrote: > >> drivers/iommu/arm-smmu.c | 15 ++++++++++++--- >> 1 file changed, 12 insertions(+), 3 deletions(-) > > Applied, thanks. It should show up in linux-next in the next days. Almost there... Should be in tomorrow's next. https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/log/?h=next https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=next&id=4e4abae311e4b44aaf61f18a826fd7136037f199 Regards.
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 930c07635956..9435e4a7759f 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -59,6 +59,15 @@ #include "arm-smmu-regs.h" +/* + * Apparently, some Qualcomm arm64 platforms which appear to expose their SMMU + * global register space are still, in fact, using a hypervisor to mediate it + * by trapping and emulating register accesses. Sadly, some deployed versions + * of said trapping code have bugs wherein they go horribly wrong for stores + * using r31 (i.e. XZR/WZR) as the source register. + */ +#define QCOM_DUMMY_VAL -1 + #define ARM_MMU500_ACTLR_CPRE (1 << 1) #define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) @@ -423,7 +432,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, { unsigned int spin_cnt, delay; - writel_relaxed(0, sync); + writel_relaxed(QCOM_DUMMY_VAL, sync); for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE)) @@ -1761,8 +1770,8 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) } /* Invalidate the TLB, just in case */ - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); + writel_relaxed(QCOM_DUMMY_VAL, gr0_base + ARM_SMMU_GR0_TLBIALLH); + writel_relaxed(QCOM_DUMMY_VAL, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);