diff mbox series

[1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE

Message ID d2a3ddb17b3270e268e2f1adf7682ea938823941.1600213517.git.robin.murphy@arm.com
State New, archived
Headers show
Series drm: panfrost: Coherency support | expand

Commit Message

Robin Murphy Sept. 15, 2020, 11:51 p.m. UTC
Midgard GPUs have ACE-Lite master interfaces which allows systems to
integrate them in an I/O-coherent manner. It seems that from the GPU's
viewpoint, the rest of the system is its outer shareable domain, and so
even when snoop signals are wired up, they are only emitted for outer
shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
indeed get coherent pagetable walks working nicely for the coherent
T620 in the Arm Juno SoC.

Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/io-pgtable-arm.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Will Deacon Sept. 21, 2020, 5:57 p.m. UTC | #1
On Wed, Sep 16, 2020 at 12:51:05AM +0100, Robin Murphy wrote:
> Midgard GPUs have ACE-Lite master interfaces which allows systems to
> integrate them in an I/O-coherent manner. It seems that from the GPU's
> viewpoint, the rest of the system is its outer shareable domain, and so
> even when snoop signals are wired up, they are only emitted for outer
> shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
> indeed get coherent pagetable walks working nicely for the coherent
> T620 in the Arm Juno SoC.

I can't help but think some of this commentary deserves to be in the code
as well.

Do you know if this sort of thing is done for other SoCs too, or is this
just a Juno quirk?

Will
Robin Murphy Sept. 21, 2020, 9:53 p.m. UTC | #2
On 2020-09-21 18:57, Will Deacon wrote:
> On Wed, Sep 16, 2020 at 12:51:05AM +0100, Robin Murphy wrote:
>> Midgard GPUs have ACE-Lite master interfaces which allows systems to
>> integrate them in an I/O-coherent manner. It seems that from the GPU's
>> viewpoint, the rest of the system is its outer shareable domain, and so
>> even when snoop signals are wired up, they are only emitted for outer
>> shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
>> indeed get coherent pagetable walks working nicely for the coherent
>> T620 in the Arm Juno SoC.
> 
> I can't help but think some of this commentary deserves to be in the code
> as well.

Sure, if you want.

> Do you know if this sort of thing is done for other SoCs too, or is this
> just a Juno quirk?

Yup, this is a "Midgard working as designed" thing. Juno is the coherent 
example I have to hand, but off the top of my head I believe some of the 
Exynos SoCs can also use their GPUs coherently if a switch is flipped in 
the interconnect to change routing between the CCI and a direct-to-RAM 
path; I expect there are probably further Midgard examples that I'm not 
aware of. Then there are definitely coherent Bifrost GPUs like the 
Amlogic S922/A311 that prompted me to revive this patch, which we 
currently drive in "Legacy" mode and thus behave the same way as Midgard 
(Bifrost's "AArch64" mode realigns Ish and Osh with the rest of the 
system, and instead invents a new "Internal Shareable" value in between 
Nsh and Ish to represent the shareability between cores within the GPU 
for which Midgard hijacked Ish).

Robin.
Will Deacon Sept. 21, 2020, 10:24 p.m. UTC | #3
On Mon, Sep 21, 2020 at 10:53:23PM +0100, Robin Murphy wrote:
> On 2020-09-21 18:57, Will Deacon wrote:
> > On Wed, Sep 16, 2020 at 12:51:05AM +0100, Robin Murphy wrote:
> > > Midgard GPUs have ACE-Lite master interfaces which allows systems to
> > > integrate them in an I/O-coherent manner. It seems that from the GPU's
> > > viewpoint, the rest of the system is its outer shareable domain, and so
> > > even when snoop signals are wired up, they are only emitted for outer
> > > shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
> > > indeed get coherent pagetable walks working nicely for the coherent
> > > T620 in the Arm Juno SoC.
> > 
> > I can't help but think some of this commentary deserves to be in the code
> > as well.
> 
> Sure, if you want.

Yes, please.

> > Do you know if this sort of thing is done for other SoCs too, or is this
> > just a Juno quirk?
> 
> Yup, this is a "Midgard working as designed" thing. Juno is the coherent
> example I have to hand, but off the top of my head I believe some of the
> Exynos SoCs can also use their GPUs coherently if a switch is flipped in the
> interconnect to change routing between the CCI and a direct-to-RAM path; I
> expect there are probably further Midgard examples that I'm not aware of.
> Then there are definitely coherent Bifrost GPUs like the Amlogic S922/A311
> that prompted me to revive this patch, which we currently drive in "Legacy"
> mode and thus behave the same way as Midgard (Bifrost's "AArch64" mode
> realigns Ish and Osh with the rest of the system, and instead invents a new
> "Internal Shareable" value in between Nsh and Ish to represent the
> shareability between cores within the GPU for which Midgard hijacked Ish).

That is more than I wanted to know :) "Internal Shareable", jeez...

Thanks,

Will
diff mbox series

Patch

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index dc7bcf858b6d..e47012006dcc 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -440,7 +440,7 @@  static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 				<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
 	}
 
-	if (prot & IOMMU_CACHE)
+	if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
 		pte |= ARM_LPAE_PTE_SH_IS;
 	else
 		pte |= ARM_LPAE_PTE_SH_OS;
@@ -1049,6 +1049,9 @@  arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
 	cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) |
 					  ARM_MALI_LPAE_TTBR_READ_INNER |
 					  ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
+	if (cfg->coherent_walk)
+		cfg->arm_mali_lpae_cfg.transtab |= ARM_MALI_LPAE_TTBR_SHARE_OUTER;
+
 	return &data->iop;
 
 out_free_data: