diff mbox series

[v9,08/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode

Message ID 1565423901-17008-9-git-send-email-yong.wu@mediatek.com (mailing list archive)
State New, archived
Headers show
Series MT8183 IOMMU SUPPORT | expand

Commit Message

Yong Wu (吴勇) Aug. 10, 2019, 7:58 a.m. UTC
MediaTek extend the arm v7s descriptor to support the dram over 4GB.

In the mt2712 and mt8173, it's called "4GB mode", the physical address
is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
bit32 is always enabled. thus, in the M4U, we always enable the bit9
for all PTEs which means to enable bit32 of physical address. Here is
the detailed remap relationship in the "4GB mode":
CPU PA         ->    HW PA
0x4000_0000          0x1_4000_0000 (Add bit32)
0x8000_0000          0x1_8000_0000 ...
0xc000_0000          0x1_c000_0000 ...
0x1_0000_0000        0x1_0000_0000 (No change)

but in mt8183, M4U support the dram from 0x4000_0000 to 0x3_ffff_ffff
which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
32bits.

The "4GB mode" is so special, it always add bit32 in paddr_to_iopte and
it only add bit32 while pa < 0x40000000UL in the iopte_to_paddr since the
valid pa is from 0x4000_0000 to 0x1_3fff_ffff. Thus I add a special macro
ARM_V7S_MTK_4GB_OAS(oas: 33) for "4GB mode".

Regarding whether the pagetable address could be over 4GB, the mt8183
support it while the previous mt8173 don't. thus keep it as is.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/io-pgtable-arm-v7s.c | 36 +++++++++++++++++++++++++++++-------
 drivers/iommu/mtk_iommu.c          | 23 +++++++++++++----------
 drivers/iommu/mtk_iommu.h          |  1 +
 include/linux/io-pgtable.h         |  9 +++++----
 4 files changed, 48 insertions(+), 21 deletions(-)

Comments

Will Deacon Aug. 14, 2019, 2:41 p.m. UTC | #1
Hi Yong Wu,

Sorry, but I'm still deeply confused by this patch.

On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> 
> In the mt2712 and mt8173, it's called "4GB mode", the physical address
> is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> bit32 is always enabled. thus, in the M4U, we always enable the bit9
> for all PTEs which means to enable bit32 of physical address. Here is
> the detailed remap relationship in the "4GB mode":
> CPU PA         ->    HW PA
> 0x4000_0000          0x1_4000_0000 (Add bit32)
> 0x8000_0000          0x1_8000_0000 ...
> 0xc000_0000          0x1_c000_0000 ...
> 0x1_0000_0000        0x1_0000_0000 (No change)

So in this example, there are no PAs below 0x4000_0000 yet you later
add code to deal with that:

> +	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> +	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> +		paddr |= BIT_ULL(32);

Why? Mainline currently doesn't do anything like this for the "4gb mode"
support as far as I can tell. In fact, we currently unconditionally set
bit 32 in the physical address returned by iova_to_phys() which wouldn't
match your CPU PAs listed above, so I'm confused about how this is supposed
to work.

The way I would like this quirk to work is that the io-pgtable code
basically sets bit 9 in the pte when bit 32 is set in the physical address,
and sets bit 4 in the pte when bit 33 is set in the physical address. It
would then do the opposite when converting a pte to a physical address.

That way, your driver can call the page table code directly with the high
addresses and we don't have to do any manual offsetting or range checking
in the page table code.

Please can you explain to me why the diff below doesn't work on top of
this series? I'm happy to chat on IRC if you think it would be easier,
because I have a horrible feeling that we've been talking past each other
and I'd like to see this support merged for 5.4.

Will

--->8

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index ab12ef5f8b03..d8d84617c822 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
 	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
 
 	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
-		if ((paddr & BIT_ULL(32)) || cfg->oas == ARM_V7S_MTK_4GB_OAS)
+		if (paddr & BIT_ULL(32))
 			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
 		if (paddr & BIT_ULL(33))
 			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
@@ -206,17 +206,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
 		mask = ARM_V7S_LVL_MASK(lvl);
 
 	paddr = pte & mask;
-	if (cfg->oas == 32 || !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))
-		return paddr;
 
-	if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
-		paddr |= BIT_ULL(33);
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
+		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
+			paddr |= BIT_ULL(32);
+		if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
+			paddr |= BIT_ULL(33);
+	}
 
-	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
-	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
-		paddr |= BIT_ULL(32);
-	else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
-		paddr |= BIT_ULL(32);
 	return paddr;
 }
 
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index d5b9454352fd..3ae54dedede0 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -286,7 +286,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
 	if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT))
 		dom->cfg.oas = 32;
 	else if (data->enable_4GB)
-		dom->cfg.oas = ARM_V7S_MTK_4GB_OAS;
+		dom->cfg.oas = 33;
 	else
 		dom->cfg.oas = 34;
 
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 27337395bd42..a2a52c349fe4 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -113,8 +113,6 @@ struct io_pgtable_cfg {
 	};
 };
 
-#define ARM_V7S_MTK_4GB_OAS			33
-
 /**
  * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
  *
Yong Wu (吴勇) Aug. 15, 2019, 8:47 a.m. UTC | #2
On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> Hi Yong Wu,
> 
> Sorry, but I'm still deeply confused by this patch.

Sorry for this. the "4GB mode" really is a bit odd...

> 
> On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > 
> > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > for all PTEs which means to enable bit32 of physical address. Here is
> > the detailed remap relationship in the "4GB mode":
> > CPU PA         ->    HW PA
> > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > 0x8000_0000          0x1_8000_0000 ...
> > 0xc000_0000          0x1_c000_0000 ...
> > 0x1_0000_0000        0x1_0000_0000 (No change)
> 
> So in this example, there are no PAs below 0x4000_0000 yet you later
> add code to deal with that:
> 
> > +	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > +	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > +		paddr |= BIT_ULL(32);
> 
> Why? Mainline currently doesn't do anything like this for the "4gb mode"
> support as far as I can tell. In fact, we currently unconditionally set
> bit 32 in the physical address returned by iova_to_phys() which wouldn't
> match your CPU PAs listed above, so I'm confused about how this is supposed
> to work.

Actually current mainline have a bug for this. So I tried to use another
special patch[1] for it in v8.

But the issue is not critical since MediaTek multimedia consumer(v4l2
and drm) don't call iommu_iova_to_phys currently.

> 
> The way I would like this quirk to work is that the io-pgtable code
> basically sets bit 9 in the pte when bit 32 is set in the physical address,
> and sets bit 4 in the pte when bit 33 is set in the physical address. It
> would then do the opposite when converting a pte to a physical address.
> 
> That way, your driver can call the page table code directly with the high
> addresses and we don't have to do any manual offsetting or range checking
> in the page table code.

In this case, the mt8183 can work successfully while the "4gb
mode"(mt8173/mt2712) can not.

In the "4gb mode", As the remap relationship above, we should always add
bit32 in pte as we did in [2]. and need add bit32 in the
"iova_to_phys"(Not always add.). That means the "4gb mode" has a special
flow:
a. Always add bit32 in paddr_to_iopte.
b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.

> 
> Please can you explain to me why the diff below doesn't work on top of
> this series?

The diff below is just I did in v8[3]. The different is that I move the
"4gb mode" special flow in the mtk_iommu.c in v8, the code is like
[4]below. When I sent v9, I found that I can distinguish the "4gb mode"
with "oas == 33" in v7s. then I can "simply" add the 4gb special flow[5]
based on your diff.


>  I'm happy to chat on IRC if you think it would be easier,
> because I have a horrible feeling that we've been talking past each other
> and I'd like to see this support merged for 5.4.

Thanks very much for your view, I'm sorry that I don't have IRC. I will
send the next version quickly if we have a conclusion here. Then Which
way is better? If you'd like keep the pagetable code clean, I will add
the "4gb mode" special flow into mtk_iommu.c.

Thanks.


[1]http://lists.infradead.org/pipermail/linux-mediatek/2019-June/020988.html
[2]
https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/iommu/io-pgtable-arm-v7s.c#L299
[3]http://lists.infradead.org/pipermail/linux-mediatek/2019-June/020991.html

[4]======4gb mode special flow in mtk_iommu.c======================

+#define MTK_IOMMU_4GB_MODE_REMAP_BASE	 0x140000000UL

@@ -380,12 +379,16 @@ static int mtk_iommu_map(struct iommu_domain
*domain, unsigned long iova,
 			 phys_addr_t paddr, size_t size, int prot)
 {
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
 	unsigned long flags;
 	int ret;
 
+	/* The "4GB mode" M4U physically can not use the lower remap of Dram.
*/
+	if (data->enable_4GB)
+		paddr |= BIT_ULL(32);
+
 	spin_lock_irqsave(&dom->pgtlock, flags);
-	ret = dom->iop->map(dom->iop, iova, paddr & DMA_BIT_MASK(32),
-			    size, prot);
+	ret = dom->iop->map(dom->iop, iova, paddr, size, prot);
 	spin_unlock_irqrestore(&dom->pgtlock, flags);
 
 	return ret;
@@ -422,8 +425,8 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct
iommu_domain *domain,
 	pa = dom->iop->iova_to_phys(dom->iop, iova);
 	spin_unlock_irqrestore(&dom->pgtlock, flags);
 
-	if (data->enable_4GB && pa < MTK_IOMMU_4GB_MODE_REMAP_BASE)
-		pa |= BIT_ULL(32);
+	if (data->enable_4GB && pa >= MTK_IOMMU_4GB_MODE_REMAP_BASE)
+		pa &= ~BIT_ULL(32);
 
 	return pa;
 }
=============================================================

[5]:
=========================================================
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c
b/drivers/iommu/io-pgtable-arm-v7s.c
index 78fd11e..8e974a5 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t
paddr, int lvl,
 	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
 
 	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
-		if (paddr & BIT_ULL(32))
+		if (paddr & BIT_ULL(32) || cfg->oas == 33)
 			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
 		if (paddr & BIT_ULL(33))
 			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
@@ -207,7 +207,9 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte,
int lvl,
 
 	paddr = pte & mask;
 	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
-		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
+		if (cfg->oas == 33 && paddr < 0x40000000UL)
+			paddr |= BIT_ULL(32);
+		else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
 			paddr |= BIT_ULL(32);
 		if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
 			paddr |= BIT_ULL(33);
============================================================

> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index ab12ef5f8b03..d8d84617c822 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
>  	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
>  
>  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
> -		if ((paddr & BIT_ULL(32)) || cfg->oas == ARM_V7S_MTK_4GB_OAS)
> +		if (paddr & BIT_ULL(32))
>  			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
>  		if (paddr & BIT_ULL(33))
>  			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> @@ -206,17 +206,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
>  		mask = ARM_V7S_LVL_MASK(lvl);
>  
>  	paddr = pte & mask;
> -	if (cfg->oas == 32 || !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))
> -		return paddr;
>  
> -	if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> -		paddr |= BIT_ULL(33);
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
> +		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> +			paddr |= BIT_ULL(32);
> +		if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> +			paddr |= BIT_ULL(33);
> +	}
>  
> -	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> -	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> -		paddr |= BIT_ULL(32);
> -	else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> -		paddr |= BIT_ULL(32);
>  	return paddr;
>  }
>  
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index d5b9454352fd..3ae54dedede0 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -286,7 +286,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
>  	if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT))
>  		dom->cfg.oas = 32;
>  	else if (data->enable_4GB)
> -		dom->cfg.oas = ARM_V7S_MTK_4GB_OAS;
> +		dom->cfg.oas = 33;
>  	else
>  		dom->cfg.oas = 34;
>  
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 27337395bd42..a2a52c349fe4 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -113,8 +113,6 @@ struct io_pgtable_cfg {
>  	};
>  };
>  
> -#define ARM_V7S_MTK_4GB_OAS			33
> -
>  /**
>   * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
>   *
Will Deacon Aug. 15, 2019, 9:51 a.m. UTC | #3
On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > 
> > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > for all PTEs which means to enable bit32 of physical address. Here is
> > > the detailed remap relationship in the "4GB mode":
> > > CPU PA         ->    HW PA
> > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > 0x8000_0000          0x1_8000_0000 ...
> > > 0xc000_0000          0x1_c000_0000 ...
> > > 0x1_0000_0000        0x1_0000_0000 (No change)
> > 
> > So in this example, there are no PAs below 0x4000_0000 yet you later
> > add code to deal with that:
> > 
> > > +	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > > +	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > > +		paddr |= BIT_ULL(32);
> > 
> > Why? Mainline currently doesn't do anything like this for the "4gb mode"
> > support as far as I can tell. In fact, we currently unconditionally set
> > bit 32 in the physical address returned by iova_to_phys() which wouldn't
> > match your CPU PAs listed above, so I'm confused about how this is supposed
> > to work.
> 
> Actually current mainline have a bug for this. So I tried to use another
> special patch[1] for it in v8.

If you're fixing a bug in mainline, I'd prefer to see that as a separate
patch.

> But the issue is not critical since MediaTek multimedia consumer(v4l2
> and drm) don't call iommu_iova_to_phys currently.
> 
> > 
> > The way I would like this quirk to work is that the io-pgtable code
> > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > would then do the opposite when converting a pte to a physical address.
> > 
> > That way, your driver can call the page table code directly with the high
> > addresses and we don't have to do any manual offsetting or range checking
> > in the page table code.
> 
> In this case, the mt8183 can work successfully while the "4gb
> mode"(mt8173/mt2712) can not.
> 
> In the "4gb mode", As the remap relationship above, we should always add
> bit32 in pte as we did in [2]. and need add bit32 in the
> "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> flow:
> a. Always add bit32 in paddr_to_iopte.
> b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.

I think this is probably at the heart of my misunderstanding. What is so
special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
or something else?

> > Please can you explain to me why the diff below doesn't work on top of
> > this series?
> 
> The diff below is just I did in v8[3]. The different is that I move the
> "4gb mode" special flow in the mtk_iommu.c in v8, the code is like
> [4]below. When I sent v9, I found that I can distinguish the "4gb mode"
> with "oas == 33" in v7s. then I can "simply" add the 4gb special flow[5]
> based on your diff.
> 
> 
> >  I'm happy to chat on IRC if you think it would be easier,
> > because I have a horrible feeling that we've been talking past each other
> > and I'd like to see this support merged for 5.4.
> 
> Thanks very much for your view, I'm sorry that I don't have IRC. I will
> send the next version quickly if we have a conclusion here. Then Which
> way is better? If you'd like keep the pagetable code clean, I will add
> the "4gb mode" special flow into mtk_iommu.c.

I mean, we could even talk on the phone if necessary because I can't accept
this code unless I understand how it works!

To be blunt, I'd like to avoid the io-pgtable changes looking different to
what I suggested:

> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > index ab12ef5f8b03..d8d84617c822 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> >  	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> >  
> >  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
> > -		if ((paddr & BIT_ULL(32)) || cfg->oas == ARM_V7S_MTK_4GB_OAS)
> > +		if (paddr & BIT_ULL(32))
> >  			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> >  		if (paddr & BIT_ULL(33))
> >  			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > @@ -206,17 +206,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> >  		mask = ARM_V7S_LVL_MASK(lvl);
> >  
> >  	paddr = pte & mask;
> > -	if (cfg->oas == 32 || !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))
> > -		return paddr;
> >  
> > -	if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > -		paddr |= BIT_ULL(33);
> > +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
> > +		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > +			paddr |= BIT_ULL(32);
> > +		if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > +			paddr |= BIT_ULL(33);
> > +	}
> >  
> > -	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > -	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > -		paddr |= BIT_ULL(32);
> > -	else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > -		paddr |= BIT_ULL(32);
> >  	return paddr;
> >  }

so anything else should ideally go in the driver. The change above gives
the driver control over bits 4 and 9 in the pte, which I hope should be
sufficient. That said, yet another thing I don't understand is how the
IOMMU page table walker views physical addresses :/

Anyway, in your diff here...

> [5]:
> =========================================================
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index 78fd11e..8e974a5 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t
> paddr, int lvl,
>  	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
>  
>  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> -		if (paddr & BIT_ULL(32))
> +		if (paddr & BIT_ULL(32) || cfg->oas == 33)
>  			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;

... I'd like to drop the oas check, because the driver should be passing
in physical addresses with bit 32 set in this case, and...

>  		if (paddr & BIT_ULL(33))
>  			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> @@ -207,7 +207,9 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte,
> int lvl,
>  
>  	paddr = pte & mask;
>  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> -		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> +		if (cfg->oas == 33 && paddr < 0x40000000UL)
> +			paddr |= BIT_ULL(32);

... here I simply don't understand the significance of 0x40000000.

Will
Yong Wu (吴勇) Aug. 15, 2019, 10:03 a.m. UTC | #4
On Thu, 2019-08-15 at 10:51 +0100, Will Deacon wrote:
> On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> > On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > 
> > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > for all PTEs which means to enable bit32 of physical address. Here is
> > > > the detailed remap relationship in the "4GB mode":
> > > > CPU PA         ->    HW PA
> > > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > > 0x8000_0000          0x1_8000_0000 ...
> > > > 0xc000_0000          0x1_c000_0000 ...
> > > > 0x1_0000_0000        0x1_0000_0000 (No change)
> > > 
> > > So in this example, there are no PAs below 0x4000_0000 yet you later
> > > add code to deal with that:
> > > 
> > > > +	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > > > +	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > > > +		paddr |= BIT_ULL(32);
> > > 
> > > Why? Mainline currently doesn't do anything like this for the "4gb mode"
> > > support as far as I can tell. In fact, we currently unconditionally set
> > > bit 32 in the physical address returned by iova_to_phys() which wouldn't
> > > match your CPU PAs listed above, so I'm confused about how this is supposed
> > > to work.
> > 
> > Actually current mainline have a bug for this. So I tried to use another
> > special patch[1] for it in v8.
> 
> If you're fixing a bug in mainline, I'd prefer to see that as a separate
> patch.
> 
> > But the issue is not critical since MediaTek multimedia consumer(v4l2
> > and drm) don't call iommu_iova_to_phys currently.
> > 
> > > 
> > > The way I would like this quirk to work is that the io-pgtable code
> > > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > > would then do the opposite when converting a pte to a physical address.
> > > 
> > > That way, your driver can call the page table code directly with the high
> > > addresses and we don't have to do any manual offsetting or range checking
> > > in the page table code.
> > 
> > In this case, the mt8183 can work successfully while the "4gb
> > mode"(mt8173/mt2712) can not.
> > 
> > In the "4gb mode", As the remap relationship above, we should always add
> > bit32 in pte as we did in [2]. and need add bit32 in the
> > "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> > flow:
> > a. Always add bit32 in paddr_to_iopte.
> > b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.
> 
> I think this is probably at the heart of my misunderstanding. What is so
> special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
> or something else?

SRAM and the HW registers.

> 
> > > Please can you explain to me why the diff below doesn't work on top of
> > > this series?
> > 
> > The diff below is just I did in v8[3]. The different is that I move the
> > "4gb mode" special flow in the mtk_iommu.c in v8, the code is like
> > [4]below. When I sent v9, I found that I can distinguish the "4gb mode"
> > with "oas == 33" in v7s. then I can "simply" add the 4gb special flow[5]
> > based on your diff.
> > 
> > 
> > >  I'm happy to chat on IRC if you think it would be easier,
> > > because I have a horrible feeling that we've been talking past each other
> > > and I'd like to see this support merged for 5.4.
> > 
> > Thanks very much for your view, I'm sorry that I don't have IRC. I will
> > send the next version quickly if we have a conclusion here. Then Which
> > way is better? If you'd like keep the pagetable code clean, I will add
> > the "4gb mode" special flow into mtk_iommu.c.
> 
> I mean, we could even talk on the phone if necessary because I can't accept
> this code unless I understand how it works!
> 
> To be blunt, I'd like to avoid the io-pgtable changes looking different to
> what I suggested:
> 
> > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > > index ab12ef5f8b03..d8d84617c822 100644
> > > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > > @@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> > >  	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> > >  
> > >  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
> > > -		if ((paddr & BIT_ULL(32)) || cfg->oas == ARM_V7S_MTK_4GB_OAS)
> > > +		if (paddr & BIT_ULL(32))
> > >  			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> > >  		if (paddr & BIT_ULL(33))
> > >  			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > > @@ -206,17 +206,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> > >  		mask = ARM_V7S_LVL_MASK(lvl);
> > >  
> > >  	paddr = pte & mask;
> > > -	if (cfg->oas == 32 || !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))
> > > -		return paddr;
> > >  
> > > -	if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > > -		paddr |= BIT_ULL(33);
> > > +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
> > > +		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > > +			paddr |= BIT_ULL(32);
> > > +		if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
> > > +			paddr |= BIT_ULL(33);
> > > +	}
> > >  
> > > -	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > > -	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > > -		paddr |= BIT_ULL(32);
> > > -	else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > > -		paddr |= BIT_ULL(32);
> > >  	return paddr;
> > >  }
> 
> so anything else should ideally go in the driver. The change above gives
> the driver control over bits 4 and 9 in the pte, which I hope should be
> sufficient. That said, yet another thing I don't understand is how the
> IOMMU page table walker views physical addresses :/

Thanks the confirm. I will keep this way in the next version.

> 
> Anyway, in your diff here...
> 
> > [5]:
> > =========================================================
> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c
> > b/drivers/iommu/io-pgtable-arm-v7s.c
> > index 78fd11e..8e974a5 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -184,7 +184,7 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t
> > paddr, int lvl,
> >  	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> >  
> >  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > -		if (paddr & BIT_ULL(32))
> > +		if (paddr & BIT_ULL(32) || cfg->oas == 33)
> >  			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
> 
> ... I'd like to drop the oas check, because the driver should be passing
> in physical addresses with bit 32 set in this case, and...
> 
> >  		if (paddr & BIT_ULL(33))
> >  			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
> > @@ -207,7 +207,9 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte,
> > int lvl,
> >  
> >  	paddr = pte & mask;
> >  	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
> > -		if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
> > +		if (cfg->oas == 33 && paddr < 0x40000000UL)
> > +			paddr |= BIT_ULL(32);
> 
> ... here I simply don't understand the significance of 0x40000000.
> 
> Will
Will Deacon Aug. 15, 2019, 10:04 a.m. UTC | #5
On Thu, Aug 15, 2019 at 06:03:30PM +0800, Yong Wu wrote:
> On Thu, 2019-08-15 at 10:51 +0100, Will Deacon wrote:
> > On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> > > On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > > > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > > 
> > > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > > for all PTEs which means to enable bit32 of physical address. Here is
> > > > > the detailed remap relationship in the "4GB mode":
> > > > > CPU PA         ->    HW PA
> > > > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > > > 0x8000_0000          0x1_8000_0000 ...
> > > > > 0xc000_0000          0x1_c000_0000 ...
> > > > > 0x1_0000_0000        0x1_0000_0000 (No change)
> > > > 
> > > > So in this example, there are no PAs below 0x4000_0000 yet you later
> > > > add code to deal with that:
> > > > 
> > > > > +	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > > > > +	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > > > > +		paddr |= BIT_ULL(32);
> > > > 
> > > > Why? Mainline currently doesn't do anything like this for the "4gb mode"
> > > > support as far as I can tell. In fact, we currently unconditionally set
> > > > bit 32 in the physical address returned by iova_to_phys() which wouldn't
> > > > match your CPU PAs listed above, so I'm confused about how this is supposed
> > > > to work.
> > > 
> > > Actually current mainline have a bug for this. So I tried to use another
> > > special patch[1] for it in v8.
> > 
> > If you're fixing a bug in mainline, I'd prefer to see that as a separate
> > patch.
> > 
> > > But the issue is not critical since MediaTek multimedia consumer(v4l2
> > > and drm) don't call iommu_iova_to_phys currently.
> > > 
> > > > 
> > > > The way I would like this quirk to work is that the io-pgtable code
> > > > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > > > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > > > would then do the opposite when converting a pte to a physical address.
> > > > 
> > > > That way, your driver can call the page table code directly with the high
> > > > addresses and we don't have to do any manual offsetting or range checking
> > > > in the page table code.
> > > 
> > > In this case, the mt8183 can work successfully while the "4gb
> > > mode"(mt8173/mt2712) can not.
> > > 
> > > In the "4gb mode", As the remap relationship above, we should always add
> > > bit32 in pte as we did in [2]. and need add bit32 in the
> > > "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> > > flow:
> > > a. Always add bit32 in paddr_to_iopte.
> > > b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.
> > 
> > I think this is probably at the heart of my misunderstanding. What is so
> > special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
> > or something else?
> 
> SRAM and the HW registers.

Do we actually need to be able to map those in the IOMMU?

Will
Yong Wu (吴勇) Aug. 15, 2019, 10:18 a.m. UTC | #6
On Thu, 2019-08-15 at 10:51 +0100, Will Deacon wrote:
> On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> > On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > 
> > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > for all PTEs which means to enable bit32 of physical address. Here is
> > > > the detailed remap relationship in the "4GB mode":
> > > > CPU PA         ->    HW PA
> > > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > > 0x8000_0000          0x1_8000_0000 ...
> > > > 0xc000_0000          0x1_c000_0000 ...
> > > > 0x1_0000_0000        0x1_0000_0000 (No change)
> > > 
> > > So in this example, there are no PAs below 0x4000_0000 yet you later
> > > add code to deal with that:
> > > 
> > > > +	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
> > > > +	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
> > > > +		paddr |= BIT_ULL(32);
> > > 
> > > Why? Mainline currently doesn't do anything like this for the "4gb mode"
> > > support as far as I can tell. In fact, we currently unconditionally set
> > > bit 32 in the physical address returned by iova_to_phys() which wouldn't
> > > match your CPU PAs listed above, so I'm confused about how this is supposed
> > > to work.
> > 
> > Actually current mainline have a bug for this. So I tried to use another
> > special patch[1] for it in v8.
> 
> If you're fixing a bug in mainline, I'd prefer to see that as a separate
> patch.
> 
> > But the issue is not critical since MediaTek multimedia consumer(v4l2
> > and drm) don't call iommu_iova_to_phys currently.
> > 
> > > 
> > > The way I would like this quirk to work is that the io-pgtable code
> > > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > > would then do the opposite when converting a pte to a physical address.
> > > 
> > > That way, your driver can call the page table code directly with the high
> > > addresses and we don't have to do any manual offsetting or range checking
> > > in the page table code.
> > 
> > In this case, the mt8183 can work successfully while the "4gb
> > mode"(mt8173/mt2712) can not.
> > 
> > In the "4gb mode", As the remap relationship above, we should always add
> > bit32 in pte as we did in [2]. and need add bit32 in the
> > "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> > flow:
> > a. Always add bit32 in paddr_to_iopte.
> > b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.
> 
> I think this is probably at the heart of my misunderstanding. What is so
> special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
> or something else?

SRAM and HW register that IOMMU can not access.

(sorry, My mailbox has something wrong.)
Will Deacon Aug. 15, 2019, 11:50 a.m. UTC | #7
Ok, I think speaking to Robin helped me a bit with this...

On Thu, Aug 15, 2019 at 06:18:38PM +0800, Yong Wu wrote:
> On Thu, 2019-08-15 at 10:51 +0100, Will Deacon wrote:
> > On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> > > On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > > > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > > 
> > > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > > for all PTEs which means to enable bit32 of physical address. Here is
> > > > > the detailed remap relationship in the "4GB mode":
> > > > > CPU PA         ->    HW PA
> > > > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > > > 0x8000_0000          0x1_8000_0000 ...
> > > > > 0xc000_0000          0x1_c000_0000 ...
> > > > > 0x1_0000_0000        0x1_0000_0000 (No change)

[...]

> > > > The way I would like this quirk to work is that the io-pgtable code
> > > > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > > > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > > > would then do the opposite when converting a pte to a physical address.
> > > > 
> > > > That way, your driver can call the page table code directly with the high
> > > > addresses and we don't have to do any manual offsetting or range checking
> > > > in the page table code.
> > > 
> > > In this case, the mt8183 can work successfully while the "4gb
> > > mode"(mt8173/mt2712) can not.
> > > 
> > > In the "4gb mode", As the remap relationship above, we should always add
> > > bit32 in pte as we did in [2]. and need add bit32 in the
> > > "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> > > flow:
> > > a. Always add bit32 in paddr_to_iopte.
> > > b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.
> > 
> > I think this is probably at the heart of my misunderstanding. What is so
> > special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
> > or something else?
> 
> SRAM and HW register that IOMMU can not access.

Ok, so redrawing your table from above, I think we can say something like:


CPU Physical address
====================

0G	1G	2G	3G	4G	5G
|---A---|---B---|---C---|---D---|---E---|
+--I/O--+------------Memory-------------+


IOMMU output physical address
=============================

				4G	5G	6G	7G	8G
				|---E---|---B---|---C---|---D---|
				+------------Memory-------------+


Do you agree? If so, what happens to region 'A' (the I/O region) in the
IOMMU output physical address space. Is it accessible?

Anyway, I think it's the job of the driver to convert between the two
address spaces, so that:

  - On ->map(), bit 32 of the CPU physical address is set before calling
    into the iopgtable code

  - The result from ->iova_to_phys() should be the result from the
    iopgtable code, but with the top bit cleared for addresses over
    5G.

This assumes that:

  1. We're ok setting bit 9 in the ptes mapping region 'E'.
  2. The IOMMU page-table walker uses CPU physical addresses

Are those true?

Thanks,

Will
Yong Wu (吴勇) Aug. 16, 2019, 7:22 a.m. UTC | #8
On Thu, 2019-08-15 at 12:50 +0100, Will Deacon wrote:
> Ok, I think speaking to Robin helped me a bit with this...
> 
> On Thu, Aug 15, 2019 at 06:18:38PM +0800, Yong Wu wrote:
> > On Thu, 2019-08-15 at 10:51 +0100, Will Deacon wrote:
> > > On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> > > > On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > > > > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > > > 
> > > > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > > > for all PTEs which means to enable bit32 of physical address. Here is
> > > > > > the detailed remap relationship in the "4GB mode":
> > > > > > CPU PA         ->    HW PA
> > > > > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > > > > 0x8000_0000          0x1_8000_0000 ...
> > > > > > 0xc000_0000          0x1_c000_0000 ...
> > > > > > 0x1_0000_0000        0x1_0000_0000 (No change)
> 
> [...]
> 
> > > > > The way I would like this quirk to work is that the io-pgtable code
> > > > > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > > > > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > > > > would then do the opposite when converting a pte to a physical address.
> > > > > 
> > > > > That way, your driver can call the page table code directly with the high
> > > > > addresses and we don't have to do any manual offsetting or range checking
> > > > > in the page table code.
> > > > 
> > > > In this case, the mt8183 can work successfully while the "4gb
> > > > mode"(mt8173/mt2712) can not.
> > > > 
> > > > In the "4gb mode", As the remap relationship above, we should always add
> > > > bit32 in pte as we did in [2]. and need add bit32 in the
> > > > "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> > > > flow:
> > > > a. Always add bit32 in paddr_to_iopte.
> > > > b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.
> > > 
> > > I think this is probably at the heart of my misunderstanding. What is so
> > > special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
> > > or something else?
> > 
> > SRAM and HW register that IOMMU can not access.
> 
> Ok, so redrawing your table from above, I think we can say something like:
> 
> 
> CPU Physical address
> ====================
> 
> 0G	1G	2G	3G	4G	5G
> |---A---|---B---|---C---|---D---|---E---|
> +--I/O--+------------Memory-------------+
> 
> 
> IOMMU output physical address
> =============================
> 
> 				4G	5G	6G	7G	8G
> 				|---E---|---B---|---C---|---D---|
> 				+------------Memory-------------+
> 
> 
> Do you agree? 

Quite right.


> If so, what happens to region 'A' (the I/O region) in the
> IOMMU output physical address space. Is it accessible?

No. IOMMU can not access region 'A' above.

> 
> Anyway, I think it's the job of the driver to convert between the two
> address spaces, so that:
> 
>   - On ->map(), bit 32 of the CPU physical address is set before calling
>     into the iopgtable code
> 
>   - The result from ->iova_to_phys() should be the result from the
>     iopgtable code, but with the top bit cleared for addresses over
>     5G.
> 
> This assumes that:
> 
>   1. We're ok setting bit 9 in the ptes mapping region 'E'.
>   2. The IOMMU page-table walker uses CPU physical addresses
> 
> Are those true?

Yes. Then this patch would be close to the one[1] I sent in v8.

Do I need to split this patch into 2 ones?:
a).the pagetable code that support 34bit PA when MTK quirk is enabled.
It only has the symmetric code handle BIT32/BIT33. Besides, I will add
CONFIG_PHYS_ADDR_T_64BIT in the iopte_to_addr as commented before.

b) MTK code that apply the special "4gb mode" flow. And the "oas" will
always is 34 bit since v7s has already supported our case.

[1]http://lists.infradead.org/pipermail/linux-mediatek/2019-June/020991.html

> 
> Thanks,
> 
> Will
Will Deacon Aug. 19, 2019, 11:24 a.m. UTC | #9
On Fri, Aug 16, 2019 at 03:22:20PM +0800, Yong Wu wrote:
> On Thu, 2019-08-15 at 12:50 +0100, Will Deacon wrote:
> > Ok, I think speaking to Robin helped me a bit with this...
> > 
> > On Thu, Aug 15, 2019 at 06:18:38PM +0800, Yong Wu wrote:
> > > On Thu, 2019-08-15 at 10:51 +0100, Will Deacon wrote:
> > > > On Thu, Aug 15, 2019 at 04:47:49PM +0800, Yong Wu wrote:
> > > > > On Wed, 2019-08-14 at 15:41 +0100, Will Deacon wrote:
> > > > > > On Sat, Aug 10, 2019 at 03:58:08PM +0800, Yong Wu wrote:
> > > > > > > MediaTek extend the arm v7s descriptor to support the dram over 4GB.
> > > > > > > 
> > > > > > > In the mt2712 and mt8173, it's called "4GB mode", the physical address
> > > > > > > is from 0x4000_0000 to 0x1_3fff_ffff, but from EMI point of view, it
> > > > > > > is remapped to high address from 0x1_0000_0000 to 0x1_ffff_ffff, the
> > > > > > > bit32 is always enabled. thus, in the M4U, we always enable the bit9
> > > > > > > for all PTEs which means to enable bit32 of physical address. Here is
> > > > > > > the detailed remap relationship in the "4GB mode":
> > > > > > > CPU PA         ->    HW PA
> > > > > > > 0x4000_0000          0x1_4000_0000 (Add bit32)
> > > > > > > 0x8000_0000          0x1_8000_0000 ...
> > > > > > > 0xc000_0000          0x1_c000_0000 ...
> > > > > > > 0x1_0000_0000        0x1_0000_0000 (No change)
> > 
> > [...]
> > 
> > > > > > The way I would like this quirk to work is that the io-pgtable code
> > > > > > basically sets bit 9 in the pte when bit 32 is set in the physical address,
> > > > > > and sets bit 4 in the pte when bit 33 is set in the physical address. It
> > > > > > would then do the opposite when converting a pte to a physical address.
> > > > > > 
> > > > > > That way, your driver can call the page table code directly with the high
> > > > > > addresses and we don't have to do any manual offsetting or range checking
> > > > > > in the page table code.
> > > > > 
> > > > > In this case, the mt8183 can work successfully while the "4gb
> > > > > mode"(mt8173/mt2712) can not.
> > > > > 
> > > > > In the "4gb mode", As the remap relationship above, we should always add
> > > > > bit32 in pte as we did in [2]. and need add bit32 in the
> > > > > "iova_to_phys"(Not always add.). That means the "4gb mode" has a special
> > > > > flow:
> > > > > a. Always add bit32 in paddr_to_iopte.
> > > > > b. Add bit32 only when PA < 0x40000000 in iopte_to_paddr.
> > > > 
> > > > I think this is probably at the heart of my misunderstanding. What is so
> > > > special about PAs (is this HW PA or CPU PA?) below 0x40000000? Is this RAM
> > > > or something else?
> > > 
> > > SRAM and HW register that IOMMU can not access.
> > 
> > Ok, so redrawing your table from above, I think we can say something like:
> > 
> > 
> > CPU Physical address
> > ====================
> > 
> > 0G	1G	2G	3G	4G	5G
> > |---A---|---B---|---C---|---D---|---E---|
> > +--I/O--+------------Memory-------------+
> > 
> > 
> > IOMMU output physical address
> > =============================
> > 
> > 				4G	5G	6G	7G	8G
> > 				|---E---|---B---|---C---|---D---|
> > 				+------------Memory-------------+
> > 
> > 
> > Do you agree? 
> 
> Quite right.

Woohoo! So I finally got something right about this :) I'd be up for
including the diagrams above either in the commit message or in the IOMMU
driver code, along with a comment saying that region 'A' cannot be mapped
by the IOMMU and that the page-table walker uses CPU physical addresses.

> > If so, what happens to region 'A' (the I/O region) in the
> > IOMMU output physical address space. Is it accessible?
> 
> No. IOMMU can not access region 'A' above.

Got it. Thanks.

> > Anyway, I think it's the job of the driver to convert between the two
> > address spaces, so that:
> > 
> >   - On ->map(), bit 32 of the CPU physical address is set before calling
> >     into the iopgtable code
> > 
> >   - The result from ->iova_to_phys() should be the result from the
> >     iopgtable code, but with the top bit cleared for addresses over
> >     5G.
> > 
> > This assumes that:
> > 
> >   1. We're ok setting bit 9 in the ptes mapping region 'E'.
> >   2. The IOMMU page-table walker uses CPU physical addresses
> > 
> > Are those true?
> 
> Yes. Then this patch would be close to the one[1] I sent in v8.
> 
> Do I need to split this patch into 2 ones?:

Up to you. If you want to fix the current mainline behaviour of always
setting bit 4, then that should be a separate patch at the start of the
series which can be backported to stable. Is there a reason this doesn't go
wrong in practice?

> a).the pagetable code that support 34bit PA when MTK quirk is enabled.
> It only has the symmetric code handle BIT32/BIT33. Besides, I will add
> CONFIG_PHYS_ADDR_T_64BIT in the iopte_to_addr as commented before.

Hmm. I would prefer that the iopgtable code:

	* Range checks the paddr against the oas in ->map()
	* Refuses to accept an oas > 32 in ->alloc()

Then it's up to you whether you just want to pass an oas of 34 from the
IOMMU driver.

Will
diff mbox series

Patch

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 77cc1eb..ab12ef5 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -112,7 +112,9 @@ 
 #define ARM_V7S_TEX_MASK		0x7
 #define ARM_V7S_ATTR_TEX(val)		(((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
 
-#define ARM_V7S_ATTR_MTK_4GB		BIT(9) /* MTK extend it for 4GB mode */
+/* MediaTek extend the two bits for PA 32bit/33bit */
+#define ARM_V7S_ATTR_MTK_PA_BIT32	BIT(9)
+#define ARM_V7S_ATTR_MTK_PA_BIT33	BIT(4)
 
 /* *well, except for TEX on level 2 large pages, of course :( */
 #define ARM_V7S_CONT_PAGE_TEX_SHIFT	6
@@ -179,13 +181,22 @@  static dma_addr_t __arm_v7s_dma_addr(void *pages)
 static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
 				    struct io_pgtable_cfg *cfg)
 {
-	return paddr & ARM_V7S_LVL_MASK(lvl);
+	arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) {
+		if ((paddr & BIT_ULL(32)) || cfg->oas == ARM_V7S_MTK_4GB_OAS)
+			pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
+		if (paddr & BIT_ULL(33))
+			pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
+	}
+	return pte;
 }
 
 static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
 				  struct io_pgtable_cfg *cfg)
 {
 	arm_v7s_iopte mask;
+	phys_addr_t paddr;
 
 	if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
 		mask = ARM_V7S_TABLE_MASK;
@@ -194,7 +205,19 @@  static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
 	else
 		mask = ARM_V7S_LVL_MASK(lvl);
 
-	return pte & mask;
+	paddr = pte & mask;
+	if (cfg->oas == 32 || !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))
+		return paddr;
+
+	if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
+		paddr |= BIT_ULL(33);
+
+	/* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_0000.*/
+	if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x40000000UL)
+		paddr |= BIT_ULL(32);
+	else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
+		paddr |= BIT_ULL(32);
+	return paddr;
 }
 
 static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
@@ -315,9 +338,6 @@  static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
 	if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
 		pte |= ARM_V7S_ATTR_NS_SECTION;
 
-	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT)
-		pte |= ARM_V7S_ATTR_MTK_4GB;
-
 	return pte;
 }
 
@@ -731,7 +751,9 @@  static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 {
 	struct arm_v7s_io_pgtable *data;
 
-	if (cfg->ias > ARM_V7S_ADDR_BITS || cfg->oas > ARM_V7S_ADDR_BITS)
+	if (cfg->ias > ARM_V7S_ADDR_BITS ||
+	    (cfg->oas > ARM_V7S_ADDR_BITS &&
+	     !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT)))
 		return NULL;
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 8300489..1a37db0 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -263,16 +263,20 @@  static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
 	dom->cfg = (struct io_pgtable_cfg) {
 		.quirks = IO_PGTABLE_QUIRK_ARM_NS |
 			IO_PGTABLE_QUIRK_NO_PERMS |
-			IO_PGTABLE_QUIRK_TLBI_ON_MAP,
+			IO_PGTABLE_QUIRK_TLBI_ON_MAP |
+			IO_PGTABLE_QUIRK_ARM_MTK_EXT,
 		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
 		.ias = 32,
-		.oas = 32,
 		.tlb = &mtk_iommu_gather_ops,
 		.iommu_dev = data->dev,
 	};
 
-	if (data->enable_4GB)
-		dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_EXT;
+	if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT))
+		dom->cfg.oas = 32;
+	else if (data->enable_4GB)
+		dom->cfg.oas = ARM_V7S_MTK_4GB_OAS;
+	else
+		dom->cfg.oas = 34;
 
 	dom->iop = alloc_io_pgtable_ops(ARM_V7S, &dom->cfg, data);
 	if (!dom->iop) {
@@ -363,8 +367,7 @@  static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
 	int ret;
 
 	spin_lock_irqsave(&dom->pgtlock, flags);
-	ret = dom->iop->map(dom->iop, iova, paddr & DMA_BIT_MASK(32),
-			    size, prot);
+	ret = dom->iop->map(dom->iop, iova, paddr, size, prot);
 	spin_unlock_irqrestore(&dom->pgtlock, flags);
 
 	return ret;
@@ -393,7 +396,6 @@  static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
 					  dma_addr_t iova)
 {
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
-	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
 	unsigned long flags;
 	phys_addr_t pa;
 
@@ -401,9 +403,6 @@  static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
 	pa = dom->iop->iova_to_phys(dom->iop, iova);
 	spin_unlock_irqrestore(&dom->pgtlock, flags);
 
-	if (data->enable_4GB)
-		pa |= BIT_ULL(32);
-
 	return pa;
 }
 
@@ -594,6 +593,8 @@  static int mtk_iommu_probe(struct platform_device *pdev)
 
 	/* Whether the current dram is over 4GB */
 	data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
+	if (!data->plat_data->has_4gb_mode)
+		data->enable_4GB = false;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	data->base = devm_ioremap_resource(dev, res);
@@ -734,10 +735,12 @@  static int __maybe_unused mtk_iommu_resume(struct device *dev)
 
 static const struct mtk_iommu_plat_data mt2712_data = {
 	.m4u_plat     = M4U_MT2712,
+	.has_4gb_mode = true,
 };
 
 static const struct mtk_iommu_plat_data mt8173_data = {
 	.m4u_plat     = M4U_MT8173,
+	.has_4gb_mode = true,
 };
 
 static const struct of_device_id mtk_iommu_of_ids[] = {
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 9725b08..c281c01 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -34,6 +34,7 @@  enum mtk_iommu_plat {
 
 struct mtk_iommu_plat_data {
 	enum mtk_iommu_plat m4u_plat;
+	bool                has_4gb_mode;
 };
 
 struct mtk_iommu_domain;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 915fb73..2733739 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -65,10 +65,9 @@  struct io_pgtable_cfg {
 	 *	(unmapped) entries but the hardware might do so anyway, perform
 	 *	TLB maintenance when mapping as well as when unmapping.
 	 *
-	 * IO_PGTABLE_QUIRK_ARM_MTK_EXT: (ARM v7s format) Set bit 9 in all
-	 *	PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
-	 *	when the SoC is in "4GB mode" and they can only access the high
-	 *	remap of DRAM (0x1_00000000 to 0x1_ffffffff).
+	 * IO_PGTABLE_QUIRK_ARM_MTK_EXT: (ARM v7s format) MediaTek IOMMUs extend
+	 *	to support up to 34 bits PA where the bit32 and bit33 are
+	 *	encoded in the bit9 and bit4 of the PTE respectively.
 	 *
 	 * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
 	 *	on unmap, for DMA domains using the flush queue mechanism for
@@ -114,6 +113,8 @@  struct io_pgtable_cfg {
 	};
 };
 
+#define ARM_V7S_MTK_4GB_OAS			33
+
 /**
  * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
  *