mbox series

[0/3] make MSI IOVA base address and its length configurable

Message ID 20250116232307.1436693-1-shyamsaini@linux.microsoft.com (mailing list archive)
Headers show
Series make MSI IOVA base address and its length configurable | expand

Message

Shyam Saini Jan. 16, 2025, 11:23 p.m. UTC
Hi,

Currently, the MSI_IOVA_BASE address is hard-coded to 0x80000000,
assuming that all platforms have this address available for MSI IOVA
reservation. However, this is not always the case, as some platforms
reserve this address for other purposes. Consequently, these platforms
cannot reserve the MSI_IOVA_BASE address for MSI.

There was an [1] attempt to fix this problem by passing the MSI IOVA
base as a kernel command line parameter. In the previous attempt,
Will suggested reserving the MSI IOVA at runtime whenever there is a
conflict with the default MSI_IOVA_BASE. However, dynamically reserving
this address has debuggability concerns, as it becomes difficult to
track IOMMU mapping failures.

This patch series aims to address the issue by introducing a new DTS
property, "arm,smmu-pci-msi-iova-data". This property allows the
configuration of MSI IOVA with a custom MSI base address and a custom
length for IOMMU/SMMU drivers. It accommodates platforms that do not
have the default MSI base address available for MSI reservation.

[1]: https://lore.kernel.org/lkml/20200914181307.117792-1-vemegava@linux.microsoft.com/

Thanks,
Shyam

Shyam Saini (3):
  dt-bindings: iommu: add "arm,smmu-pci-msi-iova-data" property
  iommu: consolidate MSI_IOVA macro definitions
  arm-smmu: use dts passed MSI IOVA address and length

 .../bindings/iommu/arm,smmu-v3.yaml           | 12 +++++
 .../devicetree/bindings/iommu/arm,smmu.yaml   | 12 +++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 10 ++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  3 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c         | 11 +++--
 drivers/iommu/virtio-iommu.c                  |  8 ++--
 include/linux/iommu.h                         | 44 +++++++++++++++++++
 7 files changed, 86 insertions(+), 14 deletions(-)

Comments

Jason Gunthorpe Jan. 20, 2025, 2:26 p.m. UTC | #1
On Thu, Jan 16, 2025 at 03:23:04PM -0800, Shyam Saini wrote:
> Hi,
> 
> Currently, the MSI_IOVA_BASE address is hard-coded to 0x80000000,
> assuming that all platforms have this address available for MSI IOVA
> reservation. However, this is not always the case, as some platforms
> reserve this address for other purposes.

Can you explain this some more? This address is in the kernel
controlled IOVA space, there are few ways a platform can impact this.

How is the platform impacting it? Is the non-functional IOVA always
reflected in the iommu_get_resv_regions()?

Why not avoid this conflict in your platform software?

> There was an [1] attempt to fix this problem by passing the MSI IOVA
> base as a kernel command line parameter. 

Yuk

> In the previous attempt,
> Will suggested reserving the MSI IOVA at runtime whenever there is a
> conflict with the default MSI_IOVA_BASE. However, dynamically reserving
> this address has debuggability concerns, as it becomes difficult to
> track IOMMU mapping failures.

Still, this approach seems like the best to me..

> This patch series aims to address the issue by introducing a new DTS
> property, "arm,smmu-pci-msi-iova-data". This property allows the
> configuration of MSI IOVA with a custom MSI base address and a custom
> length for IOMMU/SMMU drivers. It accommodates platforms that do not
> have the default MSI base address available for MSI reservation.

My understand was using DT to set kernel configurables was frowned
upon? Ultimately MSI_IOVA_BASE is an arbitary choice by kernel
software.

Jason
Jacob Pan Jan. 21, 2025, 9:49 p.m. UTC | #2
Hi Jason,

On Mon, 20 Jan 2025 10:26:43 -0400
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Thu, Jan 16, 2025 at 03:23:04PM -0800, Shyam Saini wrote:
> > Hi,
> > 
> > Currently, the MSI_IOVA_BASE address is hard-coded to 0x80000000,
> > assuming that all platforms have this address available for MSI IOVA
> > reservation. However, this is not always the case, as some platforms
> > reserve this address for other purposes.  
> 
> Can you explain this some more? This address is in the kernel
> controlled IOVA space, there are few ways a platform can impact this.
> 
> How is the platform impacting it? Is the non-functional IOVA always
> reflected in the iommu_get_resv_regions()?

I don't know the platform impact but just to clarify, are you asking
whether this non-functional IOVA is also under IORT RMR or other FW
tables? I don't think it is.

But this special IOVA is reflected in iommu_get_resv_regions() the same
way as the hardcoded MSI_IOVA_BASE. So each iommu group's
reserved_regions should show.

> Why not avoid this conflict in your platform software?
I had the same question but it seems there is not enough difference
(than the standard smmu) to justify a platform code. i.e. platform
specific iommu_get_resv_regions(), is that what you are suggesting?

> > There was an [1] attempt to fix this problem by passing the MSI IOVA
> > base as a kernel command line parameter.   
> 
> Yuk
> 
> > In the previous attempt,
> > Will suggested reserving the MSI IOVA at runtime whenever there is a
> > conflict with the default MSI_IOVA_BASE. However, dynamically
> > reserving this address has debuggability concerns, as it becomes
> > difficult to track IOMMU mapping failures.  
> 
> Still, this approach seems like the best to me..
> 
> > This patch series aims to address the issue by introducing a new DTS
> > property, "arm,smmu-pci-msi-iova-data". This property allows the
> > configuration of MSI IOVA with a custom MSI base address and a
> > custom length for IOMMU/SMMU drivers. It accommodates platforms
> > that do not have the default MSI base address available for MSI
> > reservation.  
> 
> My understand was using DT to set kernel configurables was frowned
> upon? Ultimately MSI_IOVA_BASE is an arbitary choice by kernel
> software.
> 
> Jason
Jason Gunthorpe Jan. 22, 2025, 12:19 a.m. UTC | #3
On Tue, Jan 21, 2025 at 01:49:10PM -0800, Jacob Pan wrote:

> > On Thu, Jan 16, 2025 at 03:23:04PM -0800, Shyam Saini wrote:
> > > Hi,
> > > 
> > > Currently, the MSI_IOVA_BASE address is hard-coded to 0x80000000,
> > > assuming that all platforms have this address available for MSI IOVA
> > > reservation. However, this is not always the case, as some platforms
> > > reserve this address for other purposes.  
> > 
> > Can you explain this some more? This address is in the kernel
> > controlled IOVA space, there are few ways a platform can impact this.
> > 
> > How is the platform impacting it? Is the non-functional IOVA always
> > reflected in the iommu_get_resv_regions()?
> 
> I don't know the platform impact but just to clarify, are you asking
> whether this non-functional IOVA is also under IORT RMR or other FW
> tables? I don't think it is.

No, I'm asking how can you possibly have a HW platform where
MSI_IOVA_BASE is unable to be used for DMA?

MSI_IOVA_BASE is 128M, and most ARM platforms put DRAM starting at
0. Most ARM VMMs put DRAM starting at 0 too.

So a platform saying that DMA to 128M doesn't work is pretty broken,
to the point it is hard to believe there is a HW issue at work here?

> But this special IOVA is reflected in iommu_get_resv_regions() the same
> way as the hardcoded MSI_IOVA_BASE. So each iommu group's
> reserved_regions should show.

That's great

> > Why not avoid this conflict in your platform software?
> I had the same question but it seems there is not enough difference
> (than the standard smmu) to justify a platform code. i.e. platform
> specific iommu_get_resv_regions(), is that what you are suggesting?

And here I mean, why not stop marking it reserved in the ACPI/DT
inside your firwmare or hypervisor?

This smells like some SW component using the same address Linux uses
for some odd purpose. Just change it and let Linux keep using the
address it wants?

Jason
Shyam Saini Jan. 30, 2025, 11:21 p.m. UTC | #4
Hi Jason,

Apologies for delayed reponse,

> On Tue, Jan 21, 2025 at 01:49:10PM -0800, Jacob Pan wrote:
> 
> > > On Thu, Jan 16, 2025 at 03:23:04PM -0800, Shyam Saini wrote:
> > > > Hi,
> > > > 
> > > > Currently, the MSI_IOVA_BASE address is hard-coded to 0x80000000,
> > > > assuming that all platforms have this address available for MSI IOVA
> > > > reservation. However, this is not always the case, as some platforms
> > > > reserve this address for other purposes.  
> > > 
> > > Can you explain this some more? This address is in the kernel
> > > controlled IOVA space, there are few ways a platform can impact this.
> > > 
> > > How is the platform impacting it? Is the non-functional IOVA always
> > > reflected in the iommu_get_resv_regions()?
> > 
> > I don't know the platform impact but just to clarify, are you asking
> > whether this non-functional IOVA is also under IORT RMR or other FW
> > tables? I don't think it is.
> 
> No, I'm asking how can you possibly have a HW platform where
> MSI_IOVA_BASE is unable to be used for DMA?
> 
> MSI_IOVA_BASE is 128M, and most ARM platforms put DRAM starting at
> 0. Most ARM VMMs put DRAM starting at 0 too.
> 
> So a platform saying that DMA to 128M doesn't work is pretty broken,
> to the point it is hard to believe there is a HW issue at work here?

Correct, this is limitation with our HW.
Since we can't fix it in hardware, we would need to fix it in Linux.

> > But this special IOVA is reflected in iommu_get_resv_regions() the same
> > way as the hardcoded MSI_IOVA_BASE. So each iommu group's
> > reserved_regions should show.
> 
> That's great
> 
> > > Why not avoid this conflict in your platform software?
> > I had the same question but it seems there is not enough difference
> > (than the standard smmu) to justify a platform code. i.e. platform
> > specific iommu_get_resv_regions(), is that what you are suggesting?
> 
> And here I mean, why not stop marking it reserved in the ACPI/DT
> inside your firwmare or hypervisor?

even if we reserve it in dts we would still need some address reserved for MSI IOVA

> This smells like some SW component using the same address Linux uses
> for some odd purpose. Just change it and let Linux keep using the
> address it wants?

Unfortunately, it is an HW issue.

Are you okay with this passing custom MSI_IOVA via DTS approach ?

Thanks,
Shyam
Jason Gunthorpe Jan. 31, 2025, 12:36 a.m. UTC | #5
On Thu, Jan 30, 2025 at 03:21:37PM -0800, Shyam Saini wrote:

> Unfortunately, it is an HW issue.

Well, that's pretty bad to have built HW that can't DMA to low
addresses at all.. But OK.
 
> Are you okay with this passing custom MSI_IOVA via DTS approach ?

It isn't up to me, but I've understood the DT maintainers would reject
this as it isn't is describing HW but just a random Linux software
knob.

I think you should make selecting the sw_msi dynamic in Linux.

Jason