diff mbox series

[v2,4/8] ACPI/IORT: Support CANWBS memory access flag

Message ID 4-v2-621370057090+91fec-smmuv3_nesting_jgg@nvidia.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series Initial support for SMMUv3 nested translation | expand

Commit Message

Jason Gunthorpe Aug. 27, 2024, 3:51 p.m. UTC
From: Nicolin Chen <nicolinc@nvidia.com>

The IORT spec, Issue E.f (April 2024), adds a new CANWBS bit to the Memory
Access Flag field in the Memory Access Properties table, mainly for a PCI
Root Complex.

This CANWBS defines the coherency of memory accesses to be not marked IOWB
cacheable/shareable. Its value further implies the coherency impact from a
pair of mismatched memory attributes (e.g. in a nested translation case):
  0x0: Use of mismatched memory attributes for accesses made by this
       device may lead to a loss of coherency.
  0x1: Coherency of accesses made by this device to locations in
       Conventional memory are ensured as follows, even if the memory
       attributes for the accesses presented by the device or provided by
       the SMMU are different from Inner and Outer Write-back cacheable,
       Shareable.

Note that the loss of coherency on a CANWBS-unsupported HW typically could
occur to an SMMU that doesn't implement the S2FWB feature where additional
cache flush operations would be required to prevent that from happening.

Add a new ACPI_IORT_MF_CANWBS flag and set IOMMU_FWSPEC_PCI_RC_CANWBS upon
the presence of this new flag.

CANWBS and S2FWB are similar features, in that they both guarantee the VM
can not violate coherency, however S2FWB can be bypassed by PCI No Snoop
TLPs, while CANWBS cannot. Thus CANWBS meets the requirements to set
IOMMU_CAP_ENFORCE_CACHE_COHERENCY.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/acpi/arm64/iort.c | 13 +++++++++++++
 include/linux/iommu.h     |  2 ++
 2 files changed, 15 insertions(+)

Comments

Tian, Kevin Aug. 30, 2024, 7:52 a.m. UTC | #1
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, August 27, 2024 11:52 PM
> 
> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> The IORT spec, Issue E.f (April 2024), adds a new CANWBS bit to the Memory
> Access Flag field in the Memory Access Properties table, mainly for a PCI
> Root Complex.
> 
> This CANWBS defines the coherency of memory accesses to be not marked
> IOWB
> cacheable/shareable. Its value further implies the coherency impact from a
> pair of mismatched memory attributes (e.g. in a nested translation case):
>   0x0: Use of mismatched memory attributes for accesses made by this
>        device may lead to a loss of coherency.
>   0x1: Coherency of accesses made by this device to locations in
>        Conventional memory are ensured as follows, even if the memory
>        attributes for the accesses presented by the device or provided by
>        the SMMU are different from Inner and Outer Write-back cacheable,
>        Shareable.
> 
> Note that the loss of coherency on a CANWBS-unsupported HW typically could
> occur to an SMMU that doesn't implement the S2FWB feature where additional
> cache flush operations would be required to prevent that from happening.
> 
> Add a new ACPI_IORT_MF_CANWBS flag and set
> IOMMU_FWSPEC_PCI_RC_CANWBS upon
> the presence of this new flag.
> 
> CANWBS and S2FWB are similar features, in that they both guarantee the VM
> can not violate coherency, however S2FWB can be bypassed by PCI No Snoop
> TLPs, while CANWBS cannot. Thus CANWBS meets the requirements to set
> IOMMU_CAP_ENFORCE_CACHE_COHERENCY.
> 

I'm confused here. It is clear that we need a mechanism via which the VM
cannot bypass the cache, before Yan's series comes to relax.

But according to above description S2FWB cannot 100% guarantee it
due to PCI No Snoop. Does it suggest that we should only allow nesting
only for CANWBS, or disable/hide PCI No Snoop cap from the guest
in case of S2FWB?
Jason Gunthorpe Aug. 30, 2024, 1:54 p.m. UTC | #2
On Fri, Aug 30, 2024 at 07:52:41AM +0000, Tian, Kevin wrote:

> But according to above description S2FWB cannot 100% guarantee it
> due to PCI No Snoop. Does it suggest that we should only allow nesting
> only for CANWBS, or disable/hide PCI No Snoop cap from the guest
> in case of S2FWB?

ARM has always had an issue with no-snoop and VFIO. The ARM
expectation is that VFIO/VMM would block no-snoop in the PCI config
space.

From a VM perspective, any VMM on ARM has to take care to do this
today already.

For instance a VMM could choose to only assign devices which never use
no-snoop, which describes almost all of what people actually do :)

The purpose of S2FWB is to keep that approach working. If the VMM has
blocked no-snoop then S2FWB ensures that the VM can't use IOPTE bits
to break cachability and it remains safe.

From a VFIO perspective ARM has always had a security hole similer to
what Yan is trying to fix on Intel, that is a separate pre-existing
topic. Ideally the VFIO kernel would block PCI config space no-snoop
for alot of cases.

Jason
Tian, Kevin Sept. 3, 2024, 7:14 a.m. UTC | #3
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, August 30, 2024 9:55 PM
> 
> On Fri, Aug 30, 2024 at 07:52:41AM +0000, Tian, Kevin wrote:
> 
> > But according to above description S2FWB cannot 100% guarantee it
> > due to PCI No Snoop. Does it suggest that we should only allow nesting
> > only for CANWBS, or disable/hide PCI No Snoop cap from the guest
> > in case of S2FWB?
> 
> ARM has always had an issue with no-snoop and VFIO. The ARM
> expectation is that VFIO/VMM would block no-snoop in the PCI config
> space.
> 
> From a VM perspective, any VMM on ARM has to take care to do this
> today already.
> 
> For instance a VMM could choose to only assign devices which never use
> no-snoop, which describes almost all of what people actually do :)
> 
> The purpose of S2FWB is to keep that approach working. If the VMM has
> blocked no-snoop then S2FWB ensures that the VM can't use IOPTE bits
> to break cachability and it remains safe.
> 
> From a VFIO perspective ARM has always had a security hole similer to
> what Yan is trying to fix on Intel, that is a separate pre-existing
> topic. Ideally the VFIO kernel would block PCI config space no-snoop
> for alot of cases.
> 

Make sense. It'd be helpful putting some words in the commit msg too.
diff mbox series

Patch

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 1b39e9ae7ac178..52f5836fa888db 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1218,6 +1218,17 @@  static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
 	return pci_rc->ats_attribute & ACPI_IORT_ATS_SUPPORTED;
 }
 
+static bool iort_pci_rc_supports_canwbs(struct acpi_iort_node *node)
+{
+	struct acpi_iort_memory_access *memory_access;
+	struct acpi_iort_root_complex *pci_rc;
+
+	pci_rc = (struct acpi_iort_root_complex *)node->node_data;
+	memory_access =
+		(struct acpi_iort_memory_access *)&pci_rc->memory_properties;
+	return memory_access->memory_flags & ACPI_IORT_MF_CANWBS;
+}
+
 static int iort_iommu_xlate(struct device *dev, struct acpi_iort_node *node,
 			    u32 streamid)
 {
@@ -1335,6 +1346,8 @@  int iort_iommu_configure_id(struct device *dev, const u32 *id_in)
 		fwspec = dev_iommu_fwspec_get(dev);
 		if (fwspec && iort_pci_rc_supports_ats(node))
 			fwspec->flags |= IOMMU_FWSPEC_PCI_RC_ATS;
+		if (fwspec && iort_pci_rc_supports_canwbs(node))
+			fwspec->flags |= IOMMU_FWSPEC_PCI_RC_CANWBS;
 	} else {
 		node = iort_scan_node(ACPI_IORT_NODE_NAMED_COMPONENT,
 				      iort_match_node_callback, dev);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 15d7657509f662..d1660ec23f263b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -993,6 +993,8 @@  struct iommu_fwspec {
 
 /* ATS is supported */
 #define IOMMU_FWSPEC_PCI_RC_ATS			(1 << 0)
+/* CANWBS is supported */
+#define IOMMU_FWSPEC_PCI_RC_CANWBS		(1 << 1)
 
 /*
  * An iommu attach handle represents a relationship between an iommu domain