diff mbox series

[6/8] iommu/arm-smmu-v3: Implement IOMMU_HWPT_ALLOC_NEST_PARENT

Message ID 6-v1-54e734311a7f+14f72-smmuv3_nesting_jgg@nvidia.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series Initial support for SMMUv3 nested translation | expand

Commit Message

Jason Gunthorpe Aug. 6, 2024, 11:41 p.m. UTC
For SMMUv3 the parent must be a S2 domain, which can be composed
into a IOMMU_DOMAIN_NESTED.

In future the S2 parent will also need a VMID linked to the VIOMMU and
even to KVM.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Robin Murphy Aug. 9, 2024, 3:06 p.m. UTC | #1
On 2024-08-07 12:41 am, Jason Gunthorpe wrote:
> For SMMUv3 the parent must be a S2 domain, which can be composed
> into a IOMMU_DOMAIN_NESTED.
> 
> In future the S2 parent will also need a VMID linked to the VIOMMU and
> even to KVM.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 6bbe4aa7b9511c..5faaccef707ef1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3103,7 +3103,8 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>   			   const struct iommu_user_data *user_data)
>   {
>   	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> -	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> +	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
> +				 IOMMU_HWPT_ALLOC_NEST_PARENT;
>   	struct arm_smmu_domain *smmu_domain;
>   	int ret;
>   
> @@ -3116,6 +3117,14 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>   	if (!smmu_domain)
>   		return ERR_PTR(-ENOMEM);
>   
> +	if (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) {
> +		if (!(master->smmu->features & ARM_SMMU_FEAT_TRANS_S2)) {

Nope, nesting needs to rely on FEAT_NESTING, that's why it exists. S2 
alone isn't sufficient - without S1 there's nothing to expose to 
userspace, so zero point in having a "nested" domain with nothing to 
nest into it - but furthermore we need S2 *without* unsafe broken TLBs.

Thanks,
Robin.

> +			ret = -EOPNOTSUPP;
> +			goto err_free;
> +		}
> +		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> +	}
> +
>   	smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED;
>   	smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops;
>   	ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, flags);
Jason Gunthorpe Aug. 9, 2024, 4:09 p.m. UTC | #2
On Fri, Aug 09, 2024 at 04:06:22PM +0100, Robin Murphy wrote:
> On 2024-08-07 12:41 am, Jason Gunthorpe wrote:
> > For SMMUv3 the parent must be a S2 domain, which can be composed
> > into a IOMMU_DOMAIN_NESTED.
> > 
> > In future the S2 parent will also need a VMID linked to the VIOMMU and
> > even to KVM.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++++-
> >   1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 6bbe4aa7b9511c..5faaccef707ef1 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -3103,7 +3103,8 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> >   			   const struct iommu_user_data *user_data)
> >   {
> >   	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > -	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
> > +	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
> > +				 IOMMU_HWPT_ALLOC_NEST_PARENT;
> >   	struct arm_smmu_domain *smmu_domain;
> >   	int ret;
> > @@ -3116,6 +3117,14 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
> >   	if (!smmu_domain)
> >   		return ERR_PTR(-ENOMEM);
> > +	if (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) {
> > +		if (!(master->smmu->features & ARM_SMMU_FEAT_TRANS_S2)) {
> 
> Nope, nesting needs to rely on FEAT_NESTING, that's why it exists. S2 alone
> isn't sufficient - without S1 there's nothing to expose to userspace, so
> zero point in having a "nested" domain with nothing to nest into it - but
> furthermore we need S2 *without* unsafe broken TLBs.

I do tend to agree we should fail earlier if IOMMU_DOMAIN_NESTED is
not possible so let's narrow it.

However, the above was matching how the driver already worked (ie the
old arm_smmu_enable_nesting()) where just asking for a normal S2 was
gated only by FEAT_S2.

This does add a CMDQ_OP_TLBI_NH_ALL, but I didn't think that hit an
errata?

The nesting specific stuff that touches things that FEAT_NESTING
covers in the driver is checked here:

static struct iommu_domain *
arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags,
			      struct iommu_domain *parent,
			      const struct iommu_user_data *user_data)
{
	if (!(master->smmu->features & ARM_SMMU_FEAT_NESTING))
		return ERR_PTR(-EOPNOTSUPP);

Which prevents creating a IOMMU_DOMAIN_NESTED, meaning you can't get a
CD table on top of the S2 or issue any S1 invalidations.

Thanks,
Jason
Robin Murphy Aug. 9, 2024, 6:34 p.m. UTC | #3
On 2024-08-09 5:09 pm, Jason Gunthorpe wrote:
> On Fri, Aug 09, 2024 at 04:06:22PM +0100, Robin Murphy wrote:
>> On 2024-08-07 12:41 am, Jason Gunthorpe wrote:
>>> For SMMUv3 the parent must be a S2 domain, which can be composed
>>> into a IOMMU_DOMAIN_NESTED.
>>>
>>> In future the S2 parent will also need a VMID linked to the VIOMMU and
>>> even to KVM.
>>>
>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>> ---
>>>    drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ++++++++++-
>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index 6bbe4aa7b9511c..5faaccef707ef1 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -3103,7 +3103,8 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>>>    			   const struct iommu_user_data *user_data)
>>>    {
>>>    	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
>>> -	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
>>> +	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
>>> +				 IOMMU_HWPT_ALLOC_NEST_PARENT;
>>>    	struct arm_smmu_domain *smmu_domain;
>>>    	int ret;
>>> @@ -3116,6 +3117,14 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
>>>    	if (!smmu_domain)
>>>    		return ERR_PTR(-ENOMEM);
>>> +	if (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) {
>>> +		if (!(master->smmu->features & ARM_SMMU_FEAT_TRANS_S2)) {
>>
>> Nope, nesting needs to rely on FEAT_NESTING, that's why it exists. S2 alone
>> isn't sufficient - without S1 there's nothing to expose to userspace, so
>> zero point in having a "nested" domain with nothing to nest into it - but
>> furthermore we need S2 *without* unsafe broken TLBs.
> 
> I do tend to agree we should fail earlier if IOMMU_DOMAIN_NESTED is
> not possible so let's narrow it.
> 
> However, the above was matching how the driver already worked (ie the
> old arm_smmu_enable_nesting()) where just asking for a normal S2 was
> gated only by FEAT_S2.

Ohhhh, I see, so actually the same old subtlety is still there - 
ALLOC_NEST_PARENT isn't a definite "allocate the parent domain for my 
nested setup", it's "allocate a domain which will be capable of being 
upgraded to nesting later *if* I choose to do so". Is the intent that 
someone could still use this if they had no intention of nesting but 
just wanted to ensure S2 format for their single stage of translation 
for some reason? It remains somewhat confusing since S2 domains on 
S2-only SMMUs are still fundamentally incapable of ever becoming a 
nested parent, but admittedly I'm struggling to think of a name which 
would be more accurate while still generic, so maybe it's OK...

> This does add a CMDQ_OP_TLBI_NH_ALL, but I didn't think that hit an
> errata?

Indeed, all the really nasty errata depend on both stages being active 
(such that S1 translation requests interact with concurrent S2 
invalidations)

Thanks,
Robin.

> The nesting specific stuff that touches things that FEAT_NESTING
> covers in the driver is checked here:
> 
> static struct iommu_domain *
> arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags,
> 			      struct iommu_domain *parent,
> 			      const struct iommu_user_data *user_data)
> {
> 	if (!(master->smmu->features & ARM_SMMU_FEAT_NESTING))
> 		return ERR_PTR(-EOPNOTSUPP);
> 
> Which prevents creating a IOMMU_DOMAIN_NESTED, meaning you can't get a
> CD table on top of the S2 or issue any S1 invalidations.
> 
> Thanks,
> Jason
Jason Gunthorpe Aug. 13, 2024, 2:35 p.m. UTC | #4
On Fri, Aug 09, 2024 at 07:34:20PM +0100, Robin Murphy wrote:

> > However, the above was matching how the driver already worked (ie the
> > old arm_smmu_enable_nesting()) where just asking for a normal S2 was
> > gated only by FEAT_S2.
> 
> Ohhhh, I see, so actually the same old subtlety is still there -
> ALLOC_NEST_PARENT isn't a definite "allocate the parent domain for my nested
> setup", it's "allocate a domain which will be capable of being upgraded to
> nesting later *if* I choose to do so". 

Yes. All PAGING type domains are expected to be able to be attached
nakedly without creating the DOMAIN_NESTED.

> Is the intent that someone could still use this if they had no
> intention of nesting but just wanted to ensure S2 format for their
> single stage of translation for some reason?

Sort of, yes..

When booting a VM with DMA default to bypass there are two flows for
the time before the vIOMMU is enabled.

The first flow is to allocate the S2 NESTING_PARENT and attach it
directly to the RID. This is a normal S2 paging domain. The VMM would
later switch to a DOMAIN_NESTED (maybe with bypass) when the vIOMMU is
enabled by the VM and the vSTEs are parsed.

The second flow, which is probably going to be the better way, is the
VMM will create a DOMAIN_NESTED with a bypass vSTE and attach that
instead of directly attaching the S2.

When we worked through VIOMMU it turned out we always want the
DOMAIN_NESTED to be the attached domain and the bypass/abort cases are
handled through vSTE settings instead of naked domains. This allows
the VIOMMU object to always be associated with the SID which is how we
will link the event queues, the VMID and so on.

> It remains somewhat confusing since S2 domains on S2-only SMMUs are
> still fundamentally incapable of ever becoming a nested parent, but
> admittedly I'm struggling to think of a name which would be more
> accurate while still generic, so maybe it's OK...

I think for now we can block it, as we know no use case to request
NESTING_PARENT without intending to go on and create a DOMAIN_NESTED.

Someday we may want to allow userspace to specify the page table
parameters more exactly and that might be a good API to also force a
S2 if someone has a (performance?) use case for S2 outside of nesting.

Jason
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6bbe4aa7b9511c..5faaccef707ef1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3103,7 +3103,8 @@  arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
 			   const struct iommu_user_data *user_data)
 {
 	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
-	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
+	const u32 PAGING_FLAGS = IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
+				 IOMMU_HWPT_ALLOC_NEST_PARENT;
 	struct arm_smmu_domain *smmu_domain;
 	int ret;
 
@@ -3116,6 +3117,14 @@  arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
 	if (!smmu_domain)
 		return ERR_PTR(-ENOMEM);
 
+	if (flags & IOMMU_HWPT_ALLOC_NEST_PARENT) {
+		if (!(master->smmu->features & ARM_SMMU_FEAT_TRANS_S2)) {
+			ret = -EOPNOTSUPP;
+			goto err_free;
+		}
+		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+	}
+
 	smmu_domain->domain.type = IOMMU_DOMAIN_UNMANAGED;
 	smmu_domain->domain.ops = arm_smmu_ops.default_domain_ops;
 	ret = arm_smmu_domain_finalise(smmu_domain, master->smmu, flags);