mbox series

[v5,00/18] SMMUv3 nested translation support

Message ID 20240715084519.1189624-1-smostafa@google.com (mailing list archive)
Headers show
Series SMMUv3 nested translation support | expand

Message

Mostafa Saleh July 15, 2024, 8:45 a.m. UTC
Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
but not nested instances.
This patch series adds support for nested translation in SMMUv3,
this is controlled by property “arm-smmuv3.stage=nested”, and
advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)

Main changes(architecture):
============================
1) CDs are considered IPA and translated with stage-2.
2) TTBx and tables for stage-1 are considered IPA and translated
   with stage-2.
3) Translate the IPA address with stage-2.

TLBs:
======
TLBs are the most tricky part.

1) General design
   Unified(Combined) design is used, where entries with ASID=-1 are
   IPAs(cached from stage-2 config)

   TLBs are also modified to cache 2 permissions, a new permission added
   "parent_perm."

   For non-nested configuration, perm == parent_perm and nothing
   changes. This is used to know which stage to use in case there is
   a permission fault from a TLB entry.

2) Caching in TLB
   Stage-1 and stage-2 are inserted in the TLB as is.
   For nested translation, both entries are combined into one TLB
   entry. The size (level and granule) are chosen from the smallest entries.
   That means that a stage-1 translation can be cached with sage-2
   granule in key, this is taken into account for lookup.

3) TLB Lookup
   TLB lookup already uses ASID in key, so it can distinguish between
   stage-1 and stage-2.
   And as mentioned above, the granule for stage-1 can be different,
   If stage-1 lookup failed, we try again with the stage-2 granule.

4) TLB invalidation
   - Address invalidation is split, for IOVA(CMD_TLBI_NH_VA
     /CMD_TLBI_NH_VAA) and IPA(CMD_TLBI_S2_IPA) based on ASID value
   - CMD_TLBI_NH_ASID/CMD_TLBI_NH_ALL: Consider VMID if stage-2 is
     supported, and invalidate stage-1 only by VMIDs

As far as I understand, this is compliant with the ARM architecture:
- ARM ARM DDI 0487J.a: RLGSCG, RTVTYQ, RGNJPZ
- ARM IHI 0070F.b: 16.2 Caching

An alternative approach would be to instantiate 2 TLBs, one per each
stage. I haven’t investigated that.

Others
=======
- OAS: A typical setup with nesting is to share CPU stage-2 with the
  SMMU, and according to the user manual, SMMU OAS must match the
  system physical address.

  This was discussed before in
  https://lore.kernel.org/all/20230226220650.1480786-11-smostafa@google.com/
  This series doesn’t implement that, but reworks OAS to make it easier
  to configure in the future.

- For nested configuration, IOVA notifier only notifies for stage-1
  invalidations (as far as I understand this is the intended
  behaviour as it notifies for IOVA).

- Rework class in events.

- Stop ignoring VMID for stage-1 if stage-2 is also supported.

Future improvements:
=====================
1) One small improvement, that I don’t think it’s worth the extra
   complexity, is in case of Stage-1 TLB miss for nested translation,
   we can do stage-1 walk and lookup for stage-2 TLBs, instead of
   doing the full walk.

Testing
========
1) IOMMUFD + VFIO
   Kernel: https://lore.kernel.org/all/cover.1683688960.git.nicolinc@nvidia.com/
   VMM: https://qemu-devel.nongnu.narkive.com/o815DqpI/rfc-v5-0-8-arm-smmuv3-emulation-support

   By assigning “virtio-net-pci,netdev=net0,disable-legacy=on,iommu_platform=on,ats=on”,
   to a guest VM (on top of QEMU guest) with VIFO and IOMMUFD.

2) Work in progress prototype I am hacking on for nesting on KVM
   (this is nowhere near complete, and misses many stuff but it
   doesn't require VMs/VFIO) also with virtio-net-pci and git
   cloning a bunch of stuff and also observing traces.
   https://android-kvm.googlesource.com/linux/+log/refs/heads/smostafa/android15-6.6-smmu-nesting-wip

Overall I tested the following configurations
(S1 = 4k, S2 = 4k):
- S1 level = 1 and S2 level = 1
- S1 level = 1 and S2 level = 2
- S1 level = 1 and S2 level = 3
- S1 level = 2 and S2 level = 1
- S1 level = 2 and S2 level = 2
- S1 level = 2 and S2 level = 3
- S1 level = 3 and S2 level = 2
- S1 level = 3 and S2 level = 3
(S1 = 16k, S2 = 4k)
- S1 level = 2 and S2 level = 1
- S1 level = 2 and S2 level = 2
- S1 level = 2 and S2 level = 3
- S1 level = 3 and S2 level = 1
- S1 level = 3 and S2 level = 2
- S1 level = 3 and S2 level = 3
(S1 = 64K, S2 = 4K)
- S1 level = 2 and S2 level = 1
- S1 level = 2 and S2 level = 2
- S1 level = 2 and S2 level = 3
- S1 level = 3 and S2 level = 1
- S1 level = 3 and S2 level = 2
- S1 level = 3 and S2 level = 3
(S1 = 4K, S2 = 16k)
- S1 level = 1 and S2 level = 2
- S1 level = 1 and S2 level = 3
- S1 level = 2 and S2 level = 2
- S1 level = 2 and S2 level = 3
- S1 level = 3 and S2 level = 2
- S1 level = 3 and S2 level = 3
(S1 = 4K, S2 = 64K)
- S1 level = 1 and S2 level = 2
- S1 level = 1 and S2 level = 3
- S1 level = 2 and S2 level = 2
- S1 level = 2 and S2 level = 3
- S1 level = 3 and S2 level = 2
- S1 level = 3 and S2 level = 3


hw/arm/smmuv3: Split smmuv3_translate() better viewed with --color-moved

The first 3 patches are fixes.

Changes in v5:
v4: https://lore.kernel.org/qemu-devel/20240701110241.2005222-1-smostafa@google.com/
- Collect Eric and Jean Rbs
- Fix a bug with nested lookup granule and iova mask
- Fix InputAddr for events for cd and ttbx translation faults
- Fix class in translation fault events
- Fix smmuv3_notify_iova
- Fix CACHED_ENTRY_TO_ADDR macro
- Drop FWB patch
- Fix bisectability by moving smmu_iotlb_inv_asid_vmid

Changes in v4:
v3: https://lore.kernel.org/qemu-devel/20240429032403.74910-1-smostafa@google.com/
- Collected Eric and Alex Rbs
- Rebased on master
- Dropped RFC tag
- Dropped last 2 patches about oas changes to avoid blocking this series
  and I will post them after as RFC
- Split patch 7, and introduce CACHED_ENTRY_TO_ADDR in a separate patch
- Reorder patch 8 and 9 (combine tlb and tlb lookup)
- Split patch 12, and introduce smmu_iotlb_inv_asid_vmid in a separate patch
- Split patch 14, to have fault changes in a separate patch
- Update commit messages and include Fixes sha
- Minor updates, renames and a lot of comments based on review

Changes in v3
v2: https://lore.kernel.org/qemu-devel/20240408140818.3799590-1-smostafa@google.com/
- Collected Eric Rbs.
- Rebased on master.
- Fix an existing bug in class encoding.
- Fix an existing bug in S2 events missing IPA.
- Fix nesting event population (missing class and wrong events)
- Remove CALL_FUNC_CFG_S2.
- Rework TLB combination logic to cache the largest possible entries.
- Refactor nested translation code to be more clear.
- Split patch 05 to 4 patches.
- Convert asid/vmid in trace events to int also.
- Remove some extra traces as it was not needed.
- Improve commit messages.

Changes in v2:
v1: https://lore.kernel.org/qemu-devel/20240325101442.1306300-1-smostafa@google.com/
- Collected Eric Rbs
- Rework TLB to rely on VMID/ASID instead of an extra key.
- Fixed TLB issue with large stage-1 reported by Julian.
- Cap the OAS to 48 bits as PTW doesn’t support 52 bits.
- Fix ASID/VMID representation in some contexts as 16 bits while
  they can be -1
- Increase visibility in trace points

Mostafa Saleh (18):
  hw/arm/smmu-common: Add missing size check for stage-1
  hw/arm/smmu: Fix IPA for stage-2 events
  hw/arm/smmuv3: Fix encoding of CLASS in events
  hw/arm/smmu: Use enum for SMMU stage
  hw/arm/smmu: Split smmuv3_translate()
  hw/arm/smmu: Consolidate ASID and VMID types
  hw/arm/smmu: Introduce CACHED_ENTRY_TO_ADDR
  hw/arm/smmuv3: Translate CD and TT using stage-2 table
  hw/arm/smmu-common: Rework TLB lookup for nesting
  hw/arm/smmu-common: Add support for nested TLB
  hw/arm/smmu-common: Support nested translation
  hw/arm/smmu: Support nesting in smmuv3_range_inval()
  hw/arm/smmu: Introduce smmu_iotlb_inv_asid_vmid
  hw/arm/smmu: Support nesting in the rest of commands
  hw/arm/smmuv3: Support nested SMMUs in smmuv3_notify_iova()
  hw/arm/smmuv3: Handle translation faults according to SMMUPTWEventInfo
  hw/arm/smmuv3: Support and advertise nesting
  hw/arm/smmu: Refactor SMMU OAS

 hw/arm/smmu-common.c         | 312 ++++++++++++++++++++---
 hw/arm/smmuv3-internal.h     |  19 +-
 hw/arm/smmuv3.c              | 467 +++++++++++++++++++++++------------
 hw/arm/trace-events          |  26 +-
 include/hw/arm/smmu-common.h |  46 +++-
 5 files changed, 640 insertions(+), 230 deletions(-)

Comments

Jean-Philippe Brucker July 17, 2024, 3:09 p.m. UTC | #1
On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
> but not nested instances.
> This patch series adds support for nested translation in SMMUv3,
> this is controlled by property “arm-smmuv3.stage=nested”, and
> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)

For the whole series (3-9, 11, 12, 15, 16, 18):

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

(and I think patch 16 is missing Eric's R-b)
Eric Auger July 17, 2024, 5:43 p.m. UTC | #2
Hi Peter, Richard,

On 7/17/24 17:09, Jean-Philippe Brucker wrote:
> On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
>> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
>> but not nested instances.
>> This patch series adds support for nested translation in SMMUv3,
>> this is controlled by property “arm-smmuv3.stage=nested”, and
>> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
> For the whole series (3-9, 11, 12, 15, 16, 18):
>
> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>
> (and I think patch 16 is missing Eric's R-b) 

Jean-Philippe and I have followed up the progress of this series,
Mostafa took into account all our comments and all the patches were
reviewed. It seems to be in a pretty decent state so if you don't have
any objection, please consider pulling it for 9.1.

On my end I did some testing in non nesting mode with virtio-net/vhost
and I have not noticed any regression.
Would be nice if someone could send his T-b for the nested part though
(Julien?).

Thanks

Eric
Peter Maydell July 17, 2024, 7:06 p.m. UTC | #3
On Wed, 17 Jul 2024 at 18:44, Eric Auger <eric.auger@redhat.com> wrote:
>
> Hi Peter, Richard,
>
> On 7/17/24 17:09, Jean-Philippe Brucker wrote:
> > On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
> >> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
> >> but not nested instances.
> >> This patch series adds support for nested translation in SMMUv3,
> >> this is controlled by property “arm-smmuv3.stage=nested”, and
> >> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
> > For the whole series (3-9, 11, 12, 15, 16, 18):
> >
> > Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >
> > (and I think patch 16 is missing Eric's R-b)
>
> Jean-Philippe and I have followed up the progress of this series,
> Mostafa took into account all our comments and all the patches were
> reviewed. It seems to be in a pretty decent state so if you don't have
> any objection, please consider pulling it for 9.1.

Yep, I've had this series on my radar since you said in a
comment on the previous revision that it was close to
being ready. Thanks to both you and Jean-Philippe for doing
the heavy lifting on the code review here.

Applied to target-arm.next for 9.1, thanks.

-- PMM
Julien Grall July 18, 2024, 9:43 a.m. UTC | #4
Hi Eric,

On 17/07/2024 18:43, Eric Auger wrote:
> Hi Peter, Richard,
> 
> On 7/17/24 17:09, Jean-Philippe Brucker wrote:
>> On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
>>> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
>>> but not nested instances.
>>> This patch series adds support for nested translation in SMMUv3,
>>> this is controlled by property “arm-smmuv3.stage=nested”, and
>>> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
>> For the whole series (3-9, 11, 12, 15, 16, 18):
>>
>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>
>> (and I think patch 16 is missing Eric's R-b)
> 
> Jean-Philippe and I have followed up the progress of this series,
> Mostafa took into account all our comments and all the patches were
> reviewed. It seems to be in a pretty decent state so if you don't have
> any objection, please consider pulling it for 9.1.
> 
> On my end I did some testing in non nesting mode with virtio-net/vhost
> and I have not noticed any regression.
> Would be nice if someone could send his T-b for the nested part though
> (Julien?).

I haven't yet tried the latest version. I will do that in the next 
couple of days.

Cheers,
Julien Grall July 19, 2024, 3:36 p.m. UTC | #5
Hi,

On 18/07/2024 10:43, Julien Grall wrote:
> Hi Eric,
> 
> On 17/07/2024 18:43, Eric Auger wrote:
>> Hi Peter, Richard,
>>
>> On 7/17/24 17:09, Jean-Philippe Brucker wrote:
>>> On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
>>>> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
>>>> but not nested instances.
>>>> This patch series adds support for nested translation in SMMUv3,
>>>> this is controlled by property “arm-smmuv3.stage=nested”, and
>>>> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
>>> For the whole series (3-9, 11, 12, 15, 16, 18):
>>>
>>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>
>>> (and I think patch 16 is missing Eric's R-b)
>>
>> Jean-Philippe and I have followed up the progress of this series,
>> Mostafa took into account all our comments and all the patches were
>> reviewed. It seems to be in a pretty decent state so if you don't have
>> any objection, please consider pulling it for 9.1.
>>
>> On my end I did some testing in non nesting mode with virtio-net/vhost
>> and I have not noticed any regression.
>> Would be nice if someone could send his T-b for the nested part though
>> (Julien?).
> 
> I haven't yet tried the latest version. I will do that in the next 
> couple of days.
I see this is already merged. If this still matters:

Tested-by: Julien Grall <jgrall@amazon.com>

Cheers,
Peter Maydell July 19, 2024, 3:57 p.m. UTC | #6
On Fri, 19 Jul 2024 at 16:36, Julien Grall <julien@xen.org> wrote:
>
> Hi,
>
> On 18/07/2024 10:43, Julien Grall wrote:
> > Hi Eric,
> >
> > On 17/07/2024 18:43, Eric Auger wrote:
> >> Hi Peter, Richard,
> >>
> >> On 7/17/24 17:09, Jean-Philippe Brucker wrote:
> >>> On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
> >>>> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
> >>>> but not nested instances.
> >>>> This patch series adds support for nested translation in SMMUv3,
> >>>> this is controlled by property “arm-smmuv3.stage=nested”, and
> >>>> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
> >>> For the whole series (3-9, 11, 12, 15, 16, 18):
> >>>
> >>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>>
> >>> (and I think patch 16 is missing Eric's R-b)
> >>
> >> Jean-Philippe and I have followed up the progress of this series,
> >> Mostafa took into account all our comments and all the patches were
> >> reviewed. It seems to be in a pretty decent state so if you don't have
> >> any objection, please consider pulling it for 9.1.
> >>
> >> On my end I did some testing in non nesting mode with virtio-net/vhost
> >> and I have not noticed any regression.
> >> Would be nice if someone could send his T-b for the nested part though
> >> (Julien?).
> >
> > I haven't yet tried the latest version. I will do that in the next
> > couple of days.
> I see this is already merged. If this still matters:
>
> Tested-by: Julien Grall <jgrall@amazon.com>

We can't retrospectively add the tag, but the testing itself
is still important -- thanks for doing it.

Q: is there any reason not to:
 (a) change the default to "nested" rather than "1"
 (b) make the virt board (for new virt machine versions) use
     "nested"?

AIUI "nested" should be a superset of "stage-1 only", the guest
can just ignore stage-2 if it doesn't care about it. Or is
there a performance hit from having stage-2 around even if the
guest doesn't enable it?

thanks
-- PMM
Mostafa Saleh July 20, 2024, 10:11 p.m. UTC | #7
Hi Peter,

On Fri, Jul 19, 2024 at 04:57:18PM +0100, Peter Maydell wrote:
> On Fri, 19 Jul 2024 at 16:36, Julien Grall <julien@xen.org> wrote:
> >
> > Hi,
> >
> > On 18/07/2024 10:43, Julien Grall wrote:
> > > Hi Eric,
> > >
> > > On 17/07/2024 18:43, Eric Auger wrote:
> > >> Hi Peter, Richard,
> > >>
> > >> On 7/17/24 17:09, Jean-Philippe Brucker wrote:
> > >>> On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
> > >>>> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
> > >>>> but not nested instances.
> > >>>> This patch series adds support for nested translation in SMMUv3,
> > >>>> this is controlled by property “arm-smmuv3.stage=nested”, and
> > >>>> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
> > >>> For the whole series (3-9, 11, 12, 15, 16, 18):
> > >>>
> > >>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > >>>
> > >>> (and I think patch 16 is missing Eric's R-b)
> > >>
> > >> Jean-Philippe and I have followed up the progress of this series,
> > >> Mostafa took into account all our comments and all the patches were
> > >> reviewed. It seems to be in a pretty decent state so if you don't have
> > >> any objection, please consider pulling it for 9.1.
> > >>
> > >> On my end I did some testing in non nesting mode with virtio-net/vhost
> > >> and I have not noticed any regression.
> > >> Would be nice if someone could send his T-b for the nested part though
> > >> (Julien?).
> > >
> > > I haven't yet tried the latest version. I will do that in the next
> > > couple of days.
> > I see this is already merged. If this still matters:
> >
> > Tested-by: Julien Grall <jgrall@amazon.com>
> 
> We can't retrospectively add the tag, but the testing itself
> is still important -- thanks for doing it.
> 
> Q: is there any reason not to:
>  (a) change the default to "nested" rather than "1"
>  (b) make the virt board (for new virt machine versions) use
>      "nested"?
> 
> AIUI "nested" should be a superset of "stage-1 only", the guest
> can just ignore stage-2 if it doesn't care about it. Or is
> there a performance hit from having stage-2 around even if the
> guest doesn't enable it?

I didn’t do benchmarks, but from the code, I don’t think there
would be a difference from using stage-1 only or nested stages
with stage-1 config.
I didn’t make “nested” the default stage or used it for the virt
board, as I was worried about compatibility issues (I think that
breaks backward migration), but otherwise I don’t see issues.

But if I understand correctly, setting that for virt board 9.1
(virt_machine_9_1_options) would be fine?

Thanks,
Mostafa

> 
> thanks
> -- PMM
Peter Maydell July 22, 2024, 9:35 a.m. UTC | #8
On Sat, 20 Jul 2024 at 23:11, Mostafa Saleh <smostafa@google.com> wrote:
>
> Hi Peter,
>
> On Fri, Jul 19, 2024 at 04:57:18PM +0100, Peter Maydell wrote:
> > On Fri, 19 Jul 2024 at 16:36, Julien Grall <julien@xen.org> wrote:
> > >
> > > Hi,
> > >
> > > On 18/07/2024 10:43, Julien Grall wrote:
> > > > Hi Eric,
> > > >
> > > > On 17/07/2024 18:43, Eric Auger wrote:
> > > >> Hi Peter, Richard,
> > > >>
> > > >> On 7/17/24 17:09, Jean-Philippe Brucker wrote:
> > > >>> On Mon, Jul 15, 2024 at 08:45:00AM +0000, Mostafa Saleh wrote:
> > > >>>> Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs
> > > >>>> but not nested instances.
> > > >>>> This patch series adds support for nested translation in SMMUv3,
> > > >>>> this is controlled by property “arm-smmuv3.stage=nested”, and
> > > >>>> advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2)
> > > >>> For the whole series (3-9, 11, 12, 15, 16, 18):
> > > >>>
> > > >>> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > >>>
> > > >>> (and I think patch 16 is missing Eric's R-b)
> > > >>
> > > >> Jean-Philippe and I have followed up the progress of this series,
> > > >> Mostafa took into account all our comments and all the patches were
> > > >> reviewed. It seems to be in a pretty decent state so if you don't have
> > > >> any objection, please consider pulling it for 9.1.
> > > >>
> > > >> On my end I did some testing in non nesting mode with virtio-net/vhost
> > > >> and I have not noticed any regression.
> > > >> Would be nice if someone could send his T-b for the nested part though
> > > >> (Julien?).
> > > >
> > > > I haven't yet tried the latest version. I will do that in the next
> > > > couple of days.
> > > I see this is already merged. If this still matters:
> > >
> > > Tested-by: Julien Grall <jgrall@amazon.com>
> >
> > We can't retrospectively add the tag, but the testing itself
> > is still important -- thanks for doing it.
> >
> > Q: is there any reason not to:
> >  (a) change the default to "nested" rather than "1"
> >  (b) make the virt board (for new virt machine versions) use
> >      "nested"?
> >
> > AIUI "nested" should be a superset of "stage-1 only", the guest
> > can just ignore stage-2 if it doesn't care about it. Or is
> > there a performance hit from having stage-2 around even if the
> > guest doesn't enable it?
>
> I didn’t do benchmarks, but from the code, I don’t think there
> would be a difference from using stage-1 only or nested stages
> with stage-1 config.
> I didn’t make “nested” the default stage or used it for the virt
> board, as I was worried about compatibility issues (I think that
> breaks backward migration), but otherwise I don’t see issues.
>
> But if I understand correctly, setting that for virt board 9.1
> (virt_machine_9_1_options) would be fine?

Yes, we would need to retain the old stage-1-only config on
the older machine version types, but we could switch to
nested by default on the newer ones. It's a little late
I think to make that change for the virt board at this
point in the 9.1 release cycle, but we could do it for 9.2.

thanks
-- PMM