mbox series

[RFC,v1,0/3] Update tlb invalidation routines for FEAT_LPA2

Message ID 20231027115634.1432154-1-ryan.roberts@arm.com (mailing list archive)
Headers show
Series Update tlb invalidation routines for FEAT_LPA2 | expand

Message

Ryan Roberts Oct. 27, 2023, 11:56 a.m. UTC
Hi All,

As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
changes to properly support LPA2 before Ard's changes get merged. So far those
changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
the TLBI changes to make them independent. The idea is that this series would go
in first, then Ard's and the rest of my series can race eachother and it doesn't
really matter who wins.

I've attempted to address all of Marc's feedback against the versions of these
patches posted at [2], including adding benchmark data (see patch 1). Although
if people are still nervous that this could regress non-lpa2 performance in some
cases, I could rework so that there are lpa2 and non-lpa2 variants of
__flush_tlb_range_op(), and the correct version is chosen at the higher level
(based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).

It turns out that we won't be able to key LPA2 usage off the same static key for
both the kernel and kvm usage because the kernel usage additionally depends on
CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
(lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
defines and implements lpa2_is_enabled() in his series, so there will be a minor
conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
static key for kvm in my series. Marc, would you be happy with this approach?

Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
I'll re-post on 6.7-rc1.

[1] https://lore.kernel.org/linux-arm-kernel/5651bb31-9ef6-4dfc-b146-64606279bbf7@arm.com/
[2] https://lore.kernel.org/kvmarm/20231009185008.3803879-1-ryan.roberts@arm.com/

Thanks,
Ryan

Ryan Roberts (3):
  arm64/mm: Modify range-based tlbi to decrement scale
  arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
  arm64/mm: Update tlb invalidation routines for FEAT_LPA2

 arch/arm64/include/asm/kvm_mmu.h      |   3 +
 arch/arm64/include/asm/pgtable-prot.h |   2 +
 arch/arm64/include/asm/tlb.h          |  15 ++--
 arch/arm64/include/asm/tlbflush.h     | 100 ++++++++++++++++----------
 4 files changed, 78 insertions(+), 42 deletions(-)

--
2.25.1

Comments

Ryan Roberts Nov. 13, 2023, 11:55 a.m. UTC | #1
On 27/10/2023 12:56, Ryan Roberts wrote:
> Hi All,
> 
> As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> changes to properly support LPA2 before Ard's changes get merged. So far those
> changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> the TLBI changes to make them independent. The idea is that this series would go
> in first, then Ard's and the rest of my series can race eachother and it doesn't
> really matter who wins.
> 
> I've attempted to address all of Marc's feedback against the versions of these
> patches posted at [2], including adding benchmark data (see patch 1). Although
> if people are still nervous that this could regress non-lpa2 performance in some
> cases, I could rework so that there are lpa2 and non-lpa2 variants of
> __flush_tlb_range_op(), and the correct version is chosen at the higher level
> (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> 
> It turns out that we won't be able to key LPA2 usage off the same static key for
> both the kernel and kvm usage because the kernel usage additionally depends on
> CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> defines and implements lpa2_is_enabled() in his series, so there will be a minor
> conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> static key for kvm in my series. Marc, would you be happy with this approach?
> 
> Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> I'll re-post on 6.7-rc1.

Marc, All,

I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
be the first 3 of the series. But if you have an issue with the approach it
would be good to work out an alternative plan to avoid wasting effort preparing
the series.

Thanks,
Ryan

> 
> [1] https://lore.kernel.org/linux-arm-kernel/5651bb31-9ef6-4dfc-b146-64606279bbf7@arm.com/
> [2] https://lore.kernel.org/kvmarm/20231009185008.3803879-1-ryan.roberts@arm.com/
> 
> Thanks,
> Ryan
> 
> Ryan Roberts (3):
>   arm64/mm: Modify range-based tlbi to decrement scale
>   arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
>   arm64/mm: Update tlb invalidation routines for FEAT_LPA2
> 
>  arch/arm64/include/asm/kvm_mmu.h      |   3 +
>  arch/arm64/include/asm/pgtable-prot.h |   2 +
>  arch/arm64/include/asm/tlb.h          |  15 ++--
>  arch/arm64/include/asm/tlbflush.h     | 100 ++++++++++++++++----------
>  4 files changed, 78 insertions(+), 42 deletions(-)
> 
> --
> 2.25.1
>
Marc Zyngier Nov. 13, 2023, 12:20 p.m. UTC | #2
On Mon, 13 Nov 2023 11:55:01 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
> On 27/10/2023 12:56, Ryan Roberts wrote:
> > Hi All,
> > 
> > As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> > changes to properly support LPA2 before Ard's changes get merged. So far those
> > changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> > the TLBI changes to make them independent. The idea is that this series would go
> > in first, then Ard's and the rest of my series can race eachother and it doesn't
> > really matter who wins.
> > 
> > I've attempted to address all of Marc's feedback against the versions of these
> > patches posted at [2], including adding benchmark data (see patch 1). Although
> > if people are still nervous that this could regress non-lpa2 performance in some
> > cases, I could rework so that there are lpa2 and non-lpa2 variants of
> > __flush_tlb_range_op(), and the correct version is chosen at the higher level
> > (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> > 
> > It turns out that we won't be able to key LPA2 usage off the same static key for
> > both the kernel and kvm usage because the kernel usage additionally depends on
> > CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> > (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> > defines and implements lpa2_is_enabled() in his series, so there will be a minor
> > conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> > static key for kvm in my series. Marc, would you be happy with this approach?
> > 
> > Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> > I'll re-post on 6.7-rc1.
> 
> Marc, All,
> 
> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
> series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
> be the first 3 of the series. But if you have an issue with the approach it
> would be good to work out an alternative plan to avoid wasting effort preparing
> the series.

No specific concern on the approach. Having a static-key for the KVM
side is good, and the uplift on the range stuff seems compelling.

Thanks,

	M.
Ard Biesheuvel Nov. 13, 2023, 12:29 p.m. UTC | #3
On Mon, 13 Nov 2023 at 22:20, Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 13 Nov 2023 11:55:01 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
> >
> > On 27/10/2023 12:56, Ryan Roberts wrote:
> > > Hi All,
> > >
> > > As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> > > changes to properly support LPA2 before Ard's changes get merged. So far those
> > > changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> > > the TLBI changes to make them independent. The idea is that this series would go
> > > in first, then Ard's and the rest of my series can race eachother and it doesn't
> > > really matter who wins.
> > >
> > > I've attempted to address all of Marc's feedback against the versions of these
> > > patches posted at [2], including adding benchmark data (see patch 1). Although
> > > if people are still nervous that this could regress non-lpa2 performance in some
> > > cases, I could rework so that there are lpa2 and non-lpa2 variants of
> > > __flush_tlb_range_op(), and the correct version is chosen at the higher level
> > > (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> > >
> > > It turns out that we won't be able to key LPA2 usage off the same static key for
> > > both the kernel and kvm usage because the kernel usage additionally depends on
> > > CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> > > (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> > > defines and implements lpa2_is_enabled() in his series, so there will be a minor
> > > conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> > > static key for kvm in my series. Marc, would you be happy with this approach?
> > >
> > > Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> > > I'll re-post on 6.7-rc1.
> >
> > Marc, All,
> >
> > I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
> > series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
> > be the first 3 of the series. But if you have an issue with the approach it
> > would be good to work out an alternative plan to avoid wasting effort preparing
> > the series.
>
> No specific concern on the approach. Having a static-key for the KVM
> side is good, and the uplift on the range stuff seems compelling.
>

OK, I'll put these at the start of my ++v as well.

Thanks,
Ryan Roberts Nov. 13, 2023, 12:42 p.m. UTC | #4
On 13/11/2023 12:29, Ard Biesheuvel wrote:
> On Mon, 13 Nov 2023 at 22:20, Marc Zyngier <maz@kernel.org> wrote:
>>
>> On Mon, 13 Nov 2023 11:55:01 +0000,
>> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>
>>> On 27/10/2023 12:56, Ryan Roberts wrote:
>>>> Hi All,
>>>>
>>>> As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
>>>> changes to properly support LPA2 before Ard's changes get merged. So far those
>>>> changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
>>>> the TLBI changes to make them independent. The idea is that this series would go
>>>> in first, then Ard's and the rest of my series can race eachother and it doesn't
>>>> really matter who wins.
>>>>
>>>> I've attempted to address all of Marc's feedback against the versions of these
>>>> patches posted at [2], including adding benchmark data (see patch 1). Although
>>>> if people are still nervous that this could regress non-lpa2 performance in some
>>>> cases, I could rework so that there are lpa2 and non-lpa2 variants of
>>>> __flush_tlb_range_op(), and the correct version is chosen at the higher level
>>>> (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
>>>>
>>>> It turns out that we won't be able to key LPA2 usage off the same static key for
>>>> both the kernel and kvm usage because the kernel usage additionally depends on
>>>> CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
>>>> (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
>>>> defines and implements lpa2_is_enabled() in his series, so there will be a minor
>>>> conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
>>>> static key for kvm in my series. Marc, would you be happy with this approach?
>>>>
>>>> Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
>>>> I'll re-post on 6.7-rc1.
>>>
>>> Marc, All,
>>>
>>> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
>>> series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
>>> be the first 3 of the series. But if you have an issue with the approach it
>>> would be good to work out an alternative plan to avoid wasting effort preparing
>>> the series.
>>
>> No specific concern on the approach. Having a static-key for the KVM
>> side is good, and the uplift on the range stuff seems compelling.
>>
> 
> OK, I'll put these at the start of my ++v as well.

Great - thanks both!

> 
> Thanks,
>
Catalin Marinas Nov. 13, 2023, 1:27 p.m. UTC | #5
On Mon, Nov 13, 2023 at 11:55:01AM +0000, Ryan Roberts wrote:
> On 27/10/2023 12:56, Ryan Roberts wrote:
> > As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> > changes to properly support LPA2 before Ard's changes get merged. So far those
> > changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> > the TLBI changes to make them independent. The idea is that this series would go
> > in first, then Ard's and the rest of my series can race eachother and it doesn't
> > really matter who wins.
> > 
> > I've attempted to address all of Marc's feedback against the versions of these
> > patches posted at [2], including adding benchmark data (see patch 1). Although
> > if people are still nervous that this could regress non-lpa2 performance in some
> > cases, I could rework so that there are lpa2 and non-lpa2 variants of
> > __flush_tlb_range_op(), and the correct version is chosen at the higher level
> > (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> > 
> > It turns out that we won't be able to key LPA2 usage off the same static key for
> > both the kernel and kvm usage because the kernel usage additionally depends on
> > CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> > (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> > defines and implements lpa2_is_enabled() in his series, so there will be a minor
> > conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> > static key for kvm in my series. Marc, would you be happy with this approach?
> > 
> > Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> > I'll re-post on 6.7-rc1.
[...]
> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
> series on top of v6.7-rc1 in the next day or 2.

I suspect that's what most maintainers wait for ;). Usually patches
posted just before or during the merging window get ignored, unless they
are urgent fixes.

> By default, these 3 patches will
> be the first 3 of the series. But if you have an issue with the approach it
> would be good to work out an alternative plan to avoid wasting effort preparing
> the series.

If these patches are needed for Ard's series (I think they do), they
could be posted together. But first we need to split Ard's series into
at least two: reworking the memory map together with moving (some of)
the init code to C and the actual LPA2 stage 1 support. We might even
through LPA2 stage 2 on top if we feel brave ;). We can queue them all
together but having separate series gives us an option to drop bits if
needed.
Ryan Roberts Nov. 13, 2023, 4:44 p.m. UTC | #6
On 13/11/2023 13:27, Catalin Marinas wrote:
> On Mon, Nov 13, 2023 at 11:55:01AM +0000, Ryan Roberts wrote:
>> On 27/10/2023 12:56, Ryan Roberts wrote:
>>> As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
>>> changes to properly support LPA2 before Ard's changes get merged. So far those
>>> changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
>>> the TLBI changes to make them independent. The idea is that this series would go
>>> in first, then Ard's and the rest of my series can race eachother and it doesn't
>>> really matter who wins.
>>>
>>> I've attempted to address all of Marc's feedback against the versions of these
>>> patches posted at [2], including adding benchmark data (see patch 1). Although
>>> if people are still nervous that this could regress non-lpa2 performance in some
>>> cases, I could rework so that there are lpa2 and non-lpa2 variants of
>>> __flush_tlb_range_op(), and the correct version is chosen at the higher level
>>> (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
>>>
>>> It turns out that we won't be able to key LPA2 usage off the same static key for
>>> both the kernel and kvm usage because the kernel usage additionally depends on
>>> CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
>>> (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
>>> defines and implements lpa2_is_enabled() in his series, so there will be a minor
>>> conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
>>> static key for kvm in my series. Marc, would you be happy with this approach?
>>>
>>> Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
>>> I'll re-post on 6.7-rc1.
> [...]
>> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
>> series on top of v6.7-rc1 in the next day or 2.
> 
> I suspect that's what most maintainers wait for ;). Usually patches
> posted just before or during the merging window get ignored, unless they
> are urgent fixes.
> 
>> By default, these 3 patches will
>> be the first 3 of the series. But if you have an issue with the approach it
>> would be good to work out an alternative plan to avoid wasting effort preparing
>> the series.
> 
> If these patches are needed for Ard's series (I think they do), they
> could be posted together. But first we need to split Ard's series into
> at least two: reworking the memory map together with moving (some of)
> the init code to C and the actual LPA2 stage 1 support. We might even
> through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> together but having separate series gives us an option to drop bits if
> needed.

Sorry I'm not completely sure what you are suggesting I do here. In the context
of the lpa2/kvm series, Marc previously said he would merge it for v6.8 if I
posted against v6.7-rc1 (which is what I'm gearing up for now).

My series and Ard's are independent except that they both depend on the 3
patches in this RFC. Ard has just suggested he would prefix his series with
these 3 patches and I would continue to do the same, then they can be handled as
2 independent series (or more if Ard is splitting his) - the first 3 patches
will just desolve to nothing for the series that goes in second.

Does that work for you, are are you suggesting I should re-post this as an
independent series?

Thanks,
Ryan
Catalin Marinas Nov. 13, 2023, 5:49 p.m. UTC | #7
On Mon, Nov 13, 2023 at 04:44:58PM +0000, Ryan Roberts wrote:
> On 13/11/2023 13:27, Catalin Marinas wrote:
> > If these patches are needed for Ard's series (I think they do), they
> > could be posted together. But first we need to split Ard's series into
> > at least two: reworking the memory map together with moving (some of)
> > the init code to C and the actual LPA2 stage 1 support. We might even
> > through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> > together but having separate series gives us an option to drop bits if
> > needed.
> 
> Sorry I'm not completely sure what you are suggesting I do here. In the context
> of the lpa2/kvm series, Marc previously said he would merge it for v6.8 if I
> posted against v6.7-rc1 (which is what I'm gearing up for now).

If we merge your series independently of Ard's, that's fine, these three
patches can go together as a prefix.

> My series and Ard's are independent except that they both depend on the 3
> patches in this RFC. Ard has just suggested he would prefix his series with
> these 3 patches and I would continue to do the same, then they can be handled as
> 2 independent series (or more if Ard is splitting his) - the first 3 patches
> will just desolve to nothing for the series that goes in second.
> 
> Does that work for you, are are you suggesting I should re-post this as an
> independent series?

No need to. Just post your LPA2 patches with these three as a prefix.
Ard can include them as well and if it gets to merging both, we'll sort
out a small stable branch for these three patches.
Ryan Roberts Nov. 14, 2023, 11:25 a.m. UTC | #8
On 13/11/2023 17:49, Catalin Marinas wrote:
> On Mon, Nov 13, 2023 at 04:44:58PM +0000, Ryan Roberts wrote:
>> On 13/11/2023 13:27, Catalin Marinas wrote:
>>> If these patches are needed for Ard's series (I think they do), they
>>> could be posted together. But first we need to split Ard's series into
>>> at least two: reworking the memory map together with moving (some of)
>>> the init code to C and the actual LPA2 stage 1 support. We might even
>>> through LPA2 stage 2 on top if we feel brave ;). We can queue them all
>>> together but having separate series gives us an option to drop bits if
>>> needed.
>>
>> Sorry I'm not completely sure what you are suggesting I do here. In the context
>> of the lpa2/kvm series, Marc previously said he would merge it for v6.8 if I
>> posted against v6.7-rc1 (which is what I'm gearing up for now).
> 
> If we merge your series independently of Ard's, that's fine, these three
> patches can go together as a prefix.
> 
>> My series and Ard's are independent except that they both depend on the 3
>> patches in this RFC. Ard has just suggested he would prefix his series with
>> these 3 patches and I would continue to do the same, then they can be handled as
>> 2 independent series (or more if Ard is splitting his) - the first 3 patches
>> will just desolve to nothing for the series that goes in second.
>>
>> Does that work for you, are are you suggesting I should re-post this as an
>> independent series?
> 
> No need to. Just post your LPA2 patches with these three as a prefix.
> Ard can include them as well and if it gets to merging both, we'll sort
> out a small stable branch for these three patches.
> 

Great - thanks for the clarification!
Ard Biesheuvel Nov. 15, 2023, 10:33 p.m. UTC | #9
On Mon, 13 Nov 2023 at 23:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> But first we need to split Ard's series into
> at least two: reworking the memory map together with moving (some of)
> the init code to C and the actual LPA2 stage 1 support. We might even
> through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> together but having separate series gives us an option to drop bits if
> needed.
>

I have rebased the changes onto v6.7-rc1 and squashed or added some of
the followup fixes:

https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm64-lpa2-v5

I am happy to carve this up any way you like. A natural split would be
to include everything up to and including

arm64: mmu: Make __cpu_replace_ttbr1() out of line

and queue the rest later, but perhaps you prefer to start with a smaller subset?
Catalin Marinas Nov. 23, 2023, 5:04 p.m. UTC | #10
On Thu, Nov 16, 2023 at 08:33:40AM +1000, Ard Biesheuvel wrote:
> On Mon, 13 Nov 2023 at 23:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > But first we need to split Ard's series into
> > at least two: reworking the memory map together with moving (some of)
> > the init code to C and the actual LPA2 stage 1 support. We might even
> > through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> > together but having separate series gives us an option to drop bits if
> > needed.
> >
> 
> I have rebased the changes onto v6.7-rc1 and squashed or added some of
> the followup fixes:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm64-lpa2-v5
> 
> I am happy to carve this up any way you like. A natural split would be
> to include everything up to and including
> 
> arm64: mmu: Make __cpu_replace_ttbr1() out of line
> 
> and queue the rest later, but perhaps you prefer to start with a smaller subset?

This looks fine, just split this whole lot in two, the first one to
__cpu_replace_ttbr1().