diff mbox series

[kvm-unit-tests,3/6] arm: pmu: Add extra DSB barriers in the mem_access loop

Message ID 20230315110725.1215523-4-eric.auger@redhat.com (mailing list archive)
State New, archived
Headers show
Series arm: pmu: Fix random failures of pmu-chain-promotion | expand

Commit Message

Eric Auger March 15, 2023, 11:07 a.m. UTC
The mem access loop currently features ISB barriers only. However
the mem_access loop counts the number of accesses to memory. ISB
do not garantee the PE cannot reorder memory access. Let's
add a DSB ISH before the write to PMCR_EL0 that enables the PMU
and after the last iteration, before disabling the PMU.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com>

---

This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/
---
 arm/pmu.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Alexandru Elisei April 21, 2023, 10:25 a.m. UTC | #1
Hi,

On Wed, Mar 15, 2023 at 12:07:22PM +0100, Eric Auger wrote:
> The mem access loop currently features ISB barriers only. However
> the mem_access loop counts the number of accesses to memory. ISB
> do not garantee the PE cannot reorder memory access. Let's
> add a DSB ISH before the write to PMCR_EL0 that enables the PMU
> and after the last iteration, before disabling the PMU.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> ---
> 
> This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/
> ---
>  arm/pmu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index b88366a8..dde399e2 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -301,6 +301,7 @@ static void mem_access_loop(void *addr, long loop, uint32_t pmcr)
>  {
>  	uint64_t pmcr64 = pmcr;
>  asm volatile(
> +	"       dsb     ish\n"

I think it might still be possible to reorder memory accesses which are
part of the loop after the DSB above and before the PMU is enabled below.
But the DSB above is needed to make sure previous memory accesses, which
shouldn't be counted as part of the loop, are completed.

I would put another DSB after the ISB which enables the PMU, that way all
memory accesses are neatly sandwitches between two DSBs.

Having 3 DSBs might look like overdoing it, but I reason it to be correct.
What do you think?

Thanks,
Alex

>  	"       msr     pmcr_el0, %[pmcr]\n"
>  	"       isb\n"
>  	"       mov     x10, %[loop]\n"
> @@ -308,6 +309,7 @@ asm volatile(
>  	"       ldr	x9, [%[addr]]\n"
>  	"       cmp     x10, #0x0\n"
>  	"       b.gt    1b\n"
> +	"       dsb     ish\n"
>  	"       msr     pmcr_el0, xzr\n"
>  	"       isb\n"
>  	:
> -- 
> 2.38.1
> 
>
Eric Auger April 24, 2023, 8:11 p.m. UTC | #2
Hi Alexandru,

On 4/21/23 12:25, Alexandru Elisei wrote:
> Hi,
>
> On Wed, Mar 15, 2023 at 12:07:22PM +0100, Eric Auger wrote:
>> The mem access loop currently features ISB barriers only. However
>> the mem_access loop counts the number of accesses to memory. ISB
>> do not garantee the PE cannot reorder memory access. Let's
>> add a DSB ISH before the write to PMCR_EL0 that enables the PMU
>> and after the last iteration, before disabling the PMU.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>
>> ---
>>
>> This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/
>> ---
>>  arm/pmu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arm/pmu.c b/arm/pmu.c
>> index b88366a8..dde399e2 100644
>> --- a/arm/pmu.c
>> +++ b/arm/pmu.c
>> @@ -301,6 +301,7 @@ static void mem_access_loop(void *addr, long loop, uint32_t pmcr)
>>  {
>>  	uint64_t pmcr64 = pmcr;
>>  asm volatile(
>> +	"       dsb     ish\n"
> I think it might still be possible to reorder memory accesses which are
> part of the loop after the DSB above and before the PMU is enabled below.
> But the DSB above is needed to make sure previous memory accesses, which
> shouldn't be counted as part of the loop, are completed.
>
> I would put another DSB after the ISB which enables the PMU, that way all
> memory accesses are neatly sandwitches between two DSBs.
>
> Having 3 DSBs might look like overdoing it, but I reason it to be correct.
> What do you think?
I need more time to investigate this. I will come back to you next week
as I am OoO this week. Sorry for the inconvenience.
Thank you for the review!

Eric
>
> Thanks,
> Alex
>
>>  	"       msr     pmcr_el0, %[pmcr]\n"
>>  	"       isb\n"
>>  	"       mov     x10, %[loop]\n"
>> @@ -308,6 +309,7 @@ asm volatile(
>>  	"       ldr	x9, [%[addr]]\n"
>>  	"       cmp     x10, #0x0\n"
>>  	"       b.gt    1b\n"
>> +	"       dsb     ish\n"
>>  	"       msr     pmcr_el0, xzr\n"
>>  	"       isb\n"
>>  	:
>> -- 
>> 2.38.1
>>
>>
Alexandru Elisei April 25, 2023, 1 p.m. UTC | #3
Hi,

On Mon, Apr 24, 2023 at 10:11:11PM +0200, Eric Auger wrote:
> Hi Alexandru,
> 
> On 4/21/23 12:25, Alexandru Elisei wrote:
> > Hi,
> >
> > On Wed, Mar 15, 2023 at 12:07:22PM +0100, Eric Auger wrote:
> >> The mem access loop currently features ISB barriers only. However
> >> the mem_access loop counts the number of accesses to memory. ISB
> >> do not garantee the PE cannot reorder memory access. Let's
> >> add a DSB ISH before the write to PMCR_EL0 that enables the PMU
> >> and after the last iteration, before disabling the PMU.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com>
> >>
> >> ---
> >>
> >> This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/
> >> ---
> >>  arm/pmu.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arm/pmu.c b/arm/pmu.c
> >> index b88366a8..dde399e2 100644
> >> --- a/arm/pmu.c
> >> +++ b/arm/pmu.c
> >> @@ -301,6 +301,7 @@ static void mem_access_loop(void *addr, long loop, uint32_t pmcr)
> >>  {
> >>  	uint64_t pmcr64 = pmcr;
> >>  asm volatile(
> >> +	"       dsb     ish\n"
> > I think it might still be possible to reorder memory accesses which are
> > part of the loop after the DSB above and before the PMU is enabled below.
> > But the DSB above is needed to make sure previous memory accesses, which
> > shouldn't be counted as part of the loop, are completed.
> >
> > I would put another DSB after the ISB which enables the PMU, that way all
> > memory accesses are neatly sandwitches between two DSBs.
> >
> > Having 3 DSBs might look like overdoing it, but I reason it to be correct.
> > What do you think?
> I need more time to investigate this. I will come back to you next week
> as I am OoO this week. Sorry for the inconvenience.

That's fine, I'm swamped too with other things, so don't expect a quick reply
:)

Thanks,
Alex

> Thank you for the review!
> 
> Eric
> >
> > Thanks,
> > Alex
> >
> >>  	"       msr     pmcr_el0, %[pmcr]\n"
> >>  	"       isb\n"
> >>  	"       mov     x10, %[loop]\n"
> >> @@ -308,6 +309,7 @@ asm volatile(
> >>  	"       ldr	x9, [%[addr]]\n"
> >>  	"       cmp     x10, #0x0\n"
> >>  	"       b.gt    1b\n"
> >> +	"       dsb     ish\n"
> >>  	"       msr     pmcr_el0, xzr\n"
> >>  	"       isb\n"
> >>  	:
> >> -- 
> >> 2.38.1
> >>
> >>
>
diff mbox series

Patch

diff --git a/arm/pmu.c b/arm/pmu.c
index b88366a8..dde399e2 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -301,6 +301,7 @@  static void mem_access_loop(void *addr, long loop, uint32_t pmcr)
 {
 	uint64_t pmcr64 = pmcr;
 asm volatile(
+	"       dsb     ish\n"
 	"       msr     pmcr_el0, %[pmcr]\n"
 	"       isb\n"
 	"       mov     x10, %[loop]\n"
@@ -308,6 +309,7 @@  asm volatile(
 	"       ldr	x9, [%[addr]]\n"
 	"       cmp     x10, #0x0\n"
 	"       b.gt    1b\n"
+	"       dsb     ish\n"
 	"       msr     pmcr_el0, xzr\n"
 	"       isb\n"
 	: