diff mbox series

KVM: x86/pmu: Avoid ternary operator by directly referring to counters->type

Message ID 20221205113718.1487-1-likexu@tencent.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/pmu: Avoid ternary operator by directly referring to counters->type | expand

Commit Message

Like Xu Dec. 5, 2022, 11:37 a.m. UTC
From: Like Xu <likexu@tencent.com>

In either case, the counters will point to fixed or gp pmc array, and
taking advantage of the C pointer, it's reasonable to use an almost known
mem load operation directly without disturbing the branch predictor.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Sean Christopherson Dec. 5, 2022, 4:46 p.m. UTC | #1
On Mon, Dec 05, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> In either case, the counters will point to fixed or gp pmc array, and
> taking advantage of the C pointer, it's reasonable to use an almost known
> mem load operation directly without disturbing the branch predictor.

The compiler is extremely unlikely to generate a branch for this, e.g. gcc-12 uses
setne and clang-14 shifts "fixed" by 30.  FWIW, clang is also clever enough to
use a cmov to load the address of counters, i.e. the happy path will have no taken
branches for either type of counter.

> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/kvm/vmx/pmu_intel.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index e5cec07ca8d9..28b0a784f6e9 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
>  	}
>  	if (idx >= num_counters)
>  		return NULL;
> -	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
> +	*mask &= pmu->counter_bitmask[counters->type];

In terms of readability, I have a slight preference for the current code as I
don't have to look at counters->type to understand its possible values.
Like Xu Dec. 6, 2022, 2:18 a.m. UTC | #2
On 6/12/2022 12:46 am, Sean Christopherson wrote:
> On Mon, Dec 05, 2022, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> In either case, the counters will point to fixed or gp pmc array, and
>> taking advantage of the C pointer, it's reasonable to use an almost known
>> mem load operation directly without disturbing the branch predictor.
> 
> The compiler is extremely unlikely to generate a branch for this, e.g. gcc-12 uses
> setne and clang-14 shifts "fixed" by 30.  FWIW, clang is also clever enough to
> use a cmov to load the address of counters, i.e. the happy path will have no taken
> branches for either type of counter.

If so, good news for users of the new tool chain. I assume our Linux project is also
to be commended when it comes to supporting legacy issues even if just a little.

> 
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>>   arch/x86/kvm/vmx/pmu_intel.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>> index e5cec07ca8d9..28b0a784f6e9 100644
>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
>>   	}
>>   	if (idx >= num_counters)
>>   		return NULL;
>> -	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
>> +	*mask &= pmu->counter_bitmask[counters->type];
> 
> In terms of readability, I have a slight preference for the current code as I
> don't have to look at counters->type to understand its possible values.
When someone tries to add a new type of pmc type, the code bugs up. And,
this one will make all usage of pmu->counter_bitmask[] more consistent.

Please reconsider this minor diff if it does no harm.
Sean Christopherson Dec. 6, 2022, 5:19 p.m. UTC | #3
On Tue, Dec 06, 2022, Like Xu wrote:
> > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > > index e5cec07ca8d9..28b0a784f6e9 100644
> > > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > > @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
> > >   	}
> > >   	if (idx >= num_counters)
> > >   		return NULL;
> > > -	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
> > > +	*mask &= pmu->counter_bitmask[counters->type];
> > 
> > In terms of readability, I have a slight preference for the current code as I
> > don't have to look at counters->type to understand its possible values.
> When someone tries to add a new type of pmc type, the code bugs up.

Are there new types coming along?  If so, I definitely would not object to refactoring
this code in the context of a series that adds a new type(s).  But "fixing" this one
case is not sufficient to support a new type, e.g. intel_is_valid_rdpmc_ecx() also
needs to be updated.  Actually, even this function would need additional updates
to perform a similar sanity check.

	if (fixed) {
		counters = pmu->fixed_counters;
		num_counters = pmu->nr_arch_fixed_counters;
	} else {
		counters = pmu->gp_counters;
		num_counters = pmu->nr_arch_gp_counters;
	}
	if (idx >= num_counters)
		return NULL;

> And, this one will make all usage of pmu->counter_bitmask[] more consistent.

How's that?  There's literally one instance of using ->type

  static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
  {
	struct kvm_pmu *pmu = pmc_to_pmu(pmc);

	return pmu->counter_bitmask[pmc->type];
  }

everything else is hardcoded.  And using pmc->type there make perfect sense in
that case.  But in intel_rdpmc_ecx_to_pmc(), there is already usage of "fixed",
so IMO switching to ->type makes that function somewhat inconsistent with itself.
Like Xu Dec. 7, 2022, 8:44 a.m. UTC | #4
On 7/12/2022 1:19 am, Sean Christopherson wrote:
> On Tue, Dec 06, 2022, Like Xu wrote:
>>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>>> index e5cec07ca8d9..28b0a784f6e9 100644
>>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>>>> @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
>>>>    	}
>>>>    	if (idx >= num_counters)
>>>>    		return NULL;
>>>> -	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
>>>> +	*mask &= pmu->counter_bitmask[counters->type];
>>>
>>> In terms of readability, I have a slight preference for the current code as I

IMO, using counters->type directly just like pmc_bitmask() will add more readability
and opportunistically helps some stale compilers behave better.

>>> don't have to look at counters->type to understand its possible values.
>> When someone tries to add a new type of pmc type, the code bugs up.
> 
> Are there new types coming along?  If so, I definitely would not object to refactoring
> this code in the context of a series that adds a new type(s).  But "fixing" this one
> case is not sufficient to support a new type, e.g. intel_is_valid_rdpmc_ecx() also
> needs to be updated.  Actually, even this function would need additional updates
> to perform a similar sanity check.

True but this part of the change is semantically relevant, which should not
be present in a harmless generic optimization like this one. Right ?

> 
> 	if (fixed) {
> 		counters = pmu->fixed_counters;
> 		num_counters = pmu->nr_arch_fixed_counters;
> 	} else {
> 		counters = pmu->gp_counters;
> 		num_counters = pmu->nr_arch_gp_counters;
> 	}
> 	if (idx >= num_counters)
> 		return NULL;
> 
>> And, this one will make all usage of pmu->counter_bitmask[] more consistent.
> 
> How's that?  There's literally one instance of using ->type
> 
>    static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
>    {
> 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> 
> 	return pmu->counter_bitmask[pmc->type];
>    }
> 
> everything else is hardcoded.  And using pmc->type there make perfect sense in
> that case.  But in intel_rdpmc_ecx_to_pmc(), there is already usage of "fixed",
> so IMO switching to ->type makes that function somewhat inconsistent with itself.

More, it's rare to see code like " [ a ? b : c] " in the world of both KVM and x86.
Good practice (branchless) should be scattered everywhere and not the other way 
around.

I have absolutely no objection to your "slight preference". Thanks for your time 
in reviewing this.
Sean Christopherson Dec. 7, 2022, 5:48 p.m. UTC | #5
On Wed, Dec 07, 2022, Like Xu wrote:
> On 7/12/2022 1:19 am, Sean Christopherson wrote:
> > On Tue, Dec 06, 2022, Like Xu wrote:
> > > > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > > > > index e5cec07ca8d9..28b0a784f6e9 100644
> > > > > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > > > > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > > > > @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
> > > > >    	}
> > > > >    	if (idx >= num_counters)
> > > > >    		return NULL;
> > > > > -	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
> > > > > +	*mask &= pmu->counter_bitmask[counters->type];
> > > > 
> > > > In terms of readability, I have a slight preference for the current code as I
> 
> IMO, using counters->type directly just like pmc_bitmask() will add more readability
> and opportunistically helps some stale compilers behave better.

Anyone that cares about this level of micro-optimization absolutely should be
using a toolchain that's at or near the bleeding edge.

> > > > don't have to look at counters->type to understand its possible values.
> > > When someone tries to add a new type of pmc type, the code bugs up.
> > 
> > Are there new types coming along?  If so, I definitely would not object to refactoring
> > this code in the context of a series that adds a new type(s).  But "fixing" this one
> > case is not sufficient to support a new type, e.g. intel_is_valid_rdpmc_ecx() also
> > needs to be updated.  Actually, even this function would need additional updates
> > to perform a similar sanity check.
> 
> True but this part of the change is semantically relevant, which should not
> be present in a harmless generic optimization like this one. Right ?

For modern compilers, it's not an optimization.

> > 	if (fixed) {
> > 		counters = pmu->fixed_counters;
> > 		num_counters = pmu->nr_arch_fixed_counters;
> > 	} else {
> > 		counters = pmu->gp_counters;
> > 		num_counters = pmu->nr_arch_gp_counters;
> > 	}
> > 	if (idx >= num_counters)
> > 		return NULL;
> > 
> > > And, this one will make all usage of pmu->counter_bitmask[] more consistent.
> > 
> > How's that?  There's literally one instance of using ->type
> > 
> >    static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
> >    {
> > 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> > 
> > 	return pmu->counter_bitmask[pmc->type];
> >    }
> > 
> > everything else is hardcoded.  And using pmc->type there make perfect sense in
> > that case.  But in intel_rdpmc_ecx_to_pmc(), there is already usage of "fixed",
> > so IMO switching to ->type makes that function somewhat inconsistent with itself.
> 
> More, it's rare to see code like " [ a ? b : c] " in the world of both KVM and x86.

There are a few false positives here, but ternary operators are common.

  $ git grep ? arch/x86/kvm | wc -l
  292

If you're saying that indexing an array with a ternary operator is rare, then sure,
but only because there is almost never anything that fits such a pattern, not because
it's an inherently bad pattern.

> Good practice (branchless) should be scattered everywhere and not the other
> way around.

Once again, modern compilers will not generate branches for this code.
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index e5cec07ca8d9..28b0a784f6e9 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -142,7 +142,7 @@  static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
 	}
 	if (idx >= num_counters)
 		return NULL;
-	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
+	*mask &= pmu->counter_bitmask[counters->type];
 	return &counters[array_index_nospec(idx, num_counters)];
 }