Message ID | 20221205113718.1487-1-likexu@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/pmu: Avoid ternary operator by directly referring to counters->type | expand |
On Mon, Dec 05, 2022, Like Xu wrote: > From: Like Xu <likexu@tencent.com> > > In either case, the counters will point to fixed or gp pmc array, and > taking advantage of the C pointer, it's reasonable to use an almost known > mem load operation directly without disturbing the branch predictor. The compiler is extremely unlikely to generate a branch for this, e.g. gcc-12 uses setne and clang-14 shifts "fixed" by 30. FWIW, clang is also clever enough to use a cmov to load the address of counters, i.e. the happy path will have no taken branches for either type of counter. > Signed-off-by: Like Xu <likexu@tencent.com> > --- > arch/x86/kvm/vmx/pmu_intel.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > index e5cec07ca8d9..28b0a784f6e9 100644 > --- a/arch/x86/kvm/vmx/pmu_intel.c > +++ b/arch/x86/kvm/vmx/pmu_intel.c > @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, > } > if (idx >= num_counters) > return NULL; > - *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; > + *mask &= pmu->counter_bitmask[counters->type]; In terms of readability, I have a slight preference for the current code as I don't have to look at counters->type to understand its possible values.
On 6/12/2022 12:46 am, Sean Christopherson wrote: > On Mon, Dec 05, 2022, Like Xu wrote: >> From: Like Xu <likexu@tencent.com> >> >> In either case, the counters will point to fixed or gp pmc array, and >> taking advantage of the C pointer, it's reasonable to use an almost known >> mem load operation directly without disturbing the branch predictor. > > The compiler is extremely unlikely to generate a branch for this, e.g. gcc-12 uses > setne and clang-14 shifts "fixed" by 30. FWIW, clang is also clever enough to > use a cmov to load the address of counters, i.e. the happy path will have no taken > branches for either type of counter. If so, good news for users of the new tool chain. I assume our Linux project is also to be commended when it comes to supporting legacy issues even if just a little. > >> Signed-off-by: Like Xu <likexu@tencent.com> >> --- >> arch/x86/kvm/vmx/pmu_intel.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c >> index e5cec07ca8d9..28b0a784f6e9 100644 >> --- a/arch/x86/kvm/vmx/pmu_intel.c >> +++ b/arch/x86/kvm/vmx/pmu_intel.c >> @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, >> } >> if (idx >= num_counters) >> return NULL; >> - *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; >> + *mask &= pmu->counter_bitmask[counters->type]; > > In terms of readability, I have a slight preference for the current code as I > don't have to look at counters->type to understand its possible values. When someone tries to add a new type of pmc type, the code bugs up. And, this one will make all usage of pmu->counter_bitmask[] more consistent. Please reconsider this minor diff if it does no harm.
On Tue, Dec 06, 2022, Like Xu wrote: > > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > > > index e5cec07ca8d9..28b0a784f6e9 100644 > > > --- a/arch/x86/kvm/vmx/pmu_intel.c > > > +++ b/arch/x86/kvm/vmx/pmu_intel.c > > > @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, > > > } > > > if (idx >= num_counters) > > > return NULL; > > > - *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; > > > + *mask &= pmu->counter_bitmask[counters->type]; > > > > In terms of readability, I have a slight preference for the current code as I > > don't have to look at counters->type to understand its possible values. > When someone tries to add a new type of pmc type, the code bugs up. Are there new types coming along? If so, I definitely would not object to refactoring this code in the context of a series that adds a new type(s). But "fixing" this one case is not sufficient to support a new type, e.g. intel_is_valid_rdpmc_ecx() also needs to be updated. Actually, even this function would need additional updates to perform a similar sanity check. if (fixed) { counters = pmu->fixed_counters; num_counters = pmu->nr_arch_fixed_counters; } else { counters = pmu->gp_counters; num_counters = pmu->nr_arch_gp_counters; } if (idx >= num_counters) return NULL; > And, this one will make all usage of pmu->counter_bitmask[] more consistent. How's that? There's literally one instance of using ->type static inline u64 pmc_bitmask(struct kvm_pmc *pmc) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); return pmu->counter_bitmask[pmc->type]; } everything else is hardcoded. And using pmc->type there make perfect sense in that case. But in intel_rdpmc_ecx_to_pmc(), there is already usage of "fixed", so IMO switching to ->type makes that function somewhat inconsistent with itself.
On 7/12/2022 1:19 am, Sean Christopherson wrote: > On Tue, Dec 06, 2022, Like Xu wrote: >>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c >>>> index e5cec07ca8d9..28b0a784f6e9 100644 >>>> --- a/arch/x86/kvm/vmx/pmu_intel.c >>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c >>>> @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, >>>> } >>>> if (idx >= num_counters) >>>> return NULL; >>>> - *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; >>>> + *mask &= pmu->counter_bitmask[counters->type]; >>> >>> In terms of readability, I have a slight preference for the current code as I IMO, using counters->type directly just like pmc_bitmask() will add more readability and opportunistically helps some stale compilers behave better. >>> don't have to look at counters->type to understand its possible values. >> When someone tries to add a new type of pmc type, the code bugs up. > > Are there new types coming along? If so, I definitely would not object to refactoring > this code in the context of a series that adds a new type(s). But "fixing" this one > case is not sufficient to support a new type, e.g. intel_is_valid_rdpmc_ecx() also > needs to be updated. Actually, even this function would need additional updates > to perform a similar sanity check. True but this part of the change is semantically relevant, which should not be present in a harmless generic optimization like this one. Right ? > > if (fixed) { > counters = pmu->fixed_counters; > num_counters = pmu->nr_arch_fixed_counters; > } else { > counters = pmu->gp_counters; > num_counters = pmu->nr_arch_gp_counters; > } > if (idx >= num_counters) > return NULL; > >> And, this one will make all usage of pmu->counter_bitmask[] more consistent. > > How's that? There's literally one instance of using ->type > > static inline u64 pmc_bitmask(struct kvm_pmc *pmc) > { > struct kvm_pmu *pmu = pmc_to_pmu(pmc); > > return pmu->counter_bitmask[pmc->type]; > } > > everything else is hardcoded. And using pmc->type there make perfect sense in > that case. But in intel_rdpmc_ecx_to_pmc(), there is already usage of "fixed", > so IMO switching to ->type makes that function somewhat inconsistent with itself. More, it's rare to see code like " [ a ? b : c] " in the world of both KVM and x86. Good practice (branchless) should be scattered everywhere and not the other way around. I have absolutely no objection to your "slight preference". Thanks for your time in reviewing this.
On Wed, Dec 07, 2022, Like Xu wrote: > On 7/12/2022 1:19 am, Sean Christopherson wrote: > > On Tue, Dec 06, 2022, Like Xu wrote: > > > > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > > > > > index e5cec07ca8d9..28b0a784f6e9 100644 > > > > > --- a/arch/x86/kvm/vmx/pmu_intel.c > > > > > +++ b/arch/x86/kvm/vmx/pmu_intel.c > > > > > @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, > > > > > } > > > > > if (idx >= num_counters) > > > > > return NULL; > > > > > - *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; > > > > > + *mask &= pmu->counter_bitmask[counters->type]; > > > > > > > > In terms of readability, I have a slight preference for the current code as I > > IMO, using counters->type directly just like pmc_bitmask() will add more readability > and opportunistically helps some stale compilers behave better. Anyone that cares about this level of micro-optimization absolutely should be using a toolchain that's at or near the bleeding edge. > > > > don't have to look at counters->type to understand its possible values. > > > When someone tries to add a new type of pmc type, the code bugs up. > > > > Are there new types coming along? If so, I definitely would not object to refactoring > > this code in the context of a series that adds a new type(s). But "fixing" this one > > case is not sufficient to support a new type, e.g. intel_is_valid_rdpmc_ecx() also > > needs to be updated. Actually, even this function would need additional updates > > to perform a similar sanity check. > > True but this part of the change is semantically relevant, which should not > be present in a harmless generic optimization like this one. Right ? For modern compilers, it's not an optimization. > > if (fixed) { > > counters = pmu->fixed_counters; > > num_counters = pmu->nr_arch_fixed_counters; > > } else { > > counters = pmu->gp_counters; > > num_counters = pmu->nr_arch_gp_counters; > > } > > if (idx >= num_counters) > > return NULL; > > > > > And, this one will make all usage of pmu->counter_bitmask[] more consistent. > > > > How's that? There's literally one instance of using ->type > > > > static inline u64 pmc_bitmask(struct kvm_pmc *pmc) > > { > > struct kvm_pmu *pmu = pmc_to_pmu(pmc); > > > > return pmu->counter_bitmask[pmc->type]; > > } > > > > everything else is hardcoded. And using pmc->type there make perfect sense in > > that case. But in intel_rdpmc_ecx_to_pmc(), there is already usage of "fixed", > > so IMO switching to ->type makes that function somewhat inconsistent with itself. > > More, it's rare to see code like " [ a ? b : c] " in the world of both KVM and x86. There are a few false positives here, but ternary operators are common. $ git grep ? arch/x86/kvm | wc -l 292 If you're saying that indexing an array with a ternary operator is rare, then sure, but only because there is almost never anything that fits such a pattern, not because it's an inherently bad pattern. > Good practice (branchless) should be scattered everywhere and not the other > way around. Once again, modern compilers will not generate branches for this code.
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index e5cec07ca8d9..28b0a784f6e9 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -142,7 +142,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, } if (idx >= num_counters) return NULL; - *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; + *mask &= pmu->counter_bitmask[counters->type]; return &counters[array_index_nospec(idx, num_counters)]; }