[v3,2/2,XTF] xtf/vpmu: MSR read/write tests for VPMU

Message ID	20170424175430.395-3-mohit.gambhir@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: Mohit Gambhir <mohit.gambhir@oracle.com> To: andrew.cooper3@citrix.com, xen-devel@lists.xen.org Date: Mon, 24 Apr 2017 13:54:30 -0400 Message-Id: <20170424175430.395-3-mohit.gambhir@oracle.com> In-Reply-To: <20170424175430.395-1-mohit.gambhir@oracle.com> References: <20170424175430.395-1-mohit.gambhir@oracle.com> Cc: boris.ostrovsky@oracle.com, Mohit Gambhir <mohit.gambhir@oracle.com> Subject: [Xen-devel] [PATCH v3 2/2][XTF] xtf/vpmu: MSR read/write tests for VPMU Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Mohit Gambhir April 24, 2017, 5:54 p.m. UTC

This patch tests VPMU functionality in the hypervisor on Intel machines.
The tests write to all valid bits in the MSRs that get exposed to the guests
when VPMU is enabled. The tests also write invalid values to the bits
that should be masked and expect the wrmsr call to fault.

The tests are currently unsupported for AMD machines and PV guests.

Signed-off-by: Mohit Gambhir <mohit.gambhir@oracle.com>
---
 tests/vpmu/Makefile |   9 +
 tests/vpmu/main.c   | 504 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 513 insertions(+)
 create mode 100644 tests/vpmu/Makefile
 create mode 100644 tests/vpmu/main.c

Andrew Cooper April 25, 2017, 6:20 p.m. UTC | #1

On 24/04/17 18:54, Mohit Gambhir wrote:
> This patch tests VPMU functionality in the hypervisor on Intel machines.
> The tests write to all valid bits in the MSRs that get exposed to the guests
> when VPMU is enabled. The tests also write invalid values to the bits
> that should be masked and expect the wrmsr call to fault.
>
> The tests are currently unsupported for AMD machines and PV guests.
>
> Signed-off-by: Mohit Gambhir <mohit.gambhir@oracle.com>
> ---
>  tests/vpmu/Makefile |   9 +
>  tests/vpmu/main.c   | 504 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 513 insertions(+)
>  create mode 100644 tests/vpmu/Makefile
>  create mode 100644 tests/vpmu/main.c
>
> diff --git a/tests/vpmu/Makefile b/tests/vpmu/Makefile
> new file mode 100644
> index 0000000..1eaf436
> --- /dev/null
> +++ b/tests/vpmu/Makefile
> @@ -0,0 +1,9 @@
> +include $(ROOT)/build/common.mk
> +
> +NAME      := vpmu
> +CATEGORY  := utility

utilities don't get run automatically.  Is this intentional?  If it
isn't, what is the plan for making it automatically run?  vpmu is still
disabled by default in all branches due to the security vulnerabilities,
so even if this vpmu test was automatically run, it would skip due to
vpmu not being visible.

As a tangent, I wonder if it would be a useful to have a separate
category for incomplete tests, but which are still useful to have for
manual running.

> +TEST-ENVS := $(ALL_ENVIRONMENTS)

You build for all environments, but then have PV unilaterally skip. 
Again, is this intentional?

For HVM, why all environments?  does PMU have any interaction with the
operating or paging mode in use at the time of the samples?

> +
> +obj-perenv += main.o
> +
> +include $(ROOT)/build/gen.mk
> diff --git a/tests/vpmu/main.c b/tests/vpmu/main.c
> new file mode 100644
> index 0000000..ed00953
> --- /dev/null
> +++ b/tests/vpmu/main.c
> @@ -0,0 +1,504 @@
> +/**
> + * @file tests/vpmu/main.c
> + * @ref test-vpmu
> + *
> + * @page test-vpmu vpmu
> + *
> + * Test MSRs exposed by VPMU for various versions of Architectural Performance
> + * Monitoring Unit

This needs rather more information.  What tests are performed?  Take a
look at http://xenbits.xen.org/docs/xtf/test-index.html, particularly
the functional tests.

> + *
> + * @see tests/vpmu/main.c
> + */
> +
> +#include <xtf.h>
> +#include <arch/msr-index.h>
> +#include <arch/mm.h>
> +
> +#define EVENT_UOPS_RETIRED                   0x004101c2
> +#define EVENT_UOPS_RETIRED_ANYTHREAD         0x006101c2
> +#define FIXED_CTR_CTL_BITS                   4
> +#define FIXED_CTR_ENABLE                     0x0000000A
> +#define FIXED_CTR_ENABLE_ANYTHREAD           0x0000000E
> +#define IA32_PERFEVENTSELx_ENABLE_ANYTHREAD  (1ull << 21)
> +#define IA32_PERFEVENTSELx_ENABLE_PCB        (1ull << 19)
> +#define IA32_DEBUGCTL_Freeze_LBR_ON_PMI      (1ull << 11)
> +#define IA32_DEBUGCTL_Freeze_PerfMon_On_PMI  (1ull << 12)
> +
> +const char test_title[] = "Test vpmu";
> +
> +static void get_intel_pmu_version(uint8_t *v, uint8_t *ng, uint8_t *nf,
> +                                  uint8_t *ngb)
> +{

IMO, It looks like the logic would be easier to follow if this wasn't
broken out of test_main().

> +    /* Inter Software Development Manual Vol 2A
> +     * Table 3-8  Information Returned by CPUID Instruction */

Please follow Xen Hypervisor coding style.  Therefore,

/* Single line comments like this please. */

and

/*
 * Multi-line comments
 * like this please.
 */

> +
> +    uint32_t leaf = 0x0a, subleaf = 0;
> +
> +    uint32_t eax, ebx, ecx, edx;
> +
> +    cpuid_count(leaf, subleaf, &eax, &ebx, &ecx, &edx);

You need to check max_leaf >= 0x0a before this is valid to do.  I have
just pushed a change which exposes max_leaf and max_extd_leaf from the
boot-time cpuid collection logic, which you can use.

> +
> +    /* Extract the version ID - EAX 7:0 */
> +    *v =  (eax & 0xff);
> +
> +    /* Extract number of general purpose counter regs - EAX 15:8 */
> +    *ng = (eax >> 8) & 0xff;
> +
> +    /* Extract number of fixed function counter regs - EDX 4:0 */
> +    *nf = edx & 0x1f;
> +
> +    /* Extract number of bits in general purpose counter registers bits */
> +    *ngb = (eax >> 16)  & 0xff;
> +}
> +
> +static bool test_valid_msr_write(uint32_t idx, uint64_t wval)
> +{
> +    const char *pfx = "Error: test_valid_msr_write:";
> +
> +    uint64_t rval = 0;

I don't see much value with spacing these declarations out.

> +
> +    printk("Writing 0x%" PRIx64 " to MSR 0x%x\n", wval, idx);

%#x is shorter than 0x%x when you don't care about the width of the
number printed.

However, it is generally easier to read the logs when numbers like this
are printed at full width,

> +
> +    if( wrmsr_safe(idx, wval) )
> +    {
> +        xtf_error("%s wrmsr for MSR 0x%x resulted in a fault!\n", pfx, idx);

If you are expecting the write to succeed and it doesn't, that is a test
failure, rather than an error (which is used to signify something
expected going wrong).

The trailing ! is unnecessary in the log, and is it possible to reduce
the verbosity of the logging?  This could easily be xtf_failure("Fail:
wrmsr(%08x, %016"PRIx64") got unexpected fault\n") ?

I have also found it generally helpful to not print the success cases,
as they quickly grow out of control.  Logging just the top level areas
of test (e.g. your vPMU version function), and the failures makes is far
easier in the case that something does go wrong.

> +        return false;
> +    }
> +
> +    /* check to see if the values were written correctly */
> +    if ( rdmsr_safe(idx, &rval) )
> +    {
> +        xtf_error("%s rdmsr for MSR 0x%x resulted in a fault!\n", pfx, idx);
> +        return false;
> +    }
> +    if ( rval != wval )
> +    {
> +        xtf_error("%s rdmsr value mismatch for MSR 0x%x Expected: %"PRIx64
> +                ", Found %"PRIx64"\n", pfx, idx, wval, rval);
> +        return false;
> +    }
> +
> +    return true;

As a general note, having true and false returned like this are
counterproductive in comprehensive tests like this.  It causes you to
bail out at the first failure, rather than try everything and log all
errors.  A single call to xtf_{error,failure}() will cause the overall
result to be the most severe, so you don't need to pass back an extra
status like this (unless a failure at this point actively prohibits a
later bit of testing, which doesn't appear to be the case).

> +}
> +
> +static bool test_invalid_msr_write(uint32_t idx, uint64_t wval)
> +{
> +    printk("Write invalid value %"PRIx64" to MSR 0x%x\n", wval, idx);
> +
> +    /* wrmsr_safe must return false after faulting */
> +    if( !wrmsr_safe(idx, wval) )
> +    {
> +        xtf_error("Error: test_invalid_msr_write wrmsr for MSR 0x%x did not "
> +                  "fault!\n", idx);
> +        return false;
> +    }
> +
> +    printk("wrmsr to MSR 0x%x faulted as expected.\n", idx);
> +
> +    return true;
> +}
> +
> +static bool test_transparent_msr_write(uint32_t idx, uint64_t wval)
> +{
> +    const char *pfx = "Error: test_transparent_msr_write:";
> +
> +    uint64_t rval1 = 0;
> +
> +    uint64_t rval2 = 0;

Again, no need for these to be so spaced out.

> +
> +    printk("Attempting to write a value with no effect to MSR 0x%x\n", idx);
> +
> +    /* read current value */
> +    if ( rdmsr_safe(idx, &rval1) )
> +    {
> +        xtf_error("%s rdmsr for MSR 0x%x resulted in a fault\n", pfx, idx);
> +        return false;
> +    }
> +
> +    /* wrmsr should not fault but should not write anything either */
> +    if  ( wrmsr_safe(idx, wval) )
> +    {
> +        xtf_error("%s wrmsr for MSR 0x%x resulted in a fault\n", pfx, idx);
> +        return false;
> +    }
> +
> +    /* read new value */
> +    if ( rdmsr_safe(idx, &rval2) )
> +    {
> +        xtf_error("%s rdmsr for MSR 0x%x resulted in a fault\n", pfx, idx);
> +        return false;
> +    }
> +
> +    /* Check if the new value is the same as the one before wrmsr */
> +
> +    if ( rval1 != rval2 )
> +    {
> +        xtf_error("%s rdmsr value mismatch for MSR 0x%x Expected %"PRIx64
> +                ". Found %"PRIx64"\n", pfx, idx, rval1, rval2);
> +        return false;
> +    }
> +
> +    return true;
> +}
> +
> +static bool test_intel_pmu_ver1(uint8_t ng)
> +{
> +    /* Intel Sofwtare Development Manual Vol. 3B,
> +     * Section 18.2.1 - Architectural Performance Monitoring Version 1
> +     *
> +     * MSRs made available by the VPMU -
> +     *
> +     * IA32_PMCx (start at address 0xc1)
> +     * IA32_PERFEVENTSELx (start at address 0x186)
> +     */

Very helpful comment! I thoroughly approve.

> +
> +    uint32_t idx;
> +
> +    uint64_t wval = 0;
> +
> +    uint8_t i;

This doesn't need to be a uint8_t.  A plain unsigned int will be fine.

> +
> +    printk("-----------------\n");
> +    printk("Testing version 1\n");
> +    printk("-----------------\n");
> +
> +    /* for all general purpose counters */
> +    for ( i = 0; i < ng; i++ )
> +    {
> +        /* test writing to IA32_PMCx */
> +        idx = MSR_IA32_PMC(i);
> +
> +        /* test we can write to all valid bits in the counters */
> +        /* don't set bit 31 since that gets sign-extended */

What is wrong with bit 31 being sign extended?

> +        wval = ((1ull << 31) - 1) ;
> +
> +        if ( !test_valid_msr_write(idx, wval) )
> +            return false;
> +
> +        /* set all valid bits in MSR_IA32_EVENTSELx */
> +        idx = MSR_IA32_PERFEVTSEL(i);
> +        wval = ((1ull << 32) - 1) ^ (IA32_PERFEVENTSELx_ENABLE_ANYTHREAD |
> +                IA32_PERFEVENTSELx_ENABLE_PCB);

What is wrong with Pin Control in this register?  Architecturally, it is
a software configurable bit.

> +
> +        if ( !test_valid_msr_write(idx, wval) )
> +            return false;
> +
> +        /* test writing an invalid value and assert that it faults */
> +        wval = ~((1ull << 32) - 1);
> +
> +        if ( !test_invalid_msr_write(idx, wval) )
> +            return false;
> +    }
> +
> +    return true;
> +}
> +
> +static bool test_intel_pmu_ver2(uint8_t ng, uint8_t nf)
> +{
> +    /* Intel Sofwtare Development Manual Vol. 3B,
> +     * Section 18.2.2 - Architectural Performance Monitoring Version 2
> +     *
> +     * MSRs made available by the VPMU -
> +     *
> +     * IA32_FIXED_CTRx (start at address 0x309)
> +     * IA32_FIXED_CTR_CTRL
> +     * IA32_PERF_GLOBAL_CTRL
> +     * IA32_PERF_GLOBAL_STATUS (read-only)
> +     * IA32_PERF_GLOBAL_OVF_CTRL
> +     * IA32_DEBUGCTL_Freeze_LBR_ON_PMI
> +     * IA32_DEBUGCTL_Freeze_PerfMon_On_PMI
> +     */
> +
> +    uint32_t idx;
> +
> +    uint64_t wval, rval;
> +
> +    uint8_t i;
> +
> +    printk("-----------------\n");
> +    printk("Testing version 2\n");
> +    printk("-----------------\n");
> +
> +    for ( i = 0; i < nf; i++ )
> +    {
> +        /* test writing to IA32_FIXED_CTRx */
> +        idx = MSR_IA32_FIXED_CTR(i);
> +
> +        wval = (1ull << 32) - 1;
> +
> +        if ( !test_valid_msr_write(idx, wval) )
> +             return false;
> +
> +        /* test invalid write to IA32_FIXED_CTRx */
> +        wval = ~wval;
> +
> +        if ( !test_invalid_msr_write(idx, wval) )
> +             return false;
> +    }
> +
> +    /* test IA32_FIXED_CTR_CTRL */
> +    idx = MSR_IA32_FIXED_CTR_CTRL;
> +
> +    /* enable all fixed counters */
> +    wval = 0;
> +
> +    for ( i = 0; i < nf; i++ )
> +        wval |= (FIXED_CTR_ENABLE << (FIXED_CTR_CTL_BITS * i));
> +
> +    if ( !test_valid_msr_write(idx, wval) )
> +        return false;
> +
> +    /* invert wval to test writing an invalid value */
> +    wval = ~wval;
> +
> +    if ( !test_invalid_msr_write(idx, wval) )
> +      return false;
> +
> +    /* test IA32_PERF_GLOBAL_CTRL */
> +    idx = MSR_IA32_PERF_GLOBAL_CTRL;
> +
> +    wval = 0;
> +
> +    /* set all fixed function counters enable bits */
> +    for ( i=0; i < nf; i ++ )
> +        wval |= ((uint64_t)1 << (32 + i));
> +
> +    /* set all general purpose counters enable bits*/
> +    for ( i = 0; i < ng; i++ )
> +        wval |= (1 << i);
> +
> +    if ( !test_valid_msr_write(idx, wval) )
> +         return false;
> +
> +    /* invert wval to test invalid write to MSR_IA32_PERF_GLOBAL_CTRL*/
> +    wval = ~wval;
> +
> +    if ( !test_invalid_msr_write(idx, wval) )
> +        return false;
> +
> +    /* test IA32_PERF_GLOBAL_OVF_CTRL */
> +    idx = MSR_IA32_PERF_GLOBAL_OVF_CTRL;
> +
> +    /* set all valid bits in MSR_IA32_PERF_GLOBAL_OVF_CTRL */
> +    wval = 0xC000000000000000 | (((1ULL << nf) - 1) << 32) | ((1ULL << ng) - 1);
> +
> +    /* Writing to MSR_IA32_PERF_GLOBAL_OVF_CTRL clears
> +     * MSR_IA32_PERF_GLOBAL_STATUS but but always returns 0 when read so
> +     * it is tested as a transparent write */
> +
> +    if ( !test_transparent_msr_write(idx, wval) )
> +        return false;
> +
> +    /* invert wval to test invalid write to MSR_IA32_PERF_GLOBAL_OVF_CTRL*/
> +    wval = ~wval;
> +
> +    if ( !test_invalid_msr_write(idx, wval) )
> +        return false;
> +
> +    /* test IA32_PERF_GLOBAL_STATUS (read-only) */
> +    idx = MSR_IA32_PERF_GLOBAL_STATUS;
> +
> +    if ( rdmsr_safe(idx, &rval) )
> +    {
> +        xtf_error("Error: test_intel_pmu_ver2: "
> +                  "rdmsr_safe for MSR 0x%x resulted in a fault!\n", idx);
> +        return false;
> +    }
> +
> +    /* try to write the IA32_PERF_GLOBAL_STATUS register and expect it to fail*/
> +    if ( !test_invalid_msr_write(idx, rval) )
> +        return false;
> +
> +
> +    /* test IA32_DEBUGCTL */
> +    idx = MSR_IA32_DEBUGCTL;
> +
> +    /* Test IA32_DEBUGCTL facilities enabled by v2 */
> +    wval = IA32_DEBUGCTL_Freeze_LBR_ON_PMI | IA32_DEBUGCTL_Freeze_PerfMon_On_PMI;
> +
> +    /* FIXME: This should really be a valid write but it isnt supported by the
> +     * VPMU yet */

In which case the test should be correct here, and highlight that there
is a bug in Xen.

> +    if ( !test_transparent_msr_write(idx, wval) )
> +        return false;
> +
> +    return true;
> +}
> +
> +static bool test_intel_pmu_ver3(uint8_t ng, uint8_t nf)
> +{
> +    /* Intel Sofwtare Development Manual Vol. 3B
> +     * Section 18.2.3 Architectural Performance Monitoring Version 3
> +     *
> +     * MSRs made available by the VPMU
> +     *
> +     * IA32_PERFEVENTSELx.ANYTHREAD (Bit 21)
> +     * IA32_FIXED_CTR_CTRL.ANYTHREADx (Bit 2, 6, 11)
> +     *
> +     * Version 3 introduces ANYTHREAD bit but VPMU does not support it to
> +     * ensure that a VCPU isn't able to read values from physical resources that
> +     * are not allocated to it. This test case validates that we are unable to
> +     * write to .ANYTHREAD bit in IA32_PERFEVENTSELx and IA32_FIXED_CTR_CTRL
> +     */
> +
> +    uint64_t wval;
> +
> +    uint32_t idx;
> +
> +    uint8_t i;
> +
> +    printk("-----------------\n");
> +    printk("Testing version 3\n");
> +    printk("-----------------\n");
> +
> +    /* test IA32_PERFEVENTSELx.ANYTHREAD is disabled */
> +    for ( i = 0; i < ng; i++ )
> +    {
> +        idx = MSR_IA32_PERFEVTSEL(i);
> +
> +        wval = EVENT_UOPS_RETIRED_ANYTHREAD;
> +
> +        if ( !test_invalid_msr_write(idx, wval) )
> +            return false;
> +    }
> +
> +
> +    /* test IA32_FIXED_CTR_CTL.ANYTHREAD is disabled */
> +    idx = MSR_IA32_FIXED_CTR_CTRL;
> +
> +    wval = 0;
> +
> +    for ( i = 0; i < nf; i++ )
> +        wval |= (FIXED_CTR_ENABLE_ANYTHREAD << (4 * i)) ;
> +
> +    if ( !test_invalid_msr_write(idx, wval) )
> +        return false;
> +
> +    return true;
> +}
> +
> +static bool test_full_width_cnts(uint8_t ng, uint8_t ngb)
> +{
> +    uint64_t caps;
> +
> +    uint32_t idx;
> +
> +    uint64_t wval;
> +
> +    uint8_t i;
> +
> +    /* Get perf capabilties */
> +    if ( rdmsr_safe(MSR_IA32_PERF_CAPABILITIES, &caps) )
> +    {
> +        printk("Fault while reading MSR_IA32_PERF_CAPABILITIES\n");

The PERF_CAPABILITIES MSR is only valid to read if PDCM is advertised in
CPUID.

We currently never advertise PCDM, and is a bug that
MSR_PERF_CAPABILITIES is accessable at all in guest context. 
(specifically, it is because Xen leaks almost all host MSR state into
guests).

> +        return false;
> +    }
> +
> +    if ( !(caps >> 13) & 1 )

This lacks sufficient brackets for what you are intending to do.  Also,
it looks like you probably want a #define at the top of the file for
this constant.

> +    {
> +        printk("Full width counters not supported\n");
> +        return true;
> +    }
> +
> +    /* test writes to full width counters */
> +    for ( i = 0; i < ng; i++)
> +    {
> +        idx = MSR_IA32_A_PMC(i);
> +
> +        wval = ((1ull << ngb) - 1) ;
> +
> +        if ( !test_valid_msr_write(idx, wval) )
> +                return false;
> +
> +        /* invert wval to test invalid write to MSR_IA32_PERF_GLOBAL_OVF_CTRL*/
> +        wval = ~wval;
> +
> +        if ( !test_invalid_msr_write(idx, wval) )
> +            return false;
> +    }
> +
> +    return true;
> +
> +}
> +
> +void test_main(void)
> +{
> +    /* Architectural Performance Monitoring Version */
> +    uint8_t ver;
> +
> +    /* Number of general purpose counter registers */
> +    uint8_t ngregs;
> +
> +    /* Number of fixed function counter registers */
> +    uint8_t nfregs;
> +
> +    /* Bit width of general-purpose, performance monitoring counter */
> +    uint8_t ngbits;
> +
> +    if ( IS_DEFINED(CONFIG_PV) )
> +    {
> +        printk("VPMU testing for PV guests currently unsupported\n");
> +        goto skip;

From test_main(), use return xtf_skip("...\n").  You don't need any goto
skip at all.

> +    }
> +
> +    if ( vendor_is_amd )
> +    {
> +        printk("VPMU testing for AMD currently unsupported\n");
> +        goto skip;
> +    }
> +
> +    get_intel_pmu_version(&ver, &ngregs, &nfregs, &ngbits);
> +
> +    printk("Ver=%u: GPR=%u: FFR=%x GPB=%u\n", ver, ngregs, nfregs, ngbits);

Please don't abbreviate general purpose information like this.  Also,
where possible, please use "FOO $val, BAR $val2, ..." instead if mixing
punctuation like that.

Finally, please make sure that you don't mix decimal and hex without a
leading prefix in the same print message.  It is very confusing to read.

> +
> +    switch (ver)
> +    {
> +
> +        case 4:
> +                /* version 4 facilities are not supported
> +                 * VPMU  emulates version 3 */
> +                /* fall through */

The default case should be here, with an xtf_warning("Test doesn't
support version %u\n", ver), falling through into case 3.

> +
> +        case 3:
> +                /* test version 3 facilities */
> +                if ( !test_intel_pmu_ver3( ngregs, nfregs) )
> +                    return xtf_failure("Fail: Failed VPMU version 3\n");
> +                /* fall through */
> +
> +        case 2:
> +                /* test version 2 facilities */
> +                if ( !test_intel_pmu_ver2(ngregs, nfregs) )
> +                    return xtf_failure("Fail: Failed VPMU version 2\n");
> +
> +                /* test version 1 facilities */
> +                if ( !test_intel_pmu_ver1(ngregs) )
> +                    return xtf_failure("Fail: Failed VPMU version 1\n");
> +
> +                /* test full width counters */
> +                if ( !test_full_width_cnts(ngregs, ngbits) )
> +                    return xtf_failure("Fail: Failed full width test\n");
> +
> +                break;
> +
> +        case 1:
> +                /* version 1 unsupported */

You have a v1 test function, yet it is unsupported?

> +
> +        default:

This should be a case 0 specifically for vpmu unavailable.

~Andrew

> +                printk("VMPU not supported!\n");
> +                goto skip;
> +    }
> +
> +    return xtf_success(NULL);
> +
> +skip:
> +    return xtf_skip(NULL);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

Mohit Gambhir April 25, 2017, 9:45 p.m. UTC | #2

On 04/25/2017 02:20 PM, Andrew Cooper wrote:
> On 24/04/17 18:54, Mohit Gambhir wrote:
>> This patch tests VPMU functionality in the hypervisor on Intel machines.
>> The tests write to all valid bits in the MSRs that get exposed to the guests
>> when VPMU is enabled. The tests also write invalid values to the bits
>> that should be masked and expect the wrmsr call to fault.
>>
>> The tests are currently unsupported for AMD machines and PV guests.
>>
>> Signed-off-by: Mohit Gambhir <mohit.gambhir@oracle.com>
>> ---
>>   tests/vpmu/Makefile |   9 +
>>   tests/vpmu/main.c   | 504 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 513 insertions(+)
>>   create mode 100644 tests/vpmu/Makefile
>>   create mode 100644 tests/vpmu/main.c
>>
>> diff --git a/tests/vpmu/Makefile b/tests/vpmu/Makefile
>> new file mode 100644
>> index 0000000..1eaf436
>> --- /dev/null
>> +++ b/tests/vpmu/Makefile
>> @@ -0,0 +1,9 @@
>> +include $(ROOT)/build/common.mk
>> +
>> +NAME      := vpmu
>> +CATEGORY  := utility
> utilities don't get run automatically.  Is this intentional?  If it
> isn't, what is the plan for making it automatically run?  vpmu is still
> disabled by default in all branches due to the security vulnerabilities,
> so even if this vpmu test was automatically run, it would skip due to
> vpmu not being visible.
The reason I wanted it to not run automatically was because I thought it 
will fail when vpmu
is not enabled - which is the default case. But if XTF will skip vpmu 
tests when it is disabled
then we can run the tests automatically. Should the CATEGORY be 
functional in that case?
> As a tangent, I wonder if it would be a useful to have a separate
> category for incomplete tests, but which are still useful to have for
> manual running.
Possibly. Utility category seems to do just that but I can see that it 
is really not meant for
functional tests. There are situations where writing tests before or 
while implementing a
feature is useful  and in that case this new category can be beneficial. 
It would also benefit
to have some kind of "expected failure" return type.
>> +TEST-ENVS := $(ALL_ENVIRONMENTS)
> You build for all environments, but then have PV unilaterally skip.
> Again, is this intentional?
Yes, I thought I would go back and add PV/AMD tests in V2 but I can 
leave those TEST-ENVS
out for now and "skip the skip" :) if that's preferable.
> For HVM, why all environments?  does PMU have any interaction with the
> operating or paging mode in use at the time of the samples?
Not that I know of. So should it just be hvm64?
>> +
>> +obj-perenv += main.o
>> +
>> +include $(ROOT)/build/gen.mk
>> diff --git a/tests/vpmu/main.c b/tests/vpmu/main.c
>> new file mode 100644
>> index 0000000..ed00953
>> --- /dev/null
>> +++ b/tests/vpmu/main.c
>> @@ -0,0 +1,504 @@
>> +/**
>> + * @file tests/vpmu/main.c
>> + * @ref test-vpmu
>> + *
>> + * @page test-vpmu vpmu
>> + *
>> + * Test MSRs exposed by VPMU for various versions of Architectural Performance
>> + * Monitoring Unit
> This needs rather more information.  What tests are performed?  Take a
> look at http://xenbits.xen.org/docs/xtf/test-index.html, particularly
> the functional tests.
OK, will fix it in the next version of the patch.
>> + *
>> + * @see tests/vpmu/main.c
>> + */
>> +
>> +#include <xtf.h>
>> +#include <arch/msr-index.h>
>> +#include <arch/mm.h>
>> +
>> +#define EVENT_UOPS_RETIRED                   0x004101c2
>> +#define EVENT_UOPS_RETIRED_ANYTHREAD         0x006101c2
>> +#define FIXED_CTR_CTL_BITS                   4
>> +#define FIXED_CTR_ENABLE                     0x0000000A
>> +#define FIXED_CTR_ENABLE_ANYTHREAD           0x0000000E
>> +#define IA32_PERFEVENTSELx_ENABLE_ANYTHREAD  (1ull << 21)
>> +#define IA32_PERFEVENTSELx_ENABLE_PCB        (1ull << 19)
>> +#define IA32_DEBUGCTL_Freeze_LBR_ON_PMI      (1ull << 11)
>> +#define IA32_DEBUGCTL_Freeze_PerfMon_On_PMI  (1ull << 12)
>> +
>> +const char test_title[] = "Test vpmu";
>> +
>> +static void get_intel_pmu_version(uint8_t *v, uint8_t *ng, uint8_t *nf,
>> +                                  uint8_t *ngb)
>> +{
> IMO, It looks like the logic would be easier to follow if this wasn't
> broken out of test_main().
>
OK, will fix it in the next version of the patch.
>> +    /* Inter Software Development Manual Vol 2A
>> +     * Table 3-8  Information Returned by CPUID Instruction */
> Please follow Xen Hypervisor coding style.  Therefore,
>
> /* Single line comments like this please. */
>
> and
>
> /*
>   * Multi-line comments
>   * like this please.
>   */
>
OK, will fix it in the next version of the patch.
>> +
>> +    uint32_t leaf = 0x0a, subleaf = 0;
>> +
>> +    uint32_t eax, ebx, ecx, edx;
>> +
>> +    cpuid_count(leaf, subleaf, &eax, &ebx, &ecx, &edx);
> You need to check max_leaf >= 0x0a before this is valid to do.  I have
> just pushed a change which exposes max_leaf and max_extd_leaf from the
> boot-time cpuid collection logic, which you can use.
OK, will fix it in the next version of the patch.
>> +
>> +    /* Extract the version ID - EAX 7:0 */
>> +    *v =  (eax & 0xff);
>> +
>> +    /* Extract number of general purpose counter regs - EAX 15:8 */
>> +    *ng = (eax >> 8) & 0xff;
>> +
>> +    /* Extract number of fixed function counter regs - EDX 4:0 */
>> +    *nf = edx & 0x1f;
>> +
>> +    /* Extract number of bits in general purpose counter registers bits */
>> +    *ngb = (eax >> 16)  & 0xff;
>> +}
>> +
>> +static bool test_valid_msr_write(uint32_t idx, uint64_t wval)
>> +{
>> +    const char *pfx = "Error: test_valid_msr_write:";
>> +
>> +    uint64_t rval = 0;
> I don't see much value with spacing these declarations out.
>
   OK, will fix it in the next version of the patch.
>> +
>> +    printk("Writing 0x%" PRIx64 " to MSR 0x%x\n", wval, idx);
> %#x is shorter than 0x%x when you don't care about the width of the
> number printed.
>
> However, it is generally easier to read the logs when numbers like this
> are printed at full width,
>
   OK, will fix it in the next version of the patch.
>> +
>> +    if( wrmsr_safe(idx, wval) )
>> +    {
>> +        xtf_error("%s wrmsr for MSR 0x%x resulted in a fault!\n", pfx, idx);
> If you are expecting the write to succeed and it doesn't, that is a test
> failure, rather than an error (which is used to signify something
> expected going wrong).
>
> The trailing ! is unnecessary in the log, and is it possible to reduce
> the verbosity of the logging?  This could easily be xtf_failure("Fail:
> wrmsr(%08x, %016"PRIx64") got unexpected fault\n") ?
>
> I have also found it generally helpful to not print the success cases,
> as they quickly grow out of control.  Logging just the top level areas
> of test (e.g. your vPMU version function), and the failures makes is far
> easier in the case that something does go wrong.
   OK, will fix it in the next version of the patch.
>> +        return false;
>> +    }
>> +
>> +    /* check to see if the values were written correctly */
>> +    if ( rdmsr_safe(idx, &rval) )
>> +    {
>> +        xtf_error("%s rdmsr for MSR 0x%x resulted in a fault!\n", pfx, idx);
>> +        return false;
>> +    }
>> +    if ( rval != wval )
>> +    {
>> +        xtf_error("%s rdmsr value mismatch for MSR 0x%x Expected: %"PRIx64
>> +                ", Found %"PRIx64"\n", pfx, idx, wval, rval);
>> +        return false;
>> +    }
>> +
>> +    return true;
> As a general note, having true and false returned like this are
> counterproductive in comprehensive tests like this.  It causes you to
> bail out at the first failure, rather than try everything and log all
> errors.  A single call to xtf_{error,failure}() will cause the overall
> result to be the most severe, so you don't need to pass back an extra
> status like this (unless a failure at this point actively prohibits a
> later bit of testing, which doesn't appear to be the case).
True. Will change this.
>> +}
>> +
>> +static bool test_invalid_msr_write(uint32_t idx, uint64_t wval)
>> +{
>> +    printk("Write invalid value %"PRIx64" to MSR 0x%x\n", wval, idx);
>> +
>> +    /* wrmsr_safe must return false after faulting */
>> +    if( !wrmsr_safe(idx, wval) )
>> +    {
>> +        xtf_error("Error: test_invalid_msr_write wrmsr for MSR 0x%x did not "
>> +                  "fault!\n", idx);
>> +        return false;
>> +    }
>> +
>> +    printk("wrmsr to MSR 0x%x faulted as expected.\n", idx);
>> +
>> +    return true;
>> +}
>> +
>> +static bool test_transparent_msr_write(uint32_t idx, uint64_t wval)
>> +{
>> +    const char *pfx = "Error: test_transparent_msr_write:";
>> +
>> +    uint64_t rval1 = 0;
>> +
>> +    uint64_t rval2 = 0;
> Again, no need for these to be so spaced out.
OK.
>> +
>> +    printk("Attempting to write a value with no effect to MSR 0x%x\n", idx);
>> +
>> +    /* read current value */
>> +    if ( rdmsr_safe(idx, &rval1) )
>> +    {
>> +        xtf_error("%s rdmsr for MSR 0x%x resulted in a fault\n", pfx, idx);
>> +        return false;
>> +    }
>> +
>> +    /* wrmsr should not fault but should not write anything either */
>> +    if  ( wrmsr_safe(idx, wval) )
>> +    {
>> +        xtf_error("%s wrmsr for MSR 0x%x resulted in a fault\n", pfx, idx);
>> +        return false;
>> +    }
>> +
>> +    /* read new value */
>> +    if ( rdmsr_safe(idx, &rval2) )
>> +    {
>> +        xtf_error("%s rdmsr for MSR 0x%x resulted in a fault\n", pfx, idx);
>> +        return false;
>> +    }
>> +
>> +    /* Check if the new value is the same as the one before wrmsr */
>> +
>> +    if ( rval1 != rval2 )
>> +    {
>> +        xtf_error("%s rdmsr value mismatch for MSR 0x%x Expected %"PRIx64
>> +                ". Found %"PRIx64"\n", pfx, idx, rval1, rval2);
>> +        return false;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>> +static bool test_intel_pmu_ver1(uint8_t ng)
>> +{
>> +    /* Intel Sofwtare Development Manual Vol. 3B,
>> +     * Section 18.2.1 - Architectural Performance Monitoring Version 1
>> +     *
>> +     * MSRs made available by the VPMU -
>> +     *
>> +     * IA32_PMCx (start at address 0xc1)
>> +     * IA32_PERFEVENTSELx (start at address 0x186)
>> +     */
> Very helpful comment! I thoroughly approve.
Awesome!
>> +
>> +    uint32_t idx;
>> +
>> +    uint64_t wval = 0;
>> +
>> +    uint8_t i;
> This doesn't need to be a uint8_t.  A plain unsigned int will be fine.
>
OK.
>> +
>> +    printk("-----------------\n");
>> +    printk("Testing version 1\n");
>> +    printk("-----------------\n");
>> +
>> +    /* for all general purpose counters */
>> +    for ( i = 0; i < ng; i++ )
>> +    {
>> +        /* test writing to IA32_PMCx */
>> +        idx = MSR_IA32_PMC(i);
>> +
>> +        /* test we can write to all valid bits in the counters */
>> +        /* don't set bit 31 since that gets sign-extended */
> What is wrong with bit 31 being sign extended?
The problem there is that setting PMCx to 0xffffffff sign extends 31st 
bit to
the remaining bits. So PMCx becomes 0xffffffffffff (when bit width is 
48). rdmsr then
reads all 48 bits and the value mismatches the one that was written.

A smarter test would provide both - write value to write and read value 
to check against.
But test_valid_msr_write() only takes wval and checks to see if we read 
the same value
that we wrote.

>
>> +        wval = ((1ull << 31) - 1) ;
>> +
>> +        if ( !test_valid_msr_write(idx, wval) )
>> +            return false;
>> +
>> +        /* set all valid bits in MSR_IA32_EVENTSELx */
>> +        idx = MSR_IA32_PERFEVTSEL(i);
>> +        wval = ((1ull << 32) - 1) ^ (IA32_PERFEVENTSELx_ENABLE_ANYTHREAD |
>> +                IA32_PERFEVENTSELx_ENABLE_PCB);
> What is wrong with Pin Control in this register?  Architecturally, it is
> a software configurable bit.
Look at this patch on xen-devel

[PATCH] Fix hypervisor crash when writing to VPMU MSR

I found that out while writing this test.

>> +
>> +        if ( !test_valid_msr_write(idx, wval) )
>> +            return false;
>> +
>> +        /* test writing an invalid value and assert that it faults */
>> +        wval = ~((1ull << 32) - 1);
>> +
>> +        if ( !test_invalid_msr_write(idx, wval) )
>> +            return false;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>> +static bool test_intel_pmu_ver2(uint8_t ng, uint8_t nf)
>> +{
>> +    /* Intel Sofwtare Development Manual Vol. 3B,
>> +     * Section 18.2.2 - Architectural Performance Monitoring Version 2
>> +     *
>> +     * MSRs made available by the VPMU -
>> +     *
>> +     * IA32_FIXED_CTRx (start at address 0x309)
>> +     * IA32_FIXED_CTR_CTRL
>> +     * IA32_PERF_GLOBAL_CTRL
>> +     * IA32_PERF_GLOBAL_STATUS (read-only)
>> +     * IA32_PERF_GLOBAL_OVF_CTRL
>> +     * IA32_DEBUGCTL_Freeze_LBR_ON_PMI
>> +     * IA32_DEBUGCTL_Freeze_PerfMon_On_PMI
>> +     */
>> +
>> +    uint32_t idx;
>> +
>> +    uint64_t wval, rval;
>> +
>> +    uint8_t i;
>> +
>> +    printk("-----------------\n");
>> +    printk("Testing version 2\n");
>> +    printk("-----------------\n");
>> +
>> +    for ( i = 0; i < nf; i++ )
>> +    {
>> +        /* test writing to IA32_FIXED_CTRx */
>> +        idx = MSR_IA32_FIXED_CTR(i);
>> +
>> +        wval = (1ull << 32) - 1;
>> +
>> +        if ( !test_valid_msr_write(idx, wval) )
>> +             return false;
>> +
>> +        /* test invalid write to IA32_FIXED_CTRx */
>> +        wval = ~wval;
>> +
>> +        if ( !test_invalid_msr_write(idx, wval) )
>> +             return false;
>> +    }
>> +
>> +    /* test IA32_FIXED_CTR_CTRL */
>> +    idx = MSR_IA32_FIXED_CTR_CTRL;
>> +
>> +    /* enable all fixed counters */
>> +    wval = 0;
>> +
>> +    for ( i = 0; i < nf; i++ )
>> +        wval |= (FIXED_CTR_ENABLE << (FIXED_CTR_CTL_BITS * i));
>> +
>> +    if ( !test_valid_msr_write(idx, wval) )
>> +        return false;
>> +
>> +    /* invert wval to test writing an invalid value */
>> +    wval = ~wval;
>> +
>> +    if ( !test_invalid_msr_write(idx, wval) )
>> +      return false;
>> +
>> +    /* test IA32_PERF_GLOBAL_CTRL */
>> +    idx = MSR_IA32_PERF_GLOBAL_CTRL;
>> +
>> +    wval = 0;
>> +
>> +    /* set all fixed function counters enable bits */
>> +    for ( i=0; i < nf; i ++ )
>> +        wval |= ((uint64_t)1 << (32 + i));
>> +
>> +    /* set all general purpose counters enable bits*/
>> +    for ( i = 0; i < ng; i++ )
>> +        wval |= (1 << i);
>> +
>> +    if ( !test_valid_msr_write(idx, wval) )
>> +         return false;
>> +
>> +    /* invert wval to test invalid write to MSR_IA32_PERF_GLOBAL_CTRL*/
>> +    wval = ~wval;
>> +
>> +    if ( !test_invalid_msr_write(idx, wval) )
>> +        return false;
>> +
>> +    /* test IA32_PERF_GLOBAL_OVF_CTRL */
>> +    idx = MSR_IA32_PERF_GLOBAL_OVF_CTRL;
>> +
>> +    /* set all valid bits in MSR_IA32_PERF_GLOBAL_OVF_CTRL */
>> +    wval = 0xC000000000000000 | (((1ULL << nf) - 1) << 32) | ((1ULL << ng) - 1);
>> +
>> +    /* Writing to MSR_IA32_PERF_GLOBAL_OVF_CTRL clears
>> +     * MSR_IA32_PERF_GLOBAL_STATUS but but always returns 0 when read so
>> +     * it is tested as a transparent write */
>> +
>> +    if ( !test_transparent_msr_write(idx, wval) )
>> +        return false;
>> +
>> +    /* invert wval to test invalid write to MSR_IA32_PERF_GLOBAL_OVF_CTRL*/
>> +    wval = ~wval;
>> +
>> +    if ( !test_invalid_msr_write(idx, wval) )
>> +        return false;
>> +
>> +    /* test IA32_PERF_GLOBAL_STATUS (read-only) */
>> +    idx = MSR_IA32_PERF_GLOBAL_STATUS;
>> +
>> +    if ( rdmsr_safe(idx, &rval) )
>> +    {
>> +        xtf_error("Error: test_intel_pmu_ver2: "
>> +                  "rdmsr_safe for MSR 0x%x resulted in a fault!\n", idx);
>> +        return false;
>> +    }
>> +
>> +    /* try to write the IA32_PERF_GLOBAL_STATUS register and expect it to fail*/
>> +    if ( !test_invalid_msr_write(idx, rval) )
>> +        return false;
>> +
>> +
>> +    /* test IA32_DEBUGCTL */
>> +    idx = MSR_IA32_DEBUGCTL;
>> +
>> +    /* Test IA32_DEBUGCTL facilities enabled by v2 */
>> +    wval = IA32_DEBUGCTL_Freeze_LBR_ON_PMI | IA32_DEBUGCTL_Freeze_PerfMon_On_PMI;
>> +
>> +    /* FIXME: This should really be a valid write but it isnt supported by the
>> +     * VPMU yet */
> In which case the test should be correct here, and highlight that there
> is a bug in Xen.
But then, should the main test return xtf_success, xtf_skip or xtf_failure?
>> +    if ( !test_transparent_msr_write(idx, wval) )
>> +        return false;
>> +
>> +    return true;
>> +}
>> +
>> +static bool test_intel_pmu_ver3(uint8_t ng, uint8_t nf)
>> +{
>> +    /* Intel Sofwtare Development Manual Vol. 3B
>> +     * Section 18.2.3 Architectural Performance Monitoring Version 3
>> +     *
>> +     * MSRs made available by the VPMU
>> +     *
>> +     * IA32_PERFEVENTSELx.ANYTHREAD (Bit 21)
>> +     * IA32_FIXED_CTR_CTRL.ANYTHREADx (Bit 2, 6, 11)
>> +     *
>> +     * Version 3 introduces ANYTHREAD bit but VPMU does not support it to
>> +     * ensure that a VCPU isn't able to read values from physical resources that
>> +     * are not allocated to it. This test case validates that we are unable to
>> +     * write to .ANYTHREAD bit in IA32_PERFEVENTSELx and IA32_FIXED_CTR_CTRL
>> +     */
>> +
>> +    uint64_t wval;
>> +
>> +    uint32_t idx;
>> +
>> +    uint8_t i;
>> +
>> +    printk("-----------------\n");
>> +    printk("Testing version 3\n");
>> +    printk("-----------------\n");
>> +
>> +    /* test IA32_PERFEVENTSELx.ANYTHREAD is disabled */
>> +    for ( i = 0; i < ng; i++ )
>> +    {
>> +        idx = MSR_IA32_PERFEVTSEL(i);
>> +
>> +        wval = EVENT_UOPS_RETIRED_ANYTHREAD;
>> +
>> +        if ( !test_invalid_msr_write(idx, wval) )
>> +            return false;
>> +    }
>> +
>> +
>> +    /* test IA32_FIXED_CTR_CTL.ANYTHREAD is disabled */
>> +    idx = MSR_IA32_FIXED_CTR_CTRL;
>> +
>> +    wval = 0;
>> +
>> +    for ( i = 0; i < nf; i++ )
>> +        wval |= (FIXED_CTR_ENABLE_ANYTHREAD << (4 * i)) ;
>> +
>> +    if ( !test_invalid_msr_write(idx, wval) )
>> +        return false;
>> +
>> +    return true;
>> +}
>> +
>> +static bool test_full_width_cnts(uint8_t ng, uint8_t ngb)
>> +{
>> +    uint64_t caps;
>> +
>> +    uint32_t idx;
>> +
>> +    uint64_t wval;
>> +
>> +    uint8_t i;
>> +
>> +    /* Get perf capabilties */
>> +    if ( rdmsr_safe(MSR_IA32_PERF_CAPABILITIES, &caps) )
>> +    {
>> +        printk("Fault while reading MSR_IA32_PERF_CAPABILITIES\n");
> The PERF_CAPABILITIES MSR is only valid to read if PDCM is advertised in
> CPUID.
>
> We currently never advertise PCDM, and is a bug that
> MSR_PERF_CAPABILITIES is accessable at all in guest context.
> (specifically, it is because Xen leaks almost all host MSR state into
> guests).
So I suppose I can't test full-bit width writes then?
>> +        return false;
>> +    }
>> +
>> +    if ( !(caps >> 13) & 1 )
> This lacks sufficient brackets for what you are intending to do.  Also,
> it looks like you probably want a #define at the top of the file for
> this constant.
! and & have the same precedence and right to left associativity. So 
technically it should
do what it is intended to do. But I will make it look better.
>
>> +    {
>> +        printk("Full width counters not supported\n");
>> +        return true;
>> +    }
>> +
>> +    /* test writes to full width counters */
>> +    for ( i = 0; i < ng; i++)
>> +    {
>> +        idx = MSR_IA32_A_PMC(i);
>> +
>> +        wval = ((1ull << ngb) - 1) ;
>> +
>> +        if ( !test_valid_msr_write(idx, wval) )
>> +                return false;
>> +
>> +        /* invert wval to test invalid write to MSR_IA32_PERF_GLOBAL_OVF_CTRL*/
>> +        wval = ~wval;
>> +
>> +        if ( !test_invalid_msr_write(idx, wval) )
>> +            return false;
>> +    }
>> +
>> +    return true;
>> +
>> +}
>> +
>> +void test_main(void)
>> +{
>> +    /* Architectural Performance Monitoring Version */
>> +    uint8_t ver;
>> +
>> +    /* Number of general purpose counter registers */
>> +    uint8_t ngregs;
>> +
>> +    /* Number of fixed function counter registers */
>> +    uint8_t nfregs;
>> +
>> +    /* Bit width of general-purpose, performance monitoring counter */
>> +    uint8_t ngbits;
>> +
>> +    if ( IS_DEFINED(CONFIG_PV) )
>> +    {
>> +        printk("VPMU testing for PV guests currently unsupported\n");
>> +        goto skip;
>  From test_main(), use return xtf_skip("...\n").  You don't need any goto
> skip at all.
Ok.
>> +    }
>> +
>> +    if ( vendor_is_amd )
>> +    {
>> +        printk("VPMU testing for AMD currently unsupported\n");
>> +        goto skip;
>> +    }
>> +
>> +    get_intel_pmu_version(&ver, &ngregs, &nfregs, &ngbits);
>> +
>> +    printk("Ver=%u: GPR=%u: FFR=%x GPB=%u\n", ver, ngregs, nfregs, ngbits);
> Please don't abbreviate general purpose information like this.  Also,
> where possible, please use "FOO $val, BAR $val2, ..." instead if mixing
> punctuation like that.
>
> Finally, please make sure that you don't mix decimal and hex without a
> leading prefix in the same print message.  It is very confusing to read.
OK, will fix this.
>> +
>> +    switch (ver)
>> +    {
>> +
>> +        case 4:
>> +                /* version 4 facilities are not supported
>> +                 * VPMU  emulates version 3 */
>> +                /* fall through */
> The default case should be here, with an xtf_warning("Test doesn't
> support version %u\n", ver), falling through into case 3.
OK
>> +
>> +        case 3:
>> +                /* test version 3 facilities */
>> +                if ( !test_intel_pmu_ver3( ngregs, nfregs) )
>> +                    return xtf_failure("Fail: Failed VPMU version 3\n");
>> +                /* fall through */
>> +
>> +        case 2:
>> +                /* test version 2 facilities */
>> +                if ( !test_intel_pmu_ver2(ngregs, nfregs) )
>> +                    return xtf_failure("Fail: Failed VPMU version 2\n");
>> +
>> +                /* test version 1 facilities */
>> +                if ( !test_intel_pmu_ver1(ngregs) )
>> +                    return xtf_failure("Fail: Failed VPMU version 1\n");
>> +
>> +                /* test full width counters */
>> +                if ( !test_full_width_cnts(ngregs, ngbits) )
>> +                    return xtf_failure("Fail: Failed full width test\n");
>> +
>> +                break;
>> +
>> +        case 1:
>> +                /* version 1 unsupported */
> You have a v1 test function, yet it is unsupported?
Yes, from VPMUs perspective, v1 in itself is unsupported because that would
have required disabling access to PERF_GLOBAL_STATUS and 
PERF_GLOBAL_CTRL MSRs.
However, each version does support facilities introduced in the previous 
version.
Thus, test_intel_pmu_ver1() is intended for v2 testing and helps 
breaking the
test_intel_pmu_ver2()  function into facilities introduced by the 
architecture in
two separate versions - v1 and v2
>> +
>> +        default:
> This should be a case 0 specifically for vpmu unavailable.
OK
> ~Andrew
>
>> +                printk("VMPU not supported!\n");
>> +                goto skip;
>> +    }
>> +
>> +    return xtf_success(NULL);
>> +
>> +skip:
>> +    return xtf_skip(NULL);
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */

Andrew Cooper May 3, 2017, 8:41 p.m. UTC | #3

On 25/04/17 22:45, Mohit Gambhir wrote:
>>> diff --git a/tests/vpmu/Makefile b/tests/vpmu/Makefile
>>> new file mode 100644
>>> index 0000000..1eaf436
>>> --- /dev/null
>>> +++ b/tests/vpmu/Makefile
>>> @@ -0,0 +1,9 @@
>>> +include $(ROOT)/build/common.mk
>>> +
>>> +NAME      := vpmu
>>> +CATEGORY  := utility
>> utilities don't get run automatically.  Is this intentional?  If it
>> isn't, what is the plan for making it automatically run?  vpmu is still
>> disabled by default in all branches due to the security vulnerabilities,
>> so even if this vpmu test was automatically run, it would skip due to
>> vpmu not being visible.
> The reason I wanted it to not run automatically was because I thought
> it will fail when vpmu
> is not enabled - which is the default case. But if XTF will skip vpmu
> tests when it is disabled
> then we can run the tests automatically. Should the CATEGORY be
> functional in that case?
>> As a tangent, I wonder if it would be a useful to have a separate
>> category for incomplete tests, but which are still useful to have for
>> manual running.
> Possibly. Utility category seems to do just that but I can see that it
> is really not meant for
> functional tests. There are situations where writing tests before or
> while implementing a
> feature is useful  and in that case this new category can be
> beneficial. It would also benefit
> to have some kind of "expected failure" return type.

Hopefully I have covered all of these points sufficiently in the 0/2
thread.  In the meantime, I'll consider how best to introduce a "not
quite fully baked yet" category.

>>> +TEST-ENVS := $(ALL_ENVIRONMENTS)
>> You build for all environments, but then have PV unilaterally skip.
>> Again, is this intentional?
> Yes, I thought I would go back and add PV/AMD tests in V2 but I can
> leave those TEST-ENVS
> out for now and "skip the skip" :) if that's preferable.
>> For HVM, why all environments?  does PMU have any interaction with the
>> operating or paging mode in use at the time of the samples?
> Not that I know of. So should it just be hvm64?

For now, I'd stick with just hvm64.  (A complication which I have yet to
sort out is the TEST-REVISION builds, so adding new logical content to
an existing test doesn't interfere with OSSTests bisection logic, but
that isn't an immediate problem.)

>
>>> +
>>> +    printk("-----------------\n");
>>> +    printk("Testing version 1\n");
>>> +    printk("-----------------\n");
>>> +
>>> +    /* for all general purpose counters */
>>> +    for ( i = 0; i < ng; i++ )
>>> +    {
>>> +        /* test writing to IA32_PMCx */
>>> +        idx = MSR_IA32_PMC(i);
>>> +
>>> +        /* test we can write to all valid bits in the counters */
>>> +        /* don't set bit 31 since that gets sign-extended */
>> What is wrong with bit 31 being sign extended?
> The problem there is that setting PMCx to 0xffffffff sign extends 31st
> bit to
> the remaining bits. So PMCx becomes 0xffffffffffff (when bit width is
> 48). rdmsr then
> reads all 48 bits and the value mismatches the one that was written.

That behaviour sounds mad, but I see it now described in the manuals. 
Oh well...

>
> A smarter test would provide both - write value to write and read
> value to check against.
> But test_valid_msr_write() only takes wval and checks to see if we
> read the same value
> that we wrote.

Well - you did introduce that function.  If altering it would be a
smarter test, then perhaps that is the better option to take.

>
>>
>>> +        wval = ((1ull << 31) - 1) ;
>>> +
>>> +        if ( !test_valid_msr_write(idx, wval) )
>>> +            return false;
>>> +
>>> +        /* set all valid bits in MSR_IA32_EVENTSELx */
>>> +        idx = MSR_IA32_PERFEVTSEL(i);
>>> +        wval = ((1ull << 32) - 1) ^
>>> (IA32_PERFEVENTSELx_ENABLE_ANYTHREAD |
>>> +                IA32_PERFEVENTSELx_ENABLE_PCB);
>> What is wrong with Pin Control in this register?  Architecturally, it is
>> a software configurable bit.
> Look at this patch on xen-devel
>
> [PATCH] Fix hypervisor crash when writing to VPMU MSR
>
> I found that out while writing this test.

Welcome to the very grey areas of processors which XTF has a habit of
finding.  (I really need to correlate all the suspected errata in the
doxygen comments).

Obviously, Xen shouldn't crash, but we are going to need feedback from
Intel before working out how to proceed.

>
>>> +    /* test IA32_DEBUGCTL */
>>> +    idx = MSR_IA32_DEBUGCTL;
>>> +
>>> +    /* Test IA32_DEBUGCTL facilities enabled by v2 */
>>> +    wval = IA32_DEBUGCTL_Freeze_LBR_ON_PMI |
>>> IA32_DEBUGCTL_Freeze_PerfMon_On_PMI;
>>> +
>>> +    /* FIXME: This should really be a valid write but it isnt
>>> supported by the
>>> +     * VPMU yet */
>> In which case the test should be correct here, and highlight that there
>> is a bug in Xen.
> But then, should the main test return xtf_success, xtf_skip or
> xtf_failure?

Failure.  Xen is provably malfunctioning.

>>>
>>> +    /* Get perf capabilties */
>>> +    if ( rdmsr_safe(MSR_IA32_PERF_CAPABILITIES, &caps) )
>>> +    {
>>> +        printk("Fault while reading MSR_IA32_PERF_CAPABILITIES\n");
>> The PERF_CAPABILITIES MSR is only valid to read if PDCM is advertised in
>> CPUID.
>>
>> We currently never advertise PCDM, and is a bug that
>> MSR_PERF_CAPABILITIES is accessable at all in guest context.
>> (specifically, it is because Xen leaks almost all host MSR state into
>> guests).
> So I suppose I can't test full-bit width writes then?

Well, the code should be architecturally correct to start with.  After
that, it is reasonable to explain that Xen used to leak
PERF_CAPABILITIES into guests view, at which point probing it is a
legitimate thing to do.

However, you should take care that your logic doesn't begin to fail at
the point that Xen is fixed to behave architecturally (and actually
return #GP[0] if PDCM isn't advertised to the guest), because I will
definitely be fixing the MSR leakage problems (if noone beats me to it).

>>> +        return false;
>>> +    }
>>> +
>>> +    if ( !(caps >> 13) & 1 )
>> This lacks sufficient brackets for what you are intending to do.  Also,
>> it looks like you probably want a #define at the top of the file for
>> this constant.
> ! and & have the same precedence and right to left associativity. So
> technically it should
> do what it is intended to do. But I will make it look better.

Logical NOT and Address-of have the same precedence.  Logical NOT has a
higher precedence than Bitwise AND.

And to resolve any doubt, here is what Clang things:

andrewcoop@andrewcoop:/local/xen-test-framework.git$ make CC=clang-4.0
-j4 -s
main.c:396:10: error: logical not is only applied to the left hand side
of this bitwise operator [-Werror,-Wlogical-not-parentheses]
    if ( !(caps >> 13) & 1 )
         ^             ~

>
>>> +
>>> +        case 3:
>>> +                /* test version 3 facilities */
>>> +                if ( !test_intel_pmu_ver3( ngregs, nfregs) )
>>> +                    return xtf_failure("Fail: Failed VPMU version
>>> 3\n");
>>> +                /* fall through */
>>> +
>>> +        case 2:
>>> +                /* test version 2 facilities */
>>> +                if ( !test_intel_pmu_ver2(ngregs, nfregs) )
>>> +                    return xtf_failure("Fail: Failed VPMU version
>>> 2\n");
>>> +
>>> +                /* test version 1 facilities */
>>> +                if ( !test_intel_pmu_ver1(ngregs) )
>>> +                    return xtf_failure("Fail: Failed VPMU version
>>> 1\n");
>>> +
>>> +                /* test full width counters */
>>> +                if ( !test_full_width_cnts(ngregs, ngbits) )
>>> +                    return xtf_failure("Fail: Failed full width
>>> test\n");
>>> +
>>> +                break;
>>> +
>>> +        case 1:
>>> +                /* version 1 unsupported */
>> You have a v1 test function, yet it is unsupported?
> Yes, from VPMUs perspective, v1 in itself is unsupported because that
> would
> have required disabling access to PERF_GLOBAL_STATUS and
> PERF_GLOBAL_CTRL MSRs.
> However, each version does support facilities introduced in the
> previous version.
> Thus, test_intel_pmu_ver1() is intended for v2 testing and helps
> breaking the
> test_intel_pmu_ver2()  function into facilities introduced by the
> architecture in
> two separate versions - v1 and v2

You are logically confusing Xen's reasoning for supporting different
versions of hardware PMU, with the version it decides to advertise to
guests.

It is certainly reasonable for the vPMU driver in Xen to decide that,
without PMU v2, it is unacceptably complicated to deal with context
switching the counter state.  However, this has no bearing on which
version of PMU is emulated to the guest, and it would certainly should
be possible to configure Xen to only advertise v1 to guests, while the
real hardware actually supports v2.  (After all, we already downgrade v4
to v3 in a guests view).

On another note about the MSR leakage, the v$N test case should ideally
check that none of the v$N+1 capabilities/features are leaked into the
guest, but this smells like a similar quantity of "redoing it properly
the 2nd time around" as my attempts to fix CPUID handling in Xen.

~Andrew

Mohit Gambhir May 4, 2017, 9:13 p.m. UTC | #4

On 05/03/2017 04:41 PM, Andrew Cooper wrote:
> On 25/04/17 22:45, Mohit Gambhir wrote:
>>>> diff --git a/tests/vpmu/Makefile b/tests/vpmu/Makefile
>>>> new file mode 100644
>>>> index 0000000..1eaf436
>>>> --- /dev/null
>>>> +++ b/tests/vpmu/Makefile
>>>> @@ -0,0 +1,9 @@
>>>> +include $(ROOT)/build/common.mk
>>>> +
>>>> +NAME      := vpmu
>>>> +CATEGORY  := utility
>>> utilities don't get run automatically.  Is this intentional?  If it
>>> isn't, what is the plan for making it automatically run?  vpmu is still
>>> disabled by default in all branches due to the security vulnerabilities,
>>> so even if this vpmu test was automatically run, it would skip due to
>>> vpmu not being visible.
>> The reason I wanted it to not run automatically was because I thought
>> it will fail when vpmu
>> is not enabled - which is the default case. But if XTF will skip vpmu
>> tests when it is disabled
>> then we can run the tests automatically. Should the CATEGORY be
>> functional in that case?
>>> As a tangent, I wonder if it would be a useful to have a separate
>>> category for incomplete tests, but which are still useful to have for
>>> manual running.
>> Possibly. Utility category seems to do just that but I can see that it
>> is really not meant for
>> functional tests. There are situations where writing tests before or
>> while implementing a
>> feature is useful  and in that case this new category can be
>> beneficial. It would also benefit
>> to have some kind of "expected failure" return type.
> Hopefully I have covered all of these points sufficiently in the 0/2
> thread.  In the meantime, I'll consider how best to introduce a "not
> quite fully baked yet" category.
>
>>>> +TEST-ENVS := $(ALL_ENVIRONMENTS)
>>> You build for all environments, but then have PV unilaterally skip.
>>> Again, is this intentional?
>> Yes, I thought I would go back and add PV/AMD tests in V2 but I can
>> leave those TEST-ENVS
>> out for now and "skip the skip" :) if that's preferable.
>>> For HVM, why all environments?  does PMU have any interaction with the
>>> operating or paging mode in use at the time of the samples?
>> Not that I know of. So should it just be hvm64?
> For now, I'd stick with just hvm64.  (A complication which I have yet to
> sort out is the TEST-REVISION builds, so adding new logical content to
> an existing test doesn't interfere with OSSTests bisection logic, but
> that isn't an immediate problem.)
Fixed in v5.
>>>> +
>>>> +    printk("-----------------\n");
>>>> +    printk("Testing version 1\n");
>>>> +    printk("-----------------\n");
>>>> +
>>>> +    /* for all general purpose counters */
>>>> +    for ( i = 0; i < ng; i++ )
>>>> +    {
>>>> +        /* test writing to IA32_PMCx */
>>>> +        idx = MSR_IA32_PMC(i);
>>>> +
>>>> +        /* test we can write to all valid bits in the counters */
>>>> +        /* don't set bit 31 since that gets sign-extended */
>>> What is wrong with bit 31 being sign extended?
>> The problem there is that setting PMCx to 0xffffffff sign extends 31st
>> bit to
>> the remaining bits. So PMCx becomes 0xffffffffffff (when bit width is
>> 48). rdmsr then
>> reads all 48 bits and the value mismatches the one that was written.
> That behaviour sounds mad, but I see it now described in the manuals.
> Oh well...
>
>> A smarter test would provide both - write value to write and read
>> value to check against.
>> But test_valid_msr_write() only takes wval and checks to see if we
>> read the same value
>> that we wrote.
> Well - you did introduce that function.  If altering it would be a
> smarter test, then perhaps that is the better option to take.
Fixed in v5
>>>> +        wval = ((1ull << 31) - 1) ;
>>>> +
>>>> +        if ( !test_valid_msr_write(idx, wval) )
>>>> +            return false;
>>>> +
>>>> +        /* set all valid bits in MSR_IA32_EVENTSELx */
>>>> +        idx = MSR_IA32_PERFEVTSEL(i);
>>>> +        wval = ((1ull << 32) - 1) ^
>>>> (IA32_PERFEVENTSELx_ENABLE_ANYTHREAD |
>>>> +                IA32_PERFEVENTSELx_ENABLE_PCB);
>>> What is wrong with Pin Control in this register?  Architecturally, it is
>>> a software configurable bit.
>> Look at this patch on xen-devel
>>
>> [PATCH] Fix hypervisor crash when writing to VPMU MSR
>>
>> I found that out while writing this test.
> Welcome to the very grey areas of processors which XTF has a habit of
> finding.  (I really need to correlate all the suspected errata in the
> doxygen comments).
>
> Obviously, Xen shouldn't crash, but we are going to need feedback from
> Intel before working out how to proceed.
>
>>>> +    /* test IA32_DEBUGCTL */
>>>> +    idx = MSR_IA32_DEBUGCTL;
>>>> +
>>>> +    /* Test IA32_DEBUGCTL facilities enabled by v2 */
>>>> +    wval = IA32_DEBUGCTL_Freeze_LBR_ON_PMI |
>>>> IA32_DEBUGCTL_Freeze_PerfMon_On_PMI;
>>>> +
>>>> +    /* FIXME: This should really be a valid write but it isnt
>>>> supported by the
>>>> +     * VPMU yet */
>>> In which case the test should be correct here, and highlight that there
>>> is a bug in Xen.
>> But then, should the main test return xtf_success, xtf_skip or
>> xtf_failure?
> Failure.  Xen is provably malfunctioning.
Fixed in v5. The tests now fail.
>
>>>> +    /* Get perf capabilties */
>>>> +    if ( rdmsr_safe(MSR_IA32_PERF_CAPABILITIES, &caps) )
>>>> +    {
>>>> +        printk("Fault while reading MSR_IA32_PERF_CAPABILITIES\n");
>>> The PERF_CAPABILITIES MSR is only valid to read if PDCM is advertised in
>>> CPUID.
>>>
>>> We currently never advertise PCDM, and is a bug that
>>> MSR_PERF_CAPABILITIES is accessable at all in guest context.
>>> (specifically, it is because Xen leaks almost all host MSR state into
>>> guests).
>> So I suppose I can't test full-bit width writes then?
> Well, the code should be architecturally correct to start with.  After
> that, it is reasonable to explain that Xen used to leak
> PERF_CAPABILITIES into guests view, at which point probing it is a
> legitimate thing to do.
>
> However, you should take care that your logic doesn't begin to fail at
> the point that Xen is fixed to behave architecturally (and actually
> return #GP[0] if PDCM isn't advertised to the guest), because I will
> definitely be fixing the MSR leakage problems (if noone beats me to it).
Fixed in v5 ( I think!)
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    if ( !(caps >> 13) & 1 )
>>> This lacks sufficient brackets for what you are intending to do.  Also,
>>> it looks like you probably want a #define at the top of the file for
>>> this constant.
>> ! and & have the same precedence and right to left associativity. So
>> technically it should
>> do what it is intended to do. But I will make it look better.
> Logical NOT and Address-of have the same precedence.  Logical NOT has a
> higher precedence than Bitwise AND.
>
> And to resolve any doubt, here is what Clang things:
>
> andrewcoop@andrewcoop:/local/xen-test-framework.git$ make CC=clang-4.0
> -j4 -s
> main.c:396:10: error: logical not is only applied to the left hand side
> of this bitwise operator [-Werror,-Wlogical-not-parentheses]
>      if ( !(caps >> 13) & 1 )
>           ^             ~
Hard to argue that. Fixed in v5.
>>>> +
>>>> +        case 3:
>>>> +                /* test version 3 facilities */
>>>> +                if ( !test_intel_pmu_ver3( ngregs, nfregs) )
>>>> +                    return xtf_failure("Fail: Failed VPMU version
>>>> 3\n");
>>>> +                /* fall through */
>>>> +
>>>> +        case 2:
>>>> +                /* test version 2 facilities */
>>>> +                if ( !test_intel_pmu_ver2(ngregs, nfregs) )
>>>> +                    return xtf_failure("Fail: Failed VPMU version
>>>> 2\n");
>>>> +
>>>> +                /* test version 1 facilities */
>>>> +                if ( !test_intel_pmu_ver1(ngregs) )
>>>> +                    return xtf_failure("Fail: Failed VPMU version
>>>> 1\n");
>>>> +
>>>> +                /* test full width counters */
>>>> +                if ( !test_full_width_cnts(ngregs, ngbits) )
>>>> +                    return xtf_failure("Fail: Failed full width
>>>> test\n");
>>>> +
>>>> +                break;
>>>> +
>>>> +        case 1:
>>>> +                /* version 1 unsupported */
>>> You have a v1 test function, yet it is unsupported?
>> Yes, from VPMUs perspective, v1 in itself is unsupported because that
>> would
>> have required disabling access to PERF_GLOBAL_STATUS and
>> PERF_GLOBAL_CTRL MSRs.
>> However, each version does support facilities introduced in the
>> previous version.
>> Thus, test_intel_pmu_ver1() is intended for v2 testing and helps
>> breaking the
>> test_intel_pmu_ver2()  function into facilities introduced by the
>> architecture in
>> two separate versions - v1 and v2
> You are logically confusing Xen's reasoning for supporting different
> versions of hardware PMU, with the version it decides to advertise to
> guests.
>
> It is certainly reasonable for the vPMU driver in Xen to decide that,
> without PMU v2, it is unacceptably complicated to deal with context
> switching the counter state.  However, this has no bearing on which
> version of PMU is emulated to the guest, and it would certainly should
> be possible to configure Xen to only advertise v1 to guests, while the
> real hardware actually supports v2.  (After all, we already downgrade v4
> to v3 in a guests view).
Yes, I think I understand what you are saying. I still do stand by the 
decision of  having a
separate function to test v1 facilities for -
A) modularity
B) Localizing the knowledge of xen internals to test_main() function. 
The rest of the code
is written to test the facilities  as described in the SDM.
> On another note about the MSR leakage, the v$N test case should ideally
> check that none of the v$N+1 capabilities/features are leaked into the
> guest, but this smells like a similar quantity of "redoing it properly
> the 2nd time around" as my attempts to fix CPUID handling in Xen.
Maybe something that we can add in a future patch?
> ~Andrew
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

Mohit Gambhir May 4, 2017, 9:36 p.m. UTC | #5

Adding personal email id for future correspondence on this thread - 
mgambhir@outlook.com

On 05/04/2017 05:13 PM, Mohit Gambhir wrote:
>
>
> On 05/03/2017 04:41 PM, Andrew Cooper wrote:
>> On 25/04/17 22:45, Mohit Gambhir wrote:
>>>>> diff --git a/tests/vpmu/Makefile b/tests/vpmu/Makefile
>>>>> new file mode 100644
>>>>> index 0000000..1eaf436
>>>>> --- /dev/null
>>>>> +++ b/tests/vpmu/Makefile
>>>>> @@ -0,0 +1,9 @@
>>>>> +include $(ROOT)/build/common.mk
>>>>> +
>>>>> +NAME      := vpmu
>>>>> +CATEGORY  := utility
>>>> utilities don't get run automatically.  Is this intentional?  If it
>>>> isn't, what is the plan for making it automatically run? vpmu is still
>>>> disabled by default in all branches due to the security 
>>>> vulnerabilities,
>>>> so even if this vpmu test was automatically run, it would skip due to
>>>> vpmu not being visible.
>>> The reason I wanted it to not run automatically was because I thought
>>> it will fail when vpmu
>>> is not enabled - which is the default case. But if XTF will skip vpmu
>>> tests when it is disabled
>>> then we can run the tests automatically. Should the CATEGORY be
>>> functional in that case?
>>>> As a tangent, I wonder if it would be a useful to have a separate
>>>> category for incomplete tests, but which are still useful to have for
>>>> manual running.
>>> Possibly. Utility category seems to do just that but I can see that it
>>> is really not meant for
>>> functional tests. There are situations where writing tests before or
>>> while implementing a
>>> feature is useful  and in that case this new category can be
>>> beneficial. It would also benefit
>>> to have some kind of "expected failure" return type.
>> Hopefully I have covered all of these points sufficiently in the 0/2
>> thread.  In the meantime, I'll consider how best to introduce a "not
>> quite fully baked yet" category.
>>
>>>>> +TEST-ENVS := $(ALL_ENVIRONMENTS)
>>>> You build for all environments, but then have PV unilaterally skip.
>>>> Again, is this intentional?
>>> Yes, I thought I would go back and add PV/AMD tests in V2 but I can
>>> leave those TEST-ENVS
>>> out for now and "skip the skip" :) if that's preferable.
>>>> For HVM, why all environments?  does PMU have any interaction with the
>>>> operating or paging mode in use at the time of the samples?
>>> Not that I know of. So should it just be hvm64?
>> For now, I'd stick with just hvm64.  (A complication which I have yet to
>> sort out is the TEST-REVISION builds, so adding new logical content to
>> an existing test doesn't interfere with OSSTests bisection logic, but
>> that isn't an immediate problem.)
> Fixed in v5.
>>>>> +
>>>>> +    printk("-----------------\n");
>>>>> +    printk("Testing version 1\n");
>>>>> +    printk("-----------------\n");
>>>>> +
>>>>> +    /* for all general purpose counters */
>>>>> +    for ( i = 0; i < ng; i++ )
>>>>> +    {
>>>>> +        /* test writing to IA32_PMCx */
>>>>> +        idx = MSR_IA32_PMC(i);
>>>>> +
>>>>> +        /* test we can write to all valid bits in the counters */
>>>>> +        /* don't set bit 31 since that gets sign-extended */
>>>> What is wrong with bit 31 being sign extended?
>>> The problem there is that setting PMCx to 0xffffffff sign extends 31st
>>> bit to
>>> the remaining bits. So PMCx becomes 0xffffffffffff (when bit width is
>>> 48). rdmsr then
>>> reads all 48 bits and the value mismatches the one that was written.
>> That behaviour sounds mad, but I see it now described in the manuals.
>> Oh well...
>>
>>> A smarter test would provide both - write value to write and read
>>> value to check against.
>>> But test_valid_msr_write() only takes wval and checks to see if we
>>> read the same value
>>> that we wrote.
>> Well - you did introduce that function.  If altering it would be a
>> smarter test, then perhaps that is the better option to take.
> Fixed in v5
>>>>> +        wval = ((1ull << 31) - 1) ;
>>>>> +
>>>>> +        if ( !test_valid_msr_write(idx, wval) )
>>>>> +            return false;
>>>>> +
>>>>> +        /* set all valid bits in MSR_IA32_EVENTSELx */
>>>>> +        idx = MSR_IA32_PERFEVTSEL(i);
>>>>> +        wval = ((1ull << 32) - 1) ^
>>>>> (IA32_PERFEVENTSELx_ENABLE_ANYTHREAD |
>>>>> +                IA32_PERFEVENTSELx_ENABLE_PCB);
>>>> What is wrong with Pin Control in this register? Architecturally, 
>>>> it is
>>>> a software configurable bit.
>>> Look at this patch on xen-devel
>>>
>>> [PATCH] Fix hypervisor crash when writing to VPMU MSR
>>>
>>> I found that out while writing this test.
>> Welcome to the very grey areas of processors which XTF has a habit of
>> finding.  (I really need to correlate all the suspected errata in the
>> doxygen comments).
>>
>> Obviously, Xen shouldn't crash, but we are going to need feedback from
>> Intel before working out how to proceed.
>>
>>>>> +    /* test IA32_DEBUGCTL */
>>>>> +    idx = MSR_IA32_DEBUGCTL;
>>>>> +
>>>>> +    /* Test IA32_DEBUGCTL facilities enabled by v2 */
>>>>> +    wval = IA32_DEBUGCTL_Freeze_LBR_ON_PMI |
>>>>> IA32_DEBUGCTL_Freeze_PerfMon_On_PMI;
>>>>> +
>>>>> +    /* FIXME: This should really be a valid write but it isnt
>>>>> supported by the
>>>>> +     * VPMU yet */
>>>> In which case the test should be correct here, and highlight that 
>>>> there
>>>> is a bug in Xen.
>>> But then, should the main test return xtf_success, xtf_skip or
>>> xtf_failure?
>> Failure.  Xen is provably malfunctioning.
> Fixed in v5. The tests now fail.
>>
>>>>> +    /* Get perf capabilties */
>>>>> +    if ( rdmsr_safe(MSR_IA32_PERF_CAPABILITIES, &caps) )
>>>>> +    {
>>>>> +        printk("Fault while reading MSR_IA32_PERF_CAPABILITIES\n");
>>>> The PERF_CAPABILITIES MSR is only valid to read if PDCM is 
>>>> advertised in
>>>> CPUID.
>>>>
>>>> We currently never advertise PCDM, and is a bug that
>>>> MSR_PERF_CAPABILITIES is accessable at all in guest context.
>>>> (specifically, it is because Xen leaks almost all host MSR state into
>>>> guests).
>>> So I suppose I can't test full-bit width writes then?
>> Well, the code should be architecturally correct to start with. After
>> that, it is reasonable to explain that Xen used to leak
>> PERF_CAPABILITIES into guests view, at which point probing it is a
>> legitimate thing to do.
>>
>> However, you should take care that your logic doesn't begin to fail at
>> the point that Xen is fixed to behave architecturally (and actually
>> return #GP[0] if PDCM isn't advertised to the guest), because I will
>> definitely be fixing the MSR leakage problems (if noone beats me to it).
> Fixed in v5 ( I think!)
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>>> +    if ( !(caps >> 13) & 1 )
>>>> This lacks sufficient brackets for what you are intending to do.  
>>>> Also,
>>>> it looks like you probably want a #define at the top of the file for
>>>> this constant.
>>> ! and & have the same precedence and right to left associativity. So
>>> technically it should
>>> do what it is intended to do. But I will make it look better.
>> Logical NOT and Address-of have the same precedence.  Logical NOT has a
>> higher precedence than Bitwise AND.
>>
>> And to resolve any doubt, here is what Clang things:
>>
>> andrewcoop@andrewcoop:/local/xen-test-framework.git$ make CC=clang-4.0
>> -j4 -s
>> main.c:396:10: error: logical not is only applied to the left hand side
>> of this bitwise operator [-Werror,-Wlogical-not-parentheses]
>>      if ( !(caps >> 13) & 1 )
>>           ^             ~
> Hard to argue that. Fixed in v5.
>>>>> +
>>>>> +        case 3:
>>>>> +                /* test version 3 facilities */
>>>>> +                if ( !test_intel_pmu_ver3( ngregs, nfregs) )
>>>>> +                    return xtf_failure("Fail: Failed VPMU version
>>>>> 3\n");
>>>>> +                /* fall through */
>>>>> +
>>>>> +        case 2:
>>>>> +                /* test version 2 facilities */
>>>>> +                if ( !test_intel_pmu_ver2(ngregs, nfregs) )
>>>>> +                    return xtf_failure("Fail: Failed VPMU version
>>>>> 2\n");
>>>>> +
>>>>> +                /* test version 1 facilities */
>>>>> +                if ( !test_intel_pmu_ver1(ngregs) )
>>>>> +                    return xtf_failure("Fail: Failed VPMU version
>>>>> 1\n");
>>>>> +
>>>>> +                /* test full width counters */
>>>>> +                if ( !test_full_width_cnts(ngregs, ngbits) )
>>>>> +                    return xtf_failure("Fail: Failed full width
>>>>> test\n");
>>>>> +
>>>>> +                break;
>>>>> +
>>>>> +        case 1:
>>>>> +                /* version 1 unsupported */
>>>> You have a v1 test function, yet it is unsupported?
>>> Yes, from VPMUs perspective, v1 in itself is unsupported because that
>>> would
>>> have required disabling access to PERF_GLOBAL_STATUS and
>>> PERF_GLOBAL_CTRL MSRs.
>>> However, each version does support facilities introduced in the
>>> previous version.
>>> Thus, test_intel_pmu_ver1() is intended for v2 testing and helps
>>> breaking the
>>> test_intel_pmu_ver2()  function into facilities introduced by the
>>> architecture in
>>> two separate versions - v1 and v2
>> You are logically confusing Xen's reasoning for supporting different
>> versions of hardware PMU, with the version it decides to advertise to
>> guests.
>>
>> It is certainly reasonable for the vPMU driver in Xen to decide that,
>> without PMU v2, it is unacceptably complicated to deal with context
>> switching the counter state.  However, this has no bearing on which
>> version of PMU is emulated to the guest, and it would certainly should
>> be possible to configure Xen to only advertise v1 to guests, while the
>> real hardware actually supports v2.  (After all, we already downgrade v4
>> to v3 in a guests view).
> Yes, I think I understand what you are saying. I still do stand by the 
> decision of  having a
> separate function to test v1 facilities for -
> A) modularity
> B) Localizing the knowledge of xen internals to test_main() function. 
> The rest of the code
> is written to test the facilities  as described in the SDM.
>> On another note about the MSR leakage, the v$N test case should ideally
>> check that none of the v$N+1 capabilities/features are leaked into the
>> guest, but this smells like a similar quantity of "redoing it properly
>> the 2nd time around" as my attempts to fix CPUID handling in Xen.
> Maybe something that we can add in a future patch?
>> ~Andrew
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>

[v3,2/2,XTF] xtf/vpmu: MSR read/write tests for VPMU

Commit Message

Comments

Patch