Message ID | 20210809131057.1694145-7-danielhb413@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PMU-EBB support for PPC64 TCG | expand |
On Mon, Aug 09, 2021 at 10:10:44AM -0300, Daniel Henrique Barboza wrote: > So far the PMU logic was using PMC5 for instruction counting (linux > kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using > PMCs 1-4. > > Let's enable all PMCs to count these 2 events we already provide. The > logic used to calculate PMC5 is now being provided by > update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in > update_PMC_PM_CYC(). > > The enablement of these 2 events for all PMUs are done by using the > Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e > for PM_CYC, I'm confused by this. Surely the specific values here should be defined by the hardware, not by Linux. > all of those defined by specific bits in MMCR1 for each PMC. > PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and > PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of > MMCR1 setup. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > --- > target/ppc/cpu.h | 8 +++++ > target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- > 2 files changed, 65 insertions(+), 3 deletions(-) > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h > index 8cea8f2aca..afd9cd402b 100644 > --- a/target/ppc/cpu.h > +++ b/target/ppc/cpu.h > @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { > #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ > #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ > > +#define MMCR1_PMC1SEL_SHIFT (63 - 39) > +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) > +#define MMCR1_PMC2SEL_SHIFT (63 - 47) > +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) > +#define MMCR1_PMC3SEL_SHIFT (63 - 55) > +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) > +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) > + > /* LPCR bits */ > #define LPCR_VPM0 PPC_BIT(0) > #define LPCR_VPM1 PPC_BIT(1) > diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c > index 0994531f65..99e62bd37b 100644 > --- a/target/ppc/pmu_book3s_helper.c > +++ b/target/ppc/pmu_book3s_helper.c > @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) > return insns * 4; > } > > +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, > + uint64_t curr_icount) > +{ > + env->spr[sprn] += curr_icount - env->pmu_base_icount; > +} > + > +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, > + uint64_t curr_icount) > +{ > + uint64_t insns = curr_icount - env->pmu_base_icount; > + env->spr[sprn] += get_cycles(insns); > +} > + > +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, > + uint64_t curr_icount) > +{ > + int event; > + > + switch (sprn) { > + case SPR_POWER_PMC1: > + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; > + event = event >> MMCR1_PMC1SEL_SHIFT; > + break; > + case SPR_POWER_PMC2: > + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; > + event = event >> MMCR1_PMC2SEL_SHIFT; > + break; > + case SPR_POWER_PMC3: > + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; > + event = event >> MMCR1_PMC3SEL_SHIFT; > + break; > + case SPR_POWER_PMC4: > + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; > + break; > + default: > + return; > + } > + > + switch (event) { > + case 0x2: > + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); > + break; > + case 0x1E: > + update_PMC_PM_CYC(env, sprn, curr_icount); > + break; > + default: > + return; > + } > +} > + > /* > * Set all PMCs values after a PMU freeze via MMCR0_FC. > * > @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) > static void update_PMCs_on_freeze(CPUPPCState *env) > { > uint64_t curr_icount = get_insns(); > + int sprn; > + > + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { > + update_programmable_PMC_reg(env, sprn, curr_icount); > + } > > - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; > - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - > - env->pmu_base_icount); > + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); > + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); > } > > void helper_store_mmcr0(CPUPPCState *env, target_ulong value)
On 8/10/21 12:42 AM, David Gibson wrote: > On Mon, Aug 09, 2021 at 10:10:44AM -0300, Daniel Henrique Barboza wrote: >> So far the PMU logic was using PMC5 for instruction counting (linux >> kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using >> PMCs 1-4. >> >> Let's enable all PMCs to count these 2 events we already provide. The >> logic used to calculate PMC5 is now being provided by >> update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in >> update_PMC_PM_CYC(). >> >> The enablement of these 2 events for all PMUs are done by using the >> Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e >> for PM_CYC, > > I'm confused by this. Surely the specific values here should be > defined by the hardware, not by Linux. The hardware/PowerISA defines these events as follows for all counters: 00 Disable events. (No events occur.) 01-BF Implementation-dependent C0-DF Reserved And then hardware events defined by the ISA goes from E0 to FF. Each counter has a different set of hardware defined events in this range. The Linux perf driver defines some events in the 'Implementation-dependent' area, allowing for events codes such as '0x02' to count instructions completed in PMC1 even though this event is not defined in the ISA for this PMC. I am assuming that the real hardware - at least the ones that IBM produces - does this mapping internally. I'll ask around to see if I find out whether it's the HW or some part of the Perf subsystem that I don't comprehend that are doing it. I am not particularly happy about having to rely on 'implementation-dependent' fields that are defined by the Perf subsystem of Linux in the emulator code that should be OS-agnostic. Unfortunately, I didn't find any alternative to make the kernel operate this PMU implementation other than baking these event codes into the PMU logic. Thanks, Daniel > >> all of those defined by specific bits in MMCR1 for each PMC. >> PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and >> PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of >> MMCR1 setup. >> >> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >> --- >> target/ppc/cpu.h | 8 +++++ >> target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- >> 2 files changed, 65 insertions(+), 3 deletions(-) >> >> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h >> index 8cea8f2aca..afd9cd402b 100644 >> --- a/target/ppc/cpu.h >> +++ b/target/ppc/cpu.h >> @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { >> #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ >> #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ >> >> +#define MMCR1_PMC1SEL_SHIFT (63 - 39) >> +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) >> +#define MMCR1_PMC2SEL_SHIFT (63 - 47) >> +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) >> +#define MMCR1_PMC3SEL_SHIFT (63 - 55) >> +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) >> +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) >> + >> /* LPCR bits */ >> #define LPCR_VPM0 PPC_BIT(0) >> #define LPCR_VPM1 PPC_BIT(1) >> diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c >> index 0994531f65..99e62bd37b 100644 >> --- a/target/ppc/pmu_book3s_helper.c >> +++ b/target/ppc/pmu_book3s_helper.c >> @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) >> return insns * 4; >> } >> >> +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, >> + uint64_t curr_icount) >> +{ >> + env->spr[sprn] += curr_icount - env->pmu_base_icount; >> +} >> + >> +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, >> + uint64_t curr_icount) >> +{ >> + uint64_t insns = curr_icount - env->pmu_base_icount; >> + env->spr[sprn] += get_cycles(insns); >> +} >> + >> +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, >> + uint64_t curr_icount) >> +{ >> + int event; >> + >> + switch (sprn) { >> + case SPR_POWER_PMC1: >> + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; >> + event = event >> MMCR1_PMC1SEL_SHIFT; >> + break; >> + case SPR_POWER_PMC2: >> + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; >> + event = event >> MMCR1_PMC2SEL_SHIFT; >> + break; >> + case SPR_POWER_PMC3: >> + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; >> + event = event >> MMCR1_PMC3SEL_SHIFT; >> + break; >> + case SPR_POWER_PMC4: >> + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; >> + break; >> + default: >> + return; >> + } >> + >> + switch (event) { >> + case 0x2: >> + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); >> + break; >> + case 0x1E: >> + update_PMC_PM_CYC(env, sprn, curr_icount); >> + break; >> + default: >> + return; >> + } >> +} >> + >> /* >> * Set all PMCs values after a PMU freeze via MMCR0_FC. >> * >> @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) >> static void update_PMCs_on_freeze(CPUPPCState *env) >> { >> uint64_t curr_icount = get_insns(); >> + int sprn; >> + >> + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { >> + update_programmable_PMC_reg(env, sprn, curr_icount); >> + } >> >> - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; >> - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - >> - env->pmu_base_icount); >> + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); >> + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); >> } >> >> void helper_store_mmcr0(CPUPPCState *env, target_ulong value) >
On 8/10/21 12:03 PM, Daniel Henrique Barboza wrote: > > > On 8/10/21 12:42 AM, David Gibson wrote: >> On Mon, Aug 09, 2021 at 10:10:44AM -0300, Daniel Henrique Barboza wrote: >>> So far the PMU logic was using PMC5 for instruction counting (linux >>> kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using >>> PMCs 1-4. >>> >>> Let's enable all PMCs to count these 2 events we already provide. The >>> logic used to calculate PMC5 is now being provided by >>> update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in >>> update_PMC_PM_CYC(). >>> >>> The enablement of these 2 events for all PMUs are done by using the >>> Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e >>> for PM_CYC, >> >> I'm confused by this. Surely the specific values here should be >> defined by the hardware, not by Linux. > > The hardware/PowerISA defines these events as follows for all counters: > > 00 Disable events. (No events occur.) > 01-BF Implementation-dependent > C0-DF Reserved > > And then hardware events defined by the ISA goes from E0 to FF. Each counter > has a different set of hardware defined events in this range. > > The Linux perf driver defines some events in the 'Implementation-dependent' > area, allowing for events codes such as '0x02' to count instructions > completed in PMC1 even though this event is not defined in the ISA for this > PMC. I am assuming that the real hardware - at least the ones that IBM > produces - does this mapping internally. I'll ask around to see if I find > out whether it's the HW or some part of the Perf subsystem that I don't > comprehend that are doing it. The kernel guys confirmed that the HW is aware of the implementantion-dependent Perf event codes that the Linux kernel uses. Also, by reading the kernel code it is safe to say that this is true since Power7 at least. What we can do here to play nicer with other non-IBM PowerPC chips, that might not have the same implementation-dependent Perf events in the hardware, is to make this mapping only for emulation of IBM processors. That way we can emulate these events that IBM PMU implements while not making any assumptions for every other PowerPC chip that implements Book3s. Thanks, Daniel > > > I am not particularly happy about having to rely on 'implementation-dependent' > fields that are defined by the Perf subsystem of Linux in the emulator > code that should be OS-agnostic. Unfortunately, I didn't find any alternative > to make the kernel operate this PMU implementation other than baking these > event codes into the PMU logic. > > > Thanks, > > > Daniel > > >> >>> all of those defined by specific bits in MMCR1 for each PMC. >>> PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and >>> PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of >>> MMCR1 setup. >>> >>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >>> --- >>> target/ppc/cpu.h | 8 +++++ >>> target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- >>> 2 files changed, 65 insertions(+), 3 deletions(-) >>> >>> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h >>> index 8cea8f2aca..afd9cd402b 100644 >>> --- a/target/ppc/cpu.h >>> +++ b/target/ppc/cpu.h >>> @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { >>> #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ >>> #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ >>> +#define MMCR1_PMC1SEL_SHIFT (63 - 39) >>> +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) >>> +#define MMCR1_PMC2SEL_SHIFT (63 - 47) >>> +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) >>> +#define MMCR1_PMC3SEL_SHIFT (63 - 55) >>> +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) >>> +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) >>> + >>> /* LPCR bits */ >>> #define LPCR_VPM0 PPC_BIT(0) >>> #define LPCR_VPM1 PPC_BIT(1) >>> diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c >>> index 0994531f65..99e62bd37b 100644 >>> --- a/target/ppc/pmu_book3s_helper.c >>> +++ b/target/ppc/pmu_book3s_helper.c >>> @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) >>> return insns * 4; >>> } >>> +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, >>> + uint64_t curr_icount) >>> +{ >>> + env->spr[sprn] += curr_icount - env->pmu_base_icount; >>> +} >>> + >>> +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, >>> + uint64_t curr_icount) >>> +{ >>> + uint64_t insns = curr_icount - env->pmu_base_icount; >>> + env->spr[sprn] += get_cycles(insns); >>> +} >>> + >>> +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, >>> + uint64_t curr_icount) >>> +{ >>> + int event; >>> + >>> + switch (sprn) { >>> + case SPR_POWER_PMC1: >>> + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; >>> + event = event >> MMCR1_PMC1SEL_SHIFT; >>> + break; >>> + case SPR_POWER_PMC2: >>> + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; >>> + event = event >> MMCR1_PMC2SEL_SHIFT; >>> + break; >>> + case SPR_POWER_PMC3: >>> + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; >>> + event = event >> MMCR1_PMC3SEL_SHIFT; >>> + break; >>> + case SPR_POWER_PMC4: >>> + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; >>> + break; >>> + default: >>> + return; >>> + } >>> + >>> + switch (event) { >>> + case 0x2: >>> + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); >>> + break; >>> + case 0x1E: >>> + update_PMC_PM_CYC(env, sprn, curr_icount); >>> + break; >>> + default: >>> + return; >>> + } >>> +} >>> + >>> /* >>> * Set all PMCs values after a PMU freeze via MMCR0_FC. >>> * >>> @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) >>> static void update_PMCs_on_freeze(CPUPPCState *env) >>> { >>> uint64_t curr_icount = get_insns(); >>> + int sprn; >>> + >>> + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { >>> + update_programmable_PMC_reg(env, sprn, curr_icount); >>> + } >>> - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; >>> - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - >>> - env->pmu_base_icount); >>> + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); >>> + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); >>> } >>> void helper_store_mmcr0(CPUPPCState *env, target_ulong value) >>
On Tue, Aug 10, 2021 at 12:03:15PM -0300, Daniel Henrique Barboza wrote: > > > On 8/10/21 12:42 AM, David Gibson wrote: > > On Mon, Aug 09, 2021 at 10:10:44AM -0300, Daniel Henrique Barboza wrote: > > > So far the PMU logic was using PMC5 for instruction counting (linux > > > kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using > > > PMCs 1-4. > > > > > > Let's enable all PMCs to count these 2 events we already provide. The > > > logic used to calculate PMC5 is now being provided by > > > update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in > > > update_PMC_PM_CYC(). > > > > > > The enablement of these 2 events for all PMUs are done by using the > > > Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e > > > for PM_CYC, > > > > I'm confused by this. Surely the specific values here should be > > defined by the hardware, not by Linux. > > The hardware/PowerISA defines these events as follows for all counters: > > 00 Disable events. (No events occur.) > 01-BF Implementation-dependent > C0-DF Reserved > > And then hardware events defined by the ISA goes from E0 to FF. Each counter > has a different set of hardware defined events in this range. > > The Linux perf driver defines some events in the 'Implementation-dependent' > area, allowing for events codes such as '0x02' to count instructions > completed in PMC1 even though this event is not defined in the ISA for this > PMC. I am assuming that the real hardware - at least the ones that IBM > produces - does this mapping internally. I'll ask around to see if I find > out whether it's the HW or some part of the Perf subsystem that I don't > comprehend that are doing it. > > > I am not particularly happy about having to rely on 'implementation-dependent' > fields that are defined by the Perf subsystem of Linux in the emulator > code that should be OS-agnostic. Unfortunately, I didn't find any alternative > to make the kernel operate this PMU implementation other than baking these > event codes into the PMU logic. Ok, so they are hardware defined, just not architecture defined, IIUC. I think that just needs a change to the description - and making it more explicit that this is the power9 (?) PMU you're modelling specifically not the (barely) architected "book3s" PMU. > > > Thanks, > > > Daniel > > > > > > > all of those defined by specific bits in MMCR1 for each PMC. > > > PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and > > > PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of > > > MMCR1 setup. > > > > > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > > > --- > > > target/ppc/cpu.h | 8 +++++ > > > target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- > > > 2 files changed, 65 insertions(+), 3 deletions(-) > > > > > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h > > > index 8cea8f2aca..afd9cd402b 100644 > > > --- a/target/ppc/cpu.h > > > +++ b/target/ppc/cpu.h > > > @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { > > > #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ > > > #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ > > > +#define MMCR1_PMC1SEL_SHIFT (63 - 39) > > > +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) > > > +#define MMCR1_PMC2SEL_SHIFT (63 - 47) > > > +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) > > > +#define MMCR1_PMC3SEL_SHIFT (63 - 55) > > > +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) > > > +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) > > > + > > > /* LPCR bits */ > > > #define LPCR_VPM0 PPC_BIT(0) > > > #define LPCR_VPM1 PPC_BIT(1) > > > diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c > > > index 0994531f65..99e62bd37b 100644 > > > --- a/target/ppc/pmu_book3s_helper.c > > > +++ b/target/ppc/pmu_book3s_helper.c > > > @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) > > > return insns * 4; > > > } > > > +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, > > > + uint64_t curr_icount) > > > +{ > > > + env->spr[sprn] += curr_icount - env->pmu_base_icount; > > > +} > > > + > > > +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, > > > + uint64_t curr_icount) > > > +{ > > > + uint64_t insns = curr_icount - env->pmu_base_icount; > > > + env->spr[sprn] += get_cycles(insns); > > > +} > > > + > > > +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, > > > + uint64_t curr_icount) > > > +{ > > > + int event; > > > + > > > + switch (sprn) { > > > + case SPR_POWER_PMC1: > > > + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; > > > + event = event >> MMCR1_PMC1SEL_SHIFT; > > > + break; > > > + case SPR_POWER_PMC2: > > > + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; > > > + event = event >> MMCR1_PMC2SEL_SHIFT; > > > + break; > > > + case SPR_POWER_PMC3: > > > + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; > > > + event = event >> MMCR1_PMC3SEL_SHIFT; > > > + break; > > > + case SPR_POWER_PMC4: > > > + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; > > > + break; > > > + default: > > > + return; > > > + } > > > + > > > + switch (event) { > > > + case 0x2: > > > + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); > > > + break; > > > + case 0x1E: > > > + update_PMC_PM_CYC(env, sprn, curr_icount); > > > + break; > > > + default: > > > + return; > > > + } > > > +} > > > + > > > /* > > > * Set all PMCs values after a PMU freeze via MMCR0_FC. > > > * > > > @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) > > > static void update_PMCs_on_freeze(CPUPPCState *env) > > > { > > > uint64_t curr_icount = get_insns(); > > > + int sprn; > > > + > > > + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { > > > + update_programmable_PMC_reg(env, sprn, curr_icount); > > > + } > > > - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; > > > - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - > > > - env->pmu_base_icount); > > > + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); > > > + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); > > > } > > > void helper_store_mmcr0(CPUPPCState *env, target_ulong value) > > >
On 8/10/21 8:08 PM, Daniel Henrique Barboza wrote: > > > On 8/10/21 12:03 PM, Daniel Henrique Barboza wrote: >> >> >> On 8/10/21 12:42 AM, David Gibson wrote: >>> On Mon, Aug 09, 2021 at 10:10:44AM -0300, Daniel Henrique Barboza wrote: >>>> So far the PMU logic was using PMC5 for instruction counting (linux >>>> kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using >>>> PMCs 1-4. >>>> >>>> Let's enable all PMCs to count these 2 events we already provide. The >>>> logic used to calculate PMC5 is now being provided by >>>> update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in >>>> update_PMC_PM_CYC(). >>>> >>>> The enablement of these 2 events for all PMUs are done by using the >>>> Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e >>>> for PM_CYC, >>> >>> I'm confused by this. Surely the specific values here should be >>> defined by the hardware, not by Linux. >> >> The hardware/PowerISA defines these events as follows for all counters: >> >> 00 Disable events. (No events occur.) >> 01-BF Implementation-dependent >> C0-DF Reserved >> >> And then hardware events defined by the ISA goes from E0 to FF. Each counter >> has a different set of hardware defined events in this range. >> >> The Linux perf driver defines some events in the 'Implementation-dependent' >> area, allowing for events codes such as '0x02' to count instructions >> completed in PMC1 even though this event is not defined in the ISA for this >> PMC. I am assuming that the real hardware - at least the ones that IBM >> produces - does this mapping internally. I'll ask around to see if I find >> out whether it's the HW or some part of the Perf subsystem that I don't >> comprehend that are doing it. > > The kernel guys confirmed that the HW is aware of the implementantion-dependent > Perf event codes that the Linux kernel uses. Also, by reading the kernel code it > is safe to say that this is true since Power7 at least. > > What we can do here to play nicer with other non-IBM PowerPC chips, that might > not have the same implementation-dependent Perf events in the hardware, is to make > this mapping only for emulation of IBM processors. That way we can emulate these > events that IBM PMU implements while not making any assumptions for every other > PowerPC chip that implements Book3s. Scratch that. I got told by the kernel folks that, starting in v5.14, the generic-compat-pmu events are being calculated by using the architected ISA events only. They did that to not rely on implementation-dependent events in the Perf subsystem. What I'll attempt here is implement some architected events (cycles, instructions and a small variant of those 2 that uses the run latch) and see how the PMU behaves in the selftests. Daniel > > > Thanks, > > > Daniel > > >> >> >> I am not particularly happy about having to rely on 'implementation-dependent' >> fields that are defined by the Perf subsystem of Linux in the emulator >> code that should be OS-agnostic. Unfortunately, I didn't find any alternative >> to make the kernel operate this PMU implementation other than baking these >> event codes into the PMU logic. >> >> >> Thanks, >> >> >> Daniel >> >> >>> >>>> all of those defined by specific bits in MMCR1 for each PMC. >>>> PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and >>>> PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of >>>> MMCR1 setup. >>>> >>>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >>>> --- >>>> target/ppc/cpu.h | 8 +++++ >>>> target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- >>>> 2 files changed, 65 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h >>>> index 8cea8f2aca..afd9cd402b 100644 >>>> --- a/target/ppc/cpu.h >>>> +++ b/target/ppc/cpu.h >>>> @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { >>>> #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ >>>> #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ >>>> +#define MMCR1_PMC1SEL_SHIFT (63 - 39) >>>> +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) >>>> +#define MMCR1_PMC2SEL_SHIFT (63 - 47) >>>> +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) >>>> +#define MMCR1_PMC3SEL_SHIFT (63 - 55) >>>> +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) >>>> +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) >>>> + >>>> /* LPCR bits */ >>>> #define LPCR_VPM0 PPC_BIT(0) >>>> #define LPCR_VPM1 PPC_BIT(1) >>>> diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c >>>> index 0994531f65..99e62bd37b 100644 >>>> --- a/target/ppc/pmu_book3s_helper.c >>>> +++ b/target/ppc/pmu_book3s_helper.c >>>> @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) >>>> return insns * 4; >>>> } >>>> +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, >>>> + uint64_t curr_icount) >>>> +{ >>>> + env->spr[sprn] += curr_icount - env->pmu_base_icount; >>>> +} >>>> + >>>> +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, >>>> + uint64_t curr_icount) >>>> +{ >>>> + uint64_t insns = curr_icount - env->pmu_base_icount; >>>> + env->spr[sprn] += get_cycles(insns); >>>> +} >>>> + >>>> +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, >>>> + uint64_t curr_icount) >>>> +{ >>>> + int event; >>>> + >>>> + switch (sprn) { >>>> + case SPR_POWER_PMC1: >>>> + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; >>>> + event = event >> MMCR1_PMC1SEL_SHIFT; >>>> + break; >>>> + case SPR_POWER_PMC2: >>>> + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; >>>> + event = event >> MMCR1_PMC2SEL_SHIFT; >>>> + break; >>>> + case SPR_POWER_PMC3: >>>> + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; >>>> + event = event >> MMCR1_PMC3SEL_SHIFT; >>>> + break; >>>> + case SPR_POWER_PMC4: >>>> + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; >>>> + break; >>>> + default: >>>> + return; >>>> + } >>>> + >>>> + switch (event) { >>>> + case 0x2: >>>> + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); >>>> + break; >>>> + case 0x1E: >>>> + update_PMC_PM_CYC(env, sprn, curr_icount); >>>> + break; >>>> + default: >>>> + return; >>>> + } >>>> +} >>>> + >>>> /* >>>> * Set all PMCs values after a PMU freeze via MMCR0_FC. >>>> * >>>> @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) >>>> static void update_PMCs_on_freeze(CPUPPCState *env) >>>> { >>>> uint64_t curr_icount = get_insns(); >>>> + int sprn; >>>> + >>>> + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { >>>> + update_programmable_PMC_reg(env, sprn, curr_icount); >>>> + } >>>> - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; >>>> - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - >>>> - env->pmu_base_icount); >>>> + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); >>>> + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); >>>> } >>>> void helper_store_mmcr0(CPUPPCState *env, target_ulong value) >>>
On Tue, Aug 10, 2021 at 08:08:04PM -0300, Daniel Henrique Barboza wrote: > > > On 8/10/21 12:03 PM, Daniel Henrique Barboza wrote: > > > > > > On 8/10/21 12:42 AM, David Gibson wrote: > > > On Mon, Aug 09, 2021 at 10:10:44AM -0300, Daniel Henrique Barboza wrote: > > > > So far the PMU logic was using PMC5 for instruction counting (linux > > > > kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using > > > > PMCs 1-4. > > > > > > > > Let's enable all PMCs to count these 2 events we already provide. The > > > > logic used to calculate PMC5 is now being provided by > > > > update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in > > > > update_PMC_PM_CYC(). > > > > > > > > The enablement of these 2 events for all PMUs are done by using the > > > > Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e > > > > for PM_CYC, > > > > > > I'm confused by this. Surely the specific values here should be > > > defined by the hardware, not by Linux. > > > > The hardware/PowerISA defines these events as follows for all counters: > > > > 00 Disable events. (No events occur.) > > 01-BF Implementation-dependent > > C0-DF Reserved > > > > And then hardware events defined by the ISA goes from E0 to FF. Each counter > > has a different set of hardware defined events in this range. > > > > The Linux perf driver defines some events in the 'Implementation-dependent' > > area, allowing for events codes such as '0x02' to count instructions > > completed in PMC1 even though this event is not defined in the ISA for this > > PMC. I am assuming that the real hardware - at least the ones that IBM > > produces - does this mapping internally. I'll ask around to see if I find > > out whether it's the HW or some part of the Perf subsystem that I don't > > comprehend that are doing it. > > The kernel guys confirmed that the HW is aware of the implementantion-dependent > Perf event codes that the Linux kernel uses. Also, by reading the kernel code it > is safe to say that this is true since Power7 at least. Ok. I'm pretty sure POWER6 and POWER5 have totally different PMUs though. So best to be explicit that this is the POWER7 (and later compatible chips) PMU model you're building here. > What we can do here to play nicer with other non-IBM PowerPC chips, that might > not have the same implementation-dependent Perf events in the hardware, is to make > this mapping only for emulation of IBM processors. Well.. I'm not sure there are any other chips claiming to implement the architecture (at least recent versions). But for the broad swath of things that might be considered book3s it's much worse than that: some of the POWER generations have completely different PMU designs. It's not just different event values, but different numbers of PMCs, different controls in the MMCRs different all sorts of things. > That way we can emulate these > events that IBM PMU implements while not making any assumptions for every other > PowerPC chip that implements Book3s. > > > Thanks, > > > Daniel > > > > > > > > I am not particularly happy about having to rely on 'implementation-dependent' > > fields that are defined by the Perf subsystem of Linux in the emulator > > code that should be OS-agnostic. Unfortunately, I didn't find any alternative > > to make the kernel operate this PMU implementation other than baking these > > event codes into the PMU logic. > > > > > > Thanks, > > > > > > Daniel > > > > > > > > > > > all of those defined by specific bits in MMCR1 for each PMC. > > > > PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and > > > > PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of > > > > MMCR1 setup. > > > > > > > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > > > > --- > > > > target/ppc/cpu.h | 8 +++++ > > > > target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- > > > > 2 files changed, 65 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h > > > > index 8cea8f2aca..afd9cd402b 100644 > > > > --- a/target/ppc/cpu.h > > > > +++ b/target/ppc/cpu.h > > > > @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { > > > > #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ > > > > #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ > > > > +#define MMCR1_PMC1SEL_SHIFT (63 - 39) > > > > +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) > > > > +#define MMCR1_PMC2SEL_SHIFT (63 - 47) > > > > +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) > > > > +#define MMCR1_PMC3SEL_SHIFT (63 - 55) > > > > +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) > > > > +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) > > > > + > > > > /* LPCR bits */ > > > > #define LPCR_VPM0 PPC_BIT(0) > > > > #define LPCR_VPM1 PPC_BIT(1) > > > > diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c > > > > index 0994531f65..99e62bd37b 100644 > > > > --- a/target/ppc/pmu_book3s_helper.c > > > > +++ b/target/ppc/pmu_book3s_helper.c > > > > @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) > > > > return insns * 4; > > > > } > > > > +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, > > > > + uint64_t curr_icount) > > > > +{ > > > > + env->spr[sprn] += curr_icount - env->pmu_base_icount; > > > > +} > > > > + > > > > +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, > > > > + uint64_t curr_icount) > > > > +{ > > > > + uint64_t insns = curr_icount - env->pmu_base_icount; > > > > + env->spr[sprn] += get_cycles(insns); > > > > +} > > > > + > > > > +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, > > > > + uint64_t curr_icount) > > > > +{ > > > > + int event; > > > > + > > > > + switch (sprn) { > > > > + case SPR_POWER_PMC1: > > > > + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; > > > > + event = event >> MMCR1_PMC1SEL_SHIFT; > > > > + break; > > > > + case SPR_POWER_PMC2: > > > > + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; > > > > + event = event >> MMCR1_PMC2SEL_SHIFT; > > > > + break; > > > > + case SPR_POWER_PMC3: > > > > + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; > > > > + event = event >> MMCR1_PMC3SEL_SHIFT; > > > > + break; > > > > + case SPR_POWER_PMC4: > > > > + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; > > > > + break; > > > > + default: > > > > + return; > > > > + } > > > > + > > > > + switch (event) { > > > > + case 0x2: > > > > + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); > > > > + break; > > > > + case 0x1E: > > > > + update_PMC_PM_CYC(env, sprn, curr_icount); > > > > + break; > > > > + default: > > > > + return; > > > > + } > > > > +} > > > > + > > > > /* > > > > * Set all PMCs values after a PMU freeze via MMCR0_FC. > > > > * > > > > @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) > > > > static void update_PMCs_on_freeze(CPUPPCState *env) > > > > { > > > > uint64_t curr_icount = get_insns(); > > > > + int sprn; > > > > + > > > > + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { > > > > + update_programmable_PMC_reg(env, sprn, curr_icount); > > > > + } > > > > - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; > > > > - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - > > > > - env->pmu_base_icount); > > > > + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); > > > > + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); > > > > } > > > > void helper_store_mmcr0(CPUPPCState *env, target_ulong value) > > > >
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index 8cea8f2aca..afd9cd402b 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -350,6 +350,14 @@ typedef struct ppc_v3_pate_t { #define MMCR0_FCECE PPC_BIT(38) /* FC on Enabled Cond or Event */ #define MMCR0_PMCC PPC_BITMASK(44, 45) /* PMC Control */ +#define MMCR1_PMC1SEL_SHIFT (63 - 39) +#define MMCR1_PMC1SEL PPC_BITMASK(32, 39) +#define MMCR1_PMC2SEL_SHIFT (63 - 47) +#define MMCR1_PMC2SEL PPC_BITMASK(40, 47) +#define MMCR1_PMC3SEL_SHIFT (63 - 55) +#define MMCR1_PMC3SEL PPC_BITMASK(48, 55) +#define MMCR1_PMC4SEL PPC_BITMASK(56, 63) + /* LPCR bits */ #define LPCR_VPM0 PPC_BIT(0) #define LPCR_VPM1 PPC_BIT(1) diff --git a/target/ppc/pmu_book3s_helper.c b/target/ppc/pmu_book3s_helper.c index 0994531f65..99e62bd37b 100644 --- a/target/ppc/pmu_book3s_helper.c +++ b/target/ppc/pmu_book3s_helper.c @@ -28,6 +28,56 @@ static uint64_t get_cycles(uint64_t insns) return insns * 4; } +static void update_PMC_PM_INST_CMPL(CPUPPCState *env, int sprn, + uint64_t curr_icount) +{ + env->spr[sprn] += curr_icount - env->pmu_base_icount; +} + +static void update_PMC_PM_CYC(CPUPPCState *env, int sprn, + uint64_t curr_icount) +{ + uint64_t insns = curr_icount - env->pmu_base_icount; + env->spr[sprn] += get_cycles(insns); +} + +static void update_programmable_PMC_reg(CPUPPCState *env, int sprn, + uint64_t curr_icount) +{ + int event; + + switch (sprn) { + case SPR_POWER_PMC1: + event = MMCR1_PMC1SEL & env->spr[SPR_POWER_MMCR1]; + event = event >> MMCR1_PMC1SEL_SHIFT; + break; + case SPR_POWER_PMC2: + event = MMCR1_PMC2SEL & env->spr[SPR_POWER_MMCR1]; + event = event >> MMCR1_PMC2SEL_SHIFT; + break; + case SPR_POWER_PMC3: + event = MMCR1_PMC3SEL & env->spr[SPR_POWER_MMCR1]; + event = event >> MMCR1_PMC3SEL_SHIFT; + break; + case SPR_POWER_PMC4: + event = MMCR1_PMC4SEL & env->spr[SPR_POWER_MMCR1]; + break; + default: + return; + } + + switch (event) { + case 0x2: + update_PMC_PM_INST_CMPL(env, sprn, curr_icount); + break; + case 0x1E: + update_PMC_PM_CYC(env, sprn, curr_icount); + break; + default: + return; + } +} + /* * Set all PMCs values after a PMU freeze via MMCR0_FC. * @@ -37,10 +87,14 @@ static uint64_t get_cycles(uint64_t insns) static void update_PMCs_on_freeze(CPUPPCState *env) { uint64_t curr_icount = get_insns(); + int sprn; + + for (sprn = SPR_POWER_PMC1; sprn < SPR_POWER_PMC5; sprn++) { + update_programmable_PMC_reg(env, sprn, curr_icount); + } - env->spr[SPR_POWER_PMC5] += curr_icount - env->pmu_base_icount; - env->spr[SPR_POWER_PMC6] += get_cycles(curr_icount - - env->pmu_base_icount); + update_PMC_PM_INST_CMPL(env, SPR_POWER_PMC5, curr_icount); + update_PMC_PM_CYC(env, SPR_POWER_PMC6, curr_icount); } void helper_store_mmcr0(CPUPPCState *env, target_ulong value)
So far the PMU logic was using PMC5 for instruction counting (linux kernel PM_INST_CMPL) and PMC6 to count cycles (PM_CYC). We aren't using PMCs 1-4. Let's enable all PMCs to count these 2 events we already provide. The logic used to calculate PMC5 is now being provided by update_PMC_PM_INST_CMPL() and PMC6 logic is now implemented in update_PMC_PM_CYC(). The enablement of these 2 events for all PMUs are done by using the Linux kernel definition of those events: 0x02 for PM_INST_CMPL and 0x1e for PM_CYC, all of those defined by specific bits in MMCR1 for each PMC. PMCs 1-4 relies on the correct event to be defined in MMCR1. PMC5 and PMC6 will count PM_INST_CMPL and PMC_CYC, respectively, regardless of MMCR1 setup. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 8 +++++ target/ppc/pmu_book3s_helper.c | 60 ++++++++++++++++++++++++++++++++-- 2 files changed, 65 insertions(+), 3 deletions(-)