Message ID | 20181006065113.669-3-rajneesh.bhardwaj@linux.intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Delegated to: | Andy Shevchenko |
Headers | show |
Series | [v2,1/4] platform/x86: intel_pmc_core: Show Latency Tolerance info | expand |
On Sat, Oct 6, 2018 at 9:54 AM Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com> wrote: > > The LTR values follow PCIE LTR encoding format and can be decoded as per > https://pcisig.com/sites/default/files/specification_documents/ECN_LatencyTolnReporting_14Aug08.pdf > > This adds support to translate the raw LTR values as read from the PMC > to meaningful values in nanosecond units of time. While I have pushed this to my review and testing queue, it needs a bit more work. See my comments below. > +static u32 convert_ltr_scale(u32 val) > +{ > + u32 scale = 0; Redundant, see below. > + /* > + * As per PCIE specification supporting document > + * ECN_LatencyTolnReporting_14Aug08.pdf the Latency > + * Tolerance Reporting data payload is encoded in a > + * 3 bit scale and 10 bit value fields. Values are > + * multiplied by the indicated scale to yield an absolute time > + * value, expressible in a range from 1 nanosecond to > + * 2^25*(2^10-1) = 34,326,183,936 nanoseconds. > + * > + * scale encoding is as follows: > + * > + * ---------------------------------------------- > + * |scale factor | Multiplier (ns) | > + * ---------------------------------------------- > + * | 0 | 1 | > + * | 1 | 32 | > + * | 2 | 1024 | > + * | 3 | 32768 | > + * | 4 | 1048576 | > + * | 5 | 33554432 | > + * | 6 | Invalid | > + * | 7 | Invalid | > + * ---------------------------------------------- > + */ > + if (val > 5) > + pr_warn("Invalid LTR scale factor.\n"); if (...) { pr_warn(...); // Btw, Does it recoverable state? What user will get with returned 0 as a multiplier? return 0; // Btw, is 0 fits better than ~0? How hw would behave with this value? } > + else > + scale = 1U << (5 * (val)); > + > + return scale; return 1U << (5 * val); > +} > for (index = 0; map[index].name ; index++) { > - seq_printf(s, "IP %-2d :%-32s\tRAW LTR: 0x%x\n", index, > - map[index].name, > - pmc_core_reg_read(pmcdev, map[index].bit_mask)); We use 32 characters for the names. Here are two minor issues: - inconsistency with the rest - ping-pong style of programming (you changed 32 to 24 in the same series where you introduced 32 in the first place). > + decoded_snoop_ltr = decoded_non_snoop_ltr = 0; > + ltr_raw_data = pmc_core_reg_read(pmcdev, > + map[index].bit_mask); > + snoop_ltr = ltr_raw_data & ~MTPMC_MASK; > + nonsnoop_ltr = (ltr_raw_data >> 0x10) & ~MTPMC_MASK; > + > + if (FIELD_GET(LTR_REQ_NONSNOOP, ltr_raw_data)) { > + scale = FIELD_GET(LTR_DECODED_SCALE, nonsnoop_ltr); > + val = FIELD_GET(LTR_DECODED_VAL, nonsnoop_ltr); > + decoded_non_snoop_ltr = val * convert_ltr_scale(scale); > + } > + > + if (FIELD_GET(LTR_REQ_SNOOP, ltr_raw_data)) { > + scale = FIELD_GET(LTR_DECODED_SCALE, snoop_ltr); > + val = FIELD_GET(LTR_DECODED_VAL, snoop_ltr); > + decoded_snoop_ltr = val * convert_ltr_scale(scale); > + } > + > + seq_printf(s, "IP %-2d :%-24s\tRaw LTR: 0x%-16x\t Non-Snoop LTR (ns): %-16llu\t Snoop LTR (ns): %-16llu\n", Here 0x%-16x would look a bit strange and difficult to parse. 0x%016x much better. After you remove the index, it would give you 4 more characters, though it 4 less than 8 you got from reducing 32 to 24. OTOH, those long texts perhaps may be compressed somehow, at least remove LTR duplicating from the last two. Remove spaces after '\t' as well. > + index, map[index].name, ltr_raw_data, > + decoded_non_snoop_ltr, > + decoded_snoop_ltr); > } > return 0; > } > --- a/drivers/platform/x86/intel_pmc_core.h > +++ b/drivers/platform/x86/intel_pmc_core.h > @@ -177,6 +177,11 @@ enum ppfear_regs { It might be good idea to include linux/bits.h here. > +#define LTR_REQ_NONSNOOP BIT(31) > +#define LTR_REQ_SNOOP BIT(15) > +#define LTR_DECODED_VAL GENMASK(9, 0) > +#define LTR_DECODED_SCALE GENMASK(12, 10)
Hi Andy, Thanks for your review. My comments below. If you agree then i can quickly send v3 addressing all suggestions so we can make it in time for 4.20 merge window. On 19-Oct-18 6:04 PM, Andy Shevchenko wrote: > On Sat, Oct 6, 2018 at 9:54 AM Rajneesh Bhardwaj > <rajneesh.bhardwaj@linux.intel.com> wrote: >> The LTR values follow PCIE LTR encoding format and can be decoded as per >> https://pcisig.com/sites/default/files/specification_documents/ECN_LatencyTolnReporting_14Aug08.pdf >> >> This adds support to translate the raw LTR values as read from the PMC >> to meaningful values in nanosecond units of time. > While I have pushed this to my review and testing queue, it needs a > bit more work. See my comments below. > >> +static u32 convert_ltr_scale(u32 val) >> +{ >> + u32 scale = 0; > Redundant, see below. > >> + /* >> + * As per PCIE specification supporting document >> + * ECN_LatencyTolnReporting_14Aug08.pdf the Latency >> + * Tolerance Reporting data payload is encoded in a >> + * 3 bit scale and 10 bit value fields. Values are >> + * multiplied by the indicated scale to yield an absolute time >> + * value, expressible in a range from 1 nanosecond to >> + * 2^25*(2^10-1) = 34,326,183,936 nanoseconds. >> + * >> + * scale encoding is as follows: >> + * >> + * ---------------------------------------------- >> + * |scale factor | Multiplier (ns) | >> + * ---------------------------------------------- >> + * | 0 | 1 | >> + * | 1 | 32 | >> + * | 2 | 1024 | >> + * | 3 | 32768 | >> + * | 4 | 1048576 | >> + * | 5 | 33554432 | >> + * | 6 | Invalid | >> + * | 7 | Invalid | >> + * ---------------------------------------------- >> + */ >> + if (val > 5) >> + pr_warn("Invalid LTR scale factor.\n"); > if (...) { > pr_warn(...); // Btw, Does it recoverable state? What user will get > with returned 0 as a multiplier? > return 0; // Btw, is 0 fits better than ~0? How hw would behave with > this value? > } We show 0 LTR for invalid scale as PMC output is junk sometimes. > >> + else >> + scale = 1U << (5 * (val)); >> + >> + return scale; > return 1U << (5 * val); We intend to return 0 so for invalid LTR scale. This will make retuen non zero and we dont want that. > >> +} >> for (index = 0; map[index].name ; index++) { >> - seq_printf(s, "IP %-2d :%-32s\tRAW LTR: 0x%x\n", index, >> - map[index].name, >> - pmc_core_reg_read(pmcdev, map[index].bit_mask)); > We use 32 characters for the names. Here are two minor issues: > - inconsistency with the rest intentional. > - ping-pong style of programming (you changed 32 to 24 in the same > series where you introduced 32 in the first place). This is because the formatted output looks better with this width. I used 32 for the earlier patch because its consistent with rest and also does not look bad on screen. > > >> + decoded_snoop_ltr = decoded_non_snoop_ltr = 0; >> + ltr_raw_data = pmc_core_reg_read(pmcdev, >> + map[index].bit_mask); >> + snoop_ltr = ltr_raw_data & ~MTPMC_MASK; >> + nonsnoop_ltr = (ltr_raw_data >> 0x10) & ~MTPMC_MASK; >> + >> + if (FIELD_GET(LTR_REQ_NONSNOOP, ltr_raw_data)) { >> + scale = FIELD_GET(LTR_DECODED_SCALE, nonsnoop_ltr); >> + val = FIELD_GET(LTR_DECODED_VAL, nonsnoop_ltr); >> + decoded_non_snoop_ltr = val * convert_ltr_scale(scale); >> + } >> + >> + if (FIELD_GET(LTR_REQ_SNOOP, ltr_raw_data)) { >> + scale = FIELD_GET(LTR_DECODED_SCALE, snoop_ltr); >> + val = FIELD_GET(LTR_DECODED_VAL, snoop_ltr); >> + decoded_snoop_ltr = val * convert_ltr_scale(scale); >> + } >> + >> + seq_printf(s, "IP %-2d :%-24s\tRaw LTR: 0x%-16x\t Non-Snoop LTR (ns): %-16llu\t Snoop LTR (ns): %-16llu\n", > Here 0x%-16x would look a bit strange and difficult to parse. 0x%016x > much better. Sure, I can test how it looks with 0x%016x and modify it. > After you remove the index, it would give you 4 more characters, > though it 4 less than 8 you got from reducing 32 to 24. I plan to keep the index as is. Reason for this is mentioned in previous patch that introduces this index. > > OTOH, those long texts perhaps may be compressed somehow, at least > remove LTR duplicating from the last two. Remove spaces after '\t' as > well. Noted. > >> + index, map[index].name, ltr_raw_data, >> + decoded_non_snoop_ltr, >> + decoded_snoop_ltr); >> } >> return 0; >> } >> --- a/drivers/platform/x86/intel_pmc_core.h >> +++ b/drivers/platform/x86/intel_pmc_core.h >> @@ -177,6 +177,11 @@ enum ppfear_regs { > It might be good idea to include linux/bits.h here. Certainly. Luckily 0day bot didnt complain about randconfigs etc so is it really needed as it will increase size? > >> +#define LTR_REQ_NONSNOOP BIT(31) >> +#define LTR_REQ_SNOOP BIT(15) >> +#define LTR_DECODED_VAL GENMASK(9, 0) >> +#define LTR_DECODED_SCALE GENMASK(12, 10)
On Tue, Oct 30, 2018 at 9:40 AM Bhardwaj, Rajneesh <rajneesh.bhardwaj@linux.intel.com> wrote: > Thanks for your review. My comments below. > > If you agree then i can quickly send v3 addressing all suggestions so we > can make it in time for 4.20 merge window. I don't like `quickly` part — usual way to make the last minute mistakes. How long does testing take? > On 19-Oct-18 6:04 PM, Andy Shevchenko wrote: > > On Sat, Oct 6, 2018 at 9:54 AM Rajneesh Bhardwaj > > <rajneesh.bhardwaj@linux.intel.com> wrote: > >> The LTR values follow PCIE LTR encoding format and can be decoded as per > >> https://pcisig.com/sites/default/files/specification_documents/ECN_LatencyTolnReporting_14Aug08.pdf > >> > >> This adds support to translate the raw LTR values as read from the PMC > >> to meaningful values in nanosecond units of time. > > While I have pushed this to my review and testing queue, it needs a > > bit more work. See my comments below. > >> + u32 scale = 0; > > Redundant, see below. > >> + if (val > 5) > >> + pr_warn("Invalid LTR scale factor.\n"); > > if (...) { > > pr_warn(...); // Btw, Does it recoverable state? What user will get > > with returned 0 as a multiplier? > > return 0; // Btw, is 0 fits better than ~0? How hw would behave with > > this value? > > } > > We show 0 LTR for invalid scale as PMC output is junk sometimes. > >> + else > >> + scale = 1U << (5 * (val)); > >> + > >> + return scale; > > return 1U << (5 * val); > > We intend to return 0 so for invalid LTR scale. This will make retuen > non zero and we dont want that. And my example, if being read carefully, doesn't alter that. > >> +} > >> for (index = 0; map[index].name ; index++) { > >> - seq_printf(s, "IP %-2d :%-32s\tRAW LTR: 0x%x\n", index, > >> - map[index].name, > >> - pmc_core_reg_read(pmcdev, map[index].bit_mask)); > > We use 32 characters for the names. Here are two minor issues: > > - inconsistency with the rest > > intentional. I understand that, but this is not like the rest is printed. OK, this is bikeshedding, it's your call, but keep in mind the avoidance of ping-pong programming. > > - ping-pong style of programming (you changed 32 to 24 in the same > > series where you introduced 32 in the first place). > > This is because the formatted output looks better with this width. I > used 32 for the earlier patch because its consistent with rest and also > does not look bad on screen. See how it looks like with my proposals below. > > After you remove the index, it would give you 4 more characters, > > though it 4 less than 8 you got from reducing 32 to 24. > > I plan to keep the index as is. Reason for this is mentioned in previous > patch that introduces this index. No, please don't. We have a numerous userspace tools which are doing this pretty well. > >> --- a/drivers/platform/x86/intel_pmc_core.h > >> +++ b/drivers/platform/x86/intel_pmc_core.h > >> @@ -177,6 +177,11 @@ enum ppfear_regs { > > It might be good idea to include linux/bits.h here. > > Certainly. Luckily 0day bot didnt complain about randconfigs etc so is > it really needed as it will increase size? You are lucky because of ordering of inclusions. This is fragile. Since you are using macros from bits.h better to explicitly show this dependency.
diff --git a/drivers/platform/x86/intel_pmc_core.c b/drivers/platform/x86/intel_pmc_core.c index c616cfedf2be..fbcab53456f3 100644 --- a/drivers/platform/x86/intel_pmc_core.c +++ b/drivers/platform/x86/intel_pmc_core.c @@ -21,6 +21,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/acpi.h> +#include <linux/bitfield.h> #include <linux/debugfs.h> #include <linux/delay.h> #include <linux/io.h> @@ -650,16 +651,73 @@ static int pmc_core_slps0_dbg_show(struct seq_file *s, void *unused) } DEFINE_SHOW_ATTRIBUTE(pmc_core_slps0_dbg); +static u32 convert_ltr_scale(u32 val) +{ + u32 scale = 0; + /* + * As per PCIE specification supporting document + * ECN_LatencyTolnReporting_14Aug08.pdf the Latency + * Tolerance Reporting data payload is encoded in a + * 3 bit scale and 10 bit value fields. Values are + * multiplied by the indicated scale to yield an absolute time + * value, expressible in a range from 1 nanosecond to + * 2^25*(2^10-1) = 34,326,183,936 nanoseconds. + * + * scale encoding is as follows: + * + * ---------------------------------------------- + * |scale factor | Multiplier (ns) | + * ---------------------------------------------- + * | 0 | 1 | + * | 1 | 32 | + * | 2 | 1024 | + * | 3 | 32768 | + * | 4 | 1048576 | + * | 5 | 33554432 | + * | 6 | Invalid | + * | 7 | Invalid | + * ---------------------------------------------- + */ + if (val > 5) + pr_warn("Invalid LTR scale factor.\n"); + else + scale = 1U << (5 * (val)); + + return scale; +} + static int pmc_core_ltr_show(struct seq_file *s, void *unused) { struct pmc_dev *pmcdev = s->private; const struct pmc_bit_map *map = pmcdev->map->ltr_show_sts; + u64 decoded_snoop_ltr, decoded_non_snoop_ltr; + u32 ltr_raw_data, scale, val; + u16 snoop_ltr, nonsnoop_ltr; int index; for (index = 0; map[index].name ; index++) { - seq_printf(s, "IP %-2d :%-32s\tRAW LTR: 0x%x\n", index, - map[index].name, - pmc_core_reg_read(pmcdev, map[index].bit_mask)); + decoded_snoop_ltr = decoded_non_snoop_ltr = 0; + ltr_raw_data = pmc_core_reg_read(pmcdev, + map[index].bit_mask); + snoop_ltr = ltr_raw_data & ~MTPMC_MASK; + nonsnoop_ltr = (ltr_raw_data >> 0x10) & ~MTPMC_MASK; + + if (FIELD_GET(LTR_REQ_NONSNOOP, ltr_raw_data)) { + scale = FIELD_GET(LTR_DECODED_SCALE, nonsnoop_ltr); + val = FIELD_GET(LTR_DECODED_VAL, nonsnoop_ltr); + decoded_non_snoop_ltr = val * convert_ltr_scale(scale); + } + + if (FIELD_GET(LTR_REQ_SNOOP, ltr_raw_data)) { + scale = FIELD_GET(LTR_DECODED_SCALE, snoop_ltr); + val = FIELD_GET(LTR_DECODED_VAL, snoop_ltr); + decoded_snoop_ltr = val * convert_ltr_scale(scale); + } + + seq_printf(s, "IP %-2d :%-24s\tRaw LTR: 0x%-16x\t Non-Snoop LTR (ns): %-16llu\t Snoop LTR (ns): %-16llu\n", + index, map[index].name, ltr_raw_data, + decoded_non_snoop_ltr, + decoded_snoop_ltr); } return 0; } diff --git a/drivers/platform/x86/intel_pmc_core.h b/drivers/platform/x86/intel_pmc_core.h index 7f8181057ec8..cc49cd4c86e9 100644 --- a/drivers/platform/x86/intel_pmc_core.h +++ b/drivers/platform/x86/intel_pmc_core.h @@ -177,6 +177,11 @@ enum ppfear_regs { #define CNP_PMC_LTR_EMMC 0x1BF4 #define CNP_PMC_LTR_UFSX2 0x1BF8 +#define LTR_REQ_NONSNOOP BIT(31) +#define LTR_REQ_SNOOP BIT(15) +#define LTR_DECODED_VAL GENMASK(9, 0) +#define LTR_DECODED_SCALE GENMASK(12, 10) + struct pmc_bit_map { const char *name; u32 bit_mask;
The LTR values follow PCIE LTR encoding format and can be decoded as per https://pcisig.com/sites/default/files/specification_documents/ECN_LatencyTolnReporting_14Aug08.pdf This adds support to translate the raw LTR values as read from the PMC to meaningful values in nanosecond units of time. Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com> --- Changes in v2: * Get rid of union and bitfields to decode LTR and use FIELD_GET macro * Change get_ltr_scale to convert_ltr_scale. * Other style fixes and misc. improvements suggested by Andy for v1. drivers/platform/x86/intel_pmc_core.c | 64 +++++++++++++++++++++++++-- drivers/platform/x86/intel_pmc_core.h | 5 +++ 2 files changed, 66 insertions(+), 3 deletions(-)