diff mbox

[v13,12/23] x86: refactor psr: L3 CAT: set value: implement write msr flow.

Message ID 1499305996-19029-13-git-send-email-yi.y.sun@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yi Sun July 6, 2017, 1:53 a.m. UTC
Continue from previous patch:
'x86: refactor psr: L3 CAT: set value: implement cos id picking flow.'

We have got the feature value and COS ID to set. Then, we write MSRs of the
designated feature.

Till now, set value process is completed.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v13:
    - use 'skip_prior_features'.
      (suggested by Jan Beulich)
    - add 'const' for some variables.
      (suggested by Jan Beulich)
v12:
    - declare same type varaibles in one line.
      (suggested by Jan Beulich)
    - replace 'feat_type' to 'props' in 'struct cos_write_info'.
      (suggested by Jan Beulich)
    - assign the 'cos_num' to a local variable.
      (suggested by Jan Beulich)
    - use 'ASSERT_UNREACHABLE()' to record bug and return error code if feat
      exists but props does not exist.
      (suggested by Jan Beulich)
v11:
    - rename 'write_psr_msr' to 'write_psr_msrs'.
    - rename 'do_write_psr_msr' to 'do_write_psr_msrs'.
    - change parameters and codes of 'write_psr_msrs' to handle value array.
    - add 'feat_type' in 'struct cos_write_info' to handle props array.
    - in 'do_write_psr_msrs', write value array into msrs according to
      'props->type[i]'.
    - move 'feat->cos_reg_val' assignment and value comparison in 'write_msr'
      callback function out as generic codes.
      (suggested by Jan Beulich)
    - move check from 'do_write_psr_msrs' to 'write_psr_msrs'.
      (suggested by Jan Beulich)
    - change about 'cos_max'.
      (suggested by Jan Beulich)
    - change about 'feat_props'.
      (suggested by Jan Beulich)
v10:
    - remove 'type' from 'write_msr' parameter list. Will add it back when
      implementing CDP.
      (suggested by Jan Beulich)
    - remove unnecessary casts.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
v9:
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - add 'array_len' in 'struct cos_write_info' and check if val array
      exceeds it.
    - modify 'write_psr_msr' flow only to set one value a time. No need to
      set whole feature array values.
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v8:
    - modify 'write_msr' callback function to 'void' because we have to set
      all features' cbm. When input cos exceeds some features' cos_max, just
      skip them but not break the iteration.
v5:
    - modify commit message to provide exact patch name to continue from.
      (suggested by Jan Beulich)
    - modify return value of callback functions because we do not need them
      to return number of entries the feature uses. In caller, we call
      'get_cos_num' to get the number of entries the feature uses.
      (suggested by Jan Beulich)
    - move type check out from callback functions to caller.
      (suggested by Jan Beulich)
    - modify variables names to make them better, e.g. 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - correct code format.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 76 insertions(+), 1 deletion(-)

Comments

Jan Beulich July 12, 2017, 7:37 p.m. UTC | #1
>>> Yi Sun <yi.y.sun@linux.intel.com> 07/06/17 4:07 AM >>>
>v13:
    >- use 'skip_prior_features'.
>- add 'const' for some variables.

You didn't go quite far enough with this:

>+struct cos_write_info
>+{
>+    unsigned int cos;
>+    struct feat_node *feature;
>+    const uint32_t *val;

With this, ...

>static int write_psr_msrs(unsigned int socket, unsigned int cos,
                           >uint32_t val[], unsigned int array_len,
                           
... I can't see why this can't be const too. Of course that would then affect an
earlier patch.

>enum psr_feat_type feat_type)
>{
>-    return -ENOENT;
>+    int ret;
>+    struct psr_socket_info *info = get_socket_info(socket);
>+    struct cos_write_info data =
>+    {
>+        .cos = cos,
>+        .feature = info->features[feat_type],
>+        .props = feat_props[feat_type],
>+    };
>+
>+    if ( cos > info->features[feat_type]->cos_max )
>+        return -EINVAL;
>+
>+    /* Skip to the feature's value head. */
>+    ret = skip_prior_features(&val, &array_len, feat_type);
>+    if ( ret )
>+        return ret;
>+
>+    if ( array_len < feat_props[feat_type]->cos_num )
>+        return -ENOSPC;
>+
>+    data.val = val;
>+
>+    if ( socket == cpu_to_socket(smp_processor_id()) )
>+        do_write_psr_msrs(&data);
>+    else
>+    {
>+        unsigned int cpu = get_socket_cpu(socket);
>+
>+        if ( cpu >= nr_cpu_ids )
>+            return -ENOTSOCK;
>+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);

How frequent an operation can this be? Considering that the actual MSR write(s)
in the handler is (are) conditional I wonder whether it wouldn't be worthwhile
trying to avoid the IPI altogether, by pre-checking whether any write actually
needs doing.

Jan
Yi Sun July 13, 2017, 2:59 a.m. UTC | #2
On 17-07-12 13:37:02, Jan Beulich wrote:
> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/06/17 4:07 AM >>>
> >v13:
>     >- use 'skip_prior_features'.
> >- add 'const' for some variables.
> 
> You didn't go quite far enough with this:
> 
> >+struct cos_write_info
> >+{
> >+    unsigned int cos;
> >+    struct feat_node *feature;
> >+    const uint32_t *val;
> 
> With this, ...
> 
> >static int write_psr_msrs(unsigned int socket, unsigned int cos,
>                            >uint32_t val[], unsigned int array_len,
>                            
> ... I can't see why this can't be const too. Of course that would then affect an
> earlier patch.
> 
The 'val' is input into 'skip_prior_features'. In 'skip_prior_features', there
is '*val += props->cos_num;' to change the value. So, I do not add 'const' here.
Of course, I can change the way to skip value array, e.g. using a variable as
index. Which one do you like?

> >enum psr_feat_type feat_type)
> >{
> >-    return -ENOENT;
> >+    int ret;
> >+    struct psr_socket_info *info = get_socket_info(socket);
> >+    struct cos_write_info data =
> >+    {
> >+        .cos = cos,
> >+        .feature = info->features[feat_type],
> >+        .props = feat_props[feat_type],
> >+    };
> >+
> >+    if ( cos > info->features[feat_type]->cos_max )
> >+        return -EINVAL;
> >+
> >+    /* Skip to the feature's value head. */
> >+    ret = skip_prior_features(&val, &array_len, feat_type);
> >+    if ( ret )
> >+        return ret;
> >+
> >+    if ( array_len < feat_props[feat_type]->cos_num )
> >+        return -ENOSPC;
> >+
> >+    data.val = val;
> >+
> >+    if ( socket == cpu_to_socket(smp_processor_id()) )
> >+        do_write_psr_msrs(&data);
> >+    else
> >+    {
> >+        unsigned int cpu = get_socket_cpu(socket);
> >+
> >+        if ( cpu >= nr_cpu_ids )
> >+            return -ENOTSOCK;
> >+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);
> 
> How frequent an operation can this be? Considering that the actual MSR write(s)
> in the handler is (are) conditional I wonder whether it wouldn't be worthwhile
> trying to avoid the IPI altogether, by pre-checking whether any write actually
> needs doing.
> 
Yes, I think I can check if the value to set is same as 'feat->cos_reg_val[cos]'
before calling IPI.

There is one more thing. During implementing MBA, I find there is an issue here.
The current codes in 'struct cos_write_info' and 'write_psr_msrs' only consider
one feature's value setting. In fact, we should consider to set all values in
'val' array to the MSRs with new cos id for all features.

So, the 'cos_write_info' should be something like below to input feature array
and props array to handle all features. Of course, we do not need skip value
array anymore.

struct cos_write_info
{
    unsigned int cos;
    struct feat_node **features;
    uint32_t *val;
    unsigned int array_len;
    const struct feat_props **props;
};

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
Jan Beulich July 13, 2017, 5:20 a.m. UTC | #3
>>> Yi Sun <yi.y.sun@linux.intel.com> 07/13/17 5:00 AM >>>
>On 17-07-12 13:37:02, Jan Beulich wrote:
>> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/06/17 4:07 AM >>>
>> >v13:
>>     >- use 'skip_prior_features'.
>> >- add 'const' for some variables.
>> 
>> You didn't go quite far enough with this:
>> 
>> >+struct cos_write_info
>> >+{
>> >+    unsigned int cos;
>> >+    struct feat_node *feature;
>> >+    const uint32_t *val;
>> 
>> With this, ...
>> 
>> >static int write_psr_msrs(unsigned int socket, unsigned int cos,
>>                            >uint32_t val[], unsigned int array_len,
>>                            
>> ... I can't see why this can't be const too. Of course that would then affect an
>> earlier patch.
>> 
>The 'val' is input into 'skip_prior_features'. In 'skip_prior_features', there
>is '*val += props->cos_num;' to change the value. So, I do not add 'const' here.
>Of course, I can change the way to skip value array, e.g. using a variable as
>index. Which one do you like?

Oh, I see. But yes, I still think it would be nice for const-ness to be
expressible irrespective of this helper function, so making it e.g. just update
"len" without passing in the array pointer at all (leaving that part to the caller)
would seem desirable. Or possibly not even pass "array_len" via indirection,
instead making the function return a non-negative increment value for the
caller to apply to both (keeping negative value to indicate errors). But if you
think it's better the way it is, I could also live with it.

>> >+    if ( socket == cpu_to_socket(smp_processor_id()) )
>> >+        do_write_psr_msrs(&data);
>> >+    else
>> >+    {
>> >+        unsigned int cpu = get_socket_cpu(socket);
>> >+
>> >+        if ( cpu >= nr_cpu_ids )
>> >+            return -ENOTSOCK;
>> >+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);
>> 
>> How frequent an operation can this be? Considering that the actual MSR write(s)
>> in the handler is (are) conditional I wonder whether it wouldn't be worthwhile
>> trying to avoid the IPI altogether, by pre-checking whether any write actually
>> needs doing.
>> 
>Yes, I think I can check if the value to set is same as 'feat->cos_reg_val[cos]'
>before calling IPI.

Well, as said - whether it's worth the extra effort depends on whether there is
a (reasonable) scenario where this function may be executed frequently.

>There is one more thing. During implementing MBA, I find there is an issue here.
>The current codes in 'struct cos_write_info' and 'write_psr_msrs' only consider
>one feature's value setting. In fact, we should consider to set all values in
>'val' array to the MSRs with new cos id for all features.
>
>So, the 'cos_write_info' should be something like below to input feature array
>and props array to handle all features. Of course, we do not need skip value
>array anymore.
>
>struct cos_write_info
>{
    >unsigned int cos;
    >struct feat_node **features;
    >uint32_t *val;
    >unsigned int array_len;
    >const struct feat_props **props;
>};

As you can likely understand, I can't really judge on this without seeing what
you need this for. So I'd suggest to keep things the way they are in this series
and discuss changes to it in the context of that other series of yours.

Jan
Yi Sun July 13, 2017, 7:32 a.m. UTC | #4
On 17-07-12 23:20:24, Jan Beulich wrote:
> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/13/17 5:00 AM >>>
> >On 17-07-12 13:37:02, Jan Beulich wrote:
> >> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/06/17 4:07 AM >>>
> >> >v13:
> >>     >- use 'skip_prior_features'.
> >> >- add 'const' for some variables.
> >> 
> >> You didn't go quite far enough with this:
> >> 
> >> >+struct cos_write_info
> >> >+{
> >> >+    unsigned int cos;
> >> >+    struct feat_node *feature;
> >> >+    const uint32_t *val;
> >> 
> >> With this, ...
> >> 
> >> >static int write_psr_msrs(unsigned int socket, unsigned int cos,
> >>                            >uint32_t val[], unsigned int array_len,
> >>                            
> >> ... I can't see why this can't be const too. Of course that would then affect an
> >> earlier patch.
> >> 
> >The 'val' is input into 'skip_prior_features'. In 'skip_prior_features', there
> >is '*val += props->cos_num;' to change the value. So, I do not add 'const' here.
> >Of course, I can change the way to skip value array, e.g. using a variable as
> >index. Which one do you like?
> 
> Oh, I see. But yes, I still think it would be nice for const-ness to be
> expressible irrespective of this helper function, so making it e.g. just update
> "len" without passing in the array pointer at all (leaving that part to the caller)
> would seem desirable. Or possibly not even pass "array_len" via indirection,
> instead making the function return a non-negative increment value for the
> caller to apply to both (keeping negative value to indicate errors). But if you
> think it's better the way it is, I could also live with it.
> 
Thank you! I will try to implement a version out according to your comments.

> >> >+    if ( socket == cpu_to_socket(smp_processor_id()) )
> >> >+        do_write_psr_msrs(&data);
> >> >+    else
> >> >+    {
> >> >+        unsigned int cpu = get_socket_cpu(socket);
> >> >+
> >> >+        if ( cpu >= nr_cpu_ids )
> >> >+            return -ENOTSOCK;
> >> >+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);
> >> 
> >> How frequent an operation can this be? Considering that the actual MSR write(s)
> >> in the handler is (are) conditional I wonder whether it wouldn't be worthwhile
> >> trying to avoid the IPI altogether, by pre-checking whether any write actually
> >> needs doing.
> >> 
> >Yes, I think I can check if the value to set is same as 'feat->cos_reg_val[cos]'
> >before calling IPI.
> 
> Well, as said - whether it's worth the extra effort depends on whether there is
> a (reasonable) scenario where this function may be executed frequently.
> 
This function is executed when 'psr-cat-set' command is executed. I consult
the libvirt guy, this command may be executed frequently under some scenarios.
E.g. user may dynamically adjust the cache allocation for VMs according to CMT
result.

> >There is one more thing. During implementing MBA, I find there is an issue here.
> >The current codes in 'struct cos_write_info' and 'write_psr_msrs' only consider
> >one feature's value setting. In fact, we should consider to set all values in
> >'val' array to the MSRs with new cos id for all features.
> >
> >So, the 'cos_write_info' should be something like below to input feature array
> >and props array to handle all features. Of course, we do not need skip value
> >array anymore.
> >
> >struct cos_write_info
> >{
>     >unsigned int cos;
>     >struct feat_node **features;
>     >uint32_t *val;
>     >unsigned int array_len;
>     >const struct feat_props **props;
> >};
> 
> As you can likely understand, I can't really judge on this without seeing what
> you need this for. So I'd suggest to keep things the way they are in this series
> and discuss changes to it in the context of that other series of yours.
> 
Ok, I will keep the codes in current series. Will modify them in MBA patch set
for review.
Jan Beulich July 13, 2017, 7:21 p.m. UTC | #5
>>> Yi Sun <yi.y.sun@linux.intel.com> 07/13/17 9:34 AM >>>
>On 17-07-12 23:20:24, Jan Beulich wrote:
>> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/13/17 5:00 AM >>>
>> >On 17-07-12 13:37:02, Jan Beulich wrote:
>> >> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/06/17 4:07 AM >>>
>> >> >+    if ( socket == cpu_to_socket(smp_processor_id()) )
>> >> >+        do_write_psr_msrs(&data);
>> >> >+    else
>> >> >+    {
>> >> >+        unsigned int cpu = get_socket_cpu(socket);
>> >> >+
>> >> >+        if ( cpu >= nr_cpu_ids )
>> >> >+            return -ENOTSOCK;
>> >> >+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);
>> >> 
>> >> How frequent an operation can this be? Considering that the actual MSR write(s)
>> >> in the handler is (are) conditional I wonder whether it wouldn't be worthwhile
>> >> trying to avoid the IPI altogether, by pre-checking whether any write actually
>> >> needs doing.
>> >> 
>> >Yes, I think I can check if the value to set is same as 'feat->cos_reg_val[cos]'
>> >before calling IPI.
>> 
>> Well, as said - whether it's worth the extra effort depends on whether there is
>> a (reasonable) scenario where this function may be executed frequently.
>> 
>This function is executed when 'psr-cat-set' command is executed. I consult
>the libvirt guy, this command may be executed frequently under some scenarios.
>E.g. user may dynamically adjust the cache allocation for VMs according to CMT
>result.

Hmm, that's not something I would call frequent - in the whole invocation of the
user mode process the IPI will be lost in the noise. "Frequent" would be something
the kernel does without direct user mode triggering, like on the context switch
path, in code running from a timer, or some such.

Jan
Yi Sun July 14, 2017, 1:38 a.m. UTC | #6
On 17-07-13 13:21:46, Jan Beulich wrote:
> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/13/17 9:34 AM >>>
> >On 17-07-12 23:20:24, Jan Beulich wrote:
> >> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/13/17 5:00 AM >>>
> >> >On 17-07-12 13:37:02, Jan Beulich wrote:
> >> >> >>> Yi Sun <yi.y.sun@linux.intel.com> 07/06/17 4:07 AM >>>
> >> >> >+    if ( socket == cpu_to_socket(smp_processor_id()) )
> >> >> >+        do_write_psr_msrs(&data);
> >> >> >+    else
> >> >> >+    {
> >> >> >+        unsigned int cpu = get_socket_cpu(socket);
> >> >> >+
> >> >> >+        if ( cpu >= nr_cpu_ids )
> >> >> >+            return -ENOTSOCK;
> >> >> >+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);
> >> >> 
> >> >> How frequent an operation can this be? Considering that the actual MSR write(s)
> >> >> in the handler is (are) conditional I wonder whether it wouldn't be worthwhile
> >> >> trying to avoid the IPI altogether, by pre-checking whether any write actually
> >> >> needs doing.
> >> >> 
> >> >Yes, I think I can check if the value to set is same as 'feat->cos_reg_val[cos]'
> >> >before calling IPI.
> >> 
> >> Well, as said - whether it's worth the extra effort depends on whether there is
> >> a (reasonable) scenario where this function may be executed frequently.
> >> 
> >This function is executed when 'psr-cat-set' command is executed. I consult
> >the libvirt guy, this command may be executed frequently under some scenarios.
> >E.g. user may dynamically adjust the cache allocation for VMs according to CMT
> >result.
> 
> Hmm, that's not something I would call frequent - in the whole invocation of the
> user mode process the IPI will be lost in the noise. "Frequent" would be something
> the kernel does without direct user mode triggering, like on the context switch
> path, in code running from a timer, or some such.
> 
Then, it is not 'Frequent'. This function is only trigger by user. So, I
will keep current codes. Thanks!

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
diff mbox

Patch

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index cbe08ce..48dab60 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -113,6 +113,9 @@  static const struct feat_props {
     /* get_feat_info is used to return feature HW info through sysctl. */
     bool (*get_feat_info)(const struct feat_node *feat,
                           uint32_t data[], unsigned int array_len);
+
+    /* write_msr is used to write out feature MSR register. */
+    void (*write_msr)(unsigned int cos, uint32_t val, enum cbm_type type);
 } *feat_props[FEAT_TYPE_NUM];
 
 /*
@@ -289,11 +292,17 @@  static bool cat_get_feat_info(const struct feat_node *feat,
 }
 
 /* L3 CAT props */
+static void l3_cat_write_msr(unsigned int cos, uint32_t val, enum cbm_type type)
+{
+    wrmsrl(MSR_IA32_PSR_L3_MASK(cos), val);
+}
+
 static const struct feat_props l3_cat_props = {
     .cos_num = 1,
     .type[0] = PSR_CBM_TYPE_L3,
     .alt_type = PSR_CBM_TYPE_UNKNOWN,
     .get_feat_info = cat_get_feat_info,
+    .write_msr = l3_cat_write_msr,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -946,11 +955,77 @@  static int pick_avail_cos(const struct psr_socket_info *info,
     return -EOVERFLOW;
 }
 
+static unsigned int get_socket_cpu(unsigned int socket)
+{
+    if ( likely(socket < nr_sockets) )
+        return cpumask_any(socket_cpumask[socket]);
+
+    return nr_cpu_ids;
+}
+
+struct cos_write_info
+{
+    unsigned int cos;
+    struct feat_node *feature;
+    const uint32_t *val;
+    const struct feat_props *props;
+};
+
+static void do_write_psr_msrs(void *data)
+{
+    const struct cos_write_info *info = data;
+    struct feat_node *feat = info->feature;
+    const struct feat_props *props = info->props;
+    unsigned int i, cos = info->cos, cos_num = props->cos_num;
+
+    for ( i = 0; i < cos_num; i++ )
+    {
+        if ( feat->cos_reg_val[cos * cos_num + i] != info->val[i] )
+        {
+            feat->cos_reg_val[cos * cos_num + i] = info->val[i];
+            props->write_msr(cos, info->val[i], props->type[i]);
+        }
+    }
+}
+
 static int write_psr_msrs(unsigned int socket, unsigned int cos,
                           uint32_t val[], unsigned int array_len,
                           enum psr_feat_type feat_type)
 {
-    return -ENOENT;
+    int ret;
+    struct psr_socket_info *info = get_socket_info(socket);
+    struct cos_write_info data =
+    {
+        .cos = cos,
+        .feature = info->features[feat_type],
+        .props = feat_props[feat_type],
+    };
+
+    if ( cos > info->features[feat_type]->cos_max )
+        return -EINVAL;
+
+    /* Skip to the feature's value head. */
+    ret = skip_prior_features(&val, &array_len, feat_type);
+    if ( ret )
+        return ret;
+
+    if ( array_len < feat_props[feat_type]->cos_num )
+        return -ENOSPC;
+
+    data.val = val;
+
+    if ( socket == cpu_to_socket(smp_processor_id()) )
+        do_write_psr_msrs(&data);
+    else
+    {
+        unsigned int cpu = get_socket_cpu(socket);
+
+        if ( cpu >= nr_cpu_ids )
+            return -ENOTSOCK;
+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msrs, &data, 1);
+    }
+
+    return 0;
 }
 
 int psr_set_val(struct domain *d, unsigned int socket,