diff mbox

xen: stop_machine: fill fn_result only in case of error.

Message ID 20170530161906.17896-1-gregory.herrero@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Gregory Herrero May 30, 2017, 4:19 p.m. UTC
From: Gregory Herrero <gregory.herrero@oracle.com>

Since fn_result member is shared across all cpus, it must be filled
only if an error happens. Assume CPU1 detects an error and set fn_result
to -1, then CPU2 doesn't detect an error and set fn_result to 0. The
error detected by CPU1 will be ignored.

Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
---
 xen/common/stop_machine.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Jan Beulich May 31, 2017, 8:30 a.m. UTC | #1
>>> On 30.05.17 at 18:19, <gregory.herrero@oracle.com> wrote:
> Since fn_result member is shared across all cpus, it must be filled
> only if an error happens. Assume CPU1 detects an error and set fn_result
> to -1, then CPU2 doesn't detect an error and set fn_result to 0. The
> error detected by CPU1 will be ignored.

First of all there's a difference between stop_machine_run()'s last
argument being a valid CPU number or NR_CPUS - what you say
above applies to the latter case only, so you should also state that.

And then even after your fix it'll remain ambiguous which error is
being returned on case multiple failures occur (which isn't all that
unlikely). This (almost unavoidable) effect should also be spelled
out imo, and I think you want to use write_atomic() to actually
store the error code.

Finally there a various coding style violations in the code you add.

Jan
Gregory Herrero May 31, 2017, 9:51 a.m. UTC | #2
On Wed, May 31, 2017 at 02:30:29AM -0600, Jan Beulich wrote:
> >>> On 30.05.17 at 18:19, <gregory.herrero@oracle.com> wrote:
> > Since fn_result member is shared across all cpus, it must be filled
> > only if an error happens. Assume CPU1 detects an error and set fn_result
> > to -1, then CPU2 doesn't detect an error and set fn_result to 0. The
> > error detected by CPU1 will be ignored.
> 
> First of all there's a difference between stop_machine_run()'s last
> argument being a valid CPU number or NR_CPUS - what you say
> above applies to the latter case only, so you should also state that.
> 
Make sense, I will mention that.
> And then even after your fix it'll remain ambiguous which error is
> being returned on case multiple failures occur (which isn't all that
> unlikely). This (almost unavoidable) effect should also be spelled
> out imo, and I think you want to use write_atomic() to actually
> store the error code.
> 
I agree, that's indeed ambiguous which error happens on which cpu. I
will mention it in v2.  And I will also use write_atomic().
> Finally there a various coding style violations in the code you add.
> 
Sorry for that, I will fix it.

Thanks,
Gregory
diff mbox

Patch

diff --git a/xen/common/stop_machine.c b/xen/common/stop_machine.c
index 304b783aae..22d246c7a4 100644
--- a/xen/common/stop_machine.c
+++ b/xen/common/stop_machine.c
@@ -94,6 +94,7 @@  int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
     stopmachine_data.fn_data = data;
     stopmachine_data.nr_cpus = nr_cpus;
     stopmachine_data.fn_cpu = cpu;
+    stopmachine_data.fn_result = 0;
     atomic_set(&stopmachine_data.done, 0);
     stopmachine_data.state = STOPMACHINE_START;
 
@@ -112,7 +113,11 @@  int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
 
     stopmachine_set_state(STOPMACHINE_INVOKE);
     if ( (cpu == smp_processor_id()) || (cpu == NR_CPUS) )
-        stopmachine_data.fn_result = (*fn)(data);
+    {
+        ret = (*fn)(data);
+        if (ret)
+            stopmachine_data.fn_result = ret;
+    }
     stopmachine_wait_state();
     ret = stopmachine_data.fn_result;
 
@@ -150,8 +155,11 @@  static void stopmachine_action(unsigned long cpu)
         case STOPMACHINE_INVOKE:
             if ( (stopmachine_data.fn_cpu == smp_processor_id()) ||
                  (stopmachine_data.fn_cpu == NR_CPUS) )
-                stopmachine_data.fn_result =
-                    stopmachine_data.fn(stopmachine_data.fn_data);
+	    {
+                int ret = stopmachine_data.fn(stopmachine_data.fn_data);
+                if (ret)
+                    stopmachine_data.fn_result = ret;
+	    }
             break;
         default:
             break;