diff mbox series

[v2] xen/sched: add some diagnostic info in the run queue keyhandler

Message ID 20200211122736.16714-1-jgross@suse.com (mailing list archive)
State New, archived
Headers show
Series [v2] xen/sched: add some diagnostic info in the run queue keyhandler | expand

Commit Message

Jürgen Groß Feb. 11, 2020, 12:27 p.m. UTC
When dumping the run queue information add some more data regarding
current and (if known) previous vcpu for each physical cpu.

With core scheduling activated the printed data will be e.g.:

(XEN) CPUs info:
(XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
(XEN) CPU[01] current=d[IDLE]v1
(XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
(XEN) CPU[03] current=d[IDLE]v3

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2: add proper locking
---
 xen/common/sched/core.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

Comments

Jan Beulich Feb. 11, 2020, 1:01 p.m. UTC | #1
On 11.02.2020 13:27, Juergen Gross wrote:
> When dumping the run queue information add some more data regarding
> current and (if known) previous vcpu for each physical cpu.
> 
> With core scheduling activated the printed data will be e.g.:
> 
> (XEN) CPUs info:
> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
> (XEN) CPU[01] current=d[IDLE]v1
> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
> (XEN) CPU[03] current=d[IDLE]v3
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> V2: add proper locking

"Proper" is ambiguous in the context of dumping functions. In a
number of places we use try-lock, to avoid the dumping hanging
on something else monopolizing the lock. I'd like to suggest to
do so here, too.

Jan
Jürgen Groß Feb. 11, 2020, 1:10 p.m. UTC | #2
On 11.02.20 14:01, Jan Beulich wrote:
> On 11.02.2020 13:27, Juergen Gross wrote:
>> When dumping the run queue information add some more data regarding
>> current and (if known) previous vcpu for each physical cpu.
>>
>> With core scheduling activated the printed data will be e.g.:
>>
>> (XEN) CPUs info:
>> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
>> (XEN) CPU[01] current=d[IDLE]v1
>> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
>> (XEN) CPU[03] current=d[IDLE]v3
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> V2: add proper locking
> 
> "Proper" is ambiguous in the context of dumping functions. In a
> number of places we use try-lock, to avoid the dumping hanging
> on something else monopolizing the lock. I'd like to suggest to
> do so here, too.

All the scheduler related dumping functions are using the "real" locks.
So using trylock in this single case wouldn't help at all. Additionally
using trylock only would make a crash during dumping the data more
probable, so I'm not sure we want that.

Instead of unconditionally using trylock in dumping functions I could
imagine to have a "dumplock" using proper locking by default which can
be toggled to trylock in case it is needed (or maybe automatically by
adding a timeout to the dumplock variant).


Juergen
Jan Beulich Feb. 11, 2020, 4:46 p.m. UTC | #3
On 11.02.2020 14:10, Jürgen Groß wrote:
> On 11.02.20 14:01, Jan Beulich wrote:
>> On 11.02.2020 13:27, Juergen Gross wrote:
>>> When dumping the run queue information add some more data regarding
>>> current and (if known) previous vcpu for each physical cpu.
>>>
>>> With core scheduling activated the printed data will be e.g.:
>>>
>>> (XEN) CPUs info:
>>> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
>>> (XEN) CPU[01] current=d[IDLE]v1
>>> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
>>> (XEN) CPU[03] current=d[IDLE]v3
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>> V2: add proper locking
>>
>> "Proper" is ambiguous in the context of dumping functions. In a
>> number of places we use try-lock, to avoid the dumping hanging
>> on something else monopolizing the lock. I'd like to suggest to
>> do so here, too.
> 
> All the scheduler related dumping functions are using the "real" locks.
> So using trylock in this single case wouldn't help at all. Additionally
> using trylock only would make a crash during dumping the data more
> probable, so I'm not sure we want that.

Why would it make a crash more likely? If you can't get the lock,
you'd simply skip dumping.

Jan
Jürgen Groß Feb. 11, 2020, 4:54 p.m. UTC | #4
On 11.02.20 17:46, Jan Beulich wrote:
> On 11.02.2020 14:10, Jürgen Groß wrote:
>> On 11.02.20 14:01, Jan Beulich wrote:
>>> On 11.02.2020 13:27, Juergen Gross wrote:
>>>> When dumping the run queue information add some more data regarding
>>>> current and (if known) previous vcpu for each physical cpu.
>>>>
>>>> With core scheduling activated the printed data will be e.g.:
>>>>
>>>> (XEN) CPUs info:
>>>> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
>>>> (XEN) CPU[01] current=d[IDLE]v1
>>>> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
>>>> (XEN) CPU[03] current=d[IDLE]v3
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>> V2: add proper locking
>>>
>>> "Proper" is ambiguous in the context of dumping functions. In a
>>> number of places we use try-lock, to avoid the dumping hanging
>>> on something else monopolizing the lock. I'd like to suggest to
>>> do so here, too.
>>
>> All the scheduler related dumping functions are using the "real" locks.
>> So using trylock in this single case wouldn't help at all. Additionally
>> using trylock only would make a crash during dumping the data more
>> probable, so I'm not sure we want that.
> 
> Why would it make a crash more likely? If you can't get the lock,
> you'd simply skip dumping.

Ah, okay, then I misunderstood your intention.

I still think that this should be done not only in one place, but in a
more general fashion. I'd rather give up only after some time trying
(1 millisecond per default?) and apply the same scheme to all dumping
functions.

I can have a try for such a series if you agree on taking a more general
approach.


Juergen
Jan Beulich Feb. 11, 2020, 5:04 p.m. UTC | #5
On 11.02.2020 17:54, Jürgen Groß wrote:
> On 11.02.20 17:46, Jan Beulich wrote:
>> On 11.02.2020 14:10, Jürgen Groß wrote:
>>> On 11.02.20 14:01, Jan Beulich wrote:
>>>> On 11.02.2020 13:27, Juergen Gross wrote:
>>>>> When dumping the run queue information add some more data regarding
>>>>> current and (if known) previous vcpu for each physical cpu.
>>>>>
>>>>> With core scheduling activated the printed data will be e.g.:
>>>>>
>>>>> (XEN) CPUs info:
>>>>> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
>>>>> (XEN) CPU[01] current=d[IDLE]v1
>>>>> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
>>>>> (XEN) CPU[03] current=d[IDLE]v3
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> ---
>>>>> V2: add proper locking
>>>>
>>>> "Proper" is ambiguous in the context of dumping functions. In a
>>>> number of places we use try-lock, to avoid the dumping hanging
>>>> on something else monopolizing the lock. I'd like to suggest to
>>>> do so here, too.
>>>
>>> All the scheduler related dumping functions are using the "real" locks.
>>> So using trylock in this single case wouldn't help at all. Additionally
>>> using trylock only would make a crash during dumping the data more
>>> probable, so I'm not sure we want that.
>>
>> Why would it make a crash more likely? If you can't get the lock,
>> you'd simply skip dumping.
> 
> Ah, okay, then I misunderstood your intention.
> 
> I still think that this should be done not only in one place, but in a
> more general fashion. I'd rather give up only after some time trying
> (1 millisecond per default?) and apply the same scheme to all dumping
> functions.
> 
> I can have a try for such a series if you agree on taking a more general
> approach.

Getting behavior consistent across key handlers would of course
be very nice.

Jan
Dario Faggioli Feb. 20, 2020, 9:57 a.m. UTC | #6
On Tue, 2020-02-11 at 13:27 +0100, Juergen Gross wrote:
> When dumping the run queue information add some more data regarding
> current and (if known) previous vcpu for each physical cpu.
> 
> With core scheduling activated the printed data will be e.g.:
> 
> (XEN) CPUs info:
> (XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL
> (XEN) CPU[01] current=d[IDLE]v1
> (XEN) CPU[02] current=d[IDLE]v2, curr=d[IDLE]v2, prev=NULL
> (XEN) CPU[03] current=d[IDLE]v3
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
>
I'm fine with this patch, i.e., I think the additional logging it is
adding is useful and the locking is correct and in line with the
current (scheduling) keyhandler locking practices.

And thanks for making the requested adjustments to the changelog. :-)

So, I think it should go in:

Reviewed-by: Dario Faggioli <dfaggioli@suse.com>

If/when, as a result of the discussion going on in the thread about
keyhandler locking, we decide to change all the locks for all the
keyhandlers, then we'll change them here as well.

Regards
diff mbox series

Patch

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 2e43f8029f..6fbc30e678 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -3234,7 +3234,7 @@  void scheduler_free(struct scheduler *sched)
 
 void schedule_dump(struct cpupool *c)
 {
-    unsigned int      i;
+    unsigned int      i, j;
     struct scheduler *sched;
     cpumask_t        *cpus;
 
@@ -3245,7 +3245,7 @@  void schedule_dump(struct cpupool *c)
     if ( c != NULL )
     {
         sched = c->sched;
-        cpus = c->cpu_valid;
+        cpus = c->res_valid;
         printk("Scheduler: %s (%s)\n", sched->name, sched->opt_name);
         sched_dump_settings(sched);
     }
@@ -3255,11 +3255,25 @@  void schedule_dump(struct cpupool *c)
         cpus = &cpupool_free_cpus;
     }
 
-    if ( sched->dump_cpu_state != NULL )
+    printk("CPUs info:\n");
+    for_each_cpu (i, cpus)
     {
-        printk("CPUs info:\n");
-        for_each_cpu (i, cpus)
-            sched_dump_cpu_state(sched, i);
+        struct sched_resource *sr = get_sched_res(i);
+        unsigned long flags;
+        spinlock_t *lock;
+
+        lock = pcpu_schedule_lock_irqsave(i, &flags);
+
+        printk("CPU[%02d] current=%pv, curr=%pv, prev=%pv\n", i,
+               get_cpu_current(i), sr->curr ? sr->curr->vcpu_list : NULL,
+               sr->prev ? sr->prev->vcpu_list : NULL);
+        for_each_cpu (j, sr->cpus)
+            if ( i != j )
+                printk("CPU[%02d] current=%pv\n", j, get_cpu_current(j));
+
+        pcpu_schedule_unlock_irqrestore(lock, flags, i);
+
+        sched_dump_cpu_state(sched, i);
     }
 
     rcu_read_unlock(&sched_res_rculock);