diff mbox series

xen/watchdog: Identify which domain watchdog fired

Message ID 20250213164618.38167-1-andrew.cooper3@citrix.com (mailing list archive)
State New
Headers show
Series xen/watchdog: Identify which domain watchdog fired | expand

Commit Message

Andrew Cooper Feb. 13, 2025, 4:46 p.m. UTC
When a watchdog fires, the domain is crashed and can't dump any state.

Xen allows a domain to have two separate watchdogs.  Therefore, for a
domain running multiple watchdogs (e.g. one based around network, one
for disk), it is important for diagnostics to know which watchdog
fired.

As the printk() is in a timer callback, this is a bit awkward to
arrange, but there are 12 spare bits in the bottom of the domain
pointer owing to its alignment.

Reuse these bits to encode the watchdog id too, so the one which fired
is identified when the domain is crashed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Anthony PERARD <anthony.perard@vates.tech>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>

In XenServer's case, it's two different watchdogs for dom0, so this
printk() is the final piece of useful information out of the system
which has deemed itself to have lost connectivity.
---
 xen/common/sched/core.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Comments

Jan Beulich Feb. 13, 2025, 5 p.m. UTC | #1
On 13.02.2025 17:46, Andrew Cooper wrote:
> When a watchdog fires, the domain is crashed and can't dump any state.
> 
> Xen allows a domain to have two separate watchdogs.  Therefore, for a
> domain running multiple watchdogs (e.g. one based around network, one
> for disk), it is important for diagnostics to know which watchdog
> fired.
> 
> As the printk() is in a timer callback, this is a bit awkward to
> arrange, but there are 12 spare bits in the bottom of the domain
> pointer owing to its alignment.
> 
> Reuse these bits to encode the watchdog id too, so the one which fired
> is identified when the domain is crashed.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Anthony PERARD <anthony.perard@vates.tech>
> CC: Michal Orzel <michal.orzel@amd.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien@xen.org>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>

You'll eventually need a scheduler maintainer's ack, yet you didn't Cc any
of them.

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1534,12 +1534,17 @@ long vcpu_yield(void)
>  
>  static void cf_check domain_watchdog_timeout(void *data)
>  {
> -    struct domain *d = data;
> +    /*
> +     * The data parameter encodes the watchdog id in the low bits of
> +     * the domain pointer.
> +     */
> +    struct domain *d = _p((unsigned long)data & PAGE_MASK);
> +    unsigned int id = (unsigned long)data & ~PAGE_MASK;
>  
>      if ( d->is_shutting_down || d->is_dying )
>          return;
>  
> -    printk("Watchdog timer fired for domain %u\n", d->domain_id);
> +    printk("Watchdog timer %u fired for %pd\n", id, d);

And apriori knowledge will be required to associate the number with whichever
watchdog it was (network or disk in your example)? (No question that logging
the number is better than not doing so.)

> @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d)
>      d->watchdog_inuse_map = 0;
>  
>      for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
> -        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
> +    {
> +        void *data = d;
> +
> +        BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE);
> +
> +        /*
> +         * For the timer callback parameter, encode the watchdog id in
> +         * the low bits of the domain pointer.
> +         */
> +        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0);
> +    }

This way we'll be promising to ourselves that we're never going to alter
the allocation mechanism of struct domain instances, always requiring
them to have at least page alignment. If someone wanted to change that,
they'll have a hard time spotting the logic here. Sadly I have no good
suggestion towards improving the situation.

Jan
Andrew Cooper Feb. 14, 2025, 3:09 p.m. UTC | #2
On 13/02/2025 5:00 pm, Jan Beulich wrote:
> On 13.02.2025 17:46, Andrew Cooper wrote:
>> When a watchdog fires, the domain is crashed and can't dump any state.
>>
>> Xen allows a domain to have two separate watchdogs.  Therefore, for a
>> domain running multiple watchdogs (e.g. one based around network, one
>> for disk), it is important for diagnostics to know which watchdog
>> fired.
>>
>> As the printk() is in a timer callback, this is a bit awkward to
>> arrange, but there are 12 spare bits in the bottom of the domain
>> pointer owing to its alignment.
>>
>> Reuse these bits to encode the watchdog id too, so the one which fired
>> is identified when the domain is crashed.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Anthony PERARD <anthony.perard@vates.tech>
>> CC: Michal Orzel <michal.orzel@amd.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> CC: Julien Grall <julien@xen.org>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Stefano Stabellini <sstabellini@kernel.org>
> You'll eventually need a scheduler maintainer's ack, yet you didn't Cc any
> of them.

Oops yes.  Although really SCHEDOP and the scheduler shouldn't be mixed
like this.

>
>> --- a/xen/common/sched/core.c
>> +++ b/xen/common/sched/core.c
>> @@ -1534,12 +1534,17 @@ long vcpu_yield(void)
>>  
>>  static void cf_check domain_watchdog_timeout(void *data)
>>  {
>> -    struct domain *d = data;
>> +    /*
>> +     * The data parameter encodes the watchdog id in the low bits of
>> +     * the domain pointer.
>> +     */
>> +    struct domain *d = _p((unsigned long)data & PAGE_MASK);
>> +    unsigned int id = (unsigned long)data & ~PAGE_MASK;
>>  
>>      if ( d->is_shutting_down || d->is_dying )
>>          return;
>>  
>> -    printk("Watchdog timer fired for domain %u\n", d->domain_id);
>> +    printk("Watchdog timer %u fired for %pd\n", id, d);
> And apriori knowledge will be required to associate the number with whichever
> watchdog it was (network or disk in your example)? (No question that logging
> the number is better than not doing so.)

Indeed, but that's up to the entities in the domain requesting the watchdog.

(Yes, we do have this logged.)

>
>> @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d)
>>      d->watchdog_inuse_map = 0;
>>  
>>      for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
>> -        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
>> +    {
>> +        void *data = d;
>> +
>> +        BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE);
>> +
>> +        /*
>> +         * For the timer callback parameter, encode the watchdog id in
>> +         * the low bits of the domain pointer.
>> +         */
>> +        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0);
>> +    }
> This way we'll be promising to ourselves that we're never going to alter
> the allocation mechanism of struct domain instances, always requiring
> them to have at least page alignment. If someone wanted to change that,
> they'll have a hard time spotting the logic here. Sadly I have no good
> suggestion towards improving the situation.

I wasn't terribly happy either, but something has occurred to me.

For both struct domain and vcpu, we could have an __aligned(PAGE_SIZE)
attribute.  It's accurate and unlikely to change (and helps x86 code
generation at least).

Then, we can use:

    BUILD_BUG_ON((NR_DOMAIN_WATCHDOG_TIMERS > alignof(d));

which should trigger cleanly if the precondition is violated.

For watchdog specifically, we only actually need uint16_t alignment to
be safe, and there's no way that's going to break in practice.

~Andrew
Jan Beulich Feb. 17, 2025, 7:30 a.m. UTC | #3
On 14.02.2025 16:09, Andrew Cooper wrote:
> On 13/02/2025 5:00 pm, Jan Beulich wrote:
>> On 13.02.2025 17:46, Andrew Cooper wrote:
>>> @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d)
>>>      d->watchdog_inuse_map = 0;
>>>  
>>>      for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
>>> -        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
>>> +    {
>>> +        void *data = d;
>>> +
>>> +        BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE);
>>> +
>>> +        /*
>>> +         * For the timer callback parameter, encode the watchdog id in
>>> +         * the low bits of the domain pointer.
>>> +         */
>>> +        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0);
>>> +    }
>> This way we'll be promising to ourselves that we're never going to alter
>> the allocation mechanism of struct domain instances, always requiring
>> them to have at least page alignment. If someone wanted to change that,
>> they'll have a hard time spotting the logic here. Sadly I have no good
>> suggestion towards improving the situation.
> 
> I wasn't terribly happy either, but something has occurred to me.
> 
> For both struct domain and vcpu, we could have an __aligned(PAGE_SIZE)
> attribute.  It's accurate and unlikely to change (and helps x86 code
> generation at least).
> 
> Then, we can use:
> 
>     BUILD_BUG_ON((NR_DOMAIN_WATCHDOG_TIMERS > alignof(d));
> 
> which should trigger cleanly if the precondition is violated.

Hmm, yes, why not. That would establish the missing link.

> For watchdog specifically, we only actually need uint16_t alignment to
> be safe, and there's no way that's going to break in practice.

Right.

Jan
diff mbox series

Patch

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index d6296d99fdb9..38b20e5bbde6 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1534,12 +1534,17 @@  long vcpu_yield(void)
 
 static void cf_check domain_watchdog_timeout(void *data)
 {
-    struct domain *d = data;
+    /*
+     * The data parameter encodes the watchdog id in the low bits of
+     * the domain pointer.
+     */
+    struct domain *d = _p((unsigned long)data & PAGE_MASK);
+    unsigned int id = (unsigned long)data & ~PAGE_MASK;
 
     if ( d->is_shutting_down || d->is_dying )
         return;
 
-    printk("Watchdog timer fired for domain %u\n", d->domain_id);
+    printk("Watchdog timer %u fired for %pd\n", id, d);
     domain_shutdown(d, SHUTDOWN_watchdog);
 }
 
@@ -1593,7 +1598,17 @@  void watchdog_domain_init(struct domain *d)
     d->watchdog_inuse_map = 0;
 
     for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ )
-        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0);
+    {
+        void *data = d;
+
+        BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE);
+
+        /*
+         * For the timer callback parameter, encode the watchdog id in
+         * the low bits of the domain pointer.
+         */
+        init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0);
+    }
 }
 
 void watchdog_domain_destroy(struct domain *d)