Message ID | 20250213164618.38167-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xen/watchdog: Identify which domain watchdog fired | expand |
On 13.02.2025 17:46, Andrew Cooper wrote: > When a watchdog fires, the domain is crashed and can't dump any state. > > Xen allows a domain to have two separate watchdogs. Therefore, for a > domain running multiple watchdogs (e.g. one based around network, one > for disk), it is important for diagnostics to know which watchdog > fired. > > As the printk() is in a timer callback, this is a bit awkward to > arrange, but there are 12 spare bits in the bottom of the domain > pointer owing to its alignment. > > Reuse these bits to encode the watchdog id too, so the one which fired > is identified when the domain is crashed. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Anthony PERARD <anthony.perard@vates.tech> > CC: Michal Orzel <michal.orzel@amd.com> > CC: Jan Beulich <jbeulich@suse.com> > CC: Julien Grall <julien@xen.org> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Stefano Stabellini <sstabellini@kernel.org> You'll eventually need a scheduler maintainer's ack, yet you didn't Cc any of them. > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -1534,12 +1534,17 @@ long vcpu_yield(void) > > static void cf_check domain_watchdog_timeout(void *data) > { > - struct domain *d = data; > + /* > + * The data parameter encodes the watchdog id in the low bits of > + * the domain pointer. > + */ > + struct domain *d = _p((unsigned long)data & PAGE_MASK); > + unsigned int id = (unsigned long)data & ~PAGE_MASK; > > if ( d->is_shutting_down || d->is_dying ) > return; > > - printk("Watchdog timer fired for domain %u\n", d->domain_id); > + printk("Watchdog timer %u fired for %pd\n", id, d); And apriori knowledge will be required to associate the number with whichever watchdog it was (network or disk in your example)? (No question that logging the number is better than not doing so.) > @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d) > d->watchdog_inuse_map = 0; > > for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ ) > - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0); > + { > + void *data = d; > + > + BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE); > + > + /* > + * For the timer callback parameter, encode the watchdog id in > + * the low bits of the domain pointer. > + */ > + init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0); > + } This way we'll be promising to ourselves that we're never going to alter the allocation mechanism of struct domain instances, always requiring them to have at least page alignment. If someone wanted to change that, they'll have a hard time spotting the logic here. Sadly I have no good suggestion towards improving the situation. Jan
On 13/02/2025 5:00 pm, Jan Beulich wrote: > On 13.02.2025 17:46, Andrew Cooper wrote: >> When a watchdog fires, the domain is crashed and can't dump any state. >> >> Xen allows a domain to have two separate watchdogs. Therefore, for a >> domain running multiple watchdogs (e.g. one based around network, one >> for disk), it is important for diagnostics to know which watchdog >> fired. >> >> As the printk() is in a timer callback, this is a bit awkward to >> arrange, but there are 12 spare bits in the bottom of the domain >> pointer owing to its alignment. >> >> Reuse these bits to encode the watchdog id too, so the one which fired >> is identified when the domain is crashed. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> --- >> CC: Anthony PERARD <anthony.perard@vates.tech> >> CC: Michal Orzel <michal.orzel@amd.com> >> CC: Jan Beulich <jbeulich@suse.com> >> CC: Julien Grall <julien@xen.org> >> CC: Roger Pau Monné <roger.pau@citrix.com> >> CC: Stefano Stabellini <sstabellini@kernel.org> > You'll eventually need a scheduler maintainer's ack, yet you didn't Cc any > of them. Oops yes. Although really SCHEDOP and the scheduler shouldn't be mixed like this. > >> --- a/xen/common/sched/core.c >> +++ b/xen/common/sched/core.c >> @@ -1534,12 +1534,17 @@ long vcpu_yield(void) >> >> static void cf_check domain_watchdog_timeout(void *data) >> { >> - struct domain *d = data; >> + /* >> + * The data parameter encodes the watchdog id in the low bits of >> + * the domain pointer. >> + */ >> + struct domain *d = _p((unsigned long)data & PAGE_MASK); >> + unsigned int id = (unsigned long)data & ~PAGE_MASK; >> >> if ( d->is_shutting_down || d->is_dying ) >> return; >> >> - printk("Watchdog timer fired for domain %u\n", d->domain_id); >> + printk("Watchdog timer %u fired for %pd\n", id, d); > And apriori knowledge will be required to associate the number with whichever > watchdog it was (network or disk in your example)? (No question that logging > the number is better than not doing so.) Indeed, but that's up to the entities in the domain requesting the watchdog. (Yes, we do have this logged.) > >> @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d) >> d->watchdog_inuse_map = 0; >> >> for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ ) >> - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0); >> + { >> + void *data = d; >> + >> + BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE); >> + >> + /* >> + * For the timer callback parameter, encode the watchdog id in >> + * the low bits of the domain pointer. >> + */ >> + init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0); >> + } > This way we'll be promising to ourselves that we're never going to alter > the allocation mechanism of struct domain instances, always requiring > them to have at least page alignment. If someone wanted to change that, > they'll have a hard time spotting the logic here. Sadly I have no good > suggestion towards improving the situation. I wasn't terribly happy either, but something has occurred to me. For both struct domain and vcpu, we could have an __aligned(PAGE_SIZE) attribute. It's accurate and unlikely to change (and helps x86 code generation at least). Then, we can use: BUILD_BUG_ON((NR_DOMAIN_WATCHDOG_TIMERS > alignof(d)); which should trigger cleanly if the precondition is violated. For watchdog specifically, we only actually need uint16_t alignment to be safe, and there's no way that's going to break in practice. ~Andrew
On 14.02.2025 16:09, Andrew Cooper wrote: > On 13/02/2025 5:00 pm, Jan Beulich wrote: >> On 13.02.2025 17:46, Andrew Cooper wrote: >>> @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d) >>> d->watchdog_inuse_map = 0; >>> >>> for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ ) >>> - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0); >>> + { >>> + void *data = d; >>> + >>> + BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE); >>> + >>> + /* >>> + * For the timer callback parameter, encode the watchdog id in >>> + * the low bits of the domain pointer. >>> + */ >>> + init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0); >>> + } >> This way we'll be promising to ourselves that we're never going to alter >> the allocation mechanism of struct domain instances, always requiring >> them to have at least page alignment. If someone wanted to change that, >> they'll have a hard time spotting the logic here. Sadly I have no good >> suggestion towards improving the situation. > > I wasn't terribly happy either, but something has occurred to me. > > For both struct domain and vcpu, we could have an __aligned(PAGE_SIZE) > attribute. It's accurate and unlikely to change (and helps x86 code > generation at least). > > Then, we can use: > > BUILD_BUG_ON((NR_DOMAIN_WATCHDOG_TIMERS > alignof(d)); > > which should trigger cleanly if the precondition is violated. Hmm, yes, why not. That would establish the missing link. > For watchdog specifically, we only actually need uint16_t alignment to > be safe, and there's no way that's going to break in practice. Right. Jan
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index d6296d99fdb9..38b20e5bbde6 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -1534,12 +1534,17 @@ long vcpu_yield(void) static void cf_check domain_watchdog_timeout(void *data) { - struct domain *d = data; + /* + * The data parameter encodes the watchdog id in the low bits of + * the domain pointer. + */ + struct domain *d = _p((unsigned long)data & PAGE_MASK); + unsigned int id = (unsigned long)data & ~PAGE_MASK; if ( d->is_shutting_down || d->is_dying ) return; - printk("Watchdog timer fired for domain %u\n", d->domain_id); + printk("Watchdog timer %u fired for %pd\n", id, d); domain_shutdown(d, SHUTDOWN_watchdog); } @@ -1593,7 +1598,17 @@ void watchdog_domain_init(struct domain *d) d->watchdog_inuse_map = 0; for ( i = 0; i < NR_DOMAIN_WATCHDOG_TIMERS; i++ ) - init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, d, 0); + { + void *data = d; + + BUILD_BUG_ON(NR_DOMAIN_WATCHDOG_TIMERS >= PAGE_SIZE); + + /* + * For the timer callback parameter, encode the watchdog id in + * the low bits of the domain pointer. + */ + init_timer(&d->watchdog_timer[i], domain_watchdog_timeout, data + i, 0); + } } void watchdog_domain_destroy(struct domain *d)
When a watchdog fires, the domain is crashed and can't dump any state. Xen allows a domain to have two separate watchdogs. Therefore, for a domain running multiple watchdogs (e.g. one based around network, one for disk), it is important for diagnostics to know which watchdog fired. As the printk() is in a timer callback, this is a bit awkward to arrange, but there are 12 spare bits in the bottom of the domain pointer owing to its alignment. Reuse these bits to encode the watchdog id too, so the one which fired is identified when the domain is crashed. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> In XenServer's case, it's two different watchdogs for dom0, so this printk() is the final piece of useful information out of the system which has deemed itself to have lost connectivity. --- xen/common/sched/core.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)