diff mbox

[v2,13/13] tools: don't stop xenstore domain when stopping dom0

Message ID 1450444471-6454-14-git-send-email-jgross@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jürgen Groß Dec. 18, 2015, 1:14 p.m. UTC
When restarting or shutting down dom0 the xendomains script tries to
stop all other domains. Don't do this for the xenstore domain, as it
might survive a dom0 reboot in the future.

The same applies to xl shutdown --all.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
 tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
 2 files changed, 32 insertions(+), 4 deletions(-)

Comments

Andrew Cooper Dec. 18, 2015, 2:42 p.m. UTC | #1
On 18/12/15 13:14, Juergen Gross wrote:
> When restarting or shutting down dom0 the xendomains script tries to
> stop all other domains. Don't do this for the xenstore domain, as it
> might survive a dom0 reboot in the future.
>
> The same applies to xl shutdown --all.
>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
>  tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
>  2 files changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/tools/hotplug/Linux/xendomains.in b/tools/hotplug/Linux/xendomains.in
> index dfe0b33..70b7f16 100644
> --- a/tools/hotplug/Linux/xendomains.in
> +++ b/tools/hotplug/Linux/xendomains.in
> @@ -196,6 +196,17 @@ rdnames()
>      done
>  }
>  
> +# set xenstore domain id (or 0 if no xenstore domain)
> +get_xsdomid()

A get/set mismatch.

> +{
> +    ${bindir}/xenstore-exists /tool/xenstored/domid
> +    if test $? -ne 0; then
> +        XS_DOMID=0
> +    else
> +        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`
> +    fi

This is racy.  Can't you use a failure of xenstore-read as a signal that
the key doesn't exist?

~Andrew
Jürgen Groß Dec. 18, 2015, 2:53 p.m. UTC | #2
On 18/12/15 15:42, Andrew Cooper wrote:
> On 18/12/15 13:14, Juergen Gross wrote:
>> When restarting or shutting down dom0 the xendomains script tries to
>> stop all other domains. Don't do this for the xenstore domain, as it
>> might survive a dom0 reboot in the future.
>>
>> The same applies to xl shutdown --all.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>  tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
>>  tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
>>  2 files changed, 32 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/hotplug/Linux/xendomains.in b/tools/hotplug/Linux/xendomains.in
>> index dfe0b33..70b7f16 100644
>> --- a/tools/hotplug/Linux/xendomains.in
>> +++ b/tools/hotplug/Linux/xendomains.in
>> @@ -196,6 +196,17 @@ rdnames()
>>      done
>>  }
>>  
>> +# set xenstore domain id (or 0 if no xenstore domain)
>> +get_xsdomid()
> 
> A get/set mismatch.

Hmm, depends.

It is getting the domid of the xenstore domain and is setting
XS_DOMID accordingly. The main semantics are to get the correct
domid.

> 
>> +{
>> +    ${bindir}/xenstore-exists /tool/xenstored/domid
>> +    if test $? -ne 0; then
>> +        XS_DOMID=0
>> +    else
>> +        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`
>> +    fi
> 
> This is racy.  Can't you use a failure of xenstore-read as a signal that
> the key doesn't exist?

In theory it is racy. OTOH the race would require the xenstore domain to
be started between the call of xenstore-exists and xenstore-read, but
xenstore-exists will block in case no xenstore is available. And no, I
don't have to check that. the whole script will bail out early in this
case as in the beginning xl is tested to work which will be the case
with xenstore available only.

And using xenstore-read alone is ugly as it will barf in case the key
isn't existing.


Juergen
Ian Campbell Jan. 6, 2016, 4:33 p.m. UTC | #3
On Fri, 2015-12-18 at 15:53 +0100, Juergen Gross wrote:
> On 18/12/15 15:42, Andrew Cooper wrote:
> > On 18/12/15 13:14, Juergen Gross wrote:
> > > When restarting or shutting down dom0 the xendomains script tries to
> > > stop all other domains. Don't do this for the xenstore domain, as it
> > > might survive a dom0 reboot in the future.
> > > 
> > > The same applies to xl shutdown --all.
> > > 
> > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > ---
> > >  tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
> > >  tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
> > >  2 files changed, 32 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/tools/hotplug/Linux/xendomains.in
> > > b/tools/hotplug/Linux/xendomains.in
> > > index dfe0b33..70b7f16 100644
> > > --- a/tools/hotplug/Linux/xendomains.in
> > > +++ b/tools/hotplug/Linux/xendomains.in
> > > @@ -196,6 +196,17 @@ rdnames()
> > >      done
> > >  }
> > >  
> > > +# set xenstore domain id (or 0 if no xenstore domain)
> > > +get_xsdomid()
> > 
> > A get/set mismatch.
> 
> Hmm, depends.
> 
> It is getting the domid of the xenstore domain and is setting
> XS_DOMID accordingly. The main semantics are to get the correct
> domid.
> 
> > 
> > > +{
> > > +    ${bindir}/xenstore-exists /tool/xenstored/domid
> > > +    if test $? -ne 0; then
> > > +        XS_DOMID=0
> > > +    else
> > > +        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`

Please update docs/misc/xenstore-paths.markdown with this.

Did you mean /tools?

Earlier in the series there was a patch which looped over xc_dom_info
looking for the xs domain -- if this is in xenstore can't it use that?

> > > +    fi
> > 
> > This is racy.  Can't you use a failure of xenstore-read as a signal
> > that
> > the key doesn't exist?
> 
> In theory it is racy. OTOH the race would require the xenstore domain to
> be started between the call of xenstore-exists and xenstore-read, but
> xenstore-exists will block in case no xenstore is available. And no, I
> don't have to check that. the whole script will bail out early in this
> case as in the beginning xl is tested to work which will be the case
> with xenstore available only.
> 
> And using xenstore-read alone is ugly as it will barf in case the key
> isn't existing.

XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid 2>/dev/null`

seems like it should work:
root@st40:~# xenstore-read /foo 2>/dev/null; echo $?
1
root@st40:~# xenstore-read /local/domain/0/name 2>/dev/null; echo $?
Domain-0
0

> 
> 
> Juergen
Jürgen Groß Jan. 7, 2016, 6:52 a.m. UTC | #4
On 06/01/16 17:33, Ian Campbell wrote:
> On Fri, 2015-12-18 at 15:53 +0100, Juergen Gross wrote:
>> On 18/12/15 15:42, Andrew Cooper wrote:
>>> On 18/12/15 13:14, Juergen Gross wrote:
>>>> When restarting or shutting down dom0 the xendomains script tries to
>>>> stop all other domains. Don't do this for the xenstore domain, as it
>>>> might survive a dom0 reboot in the future.
>>>>
>>>> The same applies to xl shutdown --all.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>>  tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
>>>>  tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
>>>>  2 files changed, 32 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/tools/hotplug/Linux/xendomains.in
>>>> b/tools/hotplug/Linux/xendomains.in
>>>> index dfe0b33..70b7f16 100644
>>>> --- a/tools/hotplug/Linux/xendomains.in
>>>> +++ b/tools/hotplug/Linux/xendomains.in
>>>> @@ -196,6 +196,17 @@ rdnames()
>>>>      done
>>>>  }
>>>>  
>>>> +# set xenstore domain id (or 0 if no xenstore domain)
>>>> +get_xsdomid()
>>>
>>> A get/set mismatch.
>>
>> Hmm, depends.
>>
>> It is getting the domid of the xenstore domain and is setting
>> XS_DOMID accordingly. The main semantics are to get the correct
>> domid.
>>
>>>
>>>> +{
>>>> +    ${bindir}/xenstore-exists /tool/xenstored/domid
>>>> +    if test $? -ne 0; then
>>>> +        XS_DOMID=0
>>>> +    else
>>>> +        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`
> 
> Please update docs/misc/xenstore-paths.markdown with this.

Okay.

> 
> Did you mean /tools?

No. The xenstore path is /tool/...

> 
> Earlier in the series there was a patch which looped over xc_dom_info
> looking for the xs domain -- if this is in xenstore can't it use that?

Hen and egg problem. You need to know how to connect to xenstore
(domain or daemon) before being able to read xenstore.

> 
>>>> +    fi
>>>
>>> This is racy.  Can't you use a failure of xenstore-read as a signal
>>> that
>>> the key doesn't exist?
>>
>> In theory it is racy. OTOH the race would require the xenstore domain to
>> be started between the call of xenstore-exists and xenstore-read, but
>> xenstore-exists will block in case no xenstore is available. And no, I
>> don't have to check that. the whole script will bail out early in this
>> case as in the beginning xl is tested to work which will be the case
>> with xenstore available only.
>>
>> And using xenstore-read alone is ugly as it will barf in case the key
>> isn't existing.
> 
> XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid 2>/dev/null`
> 
> seems like it should work:
> root@st40:~# xenstore-read /foo 2>/dev/null; echo $?
> 1
> root@st40:~# xenstore-read /local/domain/0/name 2>/dev/null; echo $?
> Domain-0
> 0

Okay, I'll change it.


Juergen
Ian Campbell Jan. 7, 2016, 10:34 a.m. UTC | #5
On Thu, 2016-01-07 at 07:52 +0100, Juergen Gross wrote:
> On 06/01/16 17:33, Ian Campbell wrote:
> > On Fri, 2015-12-18 at 15:53 +0100, Juergen Gross wrote:
> > > On 18/12/15 15:42, Andrew Cooper wrote:
> > > > On 18/12/15 13:14, Juergen Gross wrote:
> > > > > When restarting or shutting down dom0 the xendomains script tries
> > > > > to
> > > > > stop all other domains. Don't do this for the xenstore domain, as
> > > > > it
> > > > > might survive a dom0 reboot in the future.
> > > > > 
> > > > > The same applies to xl shutdown --all.
> > > > > 
> > > > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > > > ---
> > > > >  tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
> > > > >  tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
> > > > >  2 files changed, 32 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/tools/hotplug/Linux/xendomains.in
> > > > > b/tools/hotplug/Linux/xendomains.in
> > > > > index dfe0b33..70b7f16 100644
> > > > > --- a/tools/hotplug/Linux/xendomains.in
> > > > > +++ b/tools/hotplug/Linux/xendomains.in
> > > > > @@ -196,6 +196,17 @@ rdnames()
> > > > >      done
> > > > >  }
> > > > >  
> > > > > +# set xenstore domain id (or 0 if no xenstore domain)
> > > > > +get_xsdomid()
> > > > 
> > > > A get/set mismatch.
> > > 
> > > Hmm, depends.
> > > 
> > > It is getting the domid of the xenstore domain and is setting
> > > XS_DOMID accordingly. The main semantics are to get the correct
> > > domid.
> > > 
> > > > 
> > > > > +{
> > > > > +    ${bindir}/xenstore-exists /tool/xenstored/domid
> > > > > +    if test $? -ne 0; then
> > > > > +        XS_DOMID=0
> > > > > +    else
> > > > > +        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`
> > 
> > Please update docs/misc/xenstore-paths.markdown with this.
> 
> Okay.
> 
> > 
> > Did you mean /tools?
> 
> No. The xenstore path is /tool/...

You mean that are preexisting uses of this path?

/me looks.

Oh, so there is. Undocumented too :-(

> > 
> > Earlier in the series there was a patch which looped over xc_dom_info
> > looking for the xs domain -- if this is in xenstore can't it use that?
> 
> Hen and egg problem. You need to know how to connect to xenstore
> (domain or daemon) before being able to read xenstore.

Oh, of course.

Can you not infer from the presence of absence of the sockets in the local
f/s or do they always exist (i.e. stale from a previous configuration)?

We did once try switching to always using the domain ring, even if the
client and server were co-located in the same domain, but that can result
in uninterruptible sleeps in the kernel IIRC (a bug which might have since
been fixed, not sure). Anyway, that probably rules out the "solution" of
always using the domain.

The daemon would drop a pid file, but I suppose that might also be stale.

I'm mostly just brainstorming here, I don't really have a problem with the
scan in the earlier patch.

(FWIW in English idiom we usually say chicken and egg BTW)

Ian.
Jürgen Groß Jan. 7, 2016, 10:45 a.m. UTC | #6
On 07/01/16 11:34, Ian Campbell wrote:
> On Thu, 2016-01-07 at 07:52 +0100, Juergen Gross wrote:
>> On 06/01/16 17:33, Ian Campbell wrote:
>>> On Fri, 2015-12-18 at 15:53 +0100, Juergen Gross wrote:
>>>> On 18/12/15 15:42, Andrew Cooper wrote:
>>>>> On 18/12/15 13:14, Juergen Gross wrote:
>>>>>> When restarting or shutting down dom0 the xendomains script tries
>>>>>> to
>>>>>> stop all other domains. Don't do this for the xenstore domain, as
>>>>>> it
>>>>>> might survive a dom0 reboot in the future.
>>>>>>
>>>>>> The same applies to xl shutdown --all.
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>> ---
>>>>>>  tools/hotplug/Linux/xendomains.in | 17 +++++++++++++++++
>>>>>>  tools/libxl/xl_cmdimpl.c          | 19 +++++++++++++++----
>>>>>>  2 files changed, 32 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/hotplug/Linux/xendomains.in
>>>>>> b/tools/hotplug/Linux/xendomains.in
>>>>>> index dfe0b33..70b7f16 100644
>>>>>> --- a/tools/hotplug/Linux/xendomains.in
>>>>>> +++ b/tools/hotplug/Linux/xendomains.in
>>>>>> @@ -196,6 +196,17 @@ rdnames()
>>>>>>      done
>>>>>>  }
>>>>>>  
>>>>>> +# set xenstore domain id (or 0 if no xenstore domain)
>>>>>> +get_xsdomid()
>>>>>
>>>>> A get/set mismatch.
>>>>
>>>> Hmm, depends.
>>>>
>>>> It is getting the domid of the xenstore domain and is setting
>>>> XS_DOMID accordingly. The main semantics are to get the correct
>>>> domid.
>>>>
>>>>>
>>>>>> +{
>>>>>> +    ${bindir}/xenstore-exists /tool/xenstored/domid
>>>>>> +    if test $? -ne 0; then
>>>>>> +        XS_DOMID=0
>>>>>> +    else
>>>>>> +        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`
>>>
>>> Please update docs/misc/xenstore-paths.markdown with this.
>>
>> Okay.
>>
>>>
>>> Did you mean /tools?
>>
>> No. The xenstore path is /tool/...
> 
> You mean that are preexisting uses of this path?
> 
> /me looks.
> 
> Oh, so there is. Undocumented too :-(
> 
>>>
>>> Earlier in the series there was a patch which looped over xc_dom_info
>>> looking for the xs domain -- if this is in xenstore can't it use that?
>>
>> Hen and egg problem. You need to know how to connect to xenstore
>> (domain or daemon) before being able to read xenstore.
> 
> Oh, of course.
> 
> Can you not infer from the presence of absence of the sockets in the local
> f/s or do they always exist (i.e. stale from a previous configuration)?

No. Sockets will be created later even in the xenstored case. And still
you need to know the domain id of the xenstore domain in order to be
able to connect to it. This is available in the hypervisor and the
xenstore domain only at this time.

> We did once try switching to always using the domain ring, even if the
> client and server were co-located in the same domain, but that can result
> in uninterruptible sleeps in the kernel IIRC (a bug which might have since
> been fixed, not sure). Anyway, that probably rules out the "solution" of
> always using the domain.
> 
> The daemon would drop a pid file, but I suppose that might also be stale.
> 
> I'm mostly just brainstorming here, I don't really have a problem with the
> scan in the earlier patch.
> 
> (FWIW in English idiom we usually say chicken and egg BTW)

Aah, thanks.


Juergen
diff mbox

Patch

diff --git a/tools/hotplug/Linux/xendomains.in b/tools/hotplug/Linux/xendomains.in
index dfe0b33..70b7f16 100644
--- a/tools/hotplug/Linux/xendomains.in
+++ b/tools/hotplug/Linux/xendomains.in
@@ -196,6 +196,17 @@  rdnames()
     done
 }
 
+# set xenstore domain id (or 0 if no xenstore domain)
+get_xsdomid()
+{
+    ${bindir}/xenstore-exists /tool/xenstored/domid
+    if test $? -ne 0; then
+        XS_DOMID=0
+    else
+        XS_DOMID=`${bindir}/xenstore-read /tool/xenstored/domid`
+    fi
+}
+
 LIST_GREP='(domain\|(domid\|(name\|^    {$\|"name":\|"domid":'
 parseln()
 {
@@ -216,12 +227,14 @@  parseln()
 
 is_running()
 {
+    get_xsdomid
     rdname $1
     RC=1
     name=;id=
     while read LN; do
 	parseln "$LN" || continue
 	if test $id = 0; then continue; fi
+	if test $id = $XS_DOMID; then continue; fi
 	case $name in
 	    ($NM)
 		RC=0
@@ -302,10 +315,12 @@  start()
 
 all_zombies()
 {
+    get_xsdomid
     name=;id=
     while read LN; do
 	parseln "$LN" || continue
 	if test $id = 0; then continue; fi
+	if test $id = $XS_DOMID; then continue; fi
 	if test "$state" != "-b---d" -a "$state" != "-----d"; then
 	    return 1;
 	fi
@@ -351,11 +366,13 @@  stop()
     if test "$XENDOMAINS_AUTO_ONLY" = "true"; then
 	rdnames
     fi
+    get_xsdomid
     echo -n "Shutting down Xen domains:"
     name=;id=
     while read LN; do
 	parseln "$LN" || continue
 	if test $id = 0; then continue; fi
+	if test $id = $XS_DOMID; then continue; fi
 	echo -n " $name"
 	if test "$XENDOMAINS_AUTO_ONLY" = "true"; then
 	    eval "
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index f9933cb..bf30030 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4822,6 +4822,7 @@  static int main_shutdown_or_reboot(int do_reboot, int argc, char **argv)
     int opt, i, nb_domain;
     int wait_for_it = 0, all =0;
     int fallback_trigger = 0;
+    uint32_t domid;
     static struct option opts[] = {
         {"all", 0, 0, 'a'},
         {"wait", 0, 0, 'w'},
@@ -4846,32 +4847,42 @@  static int main_shutdown_or_reboot(int do_reboot, int argc, char **argv)
     }
 
     if (all) {
+        int ret;
         libxl_dominfo *dominfo;
         libxl_evgen_domain_death **deathws = NULL;
         if (!(dominfo = libxl_list_domain(ctx, &nb_domain))) {
             fprintf(stderr, "libxl_list_domain failed.\n");
             return -1;
         }
+        ret = libxl_xenstore_domid(ctx, &domid);
+        if (ret == ERROR_DOMAIN_NOTFOUND) {
+            domid = 0;
+        } else if (ret != 0) {
+            fprintf(stderr, "libxl_xenstore_domid failed.\n");
+            return -1;
+        }
 
         if (wait_for_it)
             deathws = calloc(nb_domain, sizeof(*deathws));
 
+        wait_for_it = 0;
         for (i = 0; i<nb_domain; i++) {
-            if (dominfo[i].domid == 0)
+            if (dominfo[i].domid == 0 || dominfo[i].domid == domid)
                 continue;
             fn(dominfo[i].domid, deathws ? &deathws[i] : NULL, i,
                fallback_trigger);
+            wait_for_it++;
         }
 
-        if (wait_for_it) {
-            wait_for_domain_deaths(deathws, nb_domain - 1 /* not dom 0 */);
+        if (deathws) {
+            wait_for_domain_deaths(deathws, wait_for_it);
             free(deathws);
         }
 
         libxl_dominfo_list_free(dominfo, nb_domain);
     } else {
         libxl_evgen_domain_death *deathw = NULL;
-        uint32_t domid = find_domain(argv[optind]);
+        domid = find_domain(argv[optind]);
 
         fn(domid, wait_for_it ? &deathw : NULL, 0, fallback_trigger);