Message ID | 1453756017-8747-2-git-send-email-konrad.wilk@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote: > To hopefully clarify what it meant. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > tools/libxc/xc_resume.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c > index 87d4324..19ba2a3 100644 > --- a/tools/libxc/xc_resume.c > +++ b/tools/libxc/xc_resume.c > @@ -248,9 +248,12 @@ out: > /* > * Resume execution of a domain after suspend shutdown. > * This can happen in one of two ways: > - * 1. Resume with special return code. > - * 2. Reset guest environment so it believes it is resumed in a new > + * 1. (fast=1) Resume with special return code (1) that the guest > + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend. "SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call which this in effect is. I think I'd say "Resume the guest without resetting the domain environment. The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1". (assuming that is true re resetting) > + * > + * 2. (fast=0) Reset guest environment so it believes it is resumed in a new > * domain context. with the above I would suggesting adding "The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 0". > + * > * (2) should be used only for guests which cannot handle the special > * new return code. (1) is always safe (but slower). Is this correct? I'd have said (2) was always safe but slow? And I would invert the first, that is to say that (1) should be used in preference with guests which support it. Ian.
Ian Campbell writes ("Re: [PATCH 1/3] libxc/xc_domain_resume: Update comment."): > On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote: > > To hopefully clarify what it meant. ... > > + * 1. (fast=1) Resume with special return code (1) that the guest > > + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend. > > "SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call > which this in effect is. > > I think I'd say "Resume the guest without resetting the domain environment. > The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1". > > (assuming that is true re resetting) I'm not sure that `will return 1' is correct. IIRC there is some ... unpleasantness here, with something effectively corrupting the guest state in a way that the guest is supposed to expect and cooperate with. I haven't investigated the details recently. I do remember it being fiddly. Ian.
On Tue, 2016-01-26 at 16:22 +0000, Ian Jackson wrote: > Ian Campbell writes ("Re: [PATCH 1/3] libxc/xc_domain_resume: Update > comment."): > > On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote: > > > To hopefully clarify what it meant. > ... > > > + * 1. (fast=1) Resume with special return code (1) that the guest > > > + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend. > > > > "SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call > > which this in effect is. > > > > I think I'd say "Resume the guest without resetting the domain > > environment. > > The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1". > > > > (assuming that is true re resetting) > > I'm not sure that `will return 1' is correct. IIRC there is some > ... unpleasantness here, with something effectively corrupting the > guest state in a way that the guest is supposed to expect and > cooperate with. The tools arrange for the hypercall to return 1, which the guest is indeed expected to expect and cooperate, as with any PV interface call it makes. They do this by intimate knowledge of the hypercall ABI (i.e. which register is the return value) and one could certainly argue it ought to be arranged in a less horrific way, but I think to characterise it as "corrupting" is probably going to far. > > I haven't investigated the details recently. I do remember it being > fiddly. > > Ian.
Ian Campbell writes ("Re: [PATCH 1/3] libxc/xc_domain_resume: Update comment."): > On Tue, 2016-01-26 at 16:22 +0000, Ian Jackson wrote: > > I'm not sure that `will return 1' is correct. IIRC there is some > > ... unpleasantness here, with something effectively corrupting the > > guest state in a way that the guest is supposed to expect and > > cooperate with. > > The tools arrange for the hypercall to return 1, which the guest is indeed > expected to expect and cooperate, as with any PV interface call it makes. > > They do this by intimate knowledge of the hypercall ABI (i.e. which > register is the return value) and one could certainly argue it ought to be > arranged in a less horrific way, but I think to characterise it as > "corrupting" is probably going to far. Ian C had a conversation about this in person. We think (ie, I am now convinced) that provided that this xc resume call is only made when the guest is suspended, that the worst outcome will indeed be that the guest experiences the hypercall returning 1, and then finding itself in a state it's not expecting. The guest will hopefully crash due to the unexpected return value but is in any case likely to implode soon due to event channel misconfiguration etc. Only if the `resume' is attempted with the guest running, would the guest's %eax actually be `corrupted' in this sense. Ian.
On Tue, Jan 26, 2016 at 04:52:06PM +0000, Ian Jackson wrote: > Ian Campbell writes ("Re: [PATCH 1/3] libxc/xc_domain_resume: Update comment."): > > On Tue, 2016-01-26 at 16:22 +0000, Ian Jackson wrote: > > > I'm not sure that `will return 1' is correct. IIRC there is some > > > ... unpleasantness here, with something effectively corrupting the > > > guest state in a way that the guest is supposed to expect and > > > cooperate with. > > > > The tools arrange for the hypercall to return 1, which the guest is indeed > > expected to expect and cooperate, as with any PV interface call it makes. > > > > They do this by intimate knowledge of the hypercall ABI (i.e. which > > register is the return value) and one could certainly argue it ought to be > > arranged in a less horrific way, but I think to characterise it as > > "corrupting" is probably going to far. > > Ian C had a conversation about this in person. We think (ie, I am now > convinced) that provided that this xc resume call is only made when > the guest is suspended, that the worst outcome will indeed be that the > guest experiences the hypercall returning 1, and then finding itself > in a state it's not expecting. The guest will hopefully crash due > to the unexpected return value but is in any case likely to implode > soon due to event channel misconfiguration etc. <nods> > > Only if the `resume' is attempted with the guest running, would the > guest's %eax actually be `corrupted' in this sense. Right. That code doing the set_vcpu manipulations is quite .. ahhh interesting! I will update the comment to be more clear. > > Ian.
On Tue, Jan 26, 2016 at 04:19:54PM +0000, Ian Campbell wrote: > On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote: > > To hopefully clarify what it meant. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > tools/libxc/xc_resume.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c > > index 87d4324..19ba2a3 100644 > > --- a/tools/libxc/xc_resume.c > > +++ b/tools/libxc/xc_resume.c > > @@ -248,9 +248,12 @@ out: > > /* > > * Resume execution of a domain after suspend shutdown. > > * This can happen in one of two ways: > > - * 1. Resume with special return code. > > - * 2. Reset guest environment so it believes it is resumed in a new > > + * 1. (fast=1) Resume with special return code (1) that the guest > > + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend. > > "SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call > which this in effect is. > > I think I'd say "Resume the guest without resetting the domain environment. > The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1". > > (assuming that is true re resetting) > > > + * > > + * 2. (fast=0) Reset guest environment so it believes it is resumed in a new > > * domain context. > > with the above I would suggesting adding "The guests's call to > SCHEDOP_shutdown(SHUTDOWN_suspend) will return 0". > > > + * > > * (2) should be used only for guests which cannot handle the special > > * new return code. (1) is always safe (but slower). > > Is this correct? I'd have said (2) was always safe but slow? That does not sound right. It should have said that fast=1 would be fast but not safe. And 2) (fast=0) is safe but slower. Let me resend this - with it hopefully being more clear. > > And I would invert the first, that is to say that (1) should be used in > preference with guests which support it. Reading the 1) I am bit perplexed. It says "safe" but what it does is far from safe - it manipulates the vCPU eax register to be 1. Granted it does it on a "paused" vCPU and once the vCPU resume it can read it. I guess in the olden days this was considered safe. Along with driving without seatbelts. > > Ian.
On Tue, 2016-01-26 at 14:47 -0500, Konrad Rzeszutek Wilk wrote: > On Tue, Jan 26, 2016 at 04:19:54PM +0000, Ian Campbell wrote: > > On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote: > > > To hopefully clarify what it meant. > > > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > --- > > > tools/libxc/xc_resume.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c > > > index 87d4324..19ba2a3 100644 > > > --- a/tools/libxc/xc_resume.c > > > +++ b/tools/libxc/xc_resume.c > > > @@ -248,9 +248,12 @@ out: > > > /* > > > * Resume execution of a domain after suspend shutdown. > > > * This can happen in one of two ways: > > > - * 1. Resume with special return code. > > > - * 2. Reset guest environment so it believes it is resumed in a new > > > + * 1. (fast=1) Resume with special return code (1) that the guest > > > + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend. > > > > "SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call > > which this in effect is. > > > > I think I'd say "Resume the guest without resetting the domain > > environment. > > The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1". > > > > (assuming that is true re resetting) > > > > > + * > > > + * 2. (fast=0) Reset guest environment so it believes it is resumed > > > in a new > > > * domain context. > > > > with the above I would suggesting adding "The guests's call to > > SCHEDOP_shutdown(SHUTDOWN_suspend) will return 0". > > > > > + * > > > * (2) should be used only for guests which cannot handle the > > > special > > > * new return code. (1) is always safe (but slower). > > > > Is this correct? I'd have said (2) was always safe but slow? > > That does not sound right. It should have said that fast=1 > would be fast but not safe. And 2) (fast=0) is safe but slower. > > Let me resend this - with it hopefully being more clear. > > > > And I would invert the first, that is to say that (1) should be used in > > preference with guests which support it. > > Reading the 1) I am bit perplexed. It says "safe" but what it does is far > from safe - it manipulates the vCPU eax register to be 1. Granted it does > it on a "paused" vCPU and once the vCPU resume it can read it. I think "(1) is always safe (but slower)" should have always read "(2) is always safe (but slower)". (1) has always been "fast and loose" and requires guest side support. Ian.
diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c index 87d4324..19ba2a3 100644 --- a/tools/libxc/xc_resume.c +++ b/tools/libxc/xc_resume.c @@ -248,9 +248,12 @@ out: /* * Resume execution of a domain after suspend shutdown. * This can happen in one of two ways: - * 1. Resume with special return code. - * 2. Reset guest environment so it believes it is resumed in a new + * 1. (fast=1) Resume with special return code (1) that the guest + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend. + * + * 2. (fast=0) Reset guest environment so it believes it is resumed in a new * domain context. + * * (2) should be used only for guests which cannot handle the special * new return code. (1) is always safe (but slower). */
To hopefully clarify what it meant. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- tools/libxc/xc_resume.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)