diff mbox series

tools/libxenguest: Fix migration's debug option

Message ID 20210702190342.31319-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series tools/libxenguest: Fix migration's debug option | expand

Commit Message

Andrew Cooper July 2, 2021, 7:03 p.m. UTC
The code has gone through many refactors, but the first refactor was the one
which broke it by inverting the check with respect to checkpointed streams.

Fixes: 7449fb36c6c8 ("migration/save: pass checkpointed_stream from libxl to libxc")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Olaf Hering <olaf@aepfle.de>

`xl migrate --debug` might not be perfect, but this at least brings it back to
mostly working.

I don't think dropping it is a sensible move.  In particular, it is invaluable
for testing the logdirty infrastructure when migrating a memtest VM.

If anyone has a clever idea to fix the grant problem, then we can.  It is
after all a debug option, without any specific prescribed behaviour.
---
 tools/libs/guest/xg_sr_save.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Olaf Hering July 5, 2021, 7:53 a.m. UTC | #1
Am Fri, 2 Jul 2021 20:03:42 +0100
schrieb Andrew Cooper <andrew.cooper3@citrix.com>:

> The code has gone through many refactors, but the first refactor was the one
> which broke it by inverting the check with respect to checkpointed streams.
> 
> Fixes: 7449fb36c6c8 ("migration/save: pass checkpointed_stream from libxl to libxc")
> Reported-by: Olaf Hering <olaf@aepfle.de>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Olaf Hering <olaf@aepfle.de>

Olaf
Jan Beulich July 5, 2021, 7:57 a.m. UTC | #2
On 02.07.2021 21:03, Andrew Cooper wrote:
> The code has gone through many refactors, but the first refactor was the one
> which broke it by inverting the check with respect to checkpointed streams.
> 
> Fixes: 7449fb36c6c8 ("migration/save: pass checkpointed_stream from libxl to libxc")
> Reported-by: Olaf Hering <olaf@aepfle.de>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
> CC: Ian Jackson <iwj@xenproject.org>
> CC: Wei Liu <wl@xen.org>
> CC: Olaf Hering <olaf@aepfle.de>
> 
> `xl migrate --debug` might not be perfect, but this at least brings it back to
> mostly working.
> 
> I don't think dropping it is a sensible move.  In particular, it is invaluable
> for testing the logdirty infrastructure when migrating a memtest VM.
> 
> If anyone has a clever idea to fix the grant problem, then we can.  It is
> after all a debug option, without any specific prescribed behaviour.

What is "the grant problem" referring to here? Neither anything above
nor the offending original commit has any reference to grants, or a
problem with them.

Jan
Olaf Hering July 5, 2021, 8:02 a.m. UTC | #3
Am Mon, 5 Jul 2021 09:57:21 +0200
schrieb Jan Beulich <jbeulich@suse.com>:

> What is "the grant problem" referring to here? Neither anything above
> nor the offending original commit has any reference to grants, or a
> problem with them.

When the guest is paused during final transit, the backends will continue to write into domU memory. As a result the final additional iteration to verify memory on both sides will always see errors. The code has no way to know for which pfn such mismatches in page content can safely be ignored.


Olaf
Jan Beulich July 5, 2021, 8:23 a.m. UTC | #4
On 05.07.2021 10:02, Olaf Hering wrote:
> Am Mon, 5 Jul 2021 09:57:21 +0200
> schrieb Jan Beulich <jbeulich@suse.com>:
> 
>> What is "the grant problem" referring to here? Neither anything above
>> nor the offending original commit has any reference to grants, or a
>> problem with them.
> 
> When the guest is paused during final transit, the backends will
> continue to write into domU memory. As a result the final additional
> iteration to verify memory on both sides will always see errors.

I see. A similar problem then exists with at least the FIFO event
channel per-vCPU control blocks?

> The code has no way to know for which pfn such mismatches in page
> content can safely be ignored.

Well, in principle this can be known, but it's expensive: For a
paused domain the grant table can't change anymore. Any pages
referenced by a valid non-r/o grant table entry could in principle
change.

Jan
Olaf Hering July 5, 2021, 8:32 a.m. UTC | #5
Am Mon, 5 Jul 2021 10:23:00 +0200
schrieb Jan Beulich <jbeulich@suse.com>:

> I see. A similar problem then exists with at least the FIFO event
> channel per-vCPU control blocks?

I have not done any debugging how the pages differ and what they are actually used for.

My guess was that it might be activity from the backends, particularly netback. I found no API to query the usage of a page, so I declared the interface broken..


Olaf
Jan Beulich July 5, 2021, 9:19 a.m. UTC | #6
On 05.07.2021 10:32, Olaf Hering wrote:
> Am Mon, 5 Jul 2021 10:23:00 +0200
> schrieb Jan Beulich <jbeulich@suse.com>:
> 
>> I see. A similar problem then exists with at least the FIFO event
>> channel per-vCPU control blocks?
> 
> I have not done any debugging how the pages differ and what they are actually used for.
> 
> My guess was that it might be activity from the backends, particularly
> netback. I found no API to query the usage of a page, so I declared the
> interface broken..

"The interface" being which one? The tool stack can map the guest's
grant table, so it is in the position to find out about all grants
without further hypervisor help. Afaict the same isn't true for the
FIFO event channel control blocks, though, and maybe there are
further ways by which pages may get modified for a paused guest.

Jan
Olaf Hering July 5, 2021, 9:25 a.m. UTC | #7
Am Mon, 5 Jul 2021 11:19:59 +0200
schrieb Jan Beulich <jbeulich@suse.com>:

> "The interface" being which one? The tool stack can map the guest's
> grant table, so it is in the position to find out about all grants
> without further hypervisor help.

The interface means the code behind verify_frames.

If there are indeed ways to query which pages belong to grants, how would the toolstack need to do that?


Olaf
Jan Beulich July 5, 2021, 9:31 a.m. UTC | #8
On 05.07.2021 11:25, Olaf Hering wrote:
> Am Mon, 5 Jul 2021 11:19:59 +0200
> schrieb Jan Beulich <jbeulich@suse.com>:
> 
>> "The interface" being which one? The tool stack can map the guest's
>> grant table, so it is in the position to find out about all grants
>> without further hypervisor help.
> 
> The interface means the code behind verify_frames.
> 
> If there are indeed ways to query which pages belong to grants, how would the toolstack need to do that?

Map the grant table of the guest and walk it, recording any MFN for
which at least one valid r/w grant exists.

Jan
Andrew Cooper July 5, 2021, 10:06 a.m. UTC | #9
On 05/07/2021 10:31, Jan Beulich wrote:
> On 05.07.2021 11:25, Olaf Hering wrote:
>> Am Mon, 5 Jul 2021 11:19:59 +0200
>> schrieb Jan Beulich <jbeulich@suse.com>:
>>
>>> "The interface" being which one? The tool stack can map the guest's
>>> grant table, so it is in the position to find out about all grants
>>> without further hypervisor help.
>> The interface means the code behind verify_frames.
>>
>> If there are indeed ways to query which pages belong to grants, how would the toolstack need to do that?
> Map the grant table of the guest and walk it, recording any MFN for
> which at least one valid r/w grant exists.

That doesn't help - Its still racy with in-flight IO.  Also with updates
from Xen such as the wallclocks.

The only way to fix the IO problem is to disconnect the blk/net rings
before doing the final sweep for frames, but that clobbers any ability
to restart the VM on the source side if things go wrong at the destination.

I don't have an answer at all for the vcpu info frames.

~Andrew
Jan Beulich July 5, 2021, 10:36 a.m. UTC | #10
On 05.07.2021 12:06, Andrew Cooper wrote:
> On 05/07/2021 10:31, Jan Beulich wrote:
>> On 05.07.2021 11:25, Olaf Hering wrote:
>>> Am Mon, 5 Jul 2021 11:19:59 +0200
>>> schrieb Jan Beulich <jbeulich@suse.com>:
>>>
>>>> "The interface" being which one? The tool stack can map the guest's
>>>> grant table, so it is in the position to find out about all grants
>>>> without further hypervisor help.
>>> The interface means the code behind verify_frames.
>>>
>>> If there are indeed ways to query which pages belong to grants, how would the toolstack need to do that?
>> Map the grant table of the guest and walk it, recording any MFN for
>> which at least one valid r/w grant exists.
> 
> That doesn't help - Its still racy with in-flight IO.

Well, I meant the recorded data to be used to simply not verify
those frames.

>  Also with updates from Xen such as the wallclocks.

This doesn't occur for a paused domain, does it?

> The only way to fix the IO problem is to disconnect the blk/net rings
> before doing the final sweep for frames, but that clobbers any ability
> to restart the VM on the source side if things go wrong at the destination.
> 
> I don't have an answer at all for the vcpu info frames.

Yeah, they fall in the same category as the FIFO control pages, as
they contain evtchn_{pending_sel,upcall_pending}.

Jan
diff mbox series

Patch

diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 2ba7c3200cd5..f0e2bd048d37 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -752,7 +752,7 @@  static int send_domain_memory_live(struct xc_sr_context *ctx)
     if ( rc )
         goto out;
 
-    if ( ctx->save.debug && ctx->stream_type != XC_STREAM_PLAIN )
+    if ( ctx->save.debug && ctx->stream_type == XC_STREAM_PLAIN )
     {
         rc = verify_frames(ctx);
         if ( rc )