Message ID | 1453187862-24331-4-git-send-email-wency@cn.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 2016-01-19 at 15:17 +0800, Wen Congyang wrote: > For example: if the secondary host is down, and we fail to send the data > to > the secondary host. xc_domain_save() returns 0. So in the function > libxl__xc_domain_save_done(), rc is 0 (the helper program exits > normally), > and retval is 0 (it is xc_domain_save()'s return value). In such case, we > just need to complete the stream. > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > tools/libxl/libxl.h | 4 ++++ > tools/libxl/libxl_stream_write.c | 14 ++++++++++++-- > 2 files changed, 16 insertions(+), 2 deletions(-) > > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h > index 7114491..df6c7a3 100644 > --- a/tools/libxl/libxl.h > +++ b/tools/libxl/libxl.h > @@ -1215,6 +1215,10 @@ int libxl_domain_resume(libxl_ctx *ctx, uint32_t > domid, int suspend_cancel, > const libxl_asyncop_how *ao_how) > LIBXL_EXTERNAL_CALLERS_ONLY; > > +/* > + * This function doesn't return until something is wrong, and we need to > + * do failover from secondary. This function runs on the primary, doesn't it? and failover would be from primary to secondary. So I think a more accurate wording would be: /* * This function doesn't return unless something has gone wrong with the * replication to the secondary. If this function returns then the caller * should resume the (primary) domain. */ I'm happy to edit the text on commit if you agree with the proposed wording. The code looks good. Thanks, Ian.
On 01/19/2016 06:55 PM, Ian Campbell wrote: > On Tue, 2016-01-19 at 15:17 +0800, Wen Congyang wrote: >> For example: if the secondary host is down, and we fail to send the data >> to >> the secondary host. xc_domain_save() returns 0. So in the function >> libxl__xc_domain_save_done(), rc is 0 (the helper program exits >> normally), >> and retval is 0 (it is xc_domain_save()'s return value). In such case, we >> just need to complete the stream. >> >> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> >> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> >> --- >> tools/libxl/libxl.h | 4 ++++ >> tools/libxl/libxl_stream_write.c | 14 ++++++++++++-- >> 2 files changed, 16 insertions(+), 2 deletions(-) >> >> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h >> index 7114491..df6c7a3 100644 >> --- a/tools/libxl/libxl.h >> +++ b/tools/libxl/libxl.h >> @@ -1215,6 +1215,10 @@ int libxl_domain_resume(libxl_ctx *ctx, uint32_t >> domid, int suspend_cancel, >> const libxl_asyncop_how *ao_how) >> LIBXL_EXTERNAL_CALLERS_ONLY; >> >> +/* >> + * This function doesn't return until something is wrong, and we need to >> + * do failover from secondary. > > This function runs on the primary, doesn't it? and failover would be from > primary to secondary. Yes, it runs on the primary > > So I think a more accurate wording would be: > > /* > * This function doesn't return unless something has gone wrong with the > * replication to the secondary. If this function returns then the caller > * should resume the (primary) domain. > */ > > I'm happy to edit the text on commit if you agree with the proposed > wording. The code looks good. I agree with that. Thanks Wen Congyang > > Thanks, > Ian. > > > > . >
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 7114491..df6c7a3 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1215,6 +1215,10 @@ int libxl_domain_resume(libxl_ctx *ctx, uint32_t domid, int suspend_cancel, const libxl_asyncop_how *ao_how) LIBXL_EXTERNAL_CALLERS_ONLY; +/* + * This function doesn't return until something is wrong, and we need to + * do failover from secondary. + */ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info, uint32_t domid, int send_fd, int recv_fd, const libxl_asyncop_how *ao_how) diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c index 80d9208..21b4b51 100644 --- a/tools/libxl/libxl_stream_write.c +++ b/tools/libxl/libxl_stream_write.c @@ -354,8 +354,18 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void, * alive, and check_all_finished() may have torn it down around us. * If the stream is not still alive, we must not continue any work. */ - if (libxl__stream_write_inuse(stream)) - write_emulator_xenstore_record(egc, stream); + if (libxl__stream_write_inuse(stream)) { + if (dss->remus) + /* + * For remus, if libxl__xc_domain_save_done() completes, + * there was an error sending data to the secondary. + * Resume the primary ASAP. The caller doesn't care of the + * return value (Please refer to libxl__remus_teardown()) + */ + stream_complete(egc, stream, 0); + else + write_emulator_xenstore_record(egc, stream); + } } static void write_emulator_xenstore_record(libxl__egc *egc,