diff mbox

[3/4] tools/libxc: Avoid generating inappropriate zero-length records

Message ID 1469121457-365-4-git-send-email-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Cooper July 21, 2016, 5:17 p.m. UTC
It was never intended for records such as these to be sent with zero content.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/xc_sr_save_x86_hvm.c |  4 ++++
 tools/libxc/xc_sr_save_x86_pv.c  | 12 ++++++++++++
 2 files changed, 16 insertions(+)

Comments

Wei Liu July 25, 2016, 9:45 a.m. UTC | #1
On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote:
> It was never intended for records such as these to be sent with zero content.
> 

Wouldn't it be better to modify write_split_record to ignore zero
content instead of patching up different places?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxc/xc_sr_save_x86_hvm.c |  4 ++++
>  tools/libxc/xc_sr_save_x86_pv.c  | 12 ++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/tools/libxc/xc_sr_save_x86_hvm.c b/tools/libxc/xc_sr_save_x86_hvm.c
> index ba50a43..5401bf9 100644
> --- a/tools/libxc/xc_sr_save_x86_hvm.c
> +++ b/tools/libxc/xc_sr_save_x86_hvm.c
> @@ -112,6 +112,10 @@ static int write_hvm_params(struct xc_sr_context *ctx)
>          }
>      }
>  
> +    /* No params? Skip this record. */
> +    if ( hdr.count == 0 )
> +        return 0;
> +
>      rc = write_split_record(ctx, &rec, entries, hdr.count * sizeof(*entries));
>      if ( rc )
>          PERROR("Failed to write HVM_PARAMS record");
> diff --git a/tools/libxc/xc_sr_save_x86_pv.c b/tools/libxc/xc_sr_save_x86_pv.c
> index 4a29460..5fb9f2f 100644
> --- a/tools/libxc/xc_sr_save_x86_pv.c
> +++ b/tools/libxc/xc_sr_save_x86_pv.c
> @@ -607,6 +607,10 @@ static int write_one_vcpu_extended(struct xc_sr_context *ctx, uint32_t id)
>          return -1;
>      }
>  
> +    /* No content? Skip the record. */
> +    if ( domctl.u.ext_vcpucontext.size == 0 )
> +        return 0;
> +
>      return write_split_record(ctx, &rec, &domctl.u.ext_vcpucontext,
>                                domctl.u.ext_vcpucontext.size);
>  }
> @@ -662,6 +666,10 @@ static int write_one_vcpu_xsave(struct xc_sr_context *ctx, uint32_t id)
>          goto err;
>      }
>  
> +    /* No xsave state? Skip this record. */
> +    if ( domctl.u.vcpuextstate.size == 0 )
> +        goto out;
> +
>      rc = write_split_record(ctx, &rec, buffer, domctl.u.vcpuextstate.size);
>      if ( rc )
>          goto err;
> @@ -728,6 +736,10 @@ static int write_one_vcpu_msrs(struct xc_sr_context *ctx, uint32_t id)
>          goto err;
>      }
>  
> +    /* No MSRs? Skip this record. */
> +    if ( domctl.u.vcpu_msrs.msr_count == 0 )
> +        goto out;
> +
>      rc = write_split_record(ctx, &rec, buffer,
>                              domctl.u.vcpu_msrs.msr_count *
>                              sizeof(xen_domctl_vcpu_msr_t));
> -- 
> 2.1.4
>
Andrew Cooper July 25, 2016, 9:57 a.m. UTC | #2
On 25/07/16 10:45, Wei Liu wrote:
> On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote:
>> It was never intended for records such as these to be sent with zero content.
>>
> Wouldn't it be better to modify write_split_record to ignore zero
> content instead of patching up different places?

That would catch the HVM_PARAMS, but not the other 3, which have an
extra 8 byte header.

However, having write_split_record() conditionally drop records is a
violation of the principle of least surprise.  Choosing to skip a record
is an important decision and needs to be easy to spot in the code.

~Andrew
Wei Liu July 25, 2016, 10:14 a.m. UTC | #3
On Mon, Jul 25, 2016 at 10:57:13AM +0100, Andrew Cooper wrote:
> On 25/07/16 10:45, Wei Liu wrote:
> > On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote:
> >> It was never intended for records such as these to be sent with zero content.
> >>
> > Wouldn't it be better to modify write_split_record to ignore zero
> > content instead of patching up different places?
> 
> That would catch the HVM_PARAMS, but not the other 3, which have an
> extra 8 byte header.
> 
> However, having write_split_record() conditionally drop records is a
> violation of the principle of least surprise.  Choosing to skip a record
> is an important decision and needs to be easy to spot in the code.
> 

Fair enough.

> ~Andrew
David Vrabel July 25, 2016, 10:32 a.m. UTC | #4
On 21/07/16 18:17, Andrew Cooper wrote:
> It was never intended for records such as these to be sent with zero content.

As the original author of the specification I'm perhaps best placed to
say what the original intention is.

For records such as HVM_PARAMS which consist of a set of N items, the
intention was to most definitely send a record with 0 items.

For records that fetch an opaque blob from the hypervisor, again the
intention was to sent this blob as-is with no sort of processing or
other checking. i.e., if the hypervisor gives us a zero-length blob we
sent that as-is.

This makes all the streams look the same with all the same records,
regardless of what hardware platform it was run on.  Including
zero-length/count records also makes diagnosing problems easier -- the
empty record is visible in the stream instead of having to remember that
sometimes these records are deliberately omitted.

As such, this series should be limited to making the restore side handle
the zero count sets or zero length blobs if it does not do so already.

The specification should be clarified to note that some records may have
zero-length blobs or contain zero items.

David
Ian Jackson July 25, 2016, 11:44 a.m. UTC | #5
David Vrabel writes ("Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records"):
> For records such as HVM_PARAMS which consist of a set of N items, the
> intention was to most definitely send a record with 0 items.
> 
> For records that fetch an opaque blob from the hypervisor, again the
> intention was to sent this blob as-is with no sort of processing or
> other checking. i.e., if the hypervisor gives us a zero-length blob we
> sent that as-is.
> 
> This makes all the streams look the same with all the same records,
> regardless of what hardware platform it was run on.  Including
> zero-length/count records also makes diagnosing problems easier -- the
> empty record is visible in the stream instead of having to remember that
> sometimes these records are deliberately omitted.
> 
> As such, this series should be limited to making the restore side handle
> the zero count sets or zero length blobs if it does not do so already.
> 
> The specification should be clarified to note that some records may have
> zero-length blobs or contain zero items.

I think I prefer David's view here, but I don't quite feel I
understand what the underlying bug is.

Ian.
Ian Jackson July 25, 2016, 5:15 p.m. UTC | #6
David Vrabel writes ("Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records"):
> On 21/07/16 18:17, Andrew Cooper wrote:
> > It was never intended for records such as these to be sent with zero content.
> 
> As the original author of the specification I'm perhaps best placed to
> say what the original intention is.

I think this discussion of `the intent' is not particularly helpful,
when the authors of the spec document, and of the code, disagree.  It
is in any case not necessary to decide what `the original intent' is
or was.  Accordingly, can all participants please stop referring to
`the intent' in this way.  (It is of course fine to write `_my_
intent'.)

What is necessary is to decide what should be done now.

I am going to quote liberally from the rest of David's mail but adjust
some of the wording to try to make it something I can agree with:

  For records such as HVM_PARAMS which consist of a set of N items,
  David's intention in the spec, was to most definitely send a record
  with 0 items.

  For records that fetch an opaque blob from the hypervisor, again,
  David's intention was to send this blob as-is with no sort of
  processing or other checking. i.e., if the hypervisor gives us a
  zero-length blob we sent that as-is.

  This makes all the streams look the same with all the same records,
  regardless of what hardware platform it was run on.  Including
  zero-length/count records also makes diagnosing problems easier -- the
  empty record is visible in the stream instead of having to remember that
  sometimes these records are deliberately omitted.

  As such, IMO this series should be limited to making the restore
  side handle the zero count sets or zero length blobs if it does not
  do so already.

  The specification should be clarified to note that some records may have
  zero-length blobs or contain zero items.

I reviewed the spec in detail at the time and I agree with David's
point of view as I have rephrased (hopefully without annoying David)
above.

I see no reason why zero-content-length records should be treated as
any kind of special case.

Is the ultimate bug that we are tripping over here simply that the
code calls malloc(0) and then bails if the libc produces NULL (as it
is entitled to do) ?

Thanks,
Ian.
Wei Liu July 26, 2016, 9:23 a.m. UTC | #7
On Mon, Jul 25, 2016 at 06:15:37PM +0100, Ian Jackson wrote:
> David Vrabel writes ("Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records"):
> > On 21/07/16 18:17, Andrew Cooper wrote:
> > > It was never intended for records such as these to be sent with zero content.
> > 
> > As the original author of the specification I'm perhaps best placed to
> > say what the original intention is.
> 
> I think this discussion of `the intent' is not particularly helpful,
> when the authors of the spec document, and of the code, disagree.  It
> is in any case not necessary to decide what `the original intent' is
> or was.  Accordingly, can all participants please stop referring to
> `the intent' in this way.  (It is of course fine to write `_my_
> intent'.)
> 
> What is necessary is to decide what should be done now.
> 
> I am going to quote liberally from the rest of David's mail but adjust
> some of the wording to try to make it something I can agree with:
> 
>   For records such as HVM_PARAMS which consist of a set of N items,
>   David's intention in the spec, was to most definitely send a record
>   with 0 items.
> 
>   For records that fetch an opaque blob from the hypervisor, again,
>   David's intention was to send this blob as-is with no sort of
>   processing or other checking. i.e., if the hypervisor gives us a
>   zero-length blob we sent that as-is.
> 
>   This makes all the streams look the same with all the same records,
>   regardless of what hardware platform it was run on.  Including
>   zero-length/count records also makes diagnosing problems easier -- the
>   empty record is visible in the stream instead of having to remember that
>   sometimes these records are deliberately omitted.
> 
>   As such, IMO this series should be limited to making the restore
>   side handle the zero count sets or zero length blobs if it does not
>   do so already.
> 
>   The specification should be clarified to note that some records may have
>   zero-length blobs or contain zero items.
> 
> I reviewed the spec in detail at the time and I agree with David's
> point of view as I have rephrased (hopefully without annoying David)
> above.
> 
> I see no reason why zero-content-length records should be treated as
> any kind of special case.
> 
> Is the ultimate bug that we are tripping over here simply that the
> code calls malloc(0) and then bails if the libc produces NULL (as it
> is entitled to do) ?
> 

No, it isn't.

AIUI the issue is receiving end can't deal with zero-length record.  To
to more precise, it is the hypervisor that chokes when toolstack issues
an hypercall with the "malformed" data.

Hence the two approaches presented: one is to omit zero-length record,
the other is to tolerate zero-length record.

If we go with David's approach, I think hypervisor should be made
tolerant to zero-length record. That would be symmetric on both ends --
hv can spit out as well as accept zero-length records. Toolstack should
transparently send and receive records.

Wei.


> Thanks,
> Ian.
Ian Jackson July 26, 2016, 1:37 p.m. UTC | #8
Wei Liu writes ("Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records"):
> On Mon, Jul 25, 2016 at 06:15:37PM +0100, Ian Jackson wrote:
> > Is the ultimate bug that we are tripping over here simply that the
> > code calls malloc(0) and then bails if the libc produces NULL (as it
> > is entitled to do) ?
> 
> No, it isn't.
> 
> AIUI the issue is receiving end can't deal with zero-length record.  To
> to more precise, it is the hypervisor that chokes when toolstack issues
> an hypercall with the "malformed" data.

Oh.  So the hypervisor produces this zero-length data, but rejects it
when the same zero-length data is supplied back to it ?  That seems
like a bug in the hypervisor to me.

> If we go with David's approach, I think hypervisor should be made
> tolerant to zero-length record. That would be symmetric on both ends --
> hv can spit out as well as accept zero-length records. Toolstack should
> transparently send and receive records.

That would seem sensible to me.

Ian.
diff mbox

Patch

diff --git a/tools/libxc/xc_sr_save_x86_hvm.c b/tools/libxc/xc_sr_save_x86_hvm.c
index ba50a43..5401bf9 100644
--- a/tools/libxc/xc_sr_save_x86_hvm.c
+++ b/tools/libxc/xc_sr_save_x86_hvm.c
@@ -112,6 +112,10 @@  static int write_hvm_params(struct xc_sr_context *ctx)
         }
     }
 
+    /* No params? Skip this record. */
+    if ( hdr.count == 0 )
+        return 0;
+
     rc = write_split_record(ctx, &rec, entries, hdr.count * sizeof(*entries));
     if ( rc )
         PERROR("Failed to write HVM_PARAMS record");
diff --git a/tools/libxc/xc_sr_save_x86_pv.c b/tools/libxc/xc_sr_save_x86_pv.c
index 4a29460..5fb9f2f 100644
--- a/tools/libxc/xc_sr_save_x86_pv.c
+++ b/tools/libxc/xc_sr_save_x86_pv.c
@@ -607,6 +607,10 @@  static int write_one_vcpu_extended(struct xc_sr_context *ctx, uint32_t id)
         return -1;
     }
 
+    /* No content? Skip the record. */
+    if ( domctl.u.ext_vcpucontext.size == 0 )
+        return 0;
+
     return write_split_record(ctx, &rec, &domctl.u.ext_vcpucontext,
                               domctl.u.ext_vcpucontext.size);
 }
@@ -662,6 +666,10 @@  static int write_one_vcpu_xsave(struct xc_sr_context *ctx, uint32_t id)
         goto err;
     }
 
+    /* No xsave state? Skip this record. */
+    if ( domctl.u.vcpuextstate.size == 0 )
+        goto out;
+
     rc = write_split_record(ctx, &rec, buffer, domctl.u.vcpuextstate.size);
     if ( rc )
         goto err;
@@ -728,6 +736,10 @@  static int write_one_vcpu_msrs(struct xc_sr_context *ctx, uint32_t id)
         goto err;
     }
 
+    /* No MSRs? Skip this record. */
+    if ( domctl.u.vcpu_msrs.msr_count == 0 )
+        goto out;
+
     rc = write_split_record(ctx, &rec, buffer,
                             domctl.u.vcpu_msrs.msr_count *
                             sizeof(xen_domctl_vcpu_msr_t));