diff mbox

[v6,4/7] xen/x86: parse Dom0 kernel for PVHv2

Message ID 20170210123351.73526-5-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Roger Pau Monné Feb. 10, 2017, 12:33 p.m. UTC
Introduce a helper to parse the Dom0 kernel.

A new helper is also introduced to libelf, that's used to store the destination
vcpu of the domain. This parameter is needed when loading the kernel on a HVM
domain (PVHv2), since hvm_copy_to_guest_phys requires passing the destination
vcpu.

While there also fix image_base and image_start to be of type "void *", and do
the necessary fixup of related functions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes since v5:
 - s/hvm_copy_to_guest_phys_vcpu/hvm_copy_to_guest_phys/.
 - Use void * for image_base and image_start, make the necessary changes.
 - Introduce elf_set_vcpu in order to store the destination vcpu in
   elf_binary, and use it in elf_load_image. This avoids having to override
   current.
 - Style fixes.
 - Round up the position of the modlist/start_info to an aligned address
   depending on the kernel bitness.

Changes since v4:
 - s/hvm/pvh.
 - Use hvm_copy_to_guest_phys_vcpu.

Changes since v3:
 - Change one error message.
 - Indent "out" label by one space.
 - Introduce hvm_copy_to_phys and slightly simplify the code in hvm_load_kernel.

Changes since v2:
 - Remove debug messages.
 - Don't hardcode the number of modules to 1.
---
 xen/arch/x86/bzimage.c            |   3 +-
 xen/arch/x86/domain_build.c       | 136 +++++++++++++++++++++++++++++++++++++-
 xen/common/libelf/libelf-loader.c |  13 +++-
 xen/include/asm-x86/bzimage.h     |   2 +-
 xen/include/xen/libelf.h          |   6 ++
 5 files changed, 155 insertions(+), 5 deletions(-)

Comments

Ian Jackson Feb. 10, 2017, 2:34 p.m. UTC | #1
Roger Pau Monne writes ("[PATCH v6 4/7] xen/x86: parse Dom0 kernel for PVHv2"):
> Introduce a helper to parse the Dom0 kernel.
> 
> A new helper is also introduced to libelf, that's used to store the
> destination vcpu of the domain. This parameter is needed when
> loading the kernel on a HVM domain (PVHv2), since
> hvm_copy_to_guest_phys requires passing the destination vcpu.

The new helper and variable seems fine to me.

> While there also fix image_base and image_start to be of type "void
> *", and do the necessary fixup of related functions.

IMO this should be separate patch(es).

> +static int __init pvh_load_kernel(struct domain *d, const module_t *image,
> +                                  unsigned long image_headroom,
> +                                  module_t *initrd, void *image_base,
> +                                  char *cmdline, paddr_t *entry,
> +                                  paddr_t *start_info_addr)
> +{

FAOD this is used for dom0 only, right ?  In which case I don't feel
the need to review it.

> diff --git a/xen/common/libelf/libelf-loader.c b/xen/common/libelf/libelf-loader.c
> index 1644f16..de140ed 100644
> --- a/xen/common/libelf/libelf-loader.c
> +++ b/xen/common/libelf/libelf-loader.c
> @@ -153,10 +153,19 @@ static elf_errorstatus elf_load_image(struct elf_binary *elf, elf_ptrval dst, el
>          return -1;
>      /* We trust the dom0 kernel image completely, so we don't care
>       * about overruns etc. here. */
> -    rc = raw_copy_to_guest(ELF_UNSAFE_PTR(dst), ELF_UNSAFE_PTR(src), filesz);
> +    if ( is_hvm_vcpu(elf->vcpu) )
> +        rc = hvm_copy_to_guest_phys((paddr_t)ELF_UNSAFE_PTR(dst),
> +                                    ELF_UNSAFE_PTR(src), filesz, elf->vcpu);
> +    else
> +        rc = raw_copy_to_guest(ELF_UNSAFE_PTR(dst), ELF_UNSAFE_PTR(src),
> +                               filesz);
>      if ( rc != 0 )
>          return -1;
> -    rc = raw_clear_guest(ELF_UNSAFE_PTR(dst + filesz), memsz - filesz);
> +    if ( is_hvm_vcpu(elf->vcpu) )
> +        rc = hvm_copy_to_guest_phys((paddr_t)ELF_UNSAFE_PTR(dst + filesz),
> +                                    NULL, filesz, elf->vcpu);
> +    else
> +        rc = raw_clear_guest(ELF_UNSAFE_PTR(dst + filesz), memsz - filesz);

This seems to involve open coding all four elements of a 2x2 matrix.
Couldn't you provide a helper function that:
 * Checks is_hvm_vcpu
 * Has the "NULL means clear" behaviour which I infer
   hvm_copy_to_guest_phys has
 * Calls hvm_copy_to_guest_phys or raw_{copy_to,clear}_guest
(Does raw_copy_to_guest have the "NULL means clear" feature ?  Maybe
that feature should be added, further lifting that into more general
code.)

Then the source and destination calculations would be done once for
each part, rather than twice, and the is_hvm_vcpu condition would be
done once rather than twice.

Thanks,
Ian.
Roger Pau Monné Feb. 13, 2017, 12:07 p.m. UTC | #2
On Fri, Feb 10, 2017 at 02:34:16PM +0000, Ian Jackson wrote:
> Roger Pau Monne writes ("[PATCH v6 4/7] xen/x86: parse Dom0 kernel for PVHv2"):
> > Introduce a helper to parse the Dom0 kernel.
> > 
> > A new helper is also introduced to libelf, that's used to store the
> > destination vcpu of the domain. This parameter is needed when
> > loading the kernel on a HVM domain (PVHv2), since
> > hvm_copy_to_guest_phys requires passing the destination vcpu.
> 
> The new helper and variable seems fine to me.
> 
> > While there also fix image_base and image_start to be of type "void
> > *", and do the necessary fixup of related functions.
> 
> IMO this should be separate patch(es).

Thanks, I've now splitted it.

> > +static int __init pvh_load_kernel(struct domain *d, const module_t *image,
> > +                                  unsigned long image_headroom,
> > +                                  module_t *initrd, void *image_base,
> > +                                  char *cmdline, paddr_t *entry,
> > +                                  paddr_t *start_info_addr)
> > +{
> 
> FAOD this is used for dom0 only, right ?  In which case I don't feel
> the need to review it.

Right, this is Dom0 only.

> > diff --git a/xen/common/libelf/libelf-loader.c b/xen/common/libelf/libelf-loader.c
> > index 1644f16..de140ed 100644
> > --- a/xen/common/libelf/libelf-loader.c
> > +++ b/xen/common/libelf/libelf-loader.c
> > @@ -153,10 +153,19 @@ static elf_errorstatus elf_load_image(struct elf_binary *elf, elf_ptrval dst, el
> >          return -1;
> >      /* We trust the dom0 kernel image completely, so we don't care
> >       * about overruns etc. here. */
> > -    rc = raw_copy_to_guest(ELF_UNSAFE_PTR(dst), ELF_UNSAFE_PTR(src), filesz);
> > +    if ( is_hvm_vcpu(elf->vcpu) )
> > +        rc = hvm_copy_to_guest_phys((paddr_t)ELF_UNSAFE_PTR(dst),
> > +                                    ELF_UNSAFE_PTR(src), filesz, elf->vcpu);
> > +    else
> > +        rc = raw_copy_to_guest(ELF_UNSAFE_PTR(dst), ELF_UNSAFE_PTR(src),
> > +                               filesz);
> >      if ( rc != 0 )
> >          return -1;
> > -    rc = raw_clear_guest(ELF_UNSAFE_PTR(dst + filesz), memsz - filesz);
> > +    if ( is_hvm_vcpu(elf->vcpu) )
> > +        rc = hvm_copy_to_guest_phys((paddr_t)ELF_UNSAFE_PTR(dst + filesz),
> > +                                    NULL, filesz, elf->vcpu);
> > +    else
> > +        rc = raw_clear_guest(ELF_UNSAFE_PTR(dst + filesz), memsz - filesz);
> 
> This seems to involve open coding all four elements of a 2x2 matrix.
> Couldn't you provide a helper function that:
>  * Checks is_hvm_vcpu
>  * Has the "NULL means clear" behaviour which I infer
>    hvm_copy_to_guest_phys has
>  * Calls hvm_copy_to_guest_phys or raw_{copy_to,clear}_guest
> (Does raw_copy_to_guest have the "NULL means clear" feature ?  Maybe
> that feature should be added, further lifting that into more general
> code.)

In the pv case raw_copy_to_guest calls clear_user which is a specific function
to zero a memory area, and AFAICT raw_copy_to_guest doesn't have the NULL means
clear. I've added the following helper, let me know if that looks fine:

static elf_errorstatus elf_memcpy(struct vcpu *v, void *dst, void *src,
                                  uint64_t size)
{
    int rc = 0;

#ifdef CONFIG_X86
    if ( is_hvm_vcpu(v) &&
         hvm_copy_to_guest_phys((paddr_t)dst, src, size, v) != HVMCOPY_okay )
        rc = -1;
    else
#endif
        rc = src == NULL ? raw_clear_guest(dst, size) :
                           raw_copy_to_guest(dst, src, size);

    return rc;
}

Roger.
diff mbox

Patch

diff --git a/xen/arch/x86/bzimage.c b/xen/arch/x86/bzimage.c
index 50ebb84..124c386 100644
--- a/xen/arch/x86/bzimage.c
+++ b/xen/arch/x86/bzimage.c
@@ -104,7 +104,8 @@  unsigned long __init bzimage_headroom(char *image_start,
     return headroom;
 }
 
-int __init bzimage_parse(char *image_base, char **image_start, unsigned long *image_len)
+int __init bzimage_parse(void *image_base, void **image_start,
+                         unsigned long *image_len)
 {
     struct setup_header *hdr = (struct setup_header *)(*image_start);
     int err = bzimage_check(hdr, *image_len);
diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index be50b65..01c9348 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -39,6 +39,7 @@ 
 #include <asm/hpet.h>
 
 #include <public/version.h>
+#include <public/arch-x86/hvm/start_info.h>
 
 static long __initdata dom0_nrpages;
 static long __initdata dom0_min_nrpages;
@@ -1026,7 +1027,7 @@  static int __init construct_dom0_pv(
     unsigned long long value;
     char *image_base = bootstrap_map(image);
     unsigned long image_len = image->mod_end;
-    char *image_start = image_base + image_headroom;
+    void *image_start = image_base + image_headroom;
     unsigned long initrd_len = initrd ? initrd->mod_end : 0;
     l4_pgentry_t *l4tab = NULL, *l4start = NULL;
     l3_pgentry_t *l3tab = NULL, *l3start = NULL;
@@ -1457,6 +1458,7 @@  static int __init construct_dom0_pv(
     /* Copy the OS image and free temporary buffer. */
     elf.dest_base = (void*)vkern_start;
     elf.dest_size = vkern_end - vkern_start;
+    elf_set_vcpu(&elf, v);
     rc = elf_load_binary(&elf);
     if ( rc < 0 )
     {
@@ -2015,12 +2017,136 @@  static int __init pvh_setup_p2m(struct domain *d)
 #undef MB1_PAGES
 }
 
+static int __init pvh_load_kernel(struct domain *d, const module_t *image,
+                                  unsigned long image_headroom,
+                                  module_t *initrd, void *image_base,
+                                  char *cmdline, paddr_t *entry,
+                                  paddr_t *start_info_addr)
+{
+    void *image_start = image_base + image_headroom;
+    unsigned long image_len = image->mod_end;
+    struct elf_binary elf;
+    struct elf_dom_parms parms;
+    paddr_t last_addr;
+    struct hvm_start_info start_info = { 0 };
+    struct hvm_modlist_entry mod = { 0 };
+    struct vcpu *v = d->vcpu[0];
+    int rc;
+
+    if ( (rc = bzimage_parse(image_base, &image_start, &image_len)) != 0 )
+    {
+        printk("Error trying to detect bz compressed kernel\n");
+        return rc;
+    }
+
+    if ( (rc = elf_init(&elf, image_start, image_len)) != 0 )
+    {
+        printk("Unable to init ELF\n");
+        return rc;
+    }
+#ifdef VERBOSE
+    elf_set_verbose(&elf);
+#endif
+    elf_parse_binary(&elf);
+    if ( (rc = elf_xen_parse(&elf, &parms)) != 0 )
+    {
+        printk("Unable to parse kernel for ELFNOTES\n");
+        return rc;
+    }
+
+    if ( parms.phys_entry == UNSET_ADDR32 )
+    {
+        printk("Unable to find XEN_ELFNOTE_PHYS32_ENTRY address\n");
+        return -EINVAL;
+    }
+
+    printk("OS: %s version: %s loader: %s bitness: %s\n", parms.guest_os,
+           parms.guest_ver, parms.loader,
+           elf_64bit(&elf) ? "64-bit" : "32-bit");
+
+    /* Copy the OS image and free temporary buffer. */
+    elf.dest_base = (void *)(parms.virt_kstart - parms.virt_base);
+    elf.dest_size = parms.virt_kend - parms.virt_kstart;
+
+    elf_set_vcpu(&elf, v);
+    rc = elf_load_binary(&elf);
+    if ( rc < 0 )
+    {
+        printk("Failed to load kernel: %d\n", rc);
+        printk("Xen dom0 kernel broken ELF: %s\n", elf_check_broken(&elf));
+        return rc;
+    }
+
+    last_addr = ROUNDUP(parms.virt_kend - parms.virt_base, PAGE_SIZE);
+
+    if ( initrd != NULL )
+    {
+        rc = hvm_copy_to_guest_phys(last_addr, mfn_to_virt(initrd->mod_start),
+                                    initrd->mod_end, v);
+        if ( rc )
+        {
+            printk("Unable to copy initrd to guest\n");
+            return rc;
+        }
+
+        mod.paddr = last_addr;
+        mod.size = initrd->mod_end;
+        last_addr += ROUNDUP(initrd->mod_end, PAGE_SIZE);
+    }
+
+    /* Free temporary buffers. */
+    discard_initial_images();
+
+    if ( cmdline != NULL )
+    {
+        rc = hvm_copy_to_guest_phys(last_addr, cmdline, strlen(cmdline) + 1, v);
+        if ( rc )
+        {
+            printk("Unable to copy guest command line\n");
+            return rc;
+        }
+        start_info.cmdline_paddr = last_addr;
+        /*
+         * Round up to 32/64 bits (depending on the guest kernel bitness) so
+         * the modlist/start_info is aligned.
+         */
+        last_addr += ROUNDUP(strlen(cmdline) + 1, elf_64bit(&elf) ? 8 : 4);
+    }
+    if ( initrd != NULL )
+    {
+        rc = hvm_copy_to_guest_phys(last_addr, &mod, sizeof(mod), v);
+        if ( rc )
+        {
+            printk("Unable to copy guest modules\n");
+            return rc;
+        }
+        start_info.modlist_paddr = last_addr;
+        start_info.nr_modules = 1;
+        last_addr += sizeof(mod);
+    }
+
+    start_info.magic = XEN_HVM_START_MAGIC_VALUE;
+    start_info.flags = SIF_PRIVILEGED | SIF_INITDOMAIN;
+    rc = hvm_copy_to_guest_phys(last_addr, &start_info, sizeof(start_info), v);
+    if ( rc )
+    {
+        printk("Unable to copy start info to guest\n");
+        return rc;
+    }
+
+    *entry = parms.phys_entry;
+    *start_info_addr = last_addr;
+
+    return 0;
+}
+
 static int __init construct_dom0_pvh(struct domain *d, const module_t *image,
                                      unsigned long image_headroom,
                                      module_t *initrd,
                                      void *(*bootstrap_map)(const module_t *),
                                      char *cmdline)
 {
+    paddr_t entry, start_info;
     int rc;
 
     printk("** Building a PVH Dom0 **\n");
@@ -2034,6 +2160,14 @@  static int __init construct_dom0_pvh(struct domain *d, const module_t *image,
         return rc;
     }
 
+    rc = pvh_load_kernel(d, image, image_headroom, initrd, bootstrap_map(image),
+                         cmdline, &entry, &start_info);
+    if ( rc )
+    {
+        printk("Failed to load Dom0 kernel\n");
+        return rc;
+    }
+
     panic("Building a PVHv2 Dom0 is not yet supported.");
     return 0;
 }
diff --git a/xen/common/libelf/libelf-loader.c b/xen/common/libelf/libelf-loader.c
index 1644f16..de140ed 100644
--- a/xen/common/libelf/libelf-loader.c
+++ b/xen/common/libelf/libelf-loader.c
@@ -153,10 +153,19 @@  static elf_errorstatus elf_load_image(struct elf_binary *elf, elf_ptrval dst, el
         return -1;
     /* We trust the dom0 kernel image completely, so we don't care
      * about overruns etc. here. */
-    rc = raw_copy_to_guest(ELF_UNSAFE_PTR(dst), ELF_UNSAFE_PTR(src), filesz);
+    if ( is_hvm_vcpu(elf->vcpu) )
+        rc = hvm_copy_to_guest_phys((paddr_t)ELF_UNSAFE_PTR(dst),
+                                    ELF_UNSAFE_PTR(src), filesz, elf->vcpu);
+    else
+        rc = raw_copy_to_guest(ELF_UNSAFE_PTR(dst), ELF_UNSAFE_PTR(src),
+                               filesz);
     if ( rc != 0 )
         return -1;
-    rc = raw_clear_guest(ELF_UNSAFE_PTR(dst + filesz), memsz - filesz);
+    if ( is_hvm_vcpu(elf->vcpu) )
+        rc = hvm_copy_to_guest_phys((paddr_t)ELF_UNSAFE_PTR(dst + filesz),
+                                    NULL, filesz, elf->vcpu);
+    else
+        rc = raw_clear_guest(ELF_UNSAFE_PTR(dst + filesz), memsz - filesz);
     if ( rc != 0 )
         return -1;
     return 0;
diff --git a/xen/include/asm-x86/bzimage.h b/xen/include/asm-x86/bzimage.h
index b48a256..a51f583 100644
--- a/xen/include/asm-x86/bzimage.h
+++ b/xen/include/asm-x86/bzimage.h
@@ -6,7 +6,7 @@ 
 
 unsigned long bzimage_headroom(char *image_start, unsigned long image_length);
 
-int bzimage_parse(char *image_base, char **image_start,
+int bzimage_parse(void *image_base, void **image_start,
                   unsigned long *image_len);
 
 #endif /* __X86_BZIMAGE_H__ */
diff --git a/xen/include/xen/libelf.h b/xen/include/xen/libelf.h
index 1b763f3..b739981 100644
--- a/xen/include/xen/libelf.h
+++ b/xen/include/xen/libelf.h
@@ -212,6 +212,8 @@  struct elf_binary {
     /* misc */
     elf_log_callback *log_callback;
     void *log_caller_data;
+#else
+    struct vcpu *vcpu;
 #endif
     bool verbose;
     const char *broken;
@@ -351,6 +353,10 @@  elf_errorstatus elf_init(struct elf_binary *elf, const char *image, size_t size)
    */
 #ifdef __XEN__
 void elf_set_verbose(struct elf_binary *elf);
+static inline void elf_set_vcpu(struct elf_binary *elf, struct vcpu *v)
+{
+    elf->vcpu = v;
+}
 #else
 void elf_set_log(struct elf_binary *elf, elf_log_callback*,
                  void *log_caller_pointer, bool verbose);