From patchwork Tue Dec 22 18:45:03 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Jackson X-Patchwork-Id: 7906731 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 1EABFBEEED for ; Tue, 22 Dec 2015 18:48:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A29162057F for ; Tue, 22 Dec 2015 18:48:16 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 39C9D20501 for ; Tue, 22 Dec 2015 18:48:15 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aBRwX-0001iz-Hk; Tue, 22 Dec 2015 18:45:53 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aBRwW-0001hV-Ay for xen-devel@lists.xensource.com; Tue, 22 Dec 2015 18:45:52 +0000 Received: from [193.109.254.147] by server-6.bemta-14.messagelabs.com id 53/5A-16618-F5A99765; Tue, 22 Dec 2015 18:45:51 +0000 X-Env-Sender: prvs=791371759=Ian.Jackson@citrix.com X-Msg-Ref: server-3.tower-27.messagelabs.com!1450809948!12405671!2 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 18113 invoked from network); 22 Dec 2015 18:45:50 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-3.tower-27.messagelabs.com with RC4-SHA encrypted SMTP; 22 Dec 2015 18:45:50 -0000 X-IronPort-AV: E=Sophos;i="5.20,465,1444694400"; d="scan'208";a="327002731" From: Ian Jackson To: Date: Tue, 22 Dec 2015 18:45:03 +0000 Message-ID: <1450809903-3393-29-git-send-email-ian.jackson@eu.citrix.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1450809903-3393-1-git-send-email-ian.jackson@eu.citrix.com> References: <1450809903-3393-1-git-send-email-ian.jackson@eu.citrix.com> MIME-Version: 1.0 X-DLP: MIA1 Cc: Ian Jackson , Wei Liu , Ian Campbell , Stefano Stabellini Subject: [Xen-devel] [PATCH 28/28] libxl: xsrestrict QEMU X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If QEMU supports xsrestrict, pass xsrestrict=on to it (by default). XXX We need to do this only if xenstored supports it, and AFAICT there is not a particularly easy way to test this. Should we open a new test xenstore connection to query this information ? We could do this once per libxl ctx. Allow the user to specify that xsrestrict should be on, in which case if it qemu cannot be depriv'd, we fail. When qemu is depriv'd it still needs write access to the physmap records in xenstore. It will be running with the privilege of the domain, so we need to allow the domain write access to /local/domain/$LIBXL_TOOLSTACK_DOMID/device-model/$DOMID/physmap It doesn't contain any information the guest doesn't already know. However be aware that it still means relaxing a tools/guest interface. The guest physmap contains the memory layout of the guest. The guest is already in possession of this information. A malicious guest could modify the physmap entries causing QEMU to read wrong information at restore time. QEMU would populate its XenPhysmap list with wrong entries that wouldn't match its own device emulation addresses, leading to a failure in restoring the domain. But it could not compromise the security of the system because the addresses are only used for checks against QEMU's addresses. Is it dangerous to let the guest write values that later QEMU and libxl read back? Let's dig a bit deeper into the code. QEMU and libxl use very similar functions to parse the physmap entries. Let's start checking libxl. The function that reads the physmap is libxl__toolstack_save. It calls: * libxl__xs_directory on /physmap This is safe. * libxl__xs_read If the path is wrong (nothing is there), it returns NULL, and it is handled correctly. * strtoll on the values read The values are under guest control but strtoll can handle bad formats. strtoll always returns an unsigned long long. In case of errors, it can return LLONG_MIN or LLONG_MAX. libxl__toolstack_save doesn't check for conversion errors, but it never uses the returned values anywhere. That's OK. libxl__toolstack_restore writes back the values to xenstore. So in case of errors: 1) libxl__toolstack_save could return early with an error, if the xenstore paths are wrong (nothing is on xenstore) 2) libxl__toolstack_restore could write back to xenstore any unsigned long long, including LLONG_MIN or LLONG_MAX Either way libxl is fine. From the QEMU side, the code is very similar and uses strtoll to parse the entries, that can deal with bad input values. The values are used to avoid memory allocations at restore time for memory that has already been allocated (video memory). If the values are wrong, QEMU will attempt another memory allocation, failing because the maxmem limit has been reached. Either way QEMU is fine. There are no other out-of-guest consumers of this information. Signed-off-by: Stefano Stabellini Signed-off-by: Ian Jackson --- v6: Merge the xsrestrict and the permissions update patches, so that the permission relaxation is done only when needed. Rip out qemu feature checking code, now in new (earlier) patch Rip out emulator_id, now in new (earlier) patch Provide and document a config option to control restriction. Make the default depend on specified dm user, and on stubdomness. Arrange that when we are going to xsrestrict, we will run two qemus, as required Move rwperm initialisaton higher up the relevant function to help avoid future changes using it uninitialised. Grant write access on to the physmap subdirectory, and drop discussion of the rest from the commit message, and adjust the xenstore doc accordingly. Move the physmap number of entries limit into its own patch. v5 (permissions): improve commit message with security details v4 (permissions): add error message in case count > MAX_PHYSMAP_ENTRIES add a note to xenstore-paths.markdown about the possible change in privilege level only change permissions if xsrestrict is supported v4 (xsrestrict): update xenstore-paths.markdown v3 (xsrestrict): add emulator_ids mark as WIP v3 (permissions): use LIBXL_TOOLSTACK_DOMID instead of 0 in the commit message update commit message with more info on why it is safe add a limit on the number of physmap entries to save and restore v2 (permissioins): fix permissions to actually do what intended use LIBXL_TOOLSTACK_DOMID instead of 0 --- docs/man/xl.cfg.pod.5 | 14 ++++++++++ docs/misc/xenstore-paths.markdown | 7 +++++ tools/libxl/libxl_create.c | 51 +++++++++++++++++++++++++++++++++++-- tools/libxl/libxl_dm.c | 30 ++++++++++++++++++++-- tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c | 2 ++ 6 files changed, 101 insertions(+), 4 deletions(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index dd93725..7c0bac3 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -1842,6 +1842,20 @@ Run the device model as user "username", instead of xen-qemudepriv-domid$domid or xen-qemudepriv-shared or root. The default is to use the first of these users which exists. +=item B + +Restrict the device model's access to hypervisor facilities (such as +xenstore). The default is true if device_model_user is not `root` and +the qemu in use supports this option (ie, when the restriction is both +possible and beneficial), or false otherwise. + +If the option is set to true (1) but the restriction is not possible +or not effective, the domain configuration is rejected. + +This option is ignored when a stub domain device model is in use, +because in that situation the device model is restricted by virtue of +being in its own domain. + =back =head2 Keymaps diff --git a/docs/misc/xenstore-paths.markdown b/docs/misc/xenstore-paths.markdown index 1dc54ac..27981de 100644 --- a/docs/misc/xenstore-paths.markdown +++ b/docs/misc/xenstore-paths.markdown @@ -339,6 +339,13 @@ A PV console backend. Described in [console.txt](console.txt) Information relating to device models running in the domain. $DOMID is the target domain of the device model. +#### ~/device-model/$DOMID/physmap [w] + +Information about an HVM domain's physical memory map. This is +written as well as read by device models which may run at the same +privilege level of the guest domain. When the device model ruus with +system privilege, this path is INTERNAL. + #### ~/device-model/$DOMID[/$EMULATOR_ID]/* [INTERNAL] Information relating to device models running in the domain. $DOMID is diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 59dfcd67..5182d2b 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -478,6 +478,49 @@ static int domcreate_setdefault_dm_user(libxl__gc *gc, return rc; } +static int domcreate_setdefault_dm_restrict(libxl__gc *gc, + libxl__domain_create_state *dcs) +{ + /* convenience aliases */ + libxl_domain_config *const d_config = dcs->guest_config; + libxl_domain_build_info *b_info = &d_config->b_info; + + if (!libxl_defbool_is_default(b_info->device_model_restrict) && + !libxl_defbool_val(b_info->device_model_restrict)) + /* explicitly false */ + return 0; + + if (b_info->type != LIBXL_DOMAIN_TYPE_HVM || + libxl_defbool_val(b_info->device_model_stubdomain)) { + /* in theory we don't care, so default it to true for + * the benefit of future callers or in case of bugs */ + libxl_defbool_setdefault(&b_info->device_model_restrict, 1); + return 0; + } + + const char *dm = libxl__domain_device_model(gc, b_info); + + const char *cannot_restrict = + !strcmp(b_info->device_model_user, "root") + ? "device model user is root" : + !libxl__dm_supported(gc, dm, libxl__dm_support_check__xsrestrict) + ? "device model does not support xs_restrict" : + !libxl__dm_supported(gc, dm, libxl__dm_support_check__emulator_id) + ? "device model does not support emulator_id" : + 0; + + libxl_defbool_setdefault(&b_info->device_model_restrict, !cannot_restrict); + + if (libxl_defbool_val(b_info->device_model_restrict) && + cannot_restrict) { + LOG(ERROR, "device model restriction desired but not possible: %s", + cannot_restrict); + return ERROR_INVAL; + } + + return 0; +} + static void init_console_info(libxl__gc *gc, libxl__device_console *console, int dev_num) @@ -1075,11 +1118,15 @@ static void domcreate_dm_support_checked(libxl__egc *egc, rc = domcreate_setdefault_dm_user(gc, dcs); if (rc) goto out; + rc = domcreate_setdefault_dm_restrict(gc, dcs); + if (rc) goto out; + /* run two qemus? */ if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM && !libxl_defbool_val(d_config->b_info.device_model_stubdomain) && - b_info->device_model_user && - strcmp(b_info->device_model_user, "root")) { + ((b_info->device_model_user && + strcmp(b_info->device_model_user, "root")) || + libxl_defbool_val(d_config->b_info.device_model_restrict))) { rc = libxl__dm_emuidmap_add(gc, domid, b_state, EMUID_SPLIT); if (rc) goto out; } diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index d805800..0ee956f 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -420,6 +420,14 @@ static int libxl__build_device_model_args_old(libxl__gc *gc, b_info->device_model_user); return ERROR_INVAL; } + if (libxl_defbool_val(b_info->device_model_restrict) && + !libxl_defbool_val(b_info->device_model_stubdomain)) { + /* The dm support check ought to prevent this, but in case + * someone has done something odd, we double check. */ + LOG(ERROR, + "device_model_restrict not supported by qemu-xen-traditional"); + return ERROR_INVAL; + } flexarray_vappend(dm_args, dm, "-d", GCSPRINTF("%d", domid), NULL); @@ -1238,8 +1246,14 @@ static int libxl__build_device_model_args_new(libxl__gc *gc, } if (state->emuidmap & (1u << EMUID_SPLIT)) { flexarray_append(dm_args, "-xenopts"); - flexarray_append(dm_args, - GCSPRINTF("emulator_id=%u", emuid)); + flexarray_append(dm_args, GCSPRINTF("%semulator_id=%u", + libxl_defbool_val(b_info->device_model_restrict) + ? "xsrestrict=on," : "", + emuid)); + } else { + /* xsrestrict only makes sense with split qemu because + * the pv devices need unrestricted access */ + assert(!libxl_defbool_val(b_info->device_model_restrict)); } } flexarray_append(dm_args, NULL); @@ -1760,11 +1774,17 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss, char **pass_stuff; const char *dm; int dm_state_fd = -1; + struct xs_permissions rwperm[2]; if (libxl_defbool_val(b_info->device_model_stubdomain)) { abort(); } + rwperm[0].id = LIBXL_TOOLSTACK_DOMID; + rwperm[0].perms = XS_PERM_NONE; + rwperm[1].id = domid; + rwperm[1].perms = XS_PERM_WRITE; + rc = libxl__dm_emuidmap_add(gc, domid, state, emuid); if (rc) goto out; @@ -1804,6 +1824,12 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss, domid, emuid, ""); xs_mkdir(ctx->xsh, XBT_NULL, path); + if (libxl_defbool_val(b_info->device_model_restrict)) { + rc = libxl__xs_mknod(gc, XBT_NULL, GCSPRINTF("%s/physmap", path), + rwperm, 2); + if (rc) goto out; + } + if (b_info->type == LIBXL_DOMAIN_TYPE_HVM && b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL) diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 9658356..d5890a8 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -443,6 +443,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("device_model_ssidref", uint32), ("device_model_ssid_label", string), ("device_model_user", string), + ("device_model_restrict", libxl_defbool), # extra parameters pass directly to qemu, NULL terminated ("extra", libxl_string_list), diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index f9933cb..bdef4dc 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -2176,6 +2176,8 @@ skip_vfb: } } + xlu_cfg_get_defbool(config, "device_model_restrict", + &b_info->device_model_restrict, 0); xlu_cfg_replace_string (config, "device_model_override", &b_info->device_model, 0);