From patchwork Thu Nov 17 20:06:08 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Ostrovsky X-Patchwork-Id: 9435273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 27A9260469 for ; Thu, 17 Nov 2016 20:06:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1726728502 for ; Thu, 17 Nov 2016 20:06:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E096F296C7; Thu, 17 Nov 2016 20:06:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 61A3928502 for ; Thu, 17 Nov 2016 20:06:34 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c7Sup-0004v9-Cl; Thu, 17 Nov 2016 20:04:11 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c7Suo-0004v3-5X for xen-devel@lists.xenproject.org; Thu, 17 Nov 2016 20:04:10 +0000 Received: from [85.158.139.211] by server-2.bemta-5.messagelabs.com id 40/F4-08512-93D0E285; Thu, 17 Nov 2016 20:04:09 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrALMWRWlGSWpSXmKPExsXSO6nOVdeCVy/ C4OAjWYvvWyYzOTB6HP5whSWAMYo1My8pvyKBNePUtR3sBWvEKpafnMrcwLhSuIuRk0NIoINJ onNXfhcjF5D9hVFi4dlWFghnI6PE3ObDjBDObkaJbxt/sIG0sAkYSZw9Oh0owcEhIqAt8Wy2A kgNs8AyRokzlw+zg9QIC0RLTNtzF6yeRUBV4vGJHkYQm1fAS+LIqflMILaEgJzEzXOdzBC2sU TfrD6WCYw8CxgZVjFqFKcWlaUW6RoZ6iUVZaZnlOQmZuboGhqY6uWmFhcnpqfmJCYV6yXn525 iBPq+noGBcQfj3cl+hxglOZiURHnvsOlFCPEl5adUZiQWZ8QXleakFh9ilOHgUJLgfcwNlBMs Sk1PrUjLzAEGIUxagoNHSYR3Pkiat7ggMbc4Mx0idYpRl+PNrpcPmIRY8vLzUqXEeU15gIoEQ IoySvPgRsAi4hKjrJQwLyMDA4MQT0FqUW5mCar8K0ZxDkYlYd50kCk8mXklcJteAR3BBHTEHg EdkCNKEhFSUg2Mmxm3/z/2d19KTwrjrc+f+BbumxPNqiT63twz+mnJxMZD6ZnCwqYSTTIfUoS OCbZbbptbHDfJzVviSp5N9J4rgit/6lyrjy34WHV3Z2eB6by3Ddn639RmR84SLTrAwHPzYNeP smvv51w4nezY9Kx9Y13trueO8e87Xm284c2cd8zivWm0vXuCEktxRqKhFnNRcSIA+y58joMCA AA= X-Env-Sender: boris.ostrovsky@oracle.com X-Msg-Ref: server-5.tower-206.messagelabs.com!1479413046!69224804!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n X-StarScan-Received: X-StarScan-Version: 9.0.16; banners=-,-,- X-VirusChecked: Checked Received: (qmail 61608 invoked from network); 17 Nov 2016 20:04:08 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-5.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 17 Nov 2016 20:04:08 -0000 Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id uAHK3xoj030927 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 17 Nov 2016 20:04:00 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id uAHK3xJO017031 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 17 Nov 2016 20:03:59 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id uAHK3ulQ023711; Thu, 17 Nov 2016 20:03:57 GMT Received: from dhcp-burlington7-2nd-B-east-10-152-55-162.usdhcp.oraclecorp.com.com (/10.152.20.106) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 17 Nov 2016 12:03:56 -0800 From: Boris Ostrovsky To: avid.vrabel@citrix.com, jgross@suse.com Date: Thu, 17 Nov 2016 15:06:08 -0500 Message-Id: <1479413168-27131-1-git-send-email-boris.ostrovsky@oracle.com> X-Mailer: git-send-email 2.7.4 X-Source-IP: aserv0021.oracle.com [141.146.126.233] Cc: Boris Ostrovsky , xen-devel@lists.xenproject.org, olaf@aepfle.de, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [Xen-devel] [PATCH v3] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to NUMA balancing") set VM_IO flag to prevent grant maps from being subjected to NUMA balancing. It was discovered recently that this flag causes get_user_pages() to always fail with -EFAULT. check_vma_flags __get_user_pages __get_user_pages_locked __get_user_pages_unlocked get_user_pages_fast iov_iter_get_pages dio_refill_pages do_direct_IO do_blockdev_direct_IO do_blockdev_direct_IO ext4_direct_IO_read generic_file_read_iter aio_run_iocb (which can happen if guest's vdisk has direct-io-safe option). To avoid this don't use vm_flags. Instead, use mempolicy that prohibits page migration (i.e. clear MPOL_F_MOF|MPOL_F_MORON) and make sure we don't consult task's policy (which may include those flags) if vma doesn't have one. Reported-by: Olaf Hering Signed-off-by: Boris Ostrovsky Cc: stable@vger.kernel.org --- Changes in v3: * Don't use __mpol_dup() and get_task_policy() which are not exported for use by drivers. Add vm_operations_struct.get_policy(). * Copy to stable drivers/xen/gntdev.c | 27 ++++++++++++++++++++++++++- 1 files changed, 26 insertions(+), 1 deletions(-) diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index bb95212..632edd4 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -433,10 +434,28 @@ static void gntdev_vma_close(struct vm_area_struct *vma) return map->pages[(addr - map->pages_vm_start) >> PAGE_SHIFT]; } +#ifdef CONFIG_NUMA +/* + * We have this op to make sure callers (such as vma_policy_mof()) don't + * check current task's policy which may include migrate flags (MPOL_F_MOF + * or MPOL_F_MORON) + */ +static struct mempolicy *gntdev_vma_get_policy(struct vm_area_struct *vma, + unsigned long addr) +{ + if (mpol_needs_cond_ref(vma->vm_policy)) + mpol_get(vma->vm_policy); + return vma->vm_policy; +} +#endif + static const struct vm_operations_struct gntdev_vmops = { .open = gntdev_vma_open, .close = gntdev_vma_close, .find_special_page = gntdev_vma_find_special_page, +#ifdef CONFIG_NUMA + .get_policy = gntdev_vma_get_policy, +#endif }; /* ------------------------------------------------------------------ */ @@ -1007,7 +1026,13 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) vma->vm_ops = &gntdev_vmops; - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP | VM_IO; + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; + +#ifdef CONFIG_NUMA + /* Prevent NUMA balancing */ + if (vma->vm_policy) + vma->vm_policy->flags &= ~(MPOL_F_MOF | MPOL_F_MORON); +#endif if (use_ptemod) vma->vm_flags |= VM_DONTCOPY;