From patchwork Wed May 15 15:11:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Tkhai X-Patchwork-Id: 10944907 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 25FBD1515 for ; Wed, 15 May 2019 15:11:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 138B528B04 for ; Wed, 15 May 2019 15:11:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 06FE128BCC; Wed, 15 May 2019 15:11:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5813228BAF for ; Wed, 15 May 2019 15:11:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CD77D6B0006; Wed, 15 May 2019 11:11:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AE3B06B000A; Wed, 15 May 2019 11:11:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 936F26B0007; Wed, 15 May 2019 11:11:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by kanga.kvack.org (Postfix) with ESMTP id 22E616B0003 for ; Wed, 15 May 2019 11:11:34 -0400 (EDT) Received: by mail-lj1-f200.google.com with SMTP id z15so457430ljj.15 for ; Wed, 15 May 2019 08:11:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:date:message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=ZpcaA7VN6vMI3XLyf20y8pejyNByYJW9fpibn7OB86k=; b=f1aium3fdeQ3tc8Rt2+n83b6uyuP6J++IhS8PyxCEtfzoBcw+JtOKuFKQUJEJRvb0P 4APTiED2E9thCF9ajFbrtwkPmKPfcNSFm0HPjU+PGvp1+BnxSJJE6Q4N57ydMDco+ry0 V4BoJJzX/QwT55Qt6VtU+v8z/IpAW02F3o8SWznSb7f4vgJoTciJVY78X4bHCWnzdB8O NdOj3Am3JnOXe7XM7mi99RoG4y9S20+FQLznUnVbiwvI05aEgy+VVlO1QsdFYZBEuxxF AvHJo/v4C4hO5JzdaoZd3kg5JwxdOUmbWicsydGfn8GfCYRIbf5ss713yxjH1X4ff4iq fkvw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com X-Gm-Message-State: APjAAAXfMfEKix50C4XKEh4UvxIkIcGLqHVqW5qoSttV/LAEVA9ojVyV +vUydPHD9Fc46dzGy6VU9ei9D4TEsZ6ripCb+DgiMKifpxL38pz+uYW3A6YchIluMpRRkcRPPS6 O5JK/rFTDiRQpo30pVzqTtaZPdnc154y/ApAyuMq1N4kDeYenRD3e6d/Yx2/oZt0raQ== X-Received: by 2002:ac2:4471:: with SMTP id y17mr11099794lfl.23.1557933093356; Wed, 15 May 2019 08:11:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqy+eSQZnssZUfgHO7ShTOx/V5nJGnNcboqgAfHNPOstimtXTnKeZ05y7zD6NAt2LuvBfHpw X-Received: by 2002:ac2:4471:: with SMTP id y17mr11099742lfl.23.1557933092265; Wed, 15 May 2019 08:11:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557933092; cv=none; d=google.com; s=arc-20160816; b=ZK90Ie8dhsiuuGDlC8ZxQTk4mUe8r5CuRwGwjGnVWkSWzB/8TtWXy5KESVzkWTSk8X qhXZn5ax8LXrfwe4BocJiowjvADJYILKfeEqt+0U7xQ6BmaubMhxukxq/dcBw346BeYJ g9Ebk8D4USGZseYlHvcZqSJ6+T2WHvhqKZzBspZ13PLbN2xNarg1dGYX/8QXR42FjSyf 2DJZ+lTdo2IQKI1oUr6zJFHMteMSj+2bVPH6wB7Z3xfNiwS/cOVmmvfLCe13blAsihKE k1JOUrrM5k7sZyJkmYW0mrEYFRu+SuEZfW006rkXmNhUO57lUTSP+jhsajDuXdejxol0 0tTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:to:from:subject; bh=ZpcaA7VN6vMI3XLyf20y8pejyNByYJW9fpibn7OB86k=; b=fbiA/dRHqxZotDEnhvQqvDQCvSAoah5HVLPL6nNYNvnoq/noGB2x34WLP4Mx4v1ayt HGBh9C/LQDjRWYnIhEcStwnV05vLSCLlYkGVPmh8klOoQ/bFOELN/4ngeOqExUu6HK/1 PEEDtREe8h/wDsckVIoQEx4aNmVLKTa5NTk8mzIMrHuIS7pz/t5RZetDMnMJrAazUotl tls59xyH8baR2eNN/JS1LTFZrQnrD8plnZCUMnZPpGWza8l8QdY0/wy6CEvwxNXKcxZ7 19uVRhCtZ4movhlbwZfVoSx3Pr+IBBZxOPK/E/83q2Q6Aw+dJUOI+SnVsvwHxT/U+qSm xgig== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from relay.sw.ru (relay.sw.ru. [185.231.240.75]) by mx.google.com with ESMTPS id a30si1909560lfo.17.2019.05.15.08.11.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 May 2019 08:11:32 -0700 (PDT) Received-SPF: pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) client-ip=185.231.240.75; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from [172.16.25.169] (helo=localhost.localdomain) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hQvYw-0001X8-JJ; Wed, 15 May 2019 18:11:22 +0300 Subject: [PATCH RFC 1/5] mm: Add process_vm_mmap() syscall declaration From: Kirill Tkhai To: akpm@linux-foundation.org, dan.j.williams@intel.com, ktkhai@virtuozzo.com, mhocko@suse.com, keith.busch@intel.com, kirill.shutemov@linux.intel.com, pasha.tatashin@oracle.com, alexander.h.duyck@linux.intel.com, ira.weiny@intel.com, andreyknvl@google.com, arunks@codeaurora.org, vbabka@suse.cz, cl@linux.com, riel@surriel.com, keescook@chromium.org, hannes@cmpxchg.org, npiggin@gmail.com, mathieu.desnoyers@efficios.com, shakeelb@google.com, guro@fb.com, aarcange@redhat.com, hughd@google.com, jglisse@redhat.com, mgorman@techsingularity.net, daniel.m.jordan@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Date: Wed, 15 May 2019 18:11:22 +0300 Message-ID: <155793308232.13922.18307403112092259417.stgit@localhost.localdomain> In-Reply-To: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> References: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> User-Agent: StGit/0.18 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Similar to process_vm_readv() and process_vm_writev(), add declarations of a new syscall, which will allow to map memory from or to another process. Signed-off-by: Kirill Tkhai --- arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 2 ++ include/linux/syscalls.h | 5 +++++ include/uapi/asm-generic/unistd.h | 5 ++++- init/Kconfig | 9 +++++---- kernel/sys_ni.c | 2 ++ 6 files changed, 19 insertions(+), 5 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 4cd5f982b1e5..bf8cc5de918f 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -438,3 +438,4 @@ 425 i386 io_uring_setup sys_io_uring_setup __ia32_sys_io_uring_setup 426 i386 io_uring_enter sys_io_uring_enter __ia32_sys_io_uring_enter 427 i386 io_uring_register sys_io_uring_register __ia32_sys_io_uring_register +428 i386 process_vm_mmap sys_process_vm_mmap __ia32_compat_sys_process_vm_mmap diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 64ca0d06259a..5af619c2d512 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -355,6 +355,7 @@ 425 common io_uring_setup __x64_sys_io_uring_setup 426 common io_uring_enter __x64_sys_io_uring_enter 427 common io_uring_register __x64_sys_io_uring_register +428 common process_vm_mmap __x64_sys_process_vm_mmap # # x32-specific system call numbers start at 512 to avoid cache impact @@ -398,3 +399,4 @@ 545 x32 execveat __x32_compat_sys_execveat/ptregs 546 x32 preadv2 __x32_compat_sys_preadv64v2 547 x32 pwritev2 __x32_compat_sys_pwritev64v2 +548 x32 process_vm_mmap __x32_compat_sys_process_vm_mmap diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index e2870fe1be5b..7d8ae36589cf 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -997,6 +997,11 @@ asmlinkage long sys_fspick(int dfd, const char __user *path, unsigned int flags) asmlinkage long sys_pidfd_send_signal(int pidfd, int sig, siginfo_t __user *info, unsigned int flags); +asmlinkage long sys_process_vm_mmap(pid_t pid, + unsigned long src_addr, + unsigned long len, + unsigned long dst_addr, + unsigned long flags); /* * Architecture-specific system calls diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index dee7292e1df6..1273d86bf546 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -832,9 +832,12 @@ __SYSCALL(__NR_io_uring_setup, sys_io_uring_setup) __SYSCALL(__NR_io_uring_enter, sys_io_uring_enter) #define __NR_io_uring_register 427 __SYSCALL(__NR_io_uring_register, sys_io_uring_register) +#define __NR_process_vm_mmap 428 +__SC_COMP(__NR_process_vm_mmap, sys_process_vm_mmap, \ + compat_sys_process_vm_mmap) #undef __NR_syscalls -#define __NR_syscalls 428 +#define __NR_syscalls 429 /* * 32 bit systems traditionally used different diff --git a/init/Kconfig b/init/Kconfig index 8b9ffe236e4f..604db5f14718 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -320,13 +320,14 @@ config POSIX_MQUEUE_SYSCTL default y config CROSS_MEMORY_ATTACH - bool "Enable process_vm_readv/writev syscalls" + bool "Enable process_vm_readv/writev/mmap syscalls" depends on MMU default y help - Enabling this option adds the system calls process_vm_readv and - process_vm_writev which allow a process with the correct privileges - to directly read from or write to another process' address space. + Enabling this option adds the system calls process_vm_readv, + process_vm_writev and process_vm_mmap, which allow a process + with the correct privileges to directly read from or write to + or mmap another process' address space. See the man page for more details. config USELIB diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 4d9ae5ea6caf..6f51634f4f7e 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -316,6 +316,8 @@ COND_SYSCALL(process_vm_readv); COND_SYSCALL_COMPAT(process_vm_readv); COND_SYSCALL(process_vm_writev); COND_SYSCALL_COMPAT(process_vm_writev); +COND_SYSCALL(process_vm_mmap); +COND_SYSCALL_COMPAT(process_vm_mmap); /* compare kernel pointers */ COND_SYSCALL(kcmp); From patchwork Wed May 15 15:11:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Tkhai X-Patchwork-Id: 10944909 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 74A0E1398 for ; Wed, 15 May 2019 15:11:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 625D028A13 for ; Wed, 15 May 2019 15:11:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5640C28BD0; Wed, 15 May 2019 15:11:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE25528A13 for ; Wed, 15 May 2019 15:11:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D6D86B0007; Wed, 15 May 2019 11:11:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EDC086B0008; Wed, 15 May 2019 11:11:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA6026B000A; Wed, 15 May 2019 11:11:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by kanga.kvack.org (Postfix) with ESMTP id 713016B0007 for ; Wed, 15 May 2019 11:11:38 -0400 (EDT) Received: by mail-lj1-f200.google.com with SMTP id g15so456816ljk.8 for ; Wed, 15 May 2019 08:11:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:date:message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=Yfs8d2jF/T6jQEdvza+WxJDpKG02YKvUPUgE54J0ka4=; b=Ba9E2vlnUA0hp4F37xSb4WgCg3GuH3/tkmPd9V3dKvI611Y4W95dkZg2WGb5UnhKk5 tj+UluJ97woJnSzwoG/gqb89s3gnuGnsv91bhzcCskmOOP18bfuM/EgFVCs3tS/Nszii UvWWaDkHygaRV7MSCzVwK8kKF8+Lh1zOVpNVvSVaKUuHyhD3dtLjmQslBmmt+6TNJjet mqo6Evj5Jr/3SMwLh9WDutKWR9t1QnuNa+AmlDqsh1KhiSI7nat3UlORiq/I2Xhz2of0 ASGBKq+tKHDDZW51BnG3NQD8r+wFnAP+AN/wMtDm1fC5DhR+BhPtwLM+P338xzhwokDQ W0Og== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com X-Gm-Message-State: APjAAAVtssmNt9b8GTeA4zTVJHvm0E4bDo6/OnfuNumYLWvqPWDd38j5 I/7GFlgvKpOytVBhNaNA0UqNBAhAqwX9AGTebe1sSFpcJ++pC67437pezOjyeCxFjGsQNon5ceP 7T8s75sDElRtv2w+adWcNnF6yMwqqeqaYktQ3fQcsC/9xo+OccHlosTSj9Y3PxwJtcA== X-Received: by 2002:ac2:494b:: with SMTP id o11mr20746442lfi.9.1557933097902; Wed, 15 May 2019 08:11:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqwdua4s3zHu61XDOPN5UoP7fllaNk2EFmVwPRreuBo3UOL5XqQLhvJqkKwEt7pbE1hYC95L X-Received: by 2002:ac2:494b:: with SMTP id o11mr20746388lfi.9.1557933096596; Wed, 15 May 2019 08:11:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557933096; cv=none; d=google.com; s=arc-20160816; b=N/WVBgSGfeDSblXyFv4JGmpwqd8xLozQUaKijqlHf0+kB/x7jDXrJKsOLVl5kh/9PZ dPzn0AhazJAHlxzCorl1N70uDluXjuoAIailAjnGjpJbj34jHj+4h25TAj//3PpZhZLP XfDi0oRF0dDmPRo/UL+AZUFpEmSiwbWU8h4/B8AvflQHlKC94NOS21as0tgdiOh7WNFq jyJwDNB57EZuW5qePlstIbpuHeUm2rScJCWrTG2kFrm5cPqc/XfnO8p/8aStHT0UX/xG Z+xwlq5Jjt9XtFcLcC36Jiv7Mqqlwrax1A8BGGg32J7mni78EMdKybf/evunFlGpIRw4 8AKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:to:from:subject; bh=Yfs8d2jF/T6jQEdvza+WxJDpKG02YKvUPUgE54J0ka4=; b=LijyBCiCA1CZW5SHf0CM06SBcwm5B9TuzK9kfU0dk9NSLTX/VZwvOpLESJ4o6fE2AW vcyVMScjg66h1ZbYxhX2lzUfSAGRaDg1LWTPOeZZN82rMTav2efC7pwVze4krgnKriGl ByfGdMhlENm6s8X6+gOtZZxeriNP5rdbc41WZF6YUGbaWlIijd/nKlprY41m+OcBfx8E GNre6OL+4XTSpG/NxrvNPIVSOK0uK/HlYgp7lgzzvRLNWConyCc7fbwAr16712oMff+i 38cTt5kud6BJSl8ITV4JY1o/+K1f3CY3HTXNNBovxTTBxkUbC2wYaV3LD7A09vT71Cg0 T72w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from relay.sw.ru (relay.sw.ru. [185.231.240.75]) by mx.google.com with ESMTPS id r10si1794042lfi.8.2019.05.15.08.11.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 May 2019 08:11:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) client-ip=185.231.240.75; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from [172.16.25.169] (helo=localhost.localdomain) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hQvZ2-0001XF-8q; Wed, 15 May 2019 18:11:28 +0300 Subject: [PATCH RFC 2/5] mm: Extend copy_vma() From: Kirill Tkhai To: akpm@linux-foundation.org, dan.j.williams@intel.com, ktkhai@virtuozzo.com, mhocko@suse.com, keith.busch@intel.com, kirill.shutemov@linux.intel.com, pasha.tatashin@oracle.com, alexander.h.duyck@linux.intel.com, ira.weiny@intel.com, andreyknvl@google.com, arunks@codeaurora.org, vbabka@suse.cz, cl@linux.com, riel@surriel.com, keescook@chromium.org, hannes@cmpxchg.org, npiggin@gmail.com, mathieu.desnoyers@efficios.com, shakeelb@google.com, guro@fb.com, aarcange@redhat.com, hughd@google.com, jglisse@redhat.com, mgorman@techsingularity.net, daniel.m.jordan@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Date: Wed, 15 May 2019 18:11:27 +0300 Message-ID: <155793308777.13922.13297821989540731131.stgit@localhost.localdomain> In-Reply-To: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> References: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> User-Agent: StGit/0.18 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This prepares the function to copy a vma between two processes. Two new arguments are introduced. Signed-off-by: Kirill Tkhai --- include/linux/mm.h | 4 ++-- mm/mmap.c | 33 ++++++++++++++++++++++++--------- mm/mremap.c | 4 ++-- 3 files changed, 28 insertions(+), 13 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 0e8834ac32b7..afe07e4a76f8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2329,8 +2329,8 @@ extern void __vma_link_rb(struct mm_struct *, struct vm_area_struct *, struct rb_node **, struct rb_node *); extern void unlink_file_vma(struct vm_area_struct *); extern struct vm_area_struct *copy_vma(struct vm_area_struct **, - unsigned long addr, unsigned long len, pgoff_t pgoff, - bool *need_rmap_locks); + struct mm_struct *, unsigned long addr, unsigned long len, + pgoff_t pgoff, bool *need_rmap_locks, bool clear_flags_ctx); extern void exit_mmap(struct mm_struct *); static inline int check_data_rlimit(unsigned long rlim, diff --git a/mm/mmap.c b/mm/mmap.c index 9cf52bdb22a8..46266f6825ae 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3194,19 +3194,21 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma) } /* - * Copy the vma structure to a new location in the same mm, - * prior to moving page table entries, to effect an mremap move. + * Copy the vma structure to new location in the same vma + * prior to moving page table entries, to effect an mremap move; */ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, - unsigned long addr, unsigned long len, pgoff_t pgoff, - bool *need_rmap_locks) + struct mm_struct *mm, unsigned long addr, + unsigned long len, pgoff_t pgoff, + bool *need_rmap_locks, bool clear_flags_ctx) { struct vm_area_struct *vma = *vmap; unsigned long vma_start = vma->vm_start; - struct mm_struct *mm = vma->vm_mm; + struct vm_userfaultfd_ctx uctx; struct vm_area_struct *new_vma, *prev; struct rb_node **rb_link, *rb_parent; bool faulted_in_anon_vma = true; + unsigned long flags; /* * If anonymous vma has not yet been faulted, update new pgoff @@ -3219,15 +3221,25 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, if (find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent)) return NULL; /* should never get here */ - new_vma = vma_merge(mm, prev, addr, addr + len, vma->vm_flags, - vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx); + + uctx = vma->vm_userfaultfd_ctx; + flags = vma->vm_flags; + if (clear_flags_ctx) { + uctx = NULL_VM_UFFD_CTX; + flags &= ~(VM_UFFD_MISSING | VM_UFFD_WP | VM_MERGEABLE | + VM_LOCKED | VM_LOCKONFAULT | VM_WIPEONFORK | + VM_DONTCOPY); + } + + new_vma = vma_merge(mm, prev, addr, addr + len, flags, vma->anon_vma, + vma->vm_file, pgoff, vma_policy(vma), uctx); if (new_vma) { /* * Source vma may have been merged into new_vma */ if (unlikely(vma_start >= new_vma->vm_start && - vma_start < new_vma->vm_end)) { + vma_start < new_vma->vm_end) && + vma->vm_mm == mm) { /* * The only way we can get a vma_merge with * self during an mremap is if the vma hasn't @@ -3248,6 +3260,9 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, new_vma = vm_area_dup(vma); if (!new_vma) goto out; + new_vma->vm_mm = mm; + new_vma->vm_flags = flags; + new_vma->vm_userfaultfd_ctx = uctx; new_vma->vm_start = addr; new_vma->vm_end = addr + len; new_vma->vm_pgoff = pgoff; diff --git a/mm/mremap.c b/mm/mremap.c index 37b5b2ad91be..9a96cfc28675 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -352,8 +352,8 @@ static unsigned long move_vma(struct vm_area_struct *vma, return err; new_pgoff = vma->vm_pgoff + ((old_addr - vma->vm_start) >> PAGE_SHIFT); - new_vma = copy_vma(&vma, new_addr, new_len, new_pgoff, - &need_rmap_locks); + new_vma = copy_vma(&vma, mm, new_addr, new_len, new_pgoff, + &need_rmap_locks, false); if (!new_vma) return -ENOMEM; From patchwork Wed May 15 15:11:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Tkhai X-Patchwork-Id: 10944911 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 88AB4912 for ; Wed, 15 May 2019 15:11:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76A6C28B3D for ; Wed, 15 May 2019 15:11:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7460928BBE; Wed, 15 May 2019 15:11:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4729428B9B for ; Wed, 15 May 2019 15:11:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CB046B0008; Wed, 15 May 2019 11:11:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 47D7B6B000A; Wed, 15 May 2019 11:11:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31D7F6B000C; Wed, 15 May 2019 11:11:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-lj1-f199.google.com (mail-lj1-f199.google.com [209.85.208.199]) by kanga.kvack.org (Postfix) with ESMTP id B53DD6B0008 for ; Wed, 15 May 2019 11:11:39 -0400 (EDT) Received: by mail-lj1-f199.google.com with SMTP id 205so462737ljj.4 for ; Wed, 15 May 2019 08:11:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:date:message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=kL0ok5RoABrQngWW6illeL5PjQ/i2W5c1MKTRnAGhb8=; b=F+hLDsLeDafQb4gPwt3m8mP1jO+60Duxe74dqSnO6iKqNTCt2f8PXtadN8ECVb88Dv LyP7UiaUdtHIKllLQ3qvuDetNBxiskbAlLmv9yodnE2Yk7GwmD/L2/KElfR9aqZBvl7Y MN5FyDEJg+gn6PnmNWt4jexer2NnEjUL9dVCFhZ0gYWoNcdfxQb7HuFJqQLJ9o0/1xwz XJIIXh/OhPlfERZH1ByLA3hls26jMICnao9JCxnV+6kgH7OqV2Eto1B1vfU0tj41W/OH 87+pG+aSqmWsLV6M30z4cknXSWwL3q5+rJHmhTvXdLC9UYOVfBmHAUfqMO1TTKkzExlC 91FA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com X-Gm-Message-State: APjAAAXPyscHmlgwC/dVuyzlM5cCj3X7kf97rAxJokQkvseAdumx/BDd f7ysT8raPr4O4l9khOLJXl/6z75CDZuHCxp93LkgT5e082kj/Lq7MfQ3NKRzkyByW8B79zpASS/ D089fIVuUNmAvD6RsYeu9pEbEo/mqAPpqeSydpmfi0XlaDhNs9Y/aoV0eft+y2m/Jjw== X-Received: by 2002:ac2:5621:: with SMTP id b1mr21694950lff.27.1557933099078; Wed, 15 May 2019 08:11:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqzHj8xtopABheTKnGJATjy40/ywyXiGFYmEt6mQOrcSEp87sXkX6Z4IPnJlwAQ44OKvzKn+ X-Received: by 2002:ac2:5621:: with SMTP id b1mr21694860lff.27.1557933097302; Wed, 15 May 2019 08:11:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557933097; cv=none; d=google.com; s=arc-20160816; b=dy4FNCTmjFUSS7hTkDyPG1KLOuMobcUNvsCHEBYdlSn9N81eBpYGfzVctNHRc4dcfC 0Cj2YKrdPEgHzKwQPShWgllfUvJQsUs9ltVyEwV3XWfWwqbUs/2htg/lZy4h3WqvsUw7 1WPe/4QigvEshVDbtEneOEudKVfBouFbjze3VSeE0c1aPpNE9ek76AXz7NqVpDiis47t qe94zYKR8L/bdLNTZF83BA7ft1ggw2TqmnujxshjIotHaKeSfITAYi5/fIq/X0OGTC9r 5nXJKZcX2mCqa9QSxA0Un2fJlC4CtwohHnzbJFM0oU/3f1DCj4iUBtDKClEJyhOYtttn tEtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:to:from:subject; bh=kL0ok5RoABrQngWW6illeL5PjQ/i2W5c1MKTRnAGhb8=; b=VeZvOhN1eGAdGW33KOdxbfZ8R0bwCSS4pcJv6KtaZYYBG2NQrx6G8pi298taf3xUmP mxpj9HAgQe6VqkT1RB0v9xMGJOMz1U24cuYvU60sjesJrSRKmDY9m3k2+Ib6x8nhasvz Lkl79caRm5Ag78zStARc+nXrLKQXWzBZz36fGG7a5UkOGaDRipS/I1d9m+heXxB+gSZL 5jG9oShEwUoGBzRlqc1F38WAa/qh2gKo7NQqzc9rztlJ9eVNsGvBkytR9XKlfvXyANQX v1B9JNz7KmO+t+S070I0loSXlQ3A4nk/rT4cH0P62FayGTbew3HrKuChDS9duIjfx0YB 2FSQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from relay.sw.ru (relay.sw.ru. [185.231.240.75]) by mx.google.com with ESMTPS id j10si1932766lfk.1.2019.05.15.08.11.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 May 2019 08:11:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) client-ip=185.231.240.75; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from [172.16.25.169] (helo=localhost.localdomain) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hQvZ7-0001XO-LD; Wed, 15 May 2019 18:11:33 +0300 Subject: [PATCH RFC 3/5] mm: Extend copy_page_range() From: Kirill Tkhai To: akpm@linux-foundation.org, dan.j.williams@intel.com, ktkhai@virtuozzo.com, mhocko@suse.com, keith.busch@intel.com, kirill.shutemov@linux.intel.com, pasha.tatashin@oracle.com, alexander.h.duyck@linux.intel.com, ira.weiny@intel.com, andreyknvl@google.com, arunks@codeaurora.org, vbabka@suse.cz, cl@linux.com, riel@surriel.com, keescook@chromium.org, hannes@cmpxchg.org, npiggin@gmail.com, mathieu.desnoyers@efficios.com, shakeelb@google.com, guro@fb.com, aarcange@redhat.com, hughd@google.com, jglisse@redhat.com, mgorman@techsingularity.net, daniel.m.jordan@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Date: Wed, 15 May 2019 18:11:33 +0300 Message-ID: <155793309336.13922.12463187219189076695.stgit@localhost.localdomain> In-Reply-To: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> References: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> User-Agent: StGit/0.18 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This allows to copy pages not only to the same addreses in another process, but also to a specified address. Huge pages and unaligned address cases are handled by splitting. Signed-off-by: Kirill Tkhai --- include/linux/huge_mm.h | 6 +- include/linux/mm.h | 3 + kernel/fork.c | 5 + mm/huge_memory.c | 30 ++++++--- mm/memory.c | 165 +++++++++++++++++++++++++++++++---------------- 5 files changed, 141 insertions(+), 68 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 7cd5c150c21d..1e6002ee7c44 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -9,11 +9,13 @@ extern vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); extern int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, - pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, + pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long dst_addr, + unsigned long src_addr, unsigned long len, struct vm_area_struct *vma); extern void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd); extern int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, - pud_t *dst_pud, pud_t *src_pud, unsigned long addr, + pud_t *dst_pud, pud_t *src_pud, unsigned long dst_addr, + unsigned long src_addr, unsigned long len, struct vm_area_struct *vma); #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD diff --git a/include/linux/mm.h b/include/linux/mm.h index afe07e4a76f8..54328d08dbdd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1485,7 +1485,8 @@ int walk_page_vma(struct vm_area_struct *vma, struct mm_walk *walk); void free_pgd_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, unsigned long floor, unsigned long ceiling); int copy_page_range(struct mm_struct *dst, struct mm_struct *src, - struct vm_area_struct *vma); + struct vm_area_struct *vma, unsigned long dst_addr, + unsigned long src_addr, unsigned long src_end); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); diff --git a/kernel/fork.c b/kernel/fork.c index a5d4b5227630..2cce9bb78c1d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -584,7 +584,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, mm->map_count++; if (!(tmp->vm_flags & VM_WIPEONFORK)) - retval = copy_page_range(mm, oldmm, mpnt); + retval = copy_page_range(mm, oldmm, mpnt, + mpnt->vm_start, + mpnt->vm_start, + mpnt->vm_end); if (tmp->vm_ops && tmp->vm_ops->open) tmp->vm_ops->open(tmp); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9f8bce9a6b32..f338b06f42c6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -956,7 +956,8 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, } int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, - pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, + pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long dst_addr, + unsigned long src_addr, unsigned long len, struct vm_area_struct *vma) { spinlock_t *dst_ptl, *src_ptl; @@ -969,6 +970,11 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, if (!vma_is_anonymous(vma)) return 0; + if (len != HPAGE_PMD_SIZE) { + split_huge_pmd(vma, src_pmd, src_addr); + return -EAGAIN; + } + pgtable = pte_alloc_one(dst_mm); if (unlikely(!pgtable)) goto out; @@ -990,12 +996,12 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd = swp_entry_to_pmd(entry); if (pmd_swp_soft_dirty(*src_pmd)) pmd = pmd_swp_mksoft_dirty(pmd); - set_pmd_at(src_mm, addr, src_pmd, pmd); + set_pmd_at(src_mm, src_addr, src_pmd, pmd); } add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(dst_mm); pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); - set_pmd_at(dst_mm, addr, dst_pmd, pmd); + set_pmd_at(dst_mm, dst_addr, dst_pmd, pmd); ret = 0; goto out_unlock; } @@ -1018,7 +1024,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, * reference. */ zero_page = mm_get_huge_zero_page(dst_mm); - set_huge_zero_page(pgtable, dst_mm, vma, addr, dst_pmd, + set_huge_zero_page(pgtable, dst_mm, vma, dst_addr, dst_pmd, zero_page); ret = 0; goto out_unlock; @@ -1032,9 +1038,9 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, mm_inc_nr_ptes(dst_mm); pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); - pmdp_set_wrprotect(src_mm, addr, src_pmd); + pmdp_set_wrprotect(src_mm, src_addr, src_pmd); pmd = pmd_mkold(pmd_wrprotect(pmd)); - set_pmd_at(dst_mm, addr, dst_pmd, pmd); + set_pmd_at(dst_mm, dst_addr, dst_pmd, pmd); ret = 0; out_unlock: @@ -1096,13 +1102,19 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, } int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, - pud_t *dst_pud, pud_t *src_pud, unsigned long addr, + pud_t *dst_pud, pud_t *src_pud, unsigned long dst_addr, + unsigned long src_addr, unsigned long len, struct vm_area_struct *vma) { spinlock_t *dst_ptl, *src_ptl; pud_t pud; int ret; + if (len != HPAGE_PUD_SIZE) { + split_huge_pud(vma, src_pud, src_addr); + return -EAGAIN; + } + dst_ptl = pud_lock(dst_mm, dst_pud); src_ptl = pud_lockptr(src_mm, src_pud); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); @@ -1121,9 +1133,9 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, /* No huge zero pud yet */ } - pudp_set_wrprotect(src_mm, addr, src_pud); + pudp_set_wrprotect(src_mm, src_addr, src_pud); pud = pud_mkold(pud_wrprotect(pud)); - set_pud_at(dst_mm, addr, dst_pud, pud); + set_pud_at(dst_mm, dst_addr, dst_pud, pud); ret = 0; out_unlock: diff --git a/mm/memory.c b/mm/memory.c index 0d0711a912de..9d0fe2aee5f2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -699,7 +699,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, static inline unsigned long copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma, - unsigned long addr, int *rss) + unsigned long src_addr, int *rss, unsigned long dst_addr) { unsigned long vm_flags = vma->vm_flags; pte_t pte = *src_pte; @@ -737,7 +737,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = swp_entry_to_pte(entry); if (pte_swp_soft_dirty(*src_pte)) pte = pte_swp_mksoft_dirty(pte); - set_pte_at(src_mm, addr, src_pte, pte); + set_pte_at(src_mm, src_addr, src_pte, pte); } } else if (is_device_private_entry(entry)) { page = device_private_entry_to_page(entry); @@ -766,7 +766,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, is_cow_mapping(vm_flags)) { make_device_private_entry_read(&entry); pte = swp_entry_to_pte(entry); - set_pte_at(src_mm, addr, src_pte, pte); + set_pte_at(src_mm, src_addr, src_pte, pte); } } goto out_set_pte; @@ -777,7 +777,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, * in the parent and the child */ if (is_cow_mapping(vm_flags) && pte_write(pte)) { - ptep_set_wrprotect(src_mm, addr, src_pte); + ptep_set_wrprotect(src_mm, src_addr, src_pte); pte = pte_wrprotect(pte); } @@ -789,7 +789,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte = pte_mkclean(pte); pte = pte_mkold(pte); - page = vm_normal_page(vma, addr, pte); + page = vm_normal_page(vma, src_addr, pte); if (page) { get_page(page); page_dup_rmap(page, false); @@ -810,13 +810,14 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, } out_set_pte: - set_pte_at(dst_mm, addr, dst_pte, pte); + set_pte_at(dst_mm, dst_addr, dst_pte, pte); return 0; } static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, struct vm_area_struct *vma, - unsigned long addr, unsigned long end) + unsigned long src_addr, unsigned long src_end, + unsigned long dst_addr) { pte_t *orig_src_pte, *orig_dst_pte; pte_t *src_pte, *dst_pte; @@ -828,10 +829,10 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, again: init_rss_vec(rss); - dst_pte = pte_alloc_map_lock(dst_mm, dst_pmd, addr, &dst_ptl); + dst_pte = pte_alloc_map_lock(dst_mm, dst_pmd, dst_addr, &dst_ptl); if (!dst_pte) return -ENOMEM; - src_pte = pte_offset_map(src_pmd, addr); + src_pte = pte_offset_map(src_pmd, src_addr); src_ptl = pte_lockptr(src_mm, src_pmd); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); orig_src_pte = src_pte; @@ -854,11 +855,12 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, continue; } entry.val = copy_one_pte(dst_mm, src_mm, dst_pte, src_pte, - vma, addr, rss); + vma, src_addr, rss, dst_addr); if (entry.val) break; progress += 8; - } while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end); + } while (dst_pte++, src_pte++, dst_addr += PAGE_SIZE, + src_addr += PAGE_SIZE, src_addr != src_end); arch_leave_lazy_mmu_mode(); spin_unlock(src_ptl); @@ -872,108 +874,147 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, return -ENOMEM; progress = 0; } - if (addr != end) + if (src_addr != src_end) goto again; return 0; } static inline int copy_pmd_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, pud_t *dst_pud, pud_t *src_pud, struct vm_area_struct *vma, - unsigned long addr, unsigned long end) + unsigned long src_addr, unsigned long src_end, unsigned long dst_addr) { + unsigned long src_next, dst_next, src_len, dst_len, dst_end, len; pmd_t *src_pmd, *dst_pmd; - unsigned long next; - dst_pmd = pmd_alloc(dst_mm, dst_pud, addr); + dst_pmd = pmd_alloc(dst_mm, dst_pud, dst_addr); if (!dst_pmd) return -ENOMEM; - src_pmd = pmd_offset(src_pud, addr); + src_pmd = pmd_offset(src_pud, src_addr); + dst_end = dst_addr + (src_end - src_addr); do { - next = pmd_addr_end(addr, end); + src_next = pmd_addr_end(src_addr, src_end); + dst_next = pmd_addr_end(dst_addr, dst_end); + src_len = src_next - src_addr; + dst_len = dst_next - dst_addr; + + len = min(src_len, dst_len); + src_next = src_addr + len; + dst_next = dst_addr + len; if (is_swap_pmd(*src_pmd) || pmd_trans_huge(*src_pmd) || pmd_devmap(*src_pmd)) { int err; - VM_BUG_ON_VMA(next-addr != HPAGE_PMD_SIZE, vma); - err = copy_huge_pmd(dst_mm, src_mm, - dst_pmd, src_pmd, addr, vma); + err = copy_huge_pmd(dst_mm, src_mm, dst_pmd, src_pmd, + dst_addr, src_addr, len, vma); if (err == -ENOMEM) return -ENOMEM; if (!err) - continue; + goto next; /* fall through */ } if (pmd_none_or_clear_bad(src_pmd)) - continue; + goto next; if (copy_pte_range(dst_mm, src_mm, dst_pmd, src_pmd, - vma, addr, next)) + vma, src_addr, src_next, dst_addr)) return -ENOMEM; - } while (dst_pmd++, src_pmd++, addr = next, addr != end); +next: + if (src_len == len) + src_pmd++; + if (dst_len == len) + dst_pmd++; + } while (src_addr = src_next, dst_addr = dst_next, src_addr != src_end); return 0; } static inline int copy_pud_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, p4d_t *dst_p4d, p4d_t *src_p4d, struct vm_area_struct *vma, - unsigned long addr, unsigned long end) + unsigned long src_addr, unsigned long src_end, unsigned long dst_addr) { + unsigned long src_next, dst_next, src_len, dst_len, dst_end, len; pud_t *src_pud, *dst_pud; - unsigned long next; - dst_pud = pud_alloc(dst_mm, dst_p4d, addr); + dst_pud = pud_alloc(dst_mm, dst_p4d, dst_addr); if (!dst_pud) return -ENOMEM; - src_pud = pud_offset(src_p4d, addr); + src_pud = pud_offset(src_p4d, src_addr); + dst_end = dst_addr + (src_end - src_addr); do { - next = pud_addr_end(addr, end); + src_next = pud_addr_end(src_addr, src_end); + dst_next = pud_addr_end(dst_addr, dst_end); + src_len = src_next - src_addr; + dst_len = dst_next - dst_addr; + + len = min(src_len, dst_len); + src_next = src_addr + len; + dst_next = dst_addr + len; + if (pud_trans_huge(*src_pud) || pud_devmap(*src_pud)) { int err; - VM_BUG_ON_VMA(next-addr != HPAGE_PUD_SIZE, vma); - err = copy_huge_pud(dst_mm, src_mm, - dst_pud, src_pud, addr, vma); + err = copy_huge_pud(dst_mm, src_mm, dst_pud, src_pud, + dst_addr, src_addr, len, vma); if (err == -ENOMEM) return -ENOMEM; if (!err) - continue; + goto next; /* fall through */ } if (pud_none_or_clear_bad(src_pud)) - continue; + goto next; if (copy_pmd_range(dst_mm, src_mm, dst_pud, src_pud, - vma, addr, next)) + vma, src_addr, src_next, dst_addr)) return -ENOMEM; - } while (dst_pud++, src_pud++, addr = next, addr != end); +next: + if (src_len == len) + src_pud++; + if (dst_len == len) + dst_pud++; + } while (src_addr = src_next, dst_addr = dst_next, src_addr != src_end); return 0; } static inline int copy_p4d_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, pgd_t *dst_pgd, pgd_t *src_pgd, struct vm_area_struct *vma, - unsigned long addr, unsigned long end) + unsigned long src_addr, unsigned long src_end, unsigned long dst_addr) { + unsigned long src_next, dst_next, src_len, dst_len, dst_end, len; p4d_t *src_p4d, *dst_p4d; - unsigned long next; - dst_p4d = p4d_alloc(dst_mm, dst_pgd, addr); + dst_p4d = p4d_alloc(dst_mm, dst_pgd, dst_addr); if (!dst_p4d) return -ENOMEM; - src_p4d = p4d_offset(src_pgd, addr); + + src_p4d = p4d_offset(src_pgd, src_addr); + dst_end = dst_addr + (src_end - src_addr); do { - next = p4d_addr_end(addr, end); + src_next = p4d_addr_end(src_addr, src_end); + dst_next = p4d_addr_end(dst_addr, dst_end); + src_len = src_next - src_addr; + dst_len = dst_next - dst_addr; + + len = min(src_len, dst_len); + src_next = src_addr + len; + dst_next = dst_addr + len; + if (p4d_none_or_clear_bad(src_p4d)) - continue; + goto next; if (copy_pud_range(dst_mm, src_mm, dst_p4d, src_p4d, - vma, addr, next)) + vma, src_addr, src_next, dst_addr)) return -ENOMEM; - } while (dst_p4d++, src_p4d++, addr = next, addr != end); +next: + if (src_len == len) + src_p4d++; + if (dst_len == len) + dst_p4d++; + } while (src_addr = src_next, dst_addr = dst_next, src_addr != src_end); return 0; } int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, - struct vm_area_struct *vma) + struct vm_area_struct *vma, unsigned long dst_addr, + unsigned long src_addr, unsigned long src_end) { pgd_t *src_pgd, *dst_pgd; - unsigned long next; - unsigned long addr = vma->vm_start; - unsigned long end = vma->vm_end; + unsigned long src_next, dst_next, src_len, dst_len, dst_end, len; struct mmu_notifier_range range; bool is_cow; int ret; @@ -1011,23 +1052,37 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, if (is_cow) { mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, - 0, vma, src_mm, addr, end); + 0, vma, src_mm, src_addr, src_end); mmu_notifier_invalidate_range_start(&range); } ret = 0; - dst_pgd = pgd_offset(dst_mm, addr); - src_pgd = pgd_offset(src_mm, addr); + dst_pgd = pgd_offset(dst_mm, dst_addr); + src_pgd = pgd_offset(src_mm, src_addr); + dst_end = dst_addr + (src_end - src_addr); do { - next = pgd_addr_end(addr, end); + src_next = pgd_addr_end(src_addr, src_end); + dst_next = pgd_addr_end(dst_addr, dst_end); + src_len = src_next - src_addr; + dst_len = dst_next - dst_addr; + + len = min(src_len, dst_len); + src_next = src_addr + len; + dst_next = dst_addr + len; + if (pgd_none_or_clear_bad(src_pgd)) - continue; + goto next; if (unlikely(copy_p4d_range(dst_mm, src_mm, dst_pgd, src_pgd, - vma, addr, next))) { + vma, src_addr, src_next, dst_addr))) { ret = -ENOMEM; break; } - } while (dst_pgd++, src_pgd++, addr = next, addr != end); +next: + if (src_len == len) + src_pgd++; + if (dst_len == len) + dst_pgd++; + } while (src_addr = src_next, dst_addr = dst_next, src_addr != src_end); if (is_cow) mmu_notifier_invalidate_range_end(&range); From patchwork Wed May 15 15:11:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Tkhai X-Patchwork-Id: 10944913 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 15DE81398 for ; Wed, 15 May 2019 15:11:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 04D8328B3D for ; Wed, 15 May 2019 15:11:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ED34C28BA8; Wed, 15 May 2019 15:11:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C347828BA7 for ; Wed, 15 May 2019 15:11:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E73D6B000A; Wed, 15 May 2019 11:11:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1219C6B000C; Wed, 15 May 2019 11:11:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E673B6B000D; Wed, 15 May 2019 11:11:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by kanga.kvack.org (Postfix) with ESMTP id 869466B000A for ; Wed, 15 May 2019 11:11:44 -0400 (EDT) Received: by mail-lj1-f200.google.com with SMTP id h1so450838ljk.22 for ; Wed, 15 May 2019 08:11:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:date:message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=/uyr94JnPISQut0hE1li4tFRlu1mk248WXZ6mjsl4Mg=; b=QzsivR/jn4resc+vYqFHCMC1GcTsIi6wX3qfJ3YFC2/8PpNbw8cOkc/MURXMHNi5vS fcCe/+bE9hNrhAGYlw3nKLSqY/5EAvn5kBeJKz2A2TtXNZRvKAzvyv9pt7cxTjPDvPF8 ITMZDv7wy6sef8jgwohijxQ4whn52DbW7ATYvCzORrxPWEG3wEY7+tWcgTF7Ys2RmQJU 71ETHFh9clH6DrER3dUJcldhibTYaDOAG2jEaDJzYiprBQiFDuz2guL8YH73gU0Cj3DY qWUZLu5uQcANF0nxZV4BaVtFxFL+hUbN39vykoCfaMEQPnLBMVHGZQS/lTw6CcivPoSD xwQw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com X-Gm-Message-State: APjAAAWeg2T8RjbZiNK4dwNY5D6E8wShoUfDGj7Y5/ZsShvl1T6Kspz3 2q5RHA+bZ+JvompiQZsBI0a5oDj+mpB3AKjZLt4AIArmTh2WQ9yvpScZwvUfvoYi83cQ5VbydJc xTsWp2t5m+OxgilaKIcNCtLzsQMDlpMqZt/plC01yR7wNz/1MkomuZ7kVL0JUK/tGbg== X-Received: by 2002:a2e:2a03:: with SMTP id q3mr20720085ljq.56.1557933104004; Wed, 15 May 2019 08:11:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqwEVsH1ATBEstD3g0jmVHaXb8Bzft+Xu09Z9x7izEIeO9eywpUglNQ9FtUr8fXJQl8EWwCD X-Received: by 2002:a2e:2a03:: with SMTP id q3mr20720015ljq.56.1557933102643; Wed, 15 May 2019 08:11:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557933102; cv=none; d=google.com; s=arc-20160816; b=bUJWHrXuaNzU+dXe0BTvkQW0YC9PQzyURPytzKo6m3tVmq/9kxpvAe/E2kT0G9TNYO DJ8u6Dyl2/yCEiBgQr7s2FOfdAYL63UijDb8L5tfDQO9vlJlivU3NQ8OdRAmxVnE6cWl NWJhx1buU2rTrw53F76DT+i3uW28zEO6+3AIvgiuXzKMI4MZCoTzjR8yJN0n2CedYQW+ 9OXUw8Zl+Sf6zNo+ZdsC8Okjrf1PfgkOc662An0qiFP949lKn/OaFZX9WMcKEQwRQrnP QrELCC1nggAu5vK8dDkuWZ8+jawiBdJSzBO2G2uImRfSNEvMSLRF6P6lVTPhEbiiSw/2 pXoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:to:from:subject; bh=/uyr94JnPISQut0hE1li4tFRlu1mk248WXZ6mjsl4Mg=; b=PoBwQd1vzSrsVNtwRol7m0mW2cg1KFez3oIQ2kOuqbYXfm8KlaIs+BpXOXGkmmsKiz PA2A3NuLitk7RGanRxeKRiOBGQK7ATbWJipP9A5gg48fXKwLUKV83twd8Fi0Lt2vROJ4 a4d7MaHnaMNRQZQUFrW8Vj9FYE/zGLdxmpEy0kt3V7p9h4Z1HATdj7vkCQe31M8VVTls AzrJC+RQThcSBR/Hgd4EmIZquAifrtZxIY1JivvY0Cyfnm9eEPLGtkzmbeJqI7xwxa1I LVKFSj0l+F0TdBKIX2ipaGiEJuiBISRjgojg5mTAhFa1YJG/zoNKWhfoSB9vQQL6/ZOi t6hQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from relay.sw.ru (relay.sw.ru. [185.231.240.75]) by mx.google.com with ESMTPS id t4si1730759lje.28.2019.05.15.08.11.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 May 2019 08:11:42 -0700 (PDT) Received-SPF: pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) client-ip=185.231.240.75; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from [172.16.25.169] (helo=localhost.localdomain) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hQvZD-0001Xm-0R; Wed, 15 May 2019 18:11:39 +0300 Subject: [PATCH RFC 4/5] mm: Export round_hint_to_min() From: Kirill Tkhai To: akpm@linux-foundation.org, dan.j.williams@intel.com, ktkhai@virtuozzo.com, mhocko@suse.com, keith.busch@intel.com, kirill.shutemov@linux.intel.com, pasha.tatashin@oracle.com, alexander.h.duyck@linux.intel.com, ira.weiny@intel.com, andreyknvl@google.com, arunks@codeaurora.org, vbabka@suse.cz, cl@linux.com, riel@surriel.com, keescook@chromium.org, hannes@cmpxchg.org, npiggin@gmail.com, mathieu.desnoyers@efficios.com, shakeelb@google.com, guro@fb.com, aarcange@redhat.com, hughd@google.com, jglisse@redhat.com, mgorman@techsingularity.net, daniel.m.jordan@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Date: Wed, 15 May 2019 18:11:38 +0300 Message-ID: <155793309872.13922.9196517703774034670.stgit@localhost.localdomain> In-Reply-To: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> References: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> User-Agent: StGit/0.18 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Kirill Tkhai --- include/linux/mman.h | 14 ++++++++++++++ mm/mmap.c | 13 ------------- 2 files changed, 14 insertions(+), 13 deletions(-) diff --git a/include/linux/mman.h b/include/linux/mman.h index 4b08e9c9c538..69feb3144c12 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -4,6 +4,7 @@ #include #include +#include #include #include @@ -73,6 +74,19 @@ static inline void vm_unacct_memory(long pages) vm_acct_memory(-pages); } +/* + * If a hint addr is less than mmap_min_addr change hint to be as + * low as possible but still greater than mmap_min_addr + */ +static inline unsigned long round_hint_to_min(unsigned long hint) +{ + hint &= PAGE_MASK; + if (((void *)hint != NULL) && + (hint < mmap_min_addr)) + return PAGE_ALIGN(mmap_min_addr); + return hint; +} + /* * Allow architectures to handle additional protection bits */ diff --git a/mm/mmap.c b/mm/mmap.c index 46266f6825ae..b2a1f77643cd 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1318,19 +1318,6 @@ struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma) return NULL; } -/* - * If a hint addr is less than mmap_min_addr change hint to be as - * low as possible but still greater than mmap_min_addr - */ -static inline unsigned long round_hint_to_min(unsigned long hint) -{ - hint &= PAGE_MASK; - if (((void *)hint != NULL) && - (hint < mmap_min_addr)) - return PAGE_ALIGN(mmap_min_addr); - return hint; -} - static inline int mlock_future_check(struct mm_struct *mm, unsigned long flags, unsigned long len) From patchwork Wed May 15 15:11:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Tkhai X-Patchwork-Id: 10944915 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2F841398 for ; Wed, 15 May 2019 15:11:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CEFEB28B86 for ; Wed, 15 May 2019 15:11:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C26D028B8E; Wed, 15 May 2019 15:11:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD40F28BC8 for ; Wed, 15 May 2019 15:11:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BAA696B000C; Wed, 15 May 2019 11:11:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AE4AE6B000D; Wed, 15 May 2019 11:11:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FB766B000E; Wed, 15 May 2019 11:11:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by kanga.kvack.org (Postfix) with ESMTP id 359F16B000C for ; Wed, 15 May 2019 11:11:49 -0400 (EDT) Received: by mail-lj1-f200.google.com with SMTP id g15so456970ljk.8 for ; Wed, 15 May 2019 08:11:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:date:message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=XTDiFLs1g6MB9cJ5m0upXpXhxwoNltSaX8PRg4fpGGw=; b=QLMAFJwdHIqOsN9tkrOjBjmvem0VHxNTm/5IxCzC4ZKI0CkepZZA3kv9S1QPDcl1/B srH3Nag0AKSBvBO4ylCnK7fBLJO0i72yjYYGjdCCmWmuvrJwXrCY9wMFJ2Cwi/3ssZ/b Mfk6r6f09lkwamtE47BQ6V3A45GVUpi27ZYWC6bqW8jzwgOm1TyY7SFuyKFWTpvIr877 M8IBFA4HGjEj/qQ0nWpXQ2R+O2jutTV4m+v9il5Ld2iZ/1iLcS35E3qGLQQD1Xbwxk9I U/NB4MxScZ6KRMIpvKkyo1letWu10XIuK40qyTp2s/mBS+PNVntXBf0+ejCCv114ab4c 5REQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com X-Gm-Message-State: APjAAAVYi9zxwkjr3A/tzW5Ny/HwIH101r+8R5asCE9+MB+RPSya16dK YV/rlQD/i0v9c3eXX1JC+GV8CdRYoi1aVyV8h7plfAZxywbLY67Bi4kf8dgFgGsD5CELo3FljOH iUK8NeHa5iTwdcl5UfLAKcevEQtv6iEEnmIDahWTzvFgFv6IgQI1YcUBnGq9hasPvKA== X-Received: by 2002:a2e:88ce:: with SMTP id a14mr13999265ljk.122.1557933108652; Wed, 15 May 2019 08:11:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqxVHJzku8QTZQMfLbqXiSh0tkm6m70U5a1H/3dZ9ep0Kt+iwXobgrem09KYP3JSBgIHArFy X-Received: by 2002:a2e:88ce:: with SMTP id a14mr13999220ljk.122.1557933107585; Wed, 15 May 2019 08:11:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557933107; cv=none; d=google.com; s=arc-20160816; b=WLctVfeVlI1qy/qspkyhSuCt1oeHGnPSNpzmEfpn0WYyle473PRuyFMCrwhzFK4+4L fI+2ft7dJSkl6MgG4k/Ya2DzJpW8D6LUT9KpoFcCweAU/8GI47ftqEA/IE0ixtgxK3No q68GjWXWyY9OVNs+KlSZbhCOYXpYrEoMd2r2KAtb79+SQRWilSFRakrK2SiOqdmIBdKF Fi4nhDIoAPnWCqNgQN2wH9D2CaFkBo0YvOU9U6W7aAhLe1Pxa1idheAs0gztXMjGPauw ubnbcS8lzLcr5EYKk8JaIM3UDyo63qiyfDxJXNbO0mq4iwh3TCIHkJqNs9NxhS/Euq/3 2AaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:to:from:subject; bh=XTDiFLs1g6MB9cJ5m0upXpXhxwoNltSaX8PRg4fpGGw=; b=MMrUweUKUFohCBaab9w/U+C3bn43meCVnhvBj6/lYiK9ozPvuOq68phaITBZ0eVC7r akmS7rNr4CEhIP3nqSxhlTMVnkMgDi1x/9Dg005D/vQCJaVhlYeRKuHDgKL3s8PQZLEN KmuXk9ZWfCdSgLoRGWf1H0qLua+7r5SkX+JcVcg9VdEsyEyv+H+o9M37RwkK2rTKRCQF Gok0EGhlDYmcKcriKN3/52TUMD9hG5fHiBVYIpp5seH+JcsybUGg/cs7ufb5wc4BLPsh EmoPkyLYJb4oAAMuV2lg06Ez7NaizlFqSIp6HvbtIo+Tt4VqKarbL4yXZkUfNCE5gK+F WiNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from relay.sw.ru (relay.sw.ru. [185.231.240.75]) by mx.google.com with ESMTPS id v13si1758574lji.214.2019.05.15.08.11.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 May 2019 08:11:47 -0700 (PDT) Received-SPF: pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) client-ip=185.231.240.75; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ktkhai@virtuozzo.com designates 185.231.240.75 as permitted sender) smtp.mailfrom=ktkhai@virtuozzo.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=virtuozzo.com Received: from [172.16.25.169] (helo=localhost.localdomain) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hQvZI-0001YA-DL; Wed, 15 May 2019 18:11:44 +0300 Subject: [PATCH RFC 5/5] mm: Add process_vm_mmap() From: Kirill Tkhai To: akpm@linux-foundation.org, dan.j.williams@intel.com, ktkhai@virtuozzo.com, mhocko@suse.com, keith.busch@intel.com, kirill.shutemov@linux.intel.com, pasha.tatashin@oracle.com, alexander.h.duyck@linux.intel.com, ira.weiny@intel.com, andreyknvl@google.com, arunks@codeaurora.org, vbabka@suse.cz, cl@linux.com, riel@surriel.com, keescook@chromium.org, hannes@cmpxchg.org, npiggin@gmail.com, mathieu.desnoyers@efficios.com, shakeelb@google.com, guro@fb.com, aarcange@redhat.com, hughd@google.com, jglisse@redhat.com, mgorman@techsingularity.net, daniel.m.jordan@oracle.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Date: Wed, 15 May 2019 18:11:44 +0300 Message-ID: <155793310413.13922.4749810361688380807.stgit@localhost.localdomain> In-Reply-To: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> References: <155793276388.13922.18064660723547377633.stgit@localhost.localdomain> User-Agent: StGit/0.18 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This adds a new syscall to map from or to another process vma. Flag PVMMAP_FIXED may be specified, its meaning is similar to mmap()'s MAP_FIXED. @pid > 0 means to map from process of @pid to current, @pid < 0 means to map from current to @pid process. VMA are merged on destination, i.e. if source task has VMA with address [start; end], and we map it sequentially twice: process_vm_mmap(@pid, start, start + (end - start)/2, ...); process_vm_mmap(@pid, start + (end - start)/2, end, ...); the destination task will have single vma [start, end]. Signed-off-by: Kirill Tkhai --- include/linux/mm.h | 4 + include/linux/mm_types.h | 2 + include/uapi/asm-generic/mman-common.h | 5 + mm/mmap.c | 108 ++++++++++++++++++++++++++++++++ mm/process_vm_access.c | 71 +++++++++++++++++++++ 5 files changed, 190 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 54328d08dbdd..c49bcfac593c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2382,6 +2382,10 @@ extern int __do_munmap(struct mm_struct *, unsigned long, size_t, struct list_head *uf, bool downgrade); extern int do_munmap(struct mm_struct *, unsigned long, size_t, struct list_head *uf); +extern unsigned long mmap_process_vm(struct mm_struct *, unsigned long, + struct mm_struct *, unsigned long, + unsigned long, unsigned long, + struct list_head *); static inline unsigned long do_mmap_pgoff(struct file *file, unsigned long addr, diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1815fbc40926..885f256f2fb7 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -261,11 +261,13 @@ struct vm_region { #ifdef CONFIG_USERFAULTFD #define NULL_VM_UFFD_CTX ((struct vm_userfaultfd_ctx) { NULL, }) +#define IS_NULL_VM_UFFD_CTX(uctx) ((uctx)->ctx == NULL) struct vm_userfaultfd_ctx { struct userfaultfd_ctx *ctx; }; #else /* CONFIG_USERFAULTFD */ #define NULL_VM_UFFD_CTX ((struct vm_userfaultfd_ctx) {}) +#define IS_NULL_VM_UFFD_CTX(uctx) (true) struct vm_userfaultfd_ctx {}; #endif /* CONFIG_USERFAULTFD */ diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index abd238d0f7a4..44cb6cf77e93 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -28,6 +28,11 @@ /* 0x0100 - 0x80000 flags are defined in asm-generic/mman.h */ #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ +/* + * Flags for process_vm_mmap + */ +#define PVMMAP_FIXED 0x01 + /* * Flags for mlock */ diff --git a/mm/mmap.c b/mm/mmap.c index b2a1f77643cd..3dbf280e9f8e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3274,6 +3274,114 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, return NULL; } +static int do_mmap_process_vm(struct vm_area_struct *src_vma, + unsigned long src_addr, + struct mm_struct *dst_mm, + unsigned long dst_addr, + unsigned long len, + struct list_head *uf) +{ + struct vm_area_struct *dst_vma; + unsigned long pgoff, ret; + bool unused; + + if (do_munmap(dst_mm, dst_addr, len, uf)) + return -ENOMEM; + + if (src_vma->vm_flags & VM_ACCOUNT) { + if (security_vm_enough_memory_mm(dst_mm, len >> PAGE_SHIFT)) + return -ENOMEM; + } + + pgoff = src_vma->vm_pgoff + + ((src_addr - src_vma->vm_start) >> PAGE_SHIFT); + dst_vma = copy_vma(&src_vma, dst_mm, dst_addr, + len, pgoff, &unused, false); + if (!dst_vma) { + ret = -ENOMEM; + goto unacct; + } + + ret = copy_page_range(dst_mm, src_vma->vm_mm, src_vma, + dst_addr, src_addr, src_addr + len); + if (ret) { + do_munmap(dst_mm, dst_addr, len, uf); + return -ENOMEM; + } + + if (dst_vma->vm_file) + uprobe_mmap(dst_vma); + perf_event_mmap(dst_vma); + + dst_vma->vm_flags |= VM_SOFTDIRTY; + vma_set_page_prot(dst_vma); + + vm_stat_account(dst_mm, dst_vma->vm_flags, len >> PAGE_SHIFT); + return 0; + +unacct: + vm_unacct_memory(len >> PAGE_SHIFT); + return ret; +} + +unsigned long mmap_process_vm(struct mm_struct *src_mm, + unsigned long src_addr, + struct mm_struct *dst_mm, + unsigned long dst_addr, + unsigned long len, + unsigned long flags, + struct list_head *uf) +{ + struct vm_area_struct *src_vma = find_vma(src_mm, src_addr); + unsigned long gua_flags = 0; + unsigned long ret; + + if (!src_vma || src_vma->vm_start > src_addr) + return -EFAULT; + if (len > src_vma->vm_end - src_addr) + return -EFAULT; + if (src_vma->vm_flags & (VM_DONTEXPAND | VM_PFNMAP)) + return -EFAULT; + if (is_vm_hugetlb_page(src_vma) || (src_vma->vm_flags & VM_IO)) + return -EINVAL; + if (dst_mm->map_count + 2 > sysctl_max_map_count) + return -ENOMEM; + if (!IS_NULL_VM_UFFD_CTX(&src_vma->vm_userfaultfd_ctx)) + return -ENOTSUPP; + + if (src_vma->vm_flags & VM_SHARED) + gua_flags |= MAP_SHARED; + else + gua_flags |= MAP_PRIVATE; + if (vma_is_anonymous(src_vma) || vma_is_shmem(src_vma)) + gua_flags |= MAP_ANONYMOUS; + if (flags & PVMMAP_FIXED) + gua_flags |= MAP_FIXED; + ret = get_unmapped_area(src_vma->vm_file, dst_addr, len, + src_vma->vm_pgoff + + ((src_addr - src_vma->vm_start) >> PAGE_SHIFT), + gua_flags); + if (offset_in_page(ret)) + return ret; + dst_addr = ret; + + /* Check against address space limit. */ + if (!may_expand_vm(dst_mm, src_vma->vm_flags, len >> PAGE_SHIFT)) { + unsigned long nr_pages; + + nr_pages = count_vma_pages_range(dst_mm, dst_addr, dst_addr + len); + if (!may_expand_vm(dst_mm, src_vma->vm_flags, + (len >> PAGE_SHIFT) - nr_pages)) + return -ENOMEM; + } + + ret = do_mmap_process_vm(src_vma, src_addr, dst_mm, dst_addr, len, uf); + if (ret) + return ret; + + return dst_addr; +} + /* * Return true if the calling process may expand its vm space by the passed * number of pages diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c index a447092d4635..7fca2c5c7edd 100644 --- a/mm/process_vm_access.c +++ b/mm/process_vm_access.c @@ -17,6 +17,8 @@ #include #include #include +#include +#include #ifdef CONFIG_COMPAT #include @@ -295,6 +297,68 @@ static ssize_t process_vm_rw(pid_t pid, return rc; } +static unsigned long process_vm_mmap(pid_t pid, unsigned long src_addr, + unsigned long len, unsigned long dst_addr, + unsigned long flags) +{ + struct mm_struct *src_mm, *dst_mm; + struct task_struct *task; + unsigned long ret; + int depth = 0; + LIST_HEAD(uf); + + len = PAGE_ALIGN(len); + src_addr = round_down(src_addr, PAGE_SIZE); + if (flags & PVMMAP_FIXED) + dst_addr = round_down(dst_addr, PAGE_SIZE); + else + dst_addr = round_hint_to_min(dst_addr); + + if ((flags & ~PVMMAP_FIXED) || len == 0 || len > TASK_SIZE || + src_addr == 0 || dst_addr > TASK_SIZE - len) + return -EINVAL; + task = find_get_task_by_vpid(pid > 0 ? pid : -pid); + if (!task) + return -ESRCH; + if (unlikely(task->flags & PF_KTHREAD)) { + ret = -EINVAL; + goto out_put_task; + } + + src_mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS); + if (!src_mm || IS_ERR(src_mm)) { + ret = IS_ERR(src_mm) ? PTR_ERR(src_mm) : -ESRCH; + goto out_put_task; + } + dst_mm = current->mm; + mmget(dst_mm); + + if (pid < 0) + swap(src_mm, dst_mm); + + /* Double lock mm in address order: smallest is the first */ + if (src_mm < dst_mm) { + down_write(&src_mm->mmap_sem); + depth = SINGLE_DEPTH_NESTING; + } + down_write_nested(&dst_mm->mmap_sem, depth); + if (src_mm > dst_mm) + down_write_nested(&src_mm->mmap_sem, SINGLE_DEPTH_NESTING); + + ret = mmap_process_vm(src_mm, src_addr, dst_mm, dst_addr, len, flags, &uf); + + up_write(&dst_mm->mmap_sem); + if (dst_mm != src_mm) + up_write(&src_mm->mmap_sem); + + userfaultfd_unmap_complete(dst_mm, &uf); + mmput(src_mm); + mmput(dst_mm); +out_put_task: + put_task_struct(task); + return ret; +} + SYSCALL_DEFINE6(process_vm_readv, pid_t, pid, const struct iovec __user *, lvec, unsigned long, liovcnt, const struct iovec __user *, rvec, unsigned long, riovcnt, unsigned long, flags) @@ -310,6 +374,13 @@ SYSCALL_DEFINE6(process_vm_writev, pid_t, pid, return process_vm_rw(pid, lvec, liovcnt, rvec, riovcnt, flags, 1); } +SYSCALL_DEFINE5(process_vm_mmap, pid_t, pid, + unsigned long, src_addr, unsigned long, len, + unsigned long, dst_addr, unsigned long, flags) +{ + return process_vm_mmap(pid, src_addr, len, dst_addr, flags); +} + #ifdef CONFIG_COMPAT static ssize_t