From patchwork Fri Aug 9 22:58:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087911 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B1901709 for ; Fri, 9 Aug 2019 22:58:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8936722362 for ; Fri, 9 Aug 2019 22:58:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7D23A2223E; Fri, 9 Aug 2019 22:58:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD3F420602 for ; Fri, 9 Aug 2019 22:58:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 300C96B0006; Fri, 9 Aug 2019 18:58:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 28A116B0007; Fri, 9 Aug 2019 18:58:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 179D16B0008; Fri, 9 Aug 2019 18:58:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id D2E6A6B0006 for ; Fri, 9 Aug 2019 18:58:43 -0400 (EDT) Received: by mail-pf1-f200.google.com with SMTP id g185so118580pfb.13 for ; Fri, 09 Aug 2019 15:58:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=tZ+u5eRVmAN1mjnSWZ32Srp51m427DgazdZtvtoveps=; b=tWxZfaY/rFBmdSuj6gXplQhpk+fHyL9lmLoliPUQTDMXGiTv5epiEBFXxB2FtqL0Wq UxlDOz2rCRdteczep6j9d3Tg8qa9p3W6nlGsMTizVq5DkJHUU3TYKSRjPcmawDF41gF8 w81KQL9fttgneYwSgMvY9CBtMNDYXz0DuazYEekPhs4AZ9RT83ZBG80siUCWJjSsKu/s cDRUo2J3l7xe+sKj7m3hPxBCSi4UmRNshX4QfIJB7l0p2yw8rXKu8fVIZk2Bja5dpfkc SijL7vsvusI9x+VZqPIrmoUxSPY50imxqyRi4LeprQb+ZI/wHem0tTxHzDGpB2NhDux/ rI6g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX4vgO/Tmasz0P0Uz/6DiHTCdqiSxaKxHxytEUR3fQYAJg58r8F Gp/P5F6z0js48p7xa1twN/SZE1rkwUp85m6pZQhdA68dsyhL2FQLNKPggJtsiQ9XpXEc8XRmuy6 B+PPGUWVytYg2KqJgxA5x1jLDWRmCLU3SV/3seSNZcQbASkD7I1rJw/weC5ar/4OKYA== X-Received: by 2002:a17:902:20cc:: with SMTP id v12mr9254046plg.188.1565391523507; Fri, 09 Aug 2019 15:58:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqz7Uz2hhr/itCjfLPYnpxbATN0RIhI+vyB6OQNdE6/SeI2thFn3dkaPwEjmGSB4DCbIxhbJ X-Received: by 2002:a17:902:20cc:: with SMTP id v12mr9254000plg.188.1565391522467; Fri, 09 Aug 2019 15:58:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391522; cv=none; d=google.com; s=arc-20160816; b=A2QXeTFM0O6eO2PaP4BTZRFp3VFgghhJAbjPtrcdvlcBPtWw51LqXhjX23D4qDKCqe SiTp9TmiqrxdnLsr8LLcmbFnBs/BMo7dHiGR7dXm1QdT2SB//OsyiplJTNNaWLwJVe/p NLwC5hzTSIznh2DaUtZ2g3mRMI9CKU5QwHy64cV4UBOpjkPxrhsIgsE5Ynk8K35Klgao LT6LQXUa+7T/YC0UtDwADgitRBCP01H/EjpptCcjSZVJZAQ1yvbOeAa6v7OAh62kRSeu Ci+gm5BRFk7eN4sJkh5OzFzUOsi/V4L94A/qen9i2H4VKrs3TeDw15QcM0l5NXw/b8vU +kfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=tZ+u5eRVmAN1mjnSWZ32Srp51m427DgazdZtvtoveps=; b=ou4NQx/7Mk+NTu23CZzHILdq03S9SopWgnBVFHNTZOiyID2QrP9Dx3ydQoAjMJ8Xic 1fiLSgCUVGKGC57Q6S+sfLOFnxey6v72nRrv1mwkBFxT6p13jbuIXPevRG66w0s0bzLs mO5SUC+/gDnuzNcCihRkHvIwdDPJiuWUQY6WvFLqTugWJoQV6uTQq+PIzwg8hrau5Wo+ 7DKNQ1A6xsuuDrvFD7+LddGD1JyFttvKAdDghROBatSjUlBIeg6+bkojX57F7RjvMR55 9UP6+m1vcX0ZcHfzsawCf2Tje2A8Pt+Q0/sVUuoaT7p/dffcFzCCJQOnDnKuV2TU+J4d ntbA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga17.intel.com (mga17.intel.com. [192.55.52.151]) by mx.google.com with ESMTPS id i133si55033550pgc.109.2019.08.09.15.58.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:42 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) client-ip=192.55.52.151; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:41 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="374631453" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:41 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space Date: Fri, 9 Aug 2019 15:58:15 -0700 Message-Id: <20190809225833.6657-2-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny In order to support an opt-in policy for users to allow long term pins of FS DAX pages we need to export the LAYOUT lease to user space. This is the first of 2 new lease flags which must be used to allow a long term pin to be made on a file. After the complete series: 0) Registrations to Device DAX char devs are not affected 1) The user has to opt in to allowing page pins on a file with an exclusive layout lease. Both exclusive and layout lease flags are user visible now. 2) page pins will fail if the lease is not active when the file back page is encountered. 3) Any truncate or hole punch operation on a pinned DAX page will fail. 4) The user has the option of holding the lease or releasing it. If they release it no other pin calls will work on the file. 5) Closing the file is ok. 6) Unmapping the file is ok 7) Pins against the files are tracked back to an owning file or an owning mm depending on the internal subsystem needs. With RDMA there is an owning file which is related to the pined file. 8) Only RDMA is currently supported 9) Truncation of pages which are not actively pinned nor covered by a lease will succeed. Signed-off-by: Ira Weiny --- fs/locks.c | 36 +++++++++++++++++++++++++++----- include/linux/fs.h | 2 +- include/uapi/asm-generic/fcntl.h | 3 +++ 3 files changed, 35 insertions(+), 6 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 24d1db632f6c..ad17c6ffca06 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -191,6 +191,8 @@ static int target_leasetype(struct file_lock *fl) return F_UNLCK; if (fl->fl_flags & FL_DOWNGRADE_PENDING) return F_RDLCK; + if (fl->fl_flags & FL_LAYOUT) + return F_LAYOUT; return fl->fl_type; } @@ -611,7 +613,8 @@ static const struct lock_manager_operations lease_manager_ops = { /* * Initialize a lease, use the default lock manager operations */ -static int lease_init(struct file *filp, long type, struct file_lock *fl) +static int lease_init(struct file *filp, long type, unsigned int flags, + struct file_lock *fl) { if (assign_type(fl, type) != 0) return -EINVAL; @@ -621,6 +624,8 @@ static int lease_init(struct file *filp, long type, struct file_lock *fl) fl->fl_file = filp; fl->fl_flags = FL_LEASE; + if (flags & FL_LAYOUT) + fl->fl_flags |= FL_LAYOUT; fl->fl_start = 0; fl->fl_end = OFFSET_MAX; fl->fl_ops = NULL; @@ -629,7 +634,8 @@ static int lease_init(struct file *filp, long type, struct file_lock *fl) } /* Allocate a file_lock initialised to this type of lease */ -static struct file_lock *lease_alloc(struct file *filp, long type) +static struct file_lock *lease_alloc(struct file *filp, long type, + unsigned int flags) { struct file_lock *fl = locks_alloc_lock(); int error = -ENOMEM; @@ -637,7 +643,7 @@ static struct file_lock *lease_alloc(struct file *filp, long type) if (fl == NULL) return ERR_PTR(error); - error = lease_init(filp, type, fl); + error = lease_init(filp, type, flags, fl); if (error) { locks_free_lock(fl); return ERR_PTR(error); @@ -1583,7 +1589,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type) int want_write = (mode & O_ACCMODE) != O_RDONLY; LIST_HEAD(dispose); - new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK); + new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK, 0); if (IS_ERR(new_fl)) return PTR_ERR(new_fl); new_fl->fl_flags = type; @@ -1720,6 +1726,8 @@ EXPORT_SYMBOL(lease_get_mtime); * * %F_UNLCK to indicate no lease is held. * + * %F_LAYOUT to indicate a layout lease is held. + * * (if a lease break is pending): * * %F_RDLCK to indicate an exclusive lease needs to be @@ -2022,8 +2030,26 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) struct file_lock *fl; struct fasync_struct *new; int error; + unsigned int flags = 0; + + /* + * NOTE on F_LAYOUT lease + * + * LAYOUT lease types are taken on files which the user knows that + * they will be pinning in memory for some indeterminate amount of + * time. Such as for use with RDMA. While we don't know what user + * space is going to do with the file we still use a F_RDLOCK level of + * lease. This ensures that there are no conflicts between + * 2 users. The conflict should only come from the File system wanting + * to revoke the lease in break_layout() And this is done by using + * F_WRLCK in the break code. + */ + if (arg == F_LAYOUT) { + arg = F_RDLCK; + flags = FL_LAYOUT; + } - fl = lease_alloc(filp, arg); + fl = lease_alloc(filp, arg, flags); if (IS_ERR(fl)) return PTR_ERR(fl); diff --git a/include/linux/fs.h b/include/linux/fs.h index 046108cd4ed9..dd60d5be9886 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1004,7 +1004,7 @@ static inline struct file *get_file(struct file *f) #define FL_DOWNGRADE_PENDING 256 /* Lease is being downgraded */ #define FL_UNLOCK_PENDING 512 /* Lease is being broken */ #define FL_OFDLCK 1024 /* lock is "owned" by struct file */ -#define FL_LAYOUT 2048 /* outstanding pNFS layout */ +#define FL_LAYOUT 2048 /* outstanding pNFS layout or user held pin */ #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE) diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h index 9dc0bf0c5a6e..baddd54f3031 100644 --- a/include/uapi/asm-generic/fcntl.h +++ b/include/uapi/asm-generic/fcntl.h @@ -174,6 +174,9 @@ struct f_owner_ex { #define F_SHLCK 8 /* or 4 */ #endif +#define F_LAYOUT 16 /* layout lease to allow longterm pins such as + RDMA */ + /* operations for bsd flock(), also used by the kernel implementation */ #define LOCK_SH 1 /* shared lock */ #define LOCK_EX 2 /* exclusive lock */ From patchwork Fri Aug 9 22:58:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087915 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF5936C5 for ; Fri, 9 Aug 2019 22:58:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CBC6620602 for ; Fri, 9 Aug 2019 22:58:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C03B12228E; Fri, 9 Aug 2019 22:58:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A2BD20602 for ; Fri, 9 Aug 2019 22:58:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB09A6B0007; Fri, 9 Aug 2019 18:58:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D630A6B0008; Fri, 9 Aug 2019 18:58:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD94A6B000A; Fri, 9 Aug 2019 18:58:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id 891E76B0007 for ; Fri, 9 Aug 2019 18:58:45 -0400 (EDT) Received: by mail-pf1-f200.google.com with SMTP id x10so62402834pfa.23 for ; Fri, 09 Aug 2019 15:58:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=X5K84N7Wtvc2gT7NHzM01KBStTXqgtGW5hDxI8bSTaM=; b=srV5ZSTJd0clE+l+/HoMAwBH8BB4cRqkHB700g8vIZRf+1knd/z420N273NfCRia/k MeGj9wxV5jah51qIYHSKvSQMSoWN6WDw8neFJi67PzOe2ZmBFnzuKHRiBHQSTfZXveL9 eYdH/6Rx5WvpVUK88CMqcq/KIMILe+pSO3s0EXJRBlSp7x51XSOoOfMy0LD+zKBnmnGy 1i13X9Vh+wX3pMTXXa23bC1Bo2WY/Xlj6YLGfDl4zmdLcW+XNmDVwSWI0gnBReu5fCLl 3nYplM9AUM8oCzZvgx2xo4DNpFQJMNMVYMptidQy9boO+hLoCPmbID4n0lgA4tdiTbs/ Qn/A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVxRiBKrZu9Idv8MvYReb3V3O9yyHSwdwljykCUezXX4m/2tNTr UlB40dqRiF+UtlWEezuIpr6B1kS9mDAfmsXwylbTmfOlzboc+CHVwoNg2YS9HlJ8fJ4hP9ChvqJ Yq0teH/pkC6ClHSlHxXbsZogQ6C8wkjOTxUQu7jQeM4Nxo1KqxEMETXaE3domjisxYg== X-Received: by 2002:a17:902:1024:: with SMTP id b33mr11989619pla.325.1565391525237; Fri, 09 Aug 2019 15:58:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqxyuBXbbb7e6zJeuZ+iUcZ716TVPSSO39BjUCelsKBvZW1yjGdPxEnje4mp7nk+jUh9LzO3 X-Received: by 2002:a17:902:1024:: with SMTP id b33mr11989566pla.325.1565391524375; Fri, 09 Aug 2019 15:58:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391524; cv=none; d=google.com; s=arc-20160816; b=Mhf/hedJeUe3ipK5h3CGe7EAyVassR5/hvN1i4XUbVS/papwm6/2KWo6/Bhft01qwr z9zn3jStkFOeZPd4tATmO3U5MaMEzwkjZXHYJOZQiD+Dn14I+tmd7Gv9j3lM5CqpWo+y xyvjQ18e1hVRMty2e4qdFmBfLWGsX+qtkhh4MZz90a4r8hB/O29a8/ucXA+8E3cxSkPE x3lNUaGjHM+d1AxyG/G5YcIE0PioAvBUjB/rl6iVDRkjBeiT8jj114HETQ3mhv5P/KlJ 8cORgEqnI5eD6qyf+o36Ol5Nvbs6xpWJN+vCcxc6IfyiNx4fJi3hP602i0s+HYSSMDVe IMlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=X5K84N7Wtvc2gT7NHzM01KBStTXqgtGW5hDxI8bSTaM=; b=cU0tNxWfw4sR4LhKIaoOrbn6HjfCLaPdrbvcAZvZb14PVfiSIoMPWN73thUYCU3cNV CxVUYNFFjvBF+3YLwQ+qpWwMq+HVCKyQTAY/XhSINjxgSchsjO/lt6J9SLT0D2A0jyWq NAd7l2L6OADZy/E8XLVuR+7LM4hVAB/u58JER7TNqY78/GyFm2dJQHORF6MSkDtPRtAF t8VJZwnDUe+EFbabtC3tqQ7GZP3IdMQNmNIM02TdMhwsqyE+aIzDgT5aBy+DhaamAJxl Hof5oynzTT47X2ESwijGDVBK+nSD8R5/3t8SzrK7h3DeZE0e7tVw89WemBGpiD2mN89Y MglA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id f131si52945262pgc.265.2019.08.09.15.58.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:44 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:43 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="199539184" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:43 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 02/19] fs/locks: Add Exclusive flag to user Layout lease Date: Fri, 9 Aug 2019 15:58:16 -0700 Message-Id: <20190809225833.6657-3-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Add an exclusive lease flag which indicates that the layout mechanism can not be broken. Exclusive layout leases allow the file system to know that pages may be GUP pined and that attempts to change the layout, ie truncate, should be failed. A process which attempts to break it's own exclusive lease gets an EDEADLOCK return to help determine that this is likely a programming bug vs someone else holding a resource. Signed-off-by: Ira Weiny --- fs/locks.c | 23 +++++++++++++++++++++-- include/linux/fs.h | 1 + include/uapi/asm-generic/fcntl.h | 2 ++ 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index ad17c6ffca06..0c7359cdab92 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -626,6 +626,8 @@ static int lease_init(struct file *filp, long type, unsigned int flags, fl->fl_flags = FL_LEASE; if (flags & FL_LAYOUT) fl->fl_flags |= FL_LAYOUT; + if (flags & FL_EXCLUSIVE) + fl->fl_flags |= FL_EXCLUSIVE; fl->fl_start = 0; fl->fl_end = OFFSET_MAX; fl->fl_ops = NULL; @@ -1619,6 +1621,14 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type) list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list) { if (!leases_conflict(fl, new_fl)) continue; + if (fl->fl_flags & FL_EXCLUSIVE) { + error = -ETXTBSY; + if (new_fl->fl_pid == fl->fl_pid) { + error = -EDEADLOCK; + goto out; + } + continue; + } if (want_write) { if (fl->fl_flags & FL_UNLOCK_PENDING) continue; @@ -1634,6 +1644,13 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type) locks_delete_lock_ctx(fl, &dispose); } + /* We differentiate between -EDEADLOCK and -ETXTBSY so the above loop + * continues with -ETXTBSY looking for a potential deadlock instead. + * If deadlock is not found go ahead and return -ETXTBSY. + */ + if (error == -ETXTBSY) + goto out; + if (list_empty(&ctx->flc_lease)) goto out; @@ -2044,9 +2061,11 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) * to revoke the lease in break_layout() And this is done by using * F_WRLCK in the break code. */ - if (arg == F_LAYOUT) { + if ((arg & F_LAYOUT) == F_LAYOUT) { + if ((arg & F_EXCLUSIVE) == F_EXCLUSIVE) + flags |= FL_EXCLUSIVE; arg = F_RDLCK; - flags = FL_LAYOUT; + flags |= FL_LAYOUT; } fl = lease_alloc(filp, arg, flags); diff --git a/include/linux/fs.h b/include/linux/fs.h index dd60d5be9886..2e41ce547913 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1005,6 +1005,7 @@ static inline struct file *get_file(struct file *f) #define FL_UNLOCK_PENDING 512 /* Lease is being broken */ #define FL_OFDLCK 1024 /* lock is "owned" by struct file */ #define FL_LAYOUT 2048 /* outstanding pNFS layout or user held pin */ +#define FL_EXCLUSIVE 4096 /* Layout lease is exclusive */ #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE) diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h index baddd54f3031..88b175ceccbc 100644 --- a/include/uapi/asm-generic/fcntl.h +++ b/include/uapi/asm-generic/fcntl.h @@ -176,6 +176,8 @@ struct f_owner_ex { #define F_LAYOUT 16 /* layout lease to allow longterm pins such as RDMA */ +#define F_EXCLUSIVE 32 /* layout lease is exclusive */ + /* FIXME or shoudl this be F_EXLCK??? */ /* operations for bsd flock(), also used by the kernel implementation */ #define LOCK_SH 1 /* shared lock */ From patchwork Fri Aug 9 22:58:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087923 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2C806C5 for ; Fri, 9 Aug 2019 22:58:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE8CC20602 for ; Fri, 9 Aug 2019 22:58:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C2CDF2223E; Fri, 9 Aug 2019 22:58:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50A6220602 for ; Fri, 9 Aug 2019 22:58:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0461E6B0008; Fri, 9 Aug 2019 18:58:47 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E4E7E6B000A; Fri, 9 Aug 2019 18:58:46 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D16CC6B000C; Fri, 9 Aug 2019 18:58:46 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 992F66B0008 for ; Fri, 9 Aug 2019 18:58:46 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id q67so2583087pfc.10 for ; Fri, 09 Aug 2019 15:58:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=kI6c4a62ifA21eFJIjI6cgtgJg208C1izl/S8s1k1Lg=; b=jjETGhSdndlDiV/ywvSWPyYXGmNcjAc4kQMbhJLgjKutOMkyQ2R3Np1RVQ96TmFoop PWN2o7vcWlBqI3b5hInA77YHstVS/2ovwEN32M6Kp1TOUTHrtV0vnjZIWJYecS22FtaN Bkp+AqtwZFfjsXJy8u4STzKzfyDaRrSf+U1B/jMBHfZW+MYSRLz/lxLiSEhKjIHkVZri lnIBc0S6mQEQ9efH3fRvOs4wZ9EaRGj0iV+YdfTsok8QoV3LoTr1/UFwtK82xcqf9bNE RZ0dhS4ndp6QAKwForhAwM65aGFLbObFd8NoxPLsqRZQ2nJgbE8MHagGVadWUxnC+qQz 8ZFA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXgLMn5TAtG5TCh0ZxTBHUzLqHk0QyXcqFG2ubALsVqfAIi2OEY rt0T4a2mNwESRNkTk0b1WNsyE/xo9yBB5QaX8ZUKsIA8W0ljYblie80aICKygGxL9O5cpS+hDHb CZNG77/OHI5LlkQQ72z3eI7yyjTWS72mYFI8rqa6UHh4h8pq3h+vhNJn1yv4wwGxDVg== X-Received: by 2002:a65:5188:: with SMTP id h8mr19464304pgq.294.1565391526224; Fri, 09 Aug 2019 15:58:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqyCJeCr5PEJCh99HXj5KAaResUKAtaXn6ccUKI7IiRXPTG2rcDpkl1ZUs3oQ8QVQcCfgs8Q X-Received: by 2002:a65:5188:: with SMTP id h8mr19464272pgq.294.1565391525431; Fri, 09 Aug 2019 15:58:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391525; cv=none; d=google.com; s=arc-20160816; b=SbTQ4vKii5/sYWj8kCfALpq8kHa7C5r+WOu8UvnvQPCZJhoCD+Nta6yjUs6oaxfIG5 j/W9P5xr3j3NBtqViyDxDO/OcY1X1aFZ8LbkOVirFulq2qVK4OTGWWyfPwwqfVcdzTXZ jhd0hLLb01mKeBWjvc7sq9Xc7ZxAulX74bAOcLYp624AiHjbz3+c9ScijRVkKMeiaYP6 aIGMMUdh+oioOigfHwJN49SF5XWvpKb114PgStLogG+wGAyp0AWpQ8Q7DuA1nwC9oiw4 hPpxwYWiUDqM1eT7UqZyJvJR8q0isZx0EXirfiGexOMFAQXeEdKWlDsGeLedYNNkR410 YafQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=kI6c4a62ifA21eFJIjI6cgtgJg208C1izl/S8s1k1Lg=; b=0/p2y3KOuheA32Uw9l6Bg3/MleFZvpawimts5iN4/6GO/RIJ2xMTfWkWm2ntlbpX+e hDZbZnyMQ9x9WhBv1NFfc8dpln1R26xSgVNeCVVJgCZen57gLe1JGbQKoGeBT6SPkS66 D5BQRK7qrnBLkvXMF0gtrZD57J3w0qQOwZ9OO+deAws89ZxuraxAZDBQq6v54i1COcZm /Egk7UNyElkdq5klSRFepd7+IInUr8XJ1sA3HodYPB8EoMiCXpDnDI11OV6DXVfneFaT R/AEzPhgxbYLivCNIuUpdsejE3Yc+renDr3ABfM2PokM4BzJp/y99ZZqCyVMUFOMEaBY me/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id k22si25962862pfi.289.2019.08.09.15.58.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:45 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="183030758" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:44 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 03/19] mm/gup: Pass flags down to __gup_device_huge* calls Date: Fri, 9 Aug 2019 15:58:17 -0700 Message-Id: <20190809225833.6657-4-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny In order to support checking for a layout lease on a FS DAX inode these calls need to know if FOLL_LONGTERM was specified. Signed-off-by: Ira Weiny --- mm/gup.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index b6a293bf1267..80423779a50a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1881,7 +1881,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, #if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) static int __gup_device_huge(unsigned long pfn, unsigned long addr, - unsigned long end, struct page **pages, int *nr) + unsigned long end, struct page **pages, int *nr, + unsigned int flags) { int nr_start = *nr; struct dev_pagemap *pgmap = NULL; @@ -1907,30 +1908,33 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, } static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) + unsigned long end, struct page **pages, int *nr, + unsigned int flags) { unsigned long fault_pfn; int nr_start = *nr; fault_pfn = pmd_pfn(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - if (!__gup_device_huge(fault_pfn, addr, end, pages, nr)) + if (!__gup_device_huge(fault_pfn, addr, end, pages, nr, flags)) return 0; if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { undo_dev_pagemap(nr, nr_start, pages); return 0; } + return 1; } static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) + unsigned long end, struct page **pages, int *nr, + unsigned int flags) { unsigned long fault_pfn; int nr_start = *nr; fault_pfn = pud_pfn(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - if (!__gup_device_huge(fault_pfn, addr, end, pages, nr)) + if (!__gup_device_huge(fault_pfn, addr, end, pages, nr, flags)) return 0; if (unlikely(pud_val(orig) != pud_val(*pudp))) { @@ -1941,14 +1945,16 @@ static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, } #else static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) + unsigned long end, struct page **pages, int *nr, + unsigned int flags) { BUILD_BUG(); return 0; } static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr, - unsigned long end, struct page **pages, int *nr) + unsigned long end, struct page **pages, int *nr, + unsigned int flags) { BUILD_BUG(); return 0; @@ -2051,7 +2057,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (pmd_devmap(orig)) { if (unlikely(flags & FOLL_LONGTERM)) return 0; - return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr); + return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr, + flags); } refs = 0; @@ -2092,7 +2099,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, if (pud_devmap(orig)) { if (unlikely(flags & FOLL_LONGTERM)) return 0; - return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr); + return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr, + flags); } refs = 0; From patchwork Fri Aug 9 22:58:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087927 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C24B13B1 for ; Fri, 9 Aug 2019 22:58:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE11C20602 for ; Fri, 9 Aug 2019 22:58:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E247E2223E; Fri, 9 Aug 2019 22:58:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5264720602 for ; Fri, 9 Aug 2019 22:58:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F7366B000A; Fri, 9 Aug 2019 18:58:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 859746B000C; Fri, 9 Aug 2019 18:58:48 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D2916B000D; Fri, 9 Aug 2019 18:58:48 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 3554C6B000A for ; Fri, 9 Aug 2019 18:58:48 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id x18so62340774pfj.4 for ; Fri, 09 Aug 2019 15:58:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=lElqodQf2pHfVO17BVRpL/ZEwcuJxLyXS4cguStFJ2c=; b=Cy6d8Pn0wSTDI72FmAsp+zp9l2YM4C0PC+NBRyD3uOddqQJp2OPOAbD1R1Re/5HoNU FzxL9IY+sKLUOPcNClw72WXGQ+WnRVSDFs9kCfTj1VKdoWvpjclk21s3C72hE5024xzA 0bK2guaVG9Yzy4oSz15bk0A27GfTf0agv0lKcbsK3QfFwRXW2JpT2OM9efvBaAx1elDT BrpYRfwzZ73I/ZslRWC5bl0+Icv0HF22SZ3QAnlIyykDBQTFhGN36G2HxO++19oYwN8a s523MJtwgtYLZWLl5DjVfe6UHWcvzA/bcDCZ+okTuVSGcyiuQyvbdOEsRs2+EOp+EU5w 3b9w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXJFHAqE5GimE9FIlZwaWzovq8paNeK+1CuGNl/S2C7il/IgVmM 9Iy7wpXFGRNnf5JGGvaNeUIGbKPyXVt8rxECHzlD8LLv6m8bsSdtDNB6IUdSAnHWCDh4nMWL7Dw /vJQT0JOFWufLSNXVoQLtV9PBuS7KHoqW8X82yabHMMOgkGvzI3A+CfgxmtJr/pBWrg== X-Received: by 2002:aa7:9118:: with SMTP id 24mr23180450pfh.56.1565391527876; Fri, 09 Aug 2019 15:58:47 -0700 (PDT) X-Google-Smtp-Source: APXvYqwf+ndBVq1/LGeGnbRptpFL+WpyKr1VE25Z92uAP9r0uU9XL2qbsOVbOCuHGK4GiGVyrYW3 X-Received: by 2002:aa7:9118:: with SMTP id 24mr23180387pfh.56.1565391526978; Fri, 09 Aug 2019 15:58:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391526; cv=none; d=google.com; s=arc-20160816; b=R4LwmW+AVjGoqjV4XsFF5fvgwU5w1PBhhVzegi4RLEytFnj0lNtQdnQtfRvZlQ7IHh c6d7IkD/TCkqgbJDfQHHfXJ6Y0ukZ7223GGiHdmn6EDjTSZqt4oXj+8HOnUTQNRRgWtB IugyyXzq18jvjIrOQppKvpK9oeWiDEg3lI6bxpZiNQXUTPLl3wZh9fMxsbJgGQlQ9Q64 K+68ox4DAJz1q047MmDQ6wtF8qzzwBAJmf3azCsuH0m0l1us6u4O2PvBzTwcLiHkVUAt eyQPKMIiynz/xsb5zf8fFkubFzAZRsINWw3KgNK/4YC/8yXjr0g8FO6vhpx4yEFZ9GUQ UjWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=lElqodQf2pHfVO17BVRpL/ZEwcuJxLyXS4cguStFJ2c=; b=WvfUBwbQMu71zsiywvEdCn2ii6nH+qVBjlxn1eMapQEMZ4D5P6Iuo+OpAkfvWL99yP 3m0ohgEZIM8wbNgfxar1o+YE9S8s6MlrIe3EN2wBX/ORCDLwp63wr6x0SAZ7lxlDuC9r mcymL0POYd/d5Z6+eS3PayuD+YBu5d2qh8pL1zH79jHe42ilta3h5wf+b0E0ZBVT+Sls iwWzTIRG+cBKKIK047mn1/iXSQ5iehoQV67XG8HF33D0pxQQlPogfPlwooITI7A1lmW2 nn7AJXkyYWUUG87WB3kKiSPhIcI6KDj8iNq8/IV4AES7qWragq0f5etx9PVfEZt+HEgr 53zw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id y14si41662198pfr.82.2019.08.09.15.58.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:46 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="204067068" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:46 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 04/19] mm/gup: Ensure F_LAYOUT lease is held prior to GUP'ing pages Date: Fri, 9 Aug 2019 15:58:18 -0700 Message-Id: <20190809225833.6657-5-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny On FS DAX files users must inform the file system they intend to take long term GUP pins on the file pages. Failure to do so should result in an error. Ensure that a F_LAYOUT lease exists at the time the GUP call is made. If not return EPERM. Signed-off-by: Ira Weiny --- Changes from RFC v1: The old version had remnants of when GUP was going to take the lease for the user. Remove this prototype code. Fix issue in gup_device_huge which was setting page reference prior to checking for Layout Lease Re-base to 5.3+ Clean up htmldoc comments fs/locks.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++ include/linux/mm.h | 2 ++ mm/gup.c | 23 +++++++++++++++++++++++ mm/huge_memory.c | 12 ++++++++++++ 4 files changed, 84 insertions(+) diff --git a/fs/locks.c b/fs/locks.c index 0c7359cdab92..14892c84844b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2971,3 +2971,50 @@ static int __init filelock_init(void) return 0; } core_initcall(filelock_init); + +/** + * mapping_inode_has_layout - ensure a file mapped page has a layout lease + * taken + * @page: page we are trying to GUP + * + * This should only be called on DAX pages. DAX pages which are mapped through + * FS DAX do not use the page cache. As a result they require the user to take + * a LAYOUT lease on them prior to be able to pin them for longterm use. + * This allows the user to opt-into the fact that truncation operations will + * fail for the duration of the pin. + * + * Return true if the page has a LAYOUT lease associated with it's file. + */ +bool mapping_inode_has_layout(struct page *page) +{ + bool ret = false; + struct inode *inode; + struct file_lock *fl; + + if (WARN_ON(PageAnon(page)) || + WARN_ON(!page) || + WARN_ON(!page->mapping) || + WARN_ON(!page->mapping->host)) + return false; + + inode = page->mapping->host; + + smp_mb(); + if (inode->i_flctx && + !list_empty_careful(&inode->i_flctx->flc_lease)) { + spin_lock(&inode->i_flctx->flc_lock); + ret = false; + list_for_each_entry(fl, &inode->i_flctx->flc_lease, fl_list) { + if (fl->fl_pid == current->tgid && + (fl->fl_flags & FL_LAYOUT) && + (fl->fl_flags & FL_EXCLUSIVE)) { + ret = true; + break; + } + } + spin_unlock(&inode->i_flctx->flc_lock); + } + + return ret; +} +EXPORT_SYMBOL_GPL(mapping_inode_has_layout); diff --git a/include/linux/mm.h b/include/linux/mm.h index ad6766a08f9b..04f22722b374 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1583,6 +1583,8 @@ int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, struct task_struct *task, bool bypass_rlim); +bool mapping_inode_has_layout(struct page *page); + /* Container for pinned pfns / pages */ struct frame_vector { unsigned int nr_allocated; /* Number of frames we have space for */ diff --git a/mm/gup.c b/mm/gup.c index 80423779a50a..0b05e22ac05f 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -221,6 +221,13 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, page = pte_page(pte); else goto no_page; + + if (unlikely(flags & FOLL_LONGTERM) && + (*pgmap)->type == MEMORY_DEVICE_FS_DAX && + !mapping_inode_has_layout(page)) { + page = ERR_PTR(-EPERM); + goto out; + } } else if (unlikely(!page)) { if (flags & FOLL_DUMP) { /* Avoid special (like zero) pages in core dumps */ @@ -1847,6 +1854,14 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON_PAGE(compound_head(page) != head, page); + if (pte_devmap(pte) && + unlikely(flags & FOLL_LONGTERM) && + pgmap->type == MEMORY_DEVICE_FS_DAX && + !mapping_inode_has_layout(head)) { + put_user_page(head); + goto pte_unmap; + } + SetPageReferenced(page); pages[*nr] = page; (*nr)++; @@ -1895,6 +1910,14 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, undo_dev_pagemap(nr, nr_start, pages); return 0; } + + if (unlikely(flags & FOLL_LONGTERM) && + pgmap->type == MEMORY_DEVICE_FS_DAX && + !mapping_inode_has_layout(page)) { + undo_dev_pagemap(nr, nr_start, pages); + return 0; + } + SetPageReferenced(page); pages[*nr] = page; get_page(page); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1334ede667a8..bc1a07a55be1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -953,6 +953,12 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, if (!*pgmap) return ERR_PTR(-EFAULT); page = pfn_to_page(pfn); + + if (unlikely(flags & FOLL_LONGTERM) && + (*pgmap)->type == MEMORY_DEVICE_FS_DAX && + !mapping_inode_has_layout(page)) + return ERR_PTR(-EPERM); + get_page(page); return page; @@ -1093,6 +1099,12 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, if (!*pgmap) return ERR_PTR(-EFAULT); page = pfn_to_page(pfn); + + if (unlikely(flags & FOLL_LONGTERM) && + (*pgmap)->type == MEMORY_DEVICE_FS_DAX && + !mapping_inode_has_layout(page)) + return ERR_PTR(-EPERM); + get_page(page); return page; From patchwork Fri Aug 9 22:58:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087933 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1A011709 for ; Fri, 9 Aug 2019 22:58:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD2FB20602 for ; Fri, 9 Aug 2019 22:58:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D08B721FAC; Fri, 9 Aug 2019 22:58:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B8D021FAC for ; Fri, 9 Aug 2019 22:58:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB0606B000C; Fri, 9 Aug 2019 18:58:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CC38B6B000D; Fri, 9 Aug 2019 18:58:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B68436B000E; Fri, 9 Aug 2019 18:58:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 6D8986B000C for ; Fri, 9 Aug 2019 18:58:49 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id g126so14023939pgc.22 for ; Fri, 09 Aug 2019 15:58:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=tYh6rksf7SquyCqsn5ZWbafxgT2t5CvaacKb/a6zePI=; b=P2POxnrRMC+RfWLnQSETHL60rYAZqbkKxM1zniiT5vlXMb9p1wyBrVMVFCXbCwTvnB YTzbh40Ht4eoYu8Ex/Ne3+BLv7Z8nB+KmlXsersA/cqglFMvpL/nQDSpV6aBIa0SaknJ ay6jzMVlANOgS4ZX91gNsbOnwesk7Gti7nGkcpK3z0JerTBZ9OltMv+nz3ddqdnzfzOq GiVGKHLWPtUUUQnbn1f1AjgnsRE2HDJWNmxFMiopJV3DbVCjOj8C/niuLcI7YWbyxMjC HaFQvq5CEiwCXUJ1n0eAhd8OgdOLrNbf9nonJGCm/aQvdG68E/CLTZtaoKpo7chJI4WX 1aHg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUXqL/w3dkDeObpxzTiFLZb4XqBw+irYTbP1yKDfraQOOsGJ6ex TSFmhm6oEDLCvJAm04AsKnZjkZzPPDcDib54gkY508WHCQzO8URAXX4jKmwu0UxsGdXXSb95gqG 0bXp/NXhygH8hmTu3GmPtB0AseOXLbEXK3nYR7EsmLjQyailng2pUERmKr7O3NUCq3A== X-Received: by 2002:a62:7d93:: with SMTP id y141mr799263pfc.164.1565391529132; Fri, 09 Aug 2019 15:58:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqxPYDtjR2MZv94lBB/roakezkl8n1Zw2E3zISJBpV5xdZheem9726Uz6z5tro6CgUjFU7cb X-Received: by 2002:a62:7d93:: with SMTP id y141mr799230pfc.164.1565391528435; Fri, 09 Aug 2019 15:58:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391528; cv=none; d=google.com; s=arc-20160816; b=wzuBDszS8mw0pWX6DftyFPNjGXGPdyWD+ztS67fQqYyFycdjEbcPdFvb5BkL6ZyOjo OcmVfOtD1VB+Nkv7qyVxlOTE+mfMSk4enQLKUuIxV/qRTbjf0Ifgwz5DT6WZbGAAjeTV fSjyTDzRouIoT4gHieFgF0c02fgVrbEWxfYKhDJHnrLyWEi2P27WLdcHcxWo00JTldYT Mp+1msNXgfhYupBn8BKcz01O5bF6648Mrsr+ZZ8IivH1XElq9+uSLU6wjo6ahv9dY2nL v617I8IW2Itj2fx/2P4hympWz7I97LUxKJhfg4SkJF4CAkmJ+mmxpCYywqvZDlZSNR5K 240A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=tYh6rksf7SquyCqsn5ZWbafxgT2t5CvaacKb/a6zePI=; b=y4cj8+sO11YgJgU5Ek6a+VPpRKPKLoM69H9/KSuQMIkuW5ZG1JWe5BIon/HZ3DWr5z ACjuR+aEgU+iL/x+szt5/qP4fpaviaIadSoa7DO0XDkzPphP8WZ9nBNjwnupqhMFq6Al pe0K+nyLcmIWrcnF6J/l5g8M1W98UHGa++Xoaw1n2JP2fYnyZPdrlmzpGYbsWOgrHASm CTQ/1OfrbmOZxJR1P8fSbhw/1zl5h8eLwt8t0CQkqhVHZGuLVcDsgWYzJjqdaE/ELvsR zcQYDGBmBRA52jTqMLFGN68Lm9b1k3uP2mxHugAU7zXd31YSeHoA2H5DTdNvAw01QiJQ 0oew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id f131si52945383pgc.265.2019.08.09.15.58.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:48 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:47 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="193483400" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:47 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 05/19] fs/ext4: Teach ext4 to break layout leases Date: Fri, 9 Aug 2019 15:58:19 -0700 Message-Id: <20190809225833.6657-6-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny ext4 must attempt to break a layout lease if it is held to know if the layout can be modified. Split out the logic to determine if a mapping is DAX, export it, and then break layout leases if a mapping is DAX. Signed-off-by: Ira Weiny --- Changes from RFC v1: Based on feedback from Dave Chinner, add support to fail all other layout breaks when a lease is held. fs/dax.c | 23 ++++++++++++++++------- fs/ext4/inode.c | 7 +++++++ include/linux/dax.h | 6 ++++++ 3 files changed, 29 insertions(+), 7 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index b64964ef44f6..a14ec32255d8 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -557,6 +557,21 @@ static void *grab_mapping_entry(struct xa_state *xas, return xa_mk_internal(VM_FAULT_FALLBACK); } +bool dax_mapping_is_dax(struct address_space *mapping) +{ + /* + * In the 'limited' case get_user_pages() for dax is disabled. + */ + if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) + return false; + + if (!dax_mapping(mapping) || !mapping_mapped(mapping)) + return false; + + return true; +} +EXPORT_SYMBOL_GPL(dax_mapping_is_dax); + /** * dax_layout_busy_page - find first pinned page in @mapping * @mapping: address space to scan for a page with ref count > 1 @@ -579,13 +594,7 @@ struct page *dax_layout_busy_page(struct address_space *mapping) unsigned int scanned = 0; struct page *page = NULL; - /* - * In the 'limited' case get_user_pages() for dax is disabled. - */ - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return NULL; - - if (!dax_mapping(mapping) || !mapping_mapped(mapping)) + if (!dax_mapping_is_dax(mapping)) return NULL; /* diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index b2c8d09acf65..f08f48de52c5 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4271,6 +4271,13 @@ int ext4_break_layouts(struct inode *inode) if (WARN_ON_ONCE(!rwsem_is_locked(&ei->i_mmap_sem))) return -EINVAL; + /* Break layout leases if active */ + if (dax_mapping_is_dax(inode->i_mapping)) { + error = break_layout(inode, true); + if (error) + return error; + } + do { page = dax_layout_busy_page(inode->i_mapping); if (!page) diff --git a/include/linux/dax.h b/include/linux/dax.h index 9bd8528bd305..da0768b34b48 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -143,6 +143,7 @@ struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev); int dax_writeback_mapping_range(struct address_space *mapping, struct block_device *bdev, struct writeback_control *wbc); +bool dax_mapping_is_dax(struct address_space *mapping); struct page *dax_layout_busy_page(struct address_space *mapping); dax_entry_t dax_lock_page(struct page *page); void dax_unlock_page(struct page *page, dax_entry_t cookie); @@ -174,6 +175,11 @@ static inline struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev) return NULL; } +static inline bool dax_mapping_is_dax(struct address_space *mapping) +{ + return false; +} + static inline struct page *dax_layout_busy_page(struct address_space *mapping) { return NULL; From patchwork Fri Aug 9 22:58:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9DC216C5 for ; Fri, 9 Aug 2019 22:58:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8842D20602 for ; Fri, 9 Aug 2019 22:58:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7BCF82223E; Fri, 9 Aug 2019 22:58:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 911EE21FAC for ; Fri, 9 Aug 2019 22:58:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60DEC6B000D; Fri, 9 Aug 2019 18:58:51 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5C2036B000E; Fri, 9 Aug 2019 18:58:51 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 438FF6B0010; Fri, 9 Aug 2019 18:58:51 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 08AC36B000D for ; Fri, 9 Aug 2019 18:58:51 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id e20so62426590pfd.3 for ; Fri, 09 Aug 2019 15:58:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=XMVFi7kxmJU7YO2euO5jxRXTORd9v36cKBiixcfay74=; b=XCdinaqXZ9aK44KvRKH0T7UcImSxhZEEbNwxv6SQ1Lo3DNNPd7/dH1UQ4zBkzpHCkD qCb9ZMLWDire2HZ6zDkIWa5A4hiqc/4EM6Y/d/vYTqIuoNvpp1VgpCNrMZgxFQqOdq24 92xNDZawfGAPAanIUvUEgzeSu2D10mLU9kG78IrxxN9kSXDRVqt0zrH35i5iU2VWBpMd V8cHuZJWYqecIycpy1iZtJWtZW+EegD66k3z08eGWXg1DLuzUzZZYgSubrznuJ43bEsX 8NR7YUPoE811gR2ga9XdeaamybMoc6iNnQcBblcAjoa/xgBQ/++4yleZ5IPIvk2LqhCb MNvA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXR1PkCVDxVMuUmHgkFi6FY7gP5M0kIuvHkG6iHc07u2V597jsX S9U6UTegXeUXPYXOIPQWgc8lt8opwKM1oqlsyeAzMFrWyeOewYh9N6WR2lP7a386CYkjFGzkvEn 0P1zkAtE4pcQ7F2+G9hhqd0mlAwlDfJssLS096ACyC5njbHXpBxMa3v6UVrthofx8Wg== X-Received: by 2002:a62:64d4:: with SMTP id y203mr23796572pfb.91.1565391530675; Fri, 09 Aug 2019 15:58:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqzUfhoaI/ePXfY9YwF1SxiUA+00pIVi0q5NTb+G3oQbX+BULkspWMMVHV7IP6S7q4du5oFR X-Received: by 2002:a62:64d4:: with SMTP id y203mr23796530pfb.91.1565391529857; Fri, 09 Aug 2019 15:58:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391529; cv=none; d=google.com; s=arc-20160816; b=IMIUt0jKEcmipaGuXCrTePFW5Gg91UtKi3HFaqTSskljh2X6a1iLr6Qlx4mLgIBSMN ikn4h1QKV8wGypkxocTcQwJ5AQvv53ChN/k6dDnfj7IKoCBX1PQMG42/6zZ/lHuInGpy ICI9apjPH2IvZrFkexMwlg79M2hREw+vNX81+YjQkUyWfgvgkGC0kl+PvyrM1NfP/myE buopIlT61Frb0LrDxhtqzyiFrWh7mCfEGc4sXbDxlEDViY1WqRJ3Gf5on36QjXHmESot i5lyV/ARHbz7meclbZnHPwcYwJgvdzkZ/H04B20ppBbGUd333YO9ockXxobHB99sB2/t LcYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=XMVFi7kxmJU7YO2euO5jxRXTORd9v36cKBiixcfay74=; b=SNXQcKrjdjXiCzZyyrvgFCVsxVS5TEo6+khDbhOU0nwZZKsbhXKZ83wp1/lgcdKDff kt6CzmwnPR7AfUbjGwUK3s48gd9UpLJZB/4+uRQvmkRdgWQhJ1aseB3jUpfDMSmxOifR gjUOGk6hSVF5aDN9991SjDJYppRsYjDwXQDvnIM/wo0CYJUr8Bz99O3TZCOHblL4ePQF /kl1jmf3Oc0/5O8orJigD2e5fIoO4hC30IQ20lGTjzAasEFOVV/6kakSgmGdt71NC7Pn LyXKxZgf3zbqACjdWvLJnMbUyhgsGN0At6umYB79Jv630X+cHaQciQpTqc3yZINM80DR lD7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga07.intel.com (mga07.intel.com. [134.134.136.100]) by mx.google.com with ESMTPS id 7si4293114pll.330.2019.08.09.15.58.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.100 as permitted sender) client-ip=134.134.136.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:49 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="169446146" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:48 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 06/19] fs/ext4: Teach dax_layout_busy_page() to operate on a sub-range Date: Fri, 9 Aug 2019 15:58:20 -0700 Message-Id: <20190809225833.6657-7-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Callers of dax_layout_busy_page() are only rarely operating on the entire file of concern. Teach dax_layout_busy_page() to operate on a sub-range of the address_space provided. Specifying 0 - ULONG_MAX however, will continue to operate on the "entire file" and XFS is split out to a separate patch by this method. This could potentially speed up dax_layout_busy_page() as well. Signed-off-by: Ira Weiny --- Changes from RFC v1 Fix 0-day build errors fs/dax.c | 15 +++++++++++---- fs/ext4/ext4.h | 2 +- fs/ext4/extents.c | 6 +++--- fs/ext4/inode.c | 19 ++++++++++++------- fs/xfs/xfs_file.c | 3 ++- include/linux/dax.h | 6 ++++-- 6 files changed, 33 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a14ec32255d8..3ad19c384454 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -573,8 +573,11 @@ bool dax_mapping_is_dax(struct address_space *mapping) EXPORT_SYMBOL_GPL(dax_mapping_is_dax); /** - * dax_layout_busy_page - find first pinned page in @mapping + * dax_layout_busy_page - find first pinned page in @mapping within + * the range @off - @off + @len * @mapping: address space to scan for a page with ref count > 1 + * @off: offset to start at + * @len: length to scan through * * DAX requires ZONE_DEVICE mapped pages. These pages are never * 'onlined' to the page allocator so they are considered idle when @@ -587,9 +590,13 @@ EXPORT_SYMBOL_GPL(dax_mapping_is_dax); * to be able to run unmap_mapping_range() and subsequently not race * mapping_mapped() becoming true. */ -struct page *dax_layout_busy_page(struct address_space *mapping) +struct page *dax_layout_busy_page(struct address_space *mapping, + loff_t off, loff_t len) { - XA_STATE(xas, &mapping->i_pages, 0); + unsigned long start_idx = off >> PAGE_SHIFT; + unsigned long end_idx = (len == ULONG_MAX) ? ULONG_MAX + : start_idx + (len >> PAGE_SHIFT); + XA_STATE(xas, &mapping->i_pages, start_idx); void *entry; unsigned int scanned = 0; struct page *page = NULL; @@ -612,7 +619,7 @@ struct page *dax_layout_busy_page(struct address_space *mapping) unmap_mapping_range(mapping, 0, 0, 1); xas_lock_irq(&xas); - xas_for_each(&xas, entry, ULONG_MAX) { + xas_for_each(&xas, entry, end_idx) { if (WARN_ON_ONCE(!xa_is_value(entry))) continue; if (unlikely(dax_is_locked(entry))) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 9c7f4036021b..32738ccdac1d 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2578,7 +2578,7 @@ extern int ext4_get_inode_loc(struct inode *, struct ext4_iloc *); extern int ext4_inode_attach_jinode(struct inode *inode); extern int ext4_can_truncate(struct inode *inode); extern int ext4_truncate(struct inode *); -extern int ext4_break_layouts(struct inode *); +extern int ext4_break_layouts(struct inode *inode, loff_t offset, loff_t len); extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length); extern int ext4_truncate_restart_trans(handle_t *, struct inode *, int nblocks); extern void ext4_set_inode_flags(struct inode *); diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 92266a2da7d6..ded4b1d92299 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4736,7 +4736,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, */ down_write(&EXT4_I(inode)->i_mmap_sem); - ret = ext4_break_layouts(inode); + ret = ext4_break_layouts(inode, offset, len); if (ret) { up_write(&EXT4_I(inode)->i_mmap_sem); goto out_mutex; @@ -5419,7 +5419,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) */ down_write(&EXT4_I(inode)->i_mmap_sem); - ret = ext4_break_layouts(inode); + ret = ext4_break_layouts(inode, offset, len); if (ret) goto out_mmap; @@ -5572,7 +5572,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) */ down_write(&EXT4_I(inode)->i_mmap_sem); - ret = ext4_break_layouts(inode); + ret = ext4_break_layouts(inode, offset, len); if (ret) goto out_mmap; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f08f48de52c5..d3fc6035428c 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4262,7 +4262,7 @@ static void ext4_wait_dax_page(struct ext4_inode_info *ei) down_write(&ei->i_mmap_sem); } -int ext4_break_layouts(struct inode *inode) +int ext4_break_layouts(struct inode *inode, loff_t offset, loff_t len) { struct ext4_inode_info *ei = EXT4_I(inode); struct page *page; @@ -4279,7 +4279,7 @@ int ext4_break_layouts(struct inode *inode) } do { - page = dax_layout_busy_page(inode->i_mapping); + page = dax_layout_busy_page(inode->i_mapping, offset, len); if (!page) return 0; @@ -4366,7 +4366,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) */ down_write(&EXT4_I(inode)->i_mmap_sem); - ret = ext4_break_layouts(inode); + ret = ext4_break_layouts(inode, offset, length); if (ret) goto out_dio; @@ -5657,10 +5657,15 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) down_write(&EXT4_I(inode)->i_mmap_sem); - rc = ext4_break_layouts(inode); - if (rc) { - up_write(&EXT4_I(inode)->i_mmap_sem); - return rc; + if (shrink) { + loff_t off = attr->ia_size; + loff_t len = inode->i_size - attr->ia_size; + + rc = ext4_break_layouts(inode, off, len); + if (rc) { + up_write(&EXT4_I(inode)->i_mmap_sem); + return rc; + } } if (attr->ia_size != inode->i_size) { diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 28101bbc0b78..8f8d478f9ec6 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -740,7 +740,8 @@ xfs_break_dax_layouts( ASSERT(xfs_isilocked(XFS_I(inode), XFS_MMAPLOCK_EXCL)); - page = dax_layout_busy_page(inode->i_mapping); + /* We default to the "whole file" */ + page = dax_layout_busy_page(inode->i_mapping, 0, ULONG_MAX); if (!page) return 0; diff --git a/include/linux/dax.h b/include/linux/dax.h index da0768b34b48..f34616979e45 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -144,7 +144,8 @@ int dax_writeback_mapping_range(struct address_space *mapping, struct block_device *bdev, struct writeback_control *wbc); bool dax_mapping_is_dax(struct address_space *mapping); -struct page *dax_layout_busy_page(struct address_space *mapping); +struct page *dax_layout_busy_page(struct address_space *mapping, + loff_t off, loff_t len); dax_entry_t dax_lock_page(struct page *page); void dax_unlock_page(struct page *page, dax_entry_t cookie); #else @@ -180,7 +181,8 @@ static inline bool dax_mapping_is_dax(struct address_space *mapping) return false; } -static inline struct page *dax_layout_busy_page(struct address_space *mapping) +static inline struct page *dax_layout_busy_page(struct address_space *mapping, + loff_t off, loff_t len) { return NULL; } From patchwork Fri Aug 9 22:58:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087945 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A7301709 for ; Fri, 9 Aug 2019 22:59:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 468EE20602 for ; Fri, 9 Aug 2019 22:59:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3A45F2223E; Fri, 9 Aug 2019 22:59:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C16E621FAC for ; Fri, 9 Aug 2019 22:58:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B38D6B000E; Fri, 9 Aug 2019 18:58:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2666D6B0010; Fri, 9 Aug 2019 18:58:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0902D6B0266; Fri, 9 Aug 2019 18:58:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id C27566B000E for ; Fri, 9 Aug 2019 18:58:52 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id 145so62398255pfv.18 for ; Fri, 09 Aug 2019 15:58:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=uJ/R3P593pjHSZ6lTtczPp3JUX+mJ6ABiFQkwj+ghMo=; b=pJdeGRUNvGk7roUKMGVmnO2YLCzpmfCRgVE2M92utt8zfOWyaa/5sQf1jsHwoL97mb LOCWyHm3p7Ze7JEYX0Qm2UIDNsPf9L7a3/gSAv1U/0QkX6wF1EFdYWtBGuLzeIZ/Der8 v4lNdZqh3uW8sPq4XSJBHuvRzZQ/rEtrd8SEoDUKH9Hu3PfrZ/0o27nAXpzaA/wSwThw 9fyDgoHH0G77ELdbGOIDB6vB8tUs9eNJmPJqVd3QwImB9kY504YLYwqPtiygGXWAXbXa wFejKue7yfT+R4JQ25XOMKOQjP2+U2nOKXq4/HtwZgJkcI+09hAizih2hAewVTPtnYBI OWCQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWF9Ue7Vpxd8gSa6GpaCDlOvhiHwM/GAWr1/TfUhST4KryYpCve jZC5pEWWlV/8xigxLrLIcQqlreljgeEthvgu9ypDZ2N/jx7qP/T9geqP/E/OKFAL+38wZ12R1I/ f0wC9lPI7JO//FF0h/fP516PswZ/+PmvQxKq96TKEVcibGIZGUW/IJZhVchFfUzn7/w== X-Received: by 2002:a17:902:543:: with SMTP id 61mr21258203plf.20.1565391532453; Fri, 09 Aug 2019 15:58:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqxmDr9/VOkhWV8OfKbExnl/Bz43bTAwgfdh5049LxK6blO8y0t3cK+nf5v7aRDQ98f3X5ZR X-Received: by 2002:a17:902:543:: with SMTP id 61mr21258164plf.20.1565391531559; Fri, 09 Aug 2019 15:58:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391531; cv=none; d=google.com; s=arc-20160816; b=r3cRg2TVt6tx+FwWfMoTyrYmtf9GyW8ug2ZkqQVPhDrGK9OhzZu5jtWQnPnVaLrQP9 FYGsv7ymha5QO4So55bOyAl457l+qrFois3B84xl8ZEXI5jwZeIScHd5CuU9Z2/rt05X GJVvvrSnJtiFGQ361D2E4Tim3EXgEwMNNcnGQaOonNkIGbKuufI2DfgezjHCze5bRgiA RDw/FyuKySRZptmR7HMvYt+7zCdbHyEb3mch1FtrLiLHDP+OeG5YBGCOHegaFOpBspc0 lBQNDtvfTN14KLa3Kq8MqsuY9fi8HQ7QEZpAoYK2X/B3hj8+Ej14c1HAz43uxX8Sgv+J fULQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=uJ/R3P593pjHSZ6lTtczPp3JUX+mJ6ABiFQkwj+ghMo=; b=tAy97Kpk8Rb3StzxkvF9sCP5fsor6iQY4jrheODZomUC5XyO18DfF/MS4UqLMDLMLl vdy2PpHc5dze5Q4uDv+nJJKHacOenz83qH+5YQ6Eadcn1x88hGD0cM4WdldIIrZicQ4G AjwGY1VyhkZTphXwjWw4IoO0d/MlprAgpOrZ43wyK4cQnH83hhDksyyJqTGx1qlNlzHo hlk2Xo+GuBiE3deYnHNtMziBlYxPJ6LaBIRIpqzAbCUrBbSd8auKZ8BXZHbHSOZnhVQ7 o1JY9I1FgpA+tFBZJ3bckmaCM80eDSKKjReh+ceRm4aaZYjFBGSBYc7h9axHfaAqzqLF +/Qg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga02.intel.com (mga02.intel.com. [134.134.136.20]) by mx.google.com with ESMTPS id i64si56920667pge.307.2019.08.09.15.58.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) client-ip=134.134.136.20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:51 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="166136395" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:50 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 07/19] fs/xfs: Teach xfs to use new dax_layout_busy_page() Date: Fri, 9 Aug 2019 15:58:21 -0700 Message-Id: <20190809225833.6657-8-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny dax_layout_busy_page() can now operate on a sub-range of the address_space provided. Have xfs specify the sub range to dax_layout_busy_page() Signed-off-by: Ira Weiny --- fs/xfs/xfs_file.c | 19 +++++++++++++------ fs/xfs/xfs_inode.h | 5 +++-- fs/xfs/xfs_ioctl.c | 15 ++++++++++++--- fs/xfs/xfs_iops.c | 14 ++++++++++---- 4 files changed, 38 insertions(+), 15 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 8f8d478f9ec6..447571e3cb02 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -295,7 +295,11 @@ xfs_file_aio_write_checks( if (error <= 0) return error; - error = xfs_break_layouts(inode, iolock, BREAK_WRITE); + /* + * BREAK_WRITE ignores offset/len tuple just specify the whole file + * (0 - ULONG_MAX to be safe. + */ + error = xfs_break_layouts(inode, iolock, 0, ULONG_MAX, BREAK_WRITE); if (error) return error; @@ -734,14 +738,15 @@ xfs_wait_dax_page( static int xfs_break_dax_layouts( struct inode *inode, - bool *retry) + bool *retry, + loff_t off, + loff_t len) { struct page *page; ASSERT(xfs_isilocked(XFS_I(inode), XFS_MMAPLOCK_EXCL)); - /* We default to the "whole file" */ - page = dax_layout_busy_page(inode->i_mapping, 0, ULONG_MAX); + page = dax_layout_busy_page(inode->i_mapping, off, len); if (!page) return 0; @@ -755,6 +760,8 @@ int xfs_break_layouts( struct inode *inode, uint *iolock, + loff_t off, + loff_t len, enum layout_break_reason reason) { bool retry; @@ -766,7 +773,7 @@ xfs_break_layouts( retry = false; switch (reason) { case BREAK_UNMAP: - error = xfs_break_dax_layouts(inode, &retry); + error = xfs_break_dax_layouts(inode, &retry, off, len); if (error || retry) break; /* fall through */ @@ -808,7 +815,7 @@ xfs_file_fallocate( return -EOPNOTSUPP; xfs_ilock(ip, iolock); - error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); + error = xfs_break_layouts(inode, &iolock, offset, len, BREAK_UNMAP); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 558173f95a03..1b0948f5267c 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -475,8 +475,9 @@ enum xfs_prealloc_flags { int xfs_update_prealloc_flags(struct xfs_inode *ip, enum xfs_prealloc_flags flags); -int xfs_break_layouts(struct inode *inode, uint *iolock, - enum layout_break_reason reason); +int xfs_break_layouts(struct inode *inode, uint *iolock, + loff_t off, loff_t len, + enum layout_break_reason reason); /* from xfs_iops.c */ extern void xfs_setup_inode(struct xfs_inode *ip); diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 6f7848cd5527..3897b88080bd 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -597,6 +597,7 @@ xfs_ioc_space( enum xfs_prealloc_flags flags = 0; uint iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; int error; + loff_t break_length; if (inode->i_flags & (S_IMMUTABLE|S_APPEND)) return -EPERM; @@ -617,9 +618,6 @@ xfs_ioc_space( return error; xfs_ilock(ip, iolock); - error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); - if (error) - goto out_unlock; switch (bf->l_whence) { case 0: /*SEEK_SET*/ @@ -665,6 +663,17 @@ xfs_ioc_space( goto out_unlock; } + /* break layout for the whole file if len ends up 0 */ + if (bf->l_len == 0) + break_length = ULONG_MAX; + else + break_length = bf->l_len; + + error = xfs_break_layouts(inode, &iolock, bf->l_start, break_length, + BREAK_UNMAP); + if (error) + goto out_unlock; + switch (cmd) { case XFS_IOC_ZERO_RANGE: flags |= XFS_PREALLOC_SET; diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index ff3c1fae5357..f0de5486f6c1 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1042,10 +1042,16 @@ xfs_vn_setattr( xfs_ilock(ip, XFS_MMAPLOCK_EXCL); iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; - error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); - if (error) { - xfs_iunlock(ip, XFS_MMAPLOCK_EXCL); - return error; + if (iattr->ia_size < inode->i_size) { + loff_t off = iattr->ia_size; + loff_t len = inode->i_size - iattr->ia_size; + + error = xfs_break_layouts(inode, &iolock, off, len, + BREAK_UNMAP); + if (error) { + xfs_iunlock(ip, XFS_MMAPLOCK_EXCL); + return error; + } } error = xfs_vn_setattr_size(dentry, iattr); From patchwork Fri Aug 9 22:58:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087949 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 07DB31709 for ; Fri, 9 Aug 2019 22:59:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EA5F420602 for ; Fri, 9 Aug 2019 22:59:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DB5282223E; Fri, 9 Aug 2019 22:59:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8725621FAC for ; Fri, 9 Aug 2019 22:59:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 720286B0010; Fri, 9 Aug 2019 18:58:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6FAA26B0266; Fri, 9 Aug 2019 18:58:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5299D6B0269; Fri, 9 Aug 2019 18:58:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 10BA06B0010 for ; Fri, 9 Aug 2019 18:58:55 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id j12so58177778pll.14 for ; Fri, 09 Aug 2019 15:58:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=o3LFNt0eHSKoEB+BEnprqBSe2zk7FW3mS2UHtjP/KF0=; b=XLycgicqhgkKhP+AoiihmV+KQ6KfphWqDC3RcujB61cwITTVopZWMIrPwd9FUZvZ3Q ZTGcO9AaLLBnIhxwt/mpCSmaJafReGOtKtLWZPZyEdFlLgQMxK7zB9z2plFIaTgJRueB jimGOI8reLW+XqNjrkg/vinVPjTf50Q5c5VKg0UGM23oKYxtA/MN2OMHGJ9PblbGsH08 9HWT68IJX22S3aqYd7p8SqbYvdf26iGpgddKH9e2NcXjqgo3kLG89ukRXrBwk9hncWuq n8evhPlgsKFkUYxGw6N7obgLsm7nJATMRu3cDt6Zyqnc7XsP6ETmlsXLtUKTSHGyjblI ANoA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVadt+yLIycHlUQ/V4Y+PbUF3PWghdfTqsDiQjjzvJNoAUFi2GZ 5jj2+YZBZWjGoJUw1spwBVkkFuHThdaa5fxtVTbja14XTNsYHly9lv8/scRgyWHSLoc/PxXh65C a1QTNjofwesvMspUvQsxad8G7brcgV71ny9dkh3gDVcQYclrJQGj/jZg2ueYXuf79nQ== X-Received: by 2002:a17:90a:a897:: with SMTP id h23mr5207333pjq.44.1565391534697; Fri, 09 Aug 2019 15:58:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqxAMSQxC2QbgbqXGePaXVPPI5ltCCjwnODtVd5zoJ5EzSY9ZAvsVxJ+xsiMR4+r2GC7KzMi X-Received: by 2002:a17:90a:a897:: with SMTP id h23mr5207295pjq.44.1565391534014; Fri, 09 Aug 2019 15:58:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391534; cv=none; d=google.com; s=arc-20160816; b=R4MO735I4ECRHCUZZFDsr7rBJePj24gagGAA+qEk7NlHDLwRI0dFvjm+sZ2CpJ40aq reCbC8zyrSqTvfTBvXWlaptXETHKLAUr1jh7Gns0/Th0GxmhThPjwj2KUITRSwu0mpgh eavpuoNl/yciku5zmFEmIBuauDG5z83e2hJ+ZJbkPXnclNDplrICHs6ohBWlITusSKEp EdeiyH2qZTr02GDOMoUozZeNGsJYVIJSONxw1eNwWwjUfES8Hm78u3mv6FV3SDn18WPo XV/OWG6on98GM5UybKdeWFwWVjklCYTQKjS1Dg2x50iOddsCA3j66RsbHuPe634TKW6S MZYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=o3LFNt0eHSKoEB+BEnprqBSe2zk7FW3mS2UHtjP/KF0=; b=cn84o51ec/DlhJa1HEU60xDMl58NdBHsX9m4gvIIva39ONsPpCFf3DBKxQx8QY2xum PR+jp0ivd+iEIJGac8YhncOl/4/nZFCI7V3tF5bcj5hh7QuanXDNngX7eUJzHKkL2qVw bT9whKOVxE1CYMJNqU2U+QcweaxK7FB4u1cXcDQxXxhSBlVHJTxMSqXQlhqZmQ9Sqn/T 1z4B0QBQvyqS7ky5PdjWsibu9cOsoFpap9AB6zDqbYNDh8cqxXvFPdzZ/uk1+qvcQo4R 7pi4O/OobHyW9l1L0zJK1MxAKUZYXnLhYLQDUK0iIdEDUv9P4IgcJSQ2fZCEkslcUHtV q2SQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id e17si22370683pgt.192.2019.08.09.15.58.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:53 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="180270005" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:52 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 08/19] fs/xfs: Fail truncate if page lease can't be broken Date: Fri, 9 Aug 2019 15:58:22 -0700 Message-Id: <20190809225833.6657-9-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny If pages are under a lease fail the truncate operation. We change the order of lease breaks to directly fail the operation if the lease exists. Select EXPORT_BLOCK_OPS for FS_DAX to ensure that xfs_break_lease_layouts() is defined for FS_DAX as well as pNFS. Signed-off-by: Ira Weiny --- fs/Kconfig | 1 + fs/xfs/xfs_file.c | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index 14cd4abdc143..c10b91f92528 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -48,6 +48,7 @@ config FS_DAX select DEV_PAGEMAP_OPS if (ZONE_DEVICE && !FS_DAX_LIMITED) select FS_IOMAP select DAX + select EXPORTFS_BLOCK_OPS help Direct Access (DAX) can be used on memory-backed block devices. If the block device supports DAX and the filesystem supports DAX, diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 447571e3cb02..850d0a0953a2 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -773,10 +773,11 @@ xfs_break_layouts( retry = false; switch (reason) { case BREAK_UNMAP: - error = xfs_break_dax_layouts(inode, &retry, off, len); + error = xfs_break_leased_layouts(inode, iolock, &retry); if (error || retry) break; - /* fall through */ + error = xfs_break_dax_layouts(inode, &retry, off, len); + break; case BREAK_WRITE: error = xfs_break_leased_layouts(inode, iolock, &retry); break; From patchwork Fri Aug 9 22:58:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087957 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 340D618A6 for ; Fri, 9 Aug 2019 22:59:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 21F2120602 for ; Fri, 9 Aug 2019 22:59:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 15C5C2228E; Fri, 9 Aug 2019 22:59:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7675120602 for ; Fri, 9 Aug 2019 22:59:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 523D76B0266; Fri, 9 Aug 2019 18:58:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4621D6B0269; Fri, 9 Aug 2019 18:58:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34F376B026A; Fri, 9 Aug 2019 18:58:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 00C6C6B0266 for ; Fri, 9 Aug 2019 18:58:57 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id q12so2366726pfl.14 for ; Fri, 09 Aug 2019 15:58:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=UDxtL8HgVwQej+23KnshJs9viiQgpKVDDU8kXcefPc4=; b=WfisboBiobMit1we+9SqMocXtPFB+l2BsZuYUlgCP9FOo1ed7C9dXNuM4Qcv+saP8T DpnSv6wZcuFqEZcV2m5umklsKMwJjGlJbO9u7w7YFOKjlM1AQxa1yohv5OWgDOcDRxWm lUN87bn+S9CIuWoqQ7kO2IKFPSWX69mffWJMoSgvxGciDoDFCG/vbcmcilNQmjfmDVpn 83k0YrJd3eU+TEVk9X7Gk+s+jd3ZxV6RB3/Wq8h0tSGZLi/IW701pGL8t6nwl7Y103iq qbxVQWRTw9jqKaRm3mrTN5r2PbA9qV/hp24/i48avC2xuaSklhAV5nof5D02RJ9J/Qud uLow== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWcpFBxgBB3/appH6dbKHOTfuAB0GDuJ+2dAFJE3r3rU4/HseE+ ZwoBEo9LmOmjXqEx6nYRSyIgKNU10OQJnFFuvyvhhxqb3+8jskQgqcrFKDhOXjs7yAGmAopEgpx Pap6d0zdnVvrIaFqtZkKKsK8vVqQHj5hPQQmYD0gbKZloi0rUC2E7McX7pWvXtf0Qmg== X-Received: by 2002:a17:90a:1904:: with SMTP id 4mr12038246pjg.116.1565391536655; Fri, 09 Aug 2019 15:58:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqzHW2VtmnF3We2Km/hdT3KqJ8UNzYqq/lrQB5OQ0FBrQxQIcmT5af0B/2jq2YgpxiSRbbZg X-Received: by 2002:a17:90a:1904:: with SMTP id 4mr12038196pjg.116.1565391535707; Fri, 09 Aug 2019 15:58:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391535; cv=none; d=google.com; s=arc-20160816; b=RNT5x8uO/e7zZZAk6eIo3IwBtE1E8noBd1O9l5FB/zKH3Fj6eYieuTgcE+Qrr3HDVp dfNqav8pi+st9x219VPOTglZ0VA99JQTCgp8v01BI6bUHQCPIU8hL5W/9qPJXEdijqSO 4eu1BoxP6pIl9Tuzi+PlmM57Y6S0MJpDsgCXNS+Ww4aAJozkaeUxzg6HqmFXP/j4Kq9x N6R7uaCATt61S3hjRHAdu6nYSxO2Ic15nzrm3+jES+fU741m7CEYkvpoafWXt7ALFTrK o1GLISXE0WFAaj2fWb67kieJLb0R/Q8UEG4+0lKTShIvkB0VGzGfibp2e50bLQZQVdnE Yv7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=UDxtL8HgVwQej+23KnshJs9viiQgpKVDDU8kXcefPc4=; b=harRU2o2Hfy46SeFn9tNQwoqe7dkgoPKFf6doAdg0f1AoxyXCQBXXbFEk+Q/9a4299 i+YBbdek+HWXpKdHu1s4d6LtO05JvbHa4z3CU1wy/D+I6RfPFmX2I02EvM2kL1XOocin vxcq8nHJYd62x+m1dprpGKdQHsWOH+wXEsTvhFDMhf58xmF2O+QPgoBHmw01YHjjUL8v NPr3k9b+Urs30EOu2pGf5ZJJeIJbJfNBtQc9gN2JTCO9oWMsWG0zC/5KoQCly9e1yOf6 E0lI8jYOQ2TjEW1G+npVNJJA+EMrcXyIM+Zkd0wT0xAdyoBEA/ClTbpObCH7QmnDqEtC yXqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id j1si54780464pfr.52.2019.08.09.15.58.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:55 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="350623637" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:54 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 09/19] mm/gup: Introduce vaddr_pin structure Date: Fri, 9 Aug 2019 15:58:23 -0700 Message-Id: <20190809225833.6657-10-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Some subsystems need to pass owning file information to GUP calls to allow for GUP to associate the "owning file" to any files being pinned within the GUP call. Introduce an object to specify this information and pass it down through some of the GUP call stack. Signed-off-by: Ira Weiny --- include/linux/mm.h | 9 +++++++++ mm/gup.c | 36 ++++++++++++++++++++++-------------- 2 files changed, 31 insertions(+), 14 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 04f22722b374..befe150d17be 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -971,6 +971,15 @@ static inline bool is_zone_device_page(const struct page *page) } #endif +/** + * @f_owner The file who "owns this GUP" + * @mm The mm who "owns this GUP" + */ +struct vaddr_pin { + struct file *f_owner; + struct mm_struct *mm; +}; + #ifdef CONFIG_DEV_PAGEMAP_OPS void __put_devmap_managed_page(struct page *page); DECLARE_STATIC_KEY_FALSE(devmap_managed_key); diff --git a/mm/gup.c b/mm/gup.c index 0b05e22ac05f..7a449500f0a6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1005,7 +1005,8 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, struct page **pages, struct vm_area_struct **vmas, int *locked, - unsigned int flags) + unsigned int flags, + struct vaddr_pin *vaddr_pin) { long ret, pages_done; bool lock_dropped; @@ -1165,7 +1166,8 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, locked, - gup_flags | FOLL_TOUCH | FOLL_REMOTE); + gup_flags | FOLL_TOUCH | FOLL_REMOTE, + NULL); } EXPORT_SYMBOL(get_user_pages_remote); @@ -1320,7 +1322,8 @@ static long __get_user_pages_locked(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, struct page **pages, struct vm_area_struct **vmas, int *locked, - unsigned int foll_flags) + unsigned int foll_flags, + struct vaddr_pin *vaddr_pin) { struct vm_area_struct *vma; unsigned long vm_flags; @@ -1504,7 +1507,7 @@ static long check_and_migrate_cma_pages(struct task_struct *tsk, */ nr_pages = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, NULL, - gup_flags); + gup_flags, NULL); if ((nr_pages > 0) && migrate_allow) { drain_allow = true; @@ -1537,7 +1540,8 @@ static long __gup_longterm_locked(struct task_struct *tsk, unsigned long nr_pages, struct page **pages, struct vm_area_struct **vmas, - unsigned int gup_flags) + unsigned int gup_flags, + struct vaddr_pin *vaddr_pin) { struct vm_area_struct **vmas_tmp = vmas; unsigned long flags = 0; @@ -1558,7 +1562,7 @@ static long __gup_longterm_locked(struct task_struct *tsk, } rc = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, - vmas_tmp, NULL, gup_flags); + vmas_tmp, NULL, gup_flags, vaddr_pin); if (gup_flags & FOLL_LONGTERM) { memalloc_nocma_restore(flags); @@ -1588,10 +1592,11 @@ static __always_inline long __gup_longterm_locked(struct task_struct *tsk, unsigned long nr_pages, struct page **pages, struct vm_area_struct **vmas, - unsigned int flags) + unsigned int flags, + struct vaddr_pin *vaddr_pin) { return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, - NULL, flags); + NULL, flags, vaddr_pin); } #endif /* CONFIG_FS_DAX || CONFIG_CMA */ @@ -1607,7 +1612,8 @@ long get_user_pages(unsigned long start, unsigned long nr_pages, struct vm_area_struct **vmas) { return __gup_longterm_locked(current, current->mm, start, nr_pages, - pages, vmas, gup_flags | FOLL_TOUCH); + pages, vmas, gup_flags | FOLL_TOUCH, + NULL); } EXPORT_SYMBOL(get_user_pages); @@ -1647,7 +1653,7 @@ long get_user_pages_locked(unsigned long start, unsigned long nr_pages, return __get_user_pages_locked(current, current->mm, start, nr_pages, pages, NULL, locked, - gup_flags | FOLL_TOUCH); + gup_flags | FOLL_TOUCH, NULL); } EXPORT_SYMBOL(get_user_pages_locked); @@ -1684,7 +1690,7 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, down_read(&mm->mmap_sem); ret = __get_user_pages_locked(current, mm, start, nr_pages, pages, NULL, - &locked, gup_flags | FOLL_TOUCH); + &locked, gup_flags | FOLL_TOUCH, NULL); if (locked) up_read(&mm->mmap_sem); return ret; @@ -2377,7 +2383,8 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, EXPORT_SYMBOL_GPL(__get_user_pages_fast); static int __gup_longterm_unlocked(unsigned long start, int nr_pages, - unsigned int gup_flags, struct page **pages) + unsigned int gup_flags, struct page **pages, + struct vaddr_pin *vaddr_pin) { int ret; @@ -2389,7 +2396,8 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages, down_read(¤t->mm->mmap_sem); ret = __gup_longterm_locked(current, current->mm, start, nr_pages, - pages, NULL, gup_flags); + pages, NULL, gup_flags, + vaddr_pin); up_read(¤t->mm->mmap_sem); } else { ret = get_user_pages_unlocked(start, nr_pages, @@ -2448,7 +2456,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, pages += nr; ret = __gup_longterm_unlocked(start, nr_pages - nr, - gup_flags, pages); + gup_flags, pages, NULL); /* Have to be a bit careful with return values */ if (nr > 0) { From patchwork Fri Aug 9 22:58:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087965 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FF636C5 for ; Fri, 9 Aug 2019 22:59:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B26520602 for ; Fri, 9 Aug 2019 22:59:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E56421FAC; Fri, 9 Aug 2019 22:59:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A27B22376 for ; Fri, 9 Aug 2019 22:59:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B9286B0269; Fri, 9 Aug 2019 18:58:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 96A416B026A; Fri, 9 Aug 2019 18:58:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 771806B026B; Fri, 9 Aug 2019 18:58:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 30DBD6B0269 for ; Fri, 9 Aug 2019 18:58:59 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id 21so62378538pfu.9 for ; Fri, 09 Aug 2019 15:58:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=IYbL9R6/GsOg8Lm6+VjpJDRhe/AIP19VysgM8nkePVA=; b=I/36bnbGb2TpVQ/0iK297cV1AX8e3yq2J2Of48Avw5/UNnt9EOk2NYiE0RuJ7gPkLX E0BmJFScVxvPTMDxSxPYhUDH4Y/9pR6FJQUDPRqCHHL3mlSovhyWkd9Q5+uxiSFNOb05 qgcnWaAmV4yz01TmmOqsdg+K28GW2A/Xo7rOWH+D1cuqqbfeWoY/ZrAMkAEoaDKUB9Ay +wmjTIfwRpVV0A7e++QkcA3/OS399vKVxr087OK/QM1fcRsj1inKRT+NbbejY03vzACA FYFy+yy+292t1ILZlxRZTPTGM+duTJ9AGXRE4vLeuUptHglZEdRAWOLT2SUyKpR+AxWN j7yQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUqbNSC0vhevF3obSb/n1O37029WcIso9I2jIu/MJqfZQiqhRMM TuPXRE3aMJMLXcmXl+B6c+hyRcIVh8JMk0x5oIiZ5EfFwzYTpD6ChIcZ+UsZ7PPR0WC/FMzJLjY vu2+gs+LV4WZ7m2YrPRe9816HME6OtGJNW9zpnlt36pqsuHS5DvhKxvkMKBn8LBPzIA== X-Received: by 2002:a63:ea50:: with SMTP id l16mr19919721pgk.160.1565391538720; Fri, 09 Aug 2019 15:58:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqxDt7v23iMpwzRLj8aqcdt/jfU4oASAEkBde5RkYCtUPBkh4OffmwVELF0WKdMGlgp0U5g/ X-Received: by 2002:a63:ea50:: with SMTP id l16mr19919656pgk.160.1565391537251; Fri, 09 Aug 2019 15:58:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391537; cv=none; d=google.com; s=arc-20160816; b=if4LTJs494JYPBxo0HRFtxdUO85lNNXJbtdZEnnSYUIsmiqL5hBz1p/0ERMqMN0N9V EsdWNdnrZ0j2pVXRCJNPJrGzgYPmc+kCUL9GRT0tqzxjxRVYe8WoiUogD73vhvmrge5A gB7nHO7G68jaW6JH9bi6VxbFg50oJtXKAsxAqCx0zC18dffDhVuXHH+8xHGTGxlg8T86 JOj4TOFskdMPLXkvGOLEERe8WrW2pGFDOwczcHlMdQcwg8OqU1FbiUWDuMHLMX0lJ9JB LV4lO9C/umsa0zcPWjAtV36zvdmoioXrnnqE6xYE6x288wFyrH6qHMJsmWpz0MoTU/aE ZtuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=IYbL9R6/GsOg8Lm6+VjpJDRhe/AIP19VysgM8nkePVA=; b=W/O8TG7+pNQO1XKzhU1s1VOFsYBJj7ufkYxTY9y70F9dQbf02ZKDmFXfbyjh6zE8TF 03Ym4tvXrqF7xDIcqpJnxF10rGZ1EMQzDbPJon9mx+Epch0EXQmbwnneTor5mHOOzE9i pa9h3ZS+3gQ/nOQAz0F569fqXLEUM6m26+Klc2O4lq2qGWqQiL8hZW1WR+j2+zZOJF89 Hlw/4zmrw0QSvhrkXUwzYCfjeuTf0sYTO2p3NTEvkwQJZSnq5ZVNh95kqIesaw2WW1uD WdBj23COUwP56H8JNuIsqX0S85Agts7SxcQ4XoeBWHDbtampGzF4t4OTvyVC2m4XBXWC VZPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id e17si22370683pgt.192.2019.08.09.15.58.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:57 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:56 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="326762567" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:56 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 10/19] mm/gup: Pass a NULL vaddr_pin through GUP fast Date: Fri, 9 Aug 2019 15:58:24 -0700 Message-Id: <20190809225833.6657-11-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Internally GUP fast needs to know that fast users will not support file pins. Pass NULL for vaddr_pin through the fast call stack so that the pin code can return an error if it encounters file backed memory within the address range. Signed-off-by: Ira Weiny Reviewed-by: John Hubbard --- mm/gup.c | 65 ++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 40 insertions(+), 25 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 7a449500f0a6..504af3e9a942 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1813,7 +1813,8 @@ static inline struct page *try_get_compound_head(struct page *page, int refs) #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) + unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { struct dev_pagemap *pgmap = NULL; int nr_start = *nr, ret = 0; @@ -1894,7 +1895,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, * useful to have gup_huge_pmd even if we can't operate on ptes. */ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) + unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { return 0; } @@ -1903,7 +1905,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, #if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) static int __gup_device_huge(unsigned long pfn, unsigned long addr, unsigned long end, struct page **pages, int *nr, - unsigned int flags) + unsigned int flags, struct vaddr_pin *vaddr_pin) { int nr_start = *nr; struct dev_pagemap *pgmap = NULL; @@ -1938,13 +1940,14 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, unsigned long end, struct page **pages, int *nr, - unsigned int flags) + unsigned int flags, struct vaddr_pin *vaddr_pin) { unsigned long fault_pfn; int nr_start = *nr; fault_pfn = pmd_pfn(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - if (!__gup_device_huge(fault_pfn, addr, end, pages, nr, flags)) + if (!__gup_device_huge(fault_pfn, addr, end, pages, nr, flags, + vaddr_pin)) return 0; if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { @@ -1957,13 +1960,14 @@ static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, unsigned long end, struct page **pages, int *nr, - unsigned int flags) + unsigned int flags, struct vaddr_pin *vaddr_pin) { unsigned long fault_pfn; int nr_start = *nr; fault_pfn = pud_pfn(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - if (!__gup_device_huge(fault_pfn, addr, end, pages, nr, flags)) + if (!__gup_device_huge(fault_pfn, addr, end, pages, nr, flags, + vaddr_pin)) return 0; if (unlikely(pud_val(orig) != pud_val(*pudp))) { @@ -1975,7 +1979,7 @@ static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, #else static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, unsigned long end, struct page **pages, int *nr, - unsigned int flags) + unsigned int flags, struct vaddr_pin *vaddr_pin) { BUILD_BUG(); return 0; @@ -1983,7 +1987,7 @@ static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr, unsigned long end, struct page **pages, int *nr, - unsigned int flags) + unsigned int flags, struct vaddr_pin *vaddr_pin) { BUILD_BUG(); return 0; @@ -2075,7 +2079,8 @@ static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr, #endif /* CONFIG_ARCH_HAS_HUGEPD */ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, int *nr) + unsigned long end, unsigned int flags, struct page **pages, + int *nr, struct vaddr_pin *vaddr_pin) { struct page *head, *page; int refs; @@ -2087,7 +2092,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (unlikely(flags & FOLL_LONGTERM)) return 0; return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr, - flags); + flags, vaddr_pin); } refs = 0; @@ -2117,7 +2122,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, } static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, int *nr) + unsigned long end, unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { struct page *head, *page; int refs; @@ -2129,7 +2135,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, if (unlikely(flags & FOLL_LONGTERM)) return 0; return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr, - flags); + flags, vaddr_pin); } refs = 0; @@ -2196,7 +2202,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, } static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) + unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { unsigned long next; pmd_t *pmdp; @@ -2220,7 +2227,7 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, return 0; if (!gup_huge_pmd(pmd, pmdp, addr, next, flags, - pages, nr)) + pages, nr, vaddr_pin)) return 0; } else if (unlikely(is_hugepd(__hugepd(pmd_val(pmd))))) { @@ -2231,7 +2238,8 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr, PMD_SHIFT, next, flags, pages, nr)) return 0; - } else if (!gup_pte_range(pmd, addr, next, flags, pages, nr)) + } else if (!gup_pte_range(pmd, addr, next, flags, pages, nr, + vaddr_pin)) return 0; } while (pmdp++, addr = next, addr != end); @@ -2239,7 +2247,8 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, } static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) + unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { unsigned long next; pud_t *pudp; @@ -2253,13 +2262,14 @@ static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, return 0; if (unlikely(pud_huge(pud))) { if (!gup_huge_pud(pud, pudp, addr, next, flags, - pages, nr)) + pages, nr, vaddr_pin)) return 0; } else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) { if (!gup_huge_pd(__hugepd(pud_val(pud)), addr, PUD_SHIFT, next, flags, pages, nr)) return 0; - } else if (!gup_pmd_range(pud, addr, next, flags, pages, nr)) + } else if (!gup_pmd_range(pud, addr, next, flags, pages, nr, + vaddr_pin)) return 0; } while (pudp++, addr = next, addr != end); @@ -2267,7 +2277,8 @@ static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, } static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) + unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { unsigned long next; p4d_t *p4dp; @@ -2284,7 +2295,8 @@ static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end, if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr, P4D_SHIFT, next, flags, pages, nr)) return 0; - } else if (!gup_pud_range(p4d, addr, next, flags, pages, nr)) + } else if (!gup_pud_range(p4d, addr, next, flags, pages, nr, + vaddr_pin)) return 0; } while (p4dp++, addr = next, addr != end); @@ -2292,7 +2304,8 @@ static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end, } static void gup_pgd_range(unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) + unsigned int flags, struct page **pages, int *nr, + struct vaddr_pin *vaddr_pin) { unsigned long next; pgd_t *pgdp; @@ -2312,7 +2325,8 @@ static void gup_pgd_range(unsigned long addr, unsigned long end, if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr, PGDIR_SHIFT, next, flags, pages, nr)) return; - } else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr)) + } else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr, + vaddr_pin)) return; } while (pgdp++, addr = next, addr != end); } @@ -2374,7 +2388,8 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) && gup_fast_permitted(start, end)) { local_irq_save(flags); - gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr); + gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr, + NULL); local_irq_restore(flags); } @@ -2445,7 +2460,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) && gup_fast_permitted(start, end)) { local_irq_disable(); - gup_pgd_range(addr, end, gup_flags, pages, &nr); + gup_pgd_range(addr, end, gup_flags, pages, &nr, NULL); local_irq_enable(); ret = nr; } From patchwork Fri Aug 9 22:58:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087969 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3DB6013B1 for ; Fri, 9 Aug 2019 22:59:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2ABEA20602 for ; Fri, 9 Aug 2019 22:59:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1ED102228E; Fri, 9 Aug 2019 22:59:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5EB9620602 for ; Fri, 9 Aug 2019 22:59:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E74686B026A; Fri, 9 Aug 2019 18:59:01 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E25F16B026B; Fri, 9 Aug 2019 18:59:01 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2C2A6B026C; Fri, 9 Aug 2019 18:59:01 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 87C9F6B026A for ; Fri, 9 Aug 2019 18:59:01 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id b30so4454964pla.16 for ; Fri, 09 Aug 2019 15:59:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=PjJDK8ia3RZCm4uOYMDrFLIGclr9bexVfc/3xOpYMjQ=; b=EyM055SGYjqb6588alSHhpd157dhQU23m/VQR0KWncxy0ATD10KQYI12ne9y5gA2fz 1w0/WDu3LCljTn55ffzliA1jz2UGW5x5Gbhk0b/U+VSDuIcGihKWUpxbgS0er8ZuQDtx ng7eP03oAXaWQovW6Hneyi0jWxX76tp9DsqnH4LkgO2uf7Y+tmpbURVk12agOdOzu+Uh tozOqRPb1ot1qN5gTyK5YJYBRA3nXBH4NnGog6zVdYknE9ANaxKTeYXocAcHv0UtUTgj MioAwBWwZwikS7wVix7KvD9dFU2oqmEN35qMjhvirSPpfUirxC7/zMb4BYtdsXGkYkAs /ilA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWopeofWlGmwphG1zWC64peQXM6ma9SUFn8QUnROp2dGNdT9Ci+ GNzSTFTmESCXJ+zu3VpRS85Cf8R7r9TPr6ffgU48sroZIzbdhgpBaSN7Njs6TQCHIPhNa5RW4JD vH4Bc+isKxjgGU2Ct8qteMTIuvCoU4GzjRWMbgcN9kVz4h+aT0QJMVif1Z9pT/IjKXQ== X-Received: by 2002:aa7:9514:: with SMTP id b20mr24552743pfp.223.1565391541127; Fri, 09 Aug 2019 15:59:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqzJnlcV1GQ8xYKcnGcSHx7qjRsRtrmJ4PlM6gu/7luhH0caQc46TshD5pu7Oqv5hn+EEPv+ X-Received: by 2002:aa7:9514:: with SMTP id b20mr24552670pfp.223.1565391539830; Fri, 09 Aug 2019 15:58:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391539; cv=none; d=google.com; s=arc-20160816; b=K63gcS84CwfQRzJ4HE/lgBoRiFIMq00OZ1YyEu/AxGzm1gcF2Q3RQULkw/fiCfO8h1 Be+kR67qi21iXVXoFdt3SXXRFlgb6AF8AWEJCGkBze+TWB1YA6SzR75NUowNHRhpetLB vocMmM/VjTh2BWsyLNUC7pbzYJxPQxTiDn+aO0jTLbkVghVZViVJhFoOjXIIv/muX0at EU/EvnUvZTVeObb2XxPVncTFFvHl3otjkz6WfcAo41OvF+tsh35W53ipkm8q9JtZ02/w 0D4LFVATM4X4BGovVMgKZaP7AkLU0omv66dhK2SUPXlXzA71hxg+yrGVeuV2t5wLmbHX eEIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=PjJDK8ia3RZCm4uOYMDrFLIGclr9bexVfc/3xOpYMjQ=; b=P92egwU74i3t1kZH3P4kEN5BZn7vXBRNL40h3xN6O/qB7UFdlQtEXeI/XqUoWWp3Om +BkZXhJPJHTMNti00ZcXp1iZQOhCpTqishB5hmRzaUTBkmzUMjt+jEWQvch5VsEHfOf2 0Q0Tnz34vcuMPbtusXIp/5IkpF83TDmVQ5+bS2zBq3snCSJxmaFq8eSlQFZUJx3JrVJw //U0E+Y5F9tnNDMnPeBTkffQyY06d01yYN9PvEi1QsdXoxsjHOmvclW/wejC8pFYYNU/ uQJcG9PsSx3RGVi5YFd+ED+5za9oCp+2M9DMuBpwHg1C0c1kg/mHfqIzgnotzhtoRRJ8 wWLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id 21si59278041pfo.138.2019.08.09.15.58.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:58:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:59 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="177765523" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:58:58 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 11/19] mm/gup: Pass follow_page_context further down the call stack Date: Fri, 9 Aug 2019 15:58:25 -0700 Message-Id: <20190809225833.6657-12-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny In preparation for passing more information (vaddr_pin) into follow_page_pte(), follow_devmap_pud(), and follow_devmap_pmd(). Signed-off-by: Ira Weiny --- include/linux/huge_mm.h | 17 ----------------- mm/gup.c | 31 +++++++++++++++---------------- mm/huge_memory.c | 6 ++++-- mm/internal.h | 28 ++++++++++++++++++++++++++++ 4 files changed, 47 insertions(+), 35 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 45ede62aa85b..b01a20ce0bb9 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -233,11 +233,6 @@ static inline int hpage_nr_pages(struct page *page) return 1; } -struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, int flags, struct dev_pagemap **pgmap); -struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, - pud_t *pud, int flags, struct dev_pagemap **pgmap); - extern vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t orig_pmd); extern struct page *huge_zero_page; @@ -375,18 +370,6 @@ static inline void mm_put_huge_zero_page(struct mm_struct *mm) return; } -static inline struct page *follow_devmap_pmd(struct vm_area_struct *vma, - unsigned long addr, pmd_t *pmd, int flags, struct dev_pagemap **pgmap) -{ - return NULL; -} - -static inline struct page *follow_devmap_pud(struct vm_area_struct *vma, - unsigned long addr, pud_t *pud, int flags, struct dev_pagemap **pgmap) -{ - return NULL; -} - static inline bool thp_migration_supported(void) { return false; diff --git a/mm/gup.c b/mm/gup.c index 504af3e9a942..a7a9d2f5278c 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -24,11 +24,6 @@ #include "internal.h" -struct follow_page_context { - struct dev_pagemap *pgmap; - unsigned int page_mask; -}; - /** * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages * @pages: array of pages to be maybe marked dirty, and definitely released. @@ -172,8 +167,9 @@ static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) static struct page *follow_page_pte(struct vm_area_struct *vma, unsigned long address, pmd_t *pmd, unsigned int flags, - struct dev_pagemap **pgmap) + struct follow_page_context *ctx) { + struct dev_pagemap **pgmap = &ctx->pgmap; struct mm_struct *mm = vma->vm_mm; struct page *page; spinlock_t *ptl; @@ -363,13 +359,13 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, } if (pmd_devmap(pmdval)) { ptl = pmd_lock(mm, pmd); - page = follow_devmap_pmd(vma, address, pmd, flags, &ctx->pgmap); + page = follow_devmap_pmd(vma, address, pmd, flags, ctx); spin_unlock(ptl); if (page) return page; } if (likely(!pmd_trans_huge(pmdval))) - return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); + return follow_page_pte(vma, address, pmd, flags, ctx); if ((flags & FOLL_NUMA) && pmd_protnone(pmdval)) return no_page_table(vma, flags); @@ -389,7 +385,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, } if (unlikely(!pmd_trans_huge(*pmd))) { spin_unlock(ptl); - return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); + return follow_page_pte(vma, address, pmd, flags, ctx); } if (flags & (FOLL_SPLIT | FOLL_SPLIT_PMD)) { int ret; @@ -419,7 +415,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, } return ret ? ERR_PTR(ret) : - follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); + follow_page_pte(vma, address, pmd, flags, ctx); } page = follow_trans_huge_pmd(vma, address, pmd, flags); spin_unlock(ptl); @@ -456,7 +452,7 @@ static struct page *follow_pud_mask(struct vm_area_struct *vma, } if (pud_devmap(*pud)) { ptl = pud_lock(mm, pud); - page = follow_devmap_pud(vma, address, pud, flags, &ctx->pgmap); + page = follow_devmap_pud(vma, address, pud, flags, ctx); spin_unlock(ptl); if (page) return page; @@ -786,7 +782,8 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking) + struct vm_area_struct **vmas, int *nonblocking, + struct vaddr_pin *vaddr_pin) { long ret = 0, i = 0; struct vm_area_struct *vma = NULL; @@ -797,6 +794,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET)); + ctx.vaddr_pin = vaddr_pin; + /* * If FOLL_FORCE is set then do not force a full fault as the hinting * fault information is unrelated to the reference behaviour of a task @@ -1025,7 +1024,7 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, lock_dropped = false; for (;;) { ret = __get_user_pages(tsk, mm, start, nr_pages, flags, pages, - vmas, locked); + vmas, locked, vaddr_pin); if (!locked) /* VM_FAULT_RETRY couldn't trigger, bypass */ return ret; @@ -1068,7 +1067,7 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, lock_dropped = true; down_read(&mm->mmap_sem); ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED, - pages, NULL, NULL); + pages, NULL, NULL, vaddr_pin); if (ret != 1) { BUG_ON(ret > 1); if (!pages_done) @@ -1226,7 +1225,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, * not result in a stack expansion that recurses back here. */ return __get_user_pages(current, mm, start, nr_pages, gup_flags, - NULL, NULL, nonblocking); + NULL, NULL, nonblocking, NULL); } /* @@ -1311,7 +1310,7 @@ struct page *get_dump_page(unsigned long addr) if (__get_user_pages(current, current->mm, addr, 1, FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma, - NULL) < 1) + NULL, NULL) < 1) return NULL; flush_cache_page(vma, addr, page_to_pfn(page)); return page; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index bc1a07a55be1..7e09f2f17ed8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -916,8 +916,9 @@ static void touch_pmd(struct vm_area_struct *vma, unsigned long addr, } struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, int flags, struct dev_pagemap **pgmap) + pmd_t *pmd, int flags, struct follow_page_context *ctx) { + struct dev_pagemap **pgmap = &ctx->pgmap; unsigned long pfn = pmd_pfn(*pmd); struct mm_struct *mm = vma->vm_mm; struct page *page; @@ -1068,8 +1069,9 @@ static void touch_pud(struct vm_area_struct *vma, unsigned long addr, } struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, - pud_t *pud, int flags, struct dev_pagemap **pgmap) + pud_t *pud, int flags, struct follow_page_context *ctx) { + struct dev_pagemap **pgmap = &ctx->pgmap; unsigned long pfn = pud_pfn(*pud); struct mm_struct *mm = vma->vm_mm; struct page *page; diff --git a/mm/internal.h b/mm/internal.h index 0d5f720c75ab..46ada5279856 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -12,6 +12,34 @@ #include #include +struct follow_page_context { + struct dev_pagemap *pgmap; + unsigned int page_mask; + struct vaddr_pin *vaddr_pin; +}; + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, + pmd_t *pmd, int flags, struct follow_page_context *ctx); +struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, + pud_t *pud, int flags, struct follow_page_context *ctx); +#else +static inline struct page *follow_devmap_pmd(struct vm_area_struct *vma, + unsigned long addr, pmd_t *pmd, int flags, + struct follow_page_context *ctx) +{ + return NULL; +} + +static inline struct page *follow_devmap_pud(struct vm_area_struct *vma, + unsigned long addr, pud_t *pud, int flags, + struct follow_page_context *ctx) +{ + return NULL; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + + /* * The set of flags that only affect watermark checking and reclaim * behaviour. This is used by the MM to obey the caller constraints From patchwork Fri Aug 9 22:58:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087979 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 48F426C5 for ; Fri, 9 Aug 2019 22:59:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36F4420602 for ; Fri, 9 Aug 2019 22:59:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2B2FD2223E; Fri, 9 Aug 2019 22:59:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78DDB20602 for ; Fri, 9 Aug 2019 22:59:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 724816B026B; Fri, 9 Aug 2019 18:59:03 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5BF7A6B026C; Fri, 9 Aug 2019 18:59:03 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B1376B026D; Fri, 9 Aug 2019 18:59:03 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 0218D6B026B for ; Fri, 9 Aug 2019 18:59:03 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id k9so58271673pls.13 for ; Fri, 09 Aug 2019 15:59:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=UXaF8E75Fz8TNtd5YlkD6jNcYk7EPaE+qiAcVO8EQK8=; b=jXR3GLf5OpEVsnGZL0RjoFFtCiQ2Sk9EIBm3b3F+eNC1ojhNMlwUCdhAoOftH1MGXU YTDiqRdKNj1ltinVfhJSXx+pABicrO9UnHIQe/ipjRgw0GDoWFqs8sunDNTdhwNv8nDP kcyFo3rwBc7tUaZcQ12NLvqR5qpAL+jpChyYaQNSwFcPxod52gm2lewNBwYeYPkxGKB2 uOtxvxOva7sT0Ik6Jj0Wfz8DEXfl0XsDMbIVJpfIUPCgC/SLg7xRyDPhqv8cyLfLHREf NZ34l6niytTyZpOtr8aRUg4NM6kpE7iOefDZRvueLKifcrw7npIQxtae33BZ+wSxuv+x F3Tg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX7B6YaaG0N5fOeYHP+phx8UZU/O9A9qhQuhTbZCIoEx8Yn+ZSw +jdCr39FJfm6VkXXQPom0B9Ytgo7QkDKUUthUSgCHZ6jDMOaJ87ZwriwTaq4XTzO677t43fQbLQ STEGr8Ts7PJ25OuZ8SKYrOhLQYugexVjgOmBejj8hfANZhYHPpux1FOviE4SpxtLeBg== X-Received: by 2002:a63:8dc9:: with SMTP id z192mr19185781pgd.151.1565391542604; Fri, 09 Aug 2019 15:59:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqwf/4zz2ggySgyosK3Y5V4HFdZr/Zg0pss+d2m9bTPDQrjuvJp1wYrf+LAZfhal9rTjqYjv X-Received: by 2002:a63:8dc9:: with SMTP id z192mr19185730pgd.151.1565391541548; Fri, 09 Aug 2019 15:59:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391541; cv=none; d=google.com; s=arc-20160816; b=a4ausUnY2oPo1yMnY34vtAkLTA9CGN+isWc+IX2IIqgSxkDDme3+uWcNqZ+malo+RM Lb3MzpZc8pwac/QNYo9Q1JLE+Cs+P4hlQghlBWJnRG0kGxhqsH+LeviPBDMU0T0zKHDw DPs5Bt6nDLQ058mMZhPnvcyohX8V1Hhz8FSj61X7mcuyc11AWKpVdIg15odsxWsmg2dA TSTSZnrnZFTBWGGXCqWigfIfMXIICbfo1JEKxqKCjMxMooTWm6DMG7sT/DprVPDLC5vo StYFQOOsW6Zp4oGiCaL3TZgSyAX2Ut22CFMymAxcRP+ONXxcEu8DsgpMgh6nExd0N1GS 6CQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=UXaF8E75Fz8TNtd5YlkD6jNcYk7EPaE+qiAcVO8EQK8=; b=A+Vy71Sl2yPWTML1tC7LUw/RK/ZoiZXU6/A0Gl/Y3X8iY/zW9B2udq12vOtlSkPXg2 PlcZ1BqbKw83wFcwXBmH1JvSsFIqUdzkkSDMzc2qgGR4Iu/RIVSUFRuo8WCn8TNPdV1n 1CDSksoJDB+gdr+adIPKxMCZX7oEBHi1wkgCwTMv3LhYk3r12emTXlvn+44n6dZGSwsZ ditPg3eCb749sgF8Xq/sVlGDFnXJeSvKaFvUlWEIeGS89CmXYsucmwAZQFYi1i1isiaY Rx8rTMjea4q4cV12iJNzw5IbI0/dYoCIKfl8sOIA1u7f52eZ07tYjBvQHgmlzrmVe/xb RVTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id f12si24938561pgp.218.2019.08.09.15.59.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:00 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="186799457" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:00 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 12/19] mm/gup: Prep put_user_pages() to take an vaddr_pin struct Date: Fri, 9 Aug 2019 15:58:26 -0700 Message-Id: <20190809225833.6657-13-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Once callers start to use vaddr_pin the put_user_pages calls will need to have access to this data coming in. Prep put_user_pages() for this data. Signed-off-by: Ira Weiny --- include/linux/mm.h | 20 +------- mm/gup.c | 122 ++++++++++++++++++++++++++++++++------------- 2 files changed, 88 insertions(+), 54 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index befe150d17be..9d37cafbef9a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1064,25 +1064,7 @@ static inline void put_page(struct page *page) __put_page(page); } -/** - * put_user_page() - release a gup-pinned page - * @page: pointer to page to be released - * - * Pages that were pinned via get_user_pages*() must be released via - * either put_user_page(), or one of the put_user_pages*() routines - * below. This is so that eventually, pages that are pinned via - * get_user_pages*() can be separately tracked and uniquely handled. In - * particular, interactions with RDMA and filesystems need special - * handling. - * - * put_user_page() and put_page() are not interchangeable, despite this early - * implementation that makes them look the same. put_user_page() calls must - * be perfectly matched up with get_user_page() calls. - */ -static inline void put_user_page(struct page *page) -{ - put_page(page); -} +void put_user_page(struct page *page); void put_user_pages_dirty_lock(struct page **pages, unsigned long npages, bool make_dirty); diff --git a/mm/gup.c b/mm/gup.c index a7a9d2f5278c..10cfd30ff668 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -24,30 +24,41 @@ #include "internal.h" -/** - * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages - * @pages: array of pages to be maybe marked dirty, and definitely released. - * @npages: number of pages in the @pages array. - * @make_dirty: whether to mark the pages dirty - * - * "gup-pinned page" refers to a page that has had one of the get_user_pages() - * variants called on that page. - * - * For each page in the @pages array, make that page (or its head page, if a - * compound page) dirty, if @make_dirty is true, and if the page was previously - * listed as clean. In any case, releases all pages using put_user_page(), - * possibly via put_user_pages(), for the non-dirty case. - * - * Please see the put_user_page() documentation for details. - * - * set_page_dirty_lock() is used internally. If instead, set_page_dirty() is - * required, then the caller should a) verify that this is really correct, - * because _lock() is usually required, and b) hand code it: - * set_page_dirty_lock(), put_user_page(). - * - */ -void put_user_pages_dirty_lock(struct page **pages, unsigned long npages, - bool make_dirty) +static void __put_user_page(struct vaddr_pin *vaddr_pin, struct page *page) +{ + page = compound_head(page); + + /* + * For devmap managed pages we need to catch refcount transition from + * GUP_PIN_COUNTING_BIAS to 1, when refcount reach one it means the + * page is free and we need to inform the device driver through + * callback. See include/linux/memremap.h and HMM for details. + */ + if (put_devmap_managed_page(page)) + return; + + if (put_page_testzero(page)) + __put_page(page); +} + +static void __put_user_pages(struct vaddr_pin *vaddr_pin, struct page **pages, + unsigned long npages) +{ + unsigned long index; + + /* + * TODO: this can be optimized for huge pages: if a series of pages is + * physically contiguous and part of the same compound page, then a + * single operation to the head page should suffice. + */ + for (index = 0; index < npages; index++) + __put_user_page(vaddr_pin, pages[index]); +} + +static void __put_user_pages_dirty_lock(struct vaddr_pin *vaddr_pin, + struct page **pages, + unsigned long npages, + bool make_dirty) { unsigned long index; @@ -58,7 +69,7 @@ void put_user_pages_dirty_lock(struct page **pages, unsigned long npages, */ if (!make_dirty) { - put_user_pages(pages, npages); + __put_user_pages(vaddr_pin, pages, npages); return; } @@ -86,9 +97,58 @@ void put_user_pages_dirty_lock(struct page **pages, unsigned long npages, */ if (!PageDirty(page)) set_page_dirty_lock(page); - put_user_page(page); + __put_user_page(vaddr_pin, page); } } + +/** + * put_user_page() - release a gup-pinned page + * @page: pointer to page to be released + * + * Pages that were pinned via get_user_pages*() must be released via + * either put_user_page(), or one of the put_user_pages*() routines + * below. This is so that eventually, pages that are pinned via + * get_user_pages*() can be separately tracked and uniquely handled. In + * particular, interactions with RDMA and filesystems need special + * handling. + * + * put_user_page() and put_page() are not interchangeable, despite this early + * implementation that makes them look the same. put_user_page() calls must + * be perfectly matched up with get_user_page() calls. + */ +void put_user_page(struct page *page) +{ + __put_user_page(NULL, page); +} +EXPORT_SYMBOL(put_user_page); + +/** + * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages + * @pages: array of pages to be maybe marked dirty, and definitely released. + * @npages: number of pages in the @pages array. + * @make_dirty: whether to mark the pages dirty + * + * "gup-pinned page" refers to a page that has had one of the get_user_pages() + * variants called on that page. + * + * For each page in the @pages array, make that page (or its head page, if a + * compound page) dirty, if @make_dirty is true, and if the page was previously + * listed as clean. In any case, releases all pages using put_user_page(), + * possibly via put_user_pages(), for the non-dirty case. + * + * Please see the put_user_page() documentation for details. + * + * set_page_dirty_lock() is used internally. If instead, set_page_dirty() is + * required, then the caller should a) verify that this is really correct, + * because _lock() is usually required, and b) hand code it: + * set_page_dirty_lock(), put_user_page(). + * + */ +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages, + bool make_dirty) +{ + __put_user_pages_dirty_lock(NULL, pages, npages, make_dirty); +} EXPORT_SYMBOL(put_user_pages_dirty_lock); /** @@ -102,15 +162,7 @@ EXPORT_SYMBOL(put_user_pages_dirty_lock); */ void put_user_pages(struct page **pages, unsigned long npages) { - unsigned long index; - - /* - * TODO: this can be optimized for huge pages: if a series of pages is - * physically contiguous and part of the same compound page, then a - * single operation to the head page should suffice. - */ - for (index = 0; index < npages; index++) - put_user_page(pages[index]); + __put_user_pages(NULL, pages, npages); } EXPORT_SYMBOL(put_user_pages); From patchwork Fri Aug 9 22:58:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087987 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 691336C5 for ; Fri, 9 Aug 2019 22:59:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 56EA920602 for ; Fri, 9 Aug 2019 22:59:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4A7A522376; Fri, 9 Aug 2019 22:59:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B786920602 for ; Fri, 9 Aug 2019 22:59:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E24D66B026C; Fri, 9 Aug 2019 18:59:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DFEA86B026D; Fri, 9 Aug 2019 18:59:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA14D6B026E; Fri, 9 Aug 2019 18:59:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 828AE6B026C for ; Fri, 9 Aug 2019 18:59:04 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id 145so62398422pfv.18 for ; Fri, 09 Aug 2019 15:59:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1thzjoQSYXOYbtTHZ+iJ6Zmx5pcWMzps+60/Qk4fNhc=; b=RrhHe+Uw6S3zZcR0Lpqua++oRbrUReL+gN2n5c0TW4DHoZh2ufyaQLX2ws9Fp3S/q5 lgoxFv6aMpTdIKJo5DK9EA2zgBBNbSa8BVSUr0doe89qW5q9s5E8D0L6t6DWVeC98jZl p71IabLh2+MwsjOfdkdSujNTDo3Qqd5eOz3JI2veUNsn87TMYVLfEhzN57BEFm50/6jn KujflQlQQc9H1BKEzMJQgc4Gz3TcgzTSnDF3ft3zcesS8gBzT7BCH1E1hQWNF+AtFMw9 D9ajTxQBaPIACMlQyJLPop/wKz0lSXB8VNOIwJJcyEVDPQYduArlaaynOiz6Zas8w1KZ 9qQQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAU7H24WyO6TScG4j/WeptZpe3I3j7xWrWdOYN7FRyeLDEEyy9gb +BTDtNRUDBB1QtM/quberFROlm3bVcxDW5TmZpepzD3dUn7orMbUBMKDuomYibS/SUcP1W1tvbz hp0xlmq2aUjXScFHudxCifHBtMBgsjURqV+F4mu/hj1aIAUfbosafBxyWLTjAzSV8dw== X-Received: by 2002:a63:4c21:: with SMTP id z33mr19679812pga.418.1565391544075; Fri, 09 Aug 2019 15:59:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqwcwcRffzyf9ZV2RUswZVBdhZw8UdPcvVrnfXeqvf+2rDCQXPabGYGmuTKybG79kxSAP1Fn X-Received: by 2002:a63:4c21:: with SMTP id z33mr19679765pga.418.1565391543235; Fri, 09 Aug 2019 15:59:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391543; cv=none; d=google.com; s=arc-20160816; b=DqxAIqhZouvqc0FhF2vOTEDdwynIKj83SDBMCZfINo6tA/FIFhAZB7oyB6pJYMYdbb TlFtIP3vNoXWwABzgsToB6NckKVAAMnf2mh4diDPbA7bVCxRAJOjWXyQ1lg6QfZwqmLs 0APT0Xd04wJzQIF8CCY9N+JCC92JpSFYixYf1u2cUzQTzJ5Vy012kTmHkAvc69EtLLDx 7LzVPVmmBXqMT6Kr7tIICMNwGMZcunye1vXHSQen9trVNspAWVSDkxhNyPo6uWsADkTX 5vgsVWg+dghzqULR6mW3u0Wh9yq+cxtxavBjXqCVnCQRmSunnH5j7QdOrHbpZdok6ON2 wqqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=1thzjoQSYXOYbtTHZ+iJ6Zmx5pcWMzps+60/Qk4fNhc=; b=nJC603UBYLnQZK9Wg3D7hv4pkL8Js2Jo/lqu2KG38Yuxg6O7iKRezPwoU5NXx8lmq2 92/uRzv0VMD18fz4YrZNNwJRwRcHxUnj5g2MJ7B/sbzPNCUrEOFXkFr+hbX/Tqxs0Ll6 d3ghWQ+N5YK/jI7BluiXo4En63Fx85fSk+cV5RzQiainmMcXgex00rzTjP7kBuHMK65J 0qYktqpP8tVtjjHMt4uJcENGAa8A3zikAZlOcM1plenYESjpyq/o4FmSt9vbbETgqjHp V6AcSCZKp7UQo5p3SJAn6g6TwqCw+0QenjiZZ0Dp5fDzV7XQy5gcDO/0bAJ3d4yhyvxt RAlg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id 21si59278041pfo.138.2019.08.09.15.59.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:02 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="259172583" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:01 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 13/19] {mm,file}: Add file_pins objects Date: Fri, 9 Aug 2019 15:58:27 -0700 Message-Id: <20190809225833.6657-14-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny User page pins (aka GUP) needs to track file information of files being pinned by those calls. Depending on the needs of the caller this information is stored in 1 of 2 ways. 1) Some subsystems like RDMA associate GUP pins with file descriptors which can be passed around to other process'. In this case a file being pined must be associated with an owning file object (which can then be resolved back to any of the processes which have a file descriptor 'pointing' to that file object). 2) Other subsystems do not have an owning file and can therefore associate the file pin directly to the mm of the process which created them. This patch introduces the new file pin structures and ensures struct file and struct mm_struct are prepared to store them. In subsequent patches the required information will be passed into new pin page calls and procfs is enhanced to show this information to the user. Signed-off-by: Ira Weiny --- fs/file_table.c | 4 ++++ include/linux/file.h | 49 ++++++++++++++++++++++++++++++++++++++++ include/linux/fs.h | 2 ++ include/linux/mm_types.h | 2 ++ kernel/fork.c | 3 +++ 5 files changed, 60 insertions(+) diff --git a/fs/file_table.c b/fs/file_table.c index b07b53f24ff5..38947b9a4769 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -46,6 +46,7 @@ static void file_free_rcu(struct rcu_head *head) { struct file *f = container_of(head, struct file, f_u.fu_rcuhead); + WARN_ON(!list_empty(&f->file_pins)); put_cred(f->f_cred); kmem_cache_free(filp_cachep, f); } @@ -118,6 +119,9 @@ static struct file *__alloc_file(int flags, const struct cred *cred) f->f_mode = OPEN_FMODE(flags); /* f->f_version: 0 */ + INIT_LIST_HEAD(&f->file_pins); + spin_lock_init(&f->fp_lock); + return f; } diff --git a/include/linux/file.h b/include/linux/file.h index 3fcddff56bc4..cd79adad5b23 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -9,6 +9,7 @@ #include #include #include +#include struct file; @@ -91,4 +92,52 @@ extern void fd_install(unsigned int fd, struct file *file); extern void flush_delayed_fput(void); extern void __fput_sync(struct file *); +/** + * struct file_file_pin + * + * Associate a pin'ed file with another file owner. + * + * Subsystems such as RDMA have the ability to pin memory which is associated + * with a file descriptor which can be passed to other processes without + * necessarily having that memory accessed in the remote processes address + * space. + * + * @file file backing memory which was pined by a GUP caller + * @f_owner the file representing the GUP owner + * @list of all file pins this owner has + * (struct file *)->file_pins + * @ref number of times this pin was taken (roughly the number of pages pinned + * in the file) + */ +struct file_file_pin { + struct file *file; + struct file *f_owner; + struct list_head list; + struct kref ref; +}; + +/* + * struct mm_file_pin + * + * Some GUP callers do not have an "owning" file. Those pins are accounted for + * in the mm of the process that called GUP. + * + * The tuple {file, inode} is used to track this as a unique file pin and to + * track when this pin has been removed. + * + * @file file backing memory which was pined by a GUP caller + * @mm back point to owning mm + * @inode backing the file + * @list of all file pins this owner has + * (struct mm_struct *)->file_pins + * @ref number of times this pin was taken + */ +struct mm_file_pin { + struct file *file; + struct mm_struct *mm; + struct inode *inode; + struct list_head list; + struct kref ref; +}; + #endif /* __LINUX_FILE_H */ diff --git a/include/linux/fs.h b/include/linux/fs.h index 2e41ce547913..d2e08feb9737 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -963,6 +963,8 @@ struct file { #endif /* #ifdef CONFIG_EPOLL */ struct address_space *f_mapping; errseq_t f_wb_err; + struct list_head file_pins; + spinlock_t fp_lock; } __randomize_layout __attribute__((aligned(4))); /* lest something weird decides that 2 is OK */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6a7a1083b6fb..4f6ea4acddbd 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -516,6 +516,8 @@ struct mm_struct { /* HMM needs to track a few things per mm */ struct hmm *hmm; #endif + struct list_head file_pins; + spinlock_t fp_lock; /* lock file_pins */ } __randomize_layout; /* diff --git a/kernel/fork.c b/kernel/fork.c index 0e2f9a2c132c..093f2f2fce1a 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -675,6 +675,7 @@ void __mmdrop(struct mm_struct *mm) BUG_ON(mm == &init_mm); WARN_ON_ONCE(mm == current->mm); WARN_ON_ONCE(mm == current->active_mm); + WARN_ON(!list_empty(&mm->file_pins)); mm_free_pgd(mm); destroy_context(mm); mmu_notifier_mm_destroy(mm); @@ -1013,6 +1014,8 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mm->pmd_huge_pte = NULL; #endif mm_init_uprobes_state(mm); + INIT_LIST_HEAD(&mm->file_pins); + spin_lock_init(&mm->fp_lock); if (current->mm) { mm->flags = current->mm->flags & MMF_INIT_MASK; From patchwork Fri Aug 9 22:58:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087991 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 004BC6C5 for ; Fri, 9 Aug 2019 22:59:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEAF920602 for ; Fri, 9 Aug 2019 22:59:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D29C72223E; Fri, 9 Aug 2019 22:59:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0200120602 for ; Fri, 9 Aug 2019 22:59:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01AAC6B026D; Fri, 9 Aug 2019 18:59:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F12E56B026E; Fri, 9 Aug 2019 18:59:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAE796B026F; Fri, 9 Aug 2019 18:59:05 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id A49776B026D for ; Fri, 9 Aug 2019 18:59:05 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id x19so60559963pgx.1 for ; Fri, 09 Aug 2019 15:59:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=nohmS51GXUJzNN3q6yvtJCXnxdHqg5ERGj/5DxVBf84=; b=nGKgqZcvSXSzltIaxP7B6IFdxS3Xwlret/3pXK0eszNN3ndJUbOfDApksN8qeAu536 EvbTHEx6L8hglaHAvv8E9sjr6lndH1L7u7+oBL8AgRhoRdc1QIfzIHvYYfyEGrS7kOrj cu0fhdJr7zEkkCnIuZ/xWkTZ+LlhPws+Ie7mLNridqeckyD/4rw+4OAR2s4QkzvvgGWW POr+od9jJiN8aDOEbPgS1d0hDuxAGPmtZ+xs1kCrs8XoquPMby4YrzGGBoR/Gba4ewZN khPXk0HpjMVZuHzv/L+w3VSpBgRElozdNK7BCvk9ZupBV09SjGYaQ9MYFkaRcOwhrng7 5TwQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWFy65VpQ2Tbw4j/1/rared4VLK/+BJczK2FafE/Y6u9yJ19vuR 26yLJLjLeM6nl+sRdQEK4Y6HoRiP84eyTxg34RA2VmUpu3KeE3EnqKjgwjG9xZl/WHbRfM0FXEt EsJpBPnX7c+HiInEIZOBDO6eedhqwmIskU1bsYlIXfqde6GiMiy6zW7yh3l0X0k0d8w== X-Received: by 2002:a17:90a:384d:: with SMTP id l13mr11973505pjf.86.1565391545278; Fri, 09 Aug 2019 15:59:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqxueCBAbJVGygzqT3XEBU7Hq4bgdVFO8mJjfogkYlginceH+b5NxUdZwUwnTOWz1uhgV8Ix X-Received: by 2002:a17:90a:384d:: with SMTP id l13mr11973447pjf.86.1565391544204; Fri, 09 Aug 2019 15:59:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391544; cv=none; d=google.com; s=arc-20160816; b=VXFYIKbRo2QcLev47BOKK3TftYJvtF2CLycXj/Flogc2djvEf+WsBjzvBtzr+1GDgd 3PZa7KfOOMi/E7e8gEH0c7hUVLz+zCBMK3XOpSpMVqNOcAtnFQNdu/12Iqhr2TNHCJfl X4Y+N6UheOV7dAqwJSq256SYQtEibYyl5MaRqve/CcySJx7TOnb6m/YGFCU+7BwQdW9u bDCTBf7To5p5T3lU38MIMGTzt0vg1HITaojGJYz1IsjJjf8YGVA1Tic85zWYZFpTvaK0 2eEEtN/cE7qRcehqILNE4ndaSJFMBonS9C+RzJkhFmfFZc1IRdfWpLZhm0YMOXvSCjqK qq4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=nohmS51GXUJzNN3q6yvtJCXnxdHqg5ERGj/5DxVBf84=; b=0NyhsUsLOxMFvvM1Xufcq1D0zvKbwiAwWo6iCZDiIW/sSfYe10xN1sRIED75qIaaUp 9v81ygDdoKW+o1MH4uPJW+NIuWUZtAC64XnH9LGjKOcAL5/ZL690qox4SSNt6D71IYNA 7G5SeqF6hpwYO0quJp6PInD8jxa1vTJ5qwHihJ2E88xr4NG7ONePMLvK4VS2oKQbhH5r +aRDRmVWhqPE2N3ZppzusATTdjC1PrFAcHWiqOBbEDZkPopA9UYrC1heuZpNDtKgFZ4L FBi1uYdRpMDCfC6K4CG1vcEqR0byfp2rpDlOyYeAFj3lXzD0zktFEtd/VxybZRf6UDXN qFgA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id h189si55552068pgc.236.2019.08.09.15.59.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:03 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="175282026" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:03 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 14/19] fs/locks: Associate file pins while performing GUP Date: Fri, 9 Aug 2019 15:58:28 -0700 Message-Id: <20190809225833.6657-15-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny When a file back area is being pinned add the appropriate file pin information to the appropriate file or mm owner. This information can then be used by admins to determine who is causing a failure to change the layout of a file. Signed-off-by: Ira Weiny --- fs/locks.c | 195 ++++++++++++++++++++++++++++++++++++++++++++- include/linux/mm.h | 35 +++++++- mm/gup.c | 8 +- mm/huge_memory.c | 4 +- 4 files changed, 230 insertions(+), 12 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 14892c84844b..02c525446d25 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -168,6 +168,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -2972,9 +2973,194 @@ static int __init filelock_init(void) } core_initcall(filelock_init); +static struct file_file_pin *alloc_file_file_pin(struct inode *inode, + struct file *file) +{ + struct file_file_pin *fp = kzalloc(sizeof(*fp), GFP_ATOMIC); + + if (!fp) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&fp->list); + kref_init(&fp->ref); + return fp; +} + +static int add_file_pin_to_f_owner(struct vaddr_pin *vaddr_pin, + struct inode *inode, + struct file *file) +{ + struct file_file_pin *fp; + + list_for_each_entry(fp, &vaddr_pin->f_owner->file_pins, list) { + if (fp->file == file) { + kref_get(&fp->ref); + return 0; + } + } + + fp = alloc_file_file_pin(inode, file); + if (IS_ERR(fp)) + return PTR_ERR(fp); + + fp->file = get_file(file); + /* NOTE no reference needed here. + * It is expected that the caller holds a reference to the owner file + * for the duration of this pin. + */ + fp->f_owner = vaddr_pin->f_owner; + + spin_lock(&fp->f_owner->fp_lock); + list_add(&fp->list, &fp->f_owner->file_pins); + spin_unlock(&fp->f_owner->fp_lock); + + return 0; +} + +static void release_file_file_pin(struct kref *ref) +{ + struct file_file_pin *fp = container_of(ref, struct file_file_pin, ref); + + spin_lock(&fp->f_owner->fp_lock); + list_del(&fp->list); + spin_unlock(&fp->f_owner->fp_lock); + fput(fp->file); + kfree(fp); +} + +static struct mm_file_pin *alloc_mm_file_pin(struct inode *inode, + struct file *file) +{ + struct mm_file_pin *fp = kzalloc(sizeof(*fp), GFP_ATOMIC); + + if (!fp) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&fp->list); + kref_init(&fp->ref); + return fp; +} + +/** + * This object bridges files and the mm struct for the purpose of tracking + * which files have GUP pins on them. + */ +static int add_file_pin_to_mm(struct vaddr_pin *vaddr_pin, struct inode *inode, + struct file *file) +{ + struct mm_file_pin *fp; + + list_for_each_entry(fp, &vaddr_pin->mm->file_pins, list) { + if (fp->inode == inode) { + kref_get(&fp->ref); + return 0; + } + } + + fp = alloc_mm_file_pin(inode, file); + if (IS_ERR(fp)) + return PTR_ERR(fp); + + fp->inode = igrab(inode); + if (!fp->inode) { + kfree(fp); + return -EFAULT; + } + + fp->file = get_file(file); + fp->mm = vaddr_pin->mm; + mmgrab(fp->mm); + + spin_lock(&fp->mm->fp_lock); + list_add(&fp->list, &fp->mm->file_pins); + spin_unlock(&fp->mm->fp_lock); + + return 0; +} + +static void release_mm_file_pin(struct kref *ref) +{ + struct mm_file_pin *fp = container_of(ref, struct mm_file_pin, ref); + + spin_lock(&fp->mm->fp_lock); + list_del(&fp->list); + spin_unlock(&fp->mm->fp_lock); + + mmdrop(fp->mm); + fput(fp->file); + iput(fp->inode); + kfree(fp); +} + +static void remove_file_file_pin(struct vaddr_pin *vaddr_pin) +{ + struct file_file_pin *fp; + struct file_file_pin *tmp; + + list_for_each_entry_safe(fp, tmp, &vaddr_pin->f_owner->file_pins, + list) { + kref_put(&fp->ref, release_file_file_pin); + } +} + +static void remove_mm_file_pin(struct vaddr_pin *vaddr_pin, + struct inode *inode) +{ + struct mm_file_pin *fp; + struct mm_file_pin *tmp; + + list_for_each_entry_safe(fp, tmp, &vaddr_pin->mm->file_pins, list) { + if (fp->inode == inode) + kref_put(&fp->ref, release_mm_file_pin); + } +} + +static bool add_file_pin(struct vaddr_pin *vaddr_pin, struct inode *inode, + struct file *file) +{ + bool ret = true; + + if (!vaddr_pin || (!vaddr_pin->f_owner && !vaddr_pin->mm)) + return false; + + if (vaddr_pin->f_owner) { + if (add_file_pin_to_f_owner(vaddr_pin, inode, file)) + ret = false; + } else { + if (add_file_pin_to_mm(vaddr_pin, inode, file)) + ret = false; + } + + return ret; +} + +void mapping_release_file(struct vaddr_pin *vaddr_pin, struct page *page) +{ + struct inode *inode; + + if (WARN_ON(!page) || WARN_ON(!vaddr_pin) || + WARN_ON(!vaddr_pin->mm && !vaddr_pin->f_owner)) + return; + + if (PageAnon(page) || + !page->mapping || + !page->mapping->host) + return; + + inode = page->mapping->host; + + if (vaddr_pin->f_owner) + remove_file_file_pin(vaddr_pin); + else + remove_mm_file_pin(vaddr_pin, inode); +} +EXPORT_SYMBOL_GPL(mapping_release_file); + /** * mapping_inode_has_layout - ensure a file mapped page has a layout lease * taken + * @vaddr_pin: pin owner information to store with this pin if a proper layout + * is lease is found. * @page: page we are trying to GUP * * This should only be called on DAX pages. DAX pages which are mapped through @@ -2983,9 +3169,12 @@ core_initcall(filelock_init); * This allows the user to opt-into the fact that truncation operations will * fail for the duration of the pin. * + * Also if the proper layout leases are found we store pining information into + * the owner passed in via the vaddr_pin structure. + * * Return true if the page has a LAYOUT lease associated with it's file. */ -bool mapping_inode_has_layout(struct page *page) +bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page) { bool ret = false; struct inode *inode; @@ -3003,12 +3192,12 @@ bool mapping_inode_has_layout(struct page *page) if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease)) { spin_lock(&inode->i_flctx->flc_lock); - ret = false; list_for_each_entry(fl, &inode->i_flctx->flc_lease, fl_list) { if (fl->fl_pid == current->tgid && (fl->fl_flags & FL_LAYOUT) && (fl->fl_flags & FL_EXCLUSIVE)) { - ret = true; + ret = add_file_pin(vaddr_pin, inode, + fl->fl_file); break; } } diff --git a/include/linux/mm.h b/include/linux/mm.h index 9d37cafbef9a..657c947bda49 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -981,9 +981,11 @@ struct vaddr_pin { }; #ifdef CONFIG_DEV_PAGEMAP_OPS +void mapping_release_file(struct vaddr_pin *vaddr_pin, struct page *page); void __put_devmap_managed_page(struct page *page); DECLARE_STATIC_KEY_FALSE(devmap_managed_key); -static inline bool put_devmap_managed_page(struct page *page) + +static inline bool page_is_devmap_managed(struct page *page) { if (!static_branch_unlikely(&devmap_managed_key)) return false; @@ -992,7 +994,6 @@ static inline bool put_devmap_managed_page(struct page *page) switch (page->pgmap->type) { case MEMORY_DEVICE_PRIVATE: case MEMORY_DEVICE_FS_DAX: - __put_devmap_managed_page(page); return true; default: break; @@ -1000,11 +1001,39 @@ static inline bool put_devmap_managed_page(struct page *page) return false; } +static inline bool put_devmap_managed_page(struct page *page) +{ + bool is_devmap = page_is_devmap_managed(page); + if (is_devmap) + __put_devmap_managed_page(page); + return is_devmap; +} + +static inline bool put_devmap_managed_user_page(struct vaddr_pin *vaddr_pin, + struct page *page) +{ + bool is_devmap = page_is_devmap_managed(page); + + if (is_devmap) { + if (page->pgmap->type == MEMORY_DEVICE_FS_DAX) + mapping_release_file(vaddr_pin, page); + + __put_devmap_managed_page(page); + } + + return is_devmap; +} + #else /* CONFIG_DEV_PAGEMAP_OPS */ static inline bool put_devmap_managed_page(struct page *page) { return false; } +static inline bool put_devmap_managed_user_page(struct vaddr_pin *vaddr_pin, + struct page *page) +{ + return false; +} #endif /* CONFIG_DEV_PAGEMAP_OPS */ static inline bool is_device_private_page(const struct page *page) @@ -1574,7 +1603,7 @@ int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, struct task_struct *task, bool bypass_rlim); -bool mapping_inode_has_layout(struct page *page); +bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page); /* Container for pinned pfns / pages */ struct frame_vector { diff --git a/mm/gup.c b/mm/gup.c index 10cfd30ff668..eeaa0ddd08a6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -34,7 +34,7 @@ static void __put_user_page(struct vaddr_pin *vaddr_pin, struct page *page) * page is free and we need to inform the device driver through * callback. See include/linux/memremap.h and HMM for details. */ - if (put_devmap_managed_page(page)) + if (put_devmap_managed_user_page(vaddr_pin, page)) return; if (put_page_testzero(page)) @@ -272,7 +272,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, if (unlikely(flags & FOLL_LONGTERM) && (*pgmap)->type == MEMORY_DEVICE_FS_DAX && - !mapping_inode_has_layout(page)) { + !mapping_inode_has_layout(ctx->vaddr_pin, page)) { page = ERR_PTR(-EPERM); goto out; } @@ -1915,7 +1915,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, if (pte_devmap(pte) && unlikely(flags & FOLL_LONGTERM) && pgmap->type == MEMORY_DEVICE_FS_DAX && - !mapping_inode_has_layout(head)) { + !mapping_inode_has_layout(vaddr_pin, head)) { put_user_page(head); goto pte_unmap; } @@ -1972,7 +1972,7 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, if (unlikely(flags & FOLL_LONGTERM) && pgmap->type == MEMORY_DEVICE_FS_DAX && - !mapping_inode_has_layout(page)) { + !mapping_inode_has_layout(vaddr_pin, page)) { undo_dev_pagemap(nr, nr_start, pages); return 0; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 7e09f2f17ed8..2d700e21d4af 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -957,7 +957,7 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, if (unlikely(flags & FOLL_LONGTERM) && (*pgmap)->type == MEMORY_DEVICE_FS_DAX && - !mapping_inode_has_layout(page)) + !mapping_inode_has_layout(ctx->vaddr_pin, page)) return ERR_PTR(-EPERM); get_page(page); @@ -1104,7 +1104,7 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, if (unlikely(flags & FOLL_LONGTERM) && (*pgmap)->type == MEMORY_DEVICE_FS_DAX && - !mapping_inode_has_layout(page)) + !mapping_inode_has_layout(ctx->vaddr_pin, page)) return ERR_PTR(-EPERM); get_page(page); From patchwork Fri Aug 9 22:58:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087993 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F2E536C5 for ; Fri, 9 Aug 2019 22:59:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E0A0020602 for ; Fri, 9 Aug 2019 22:59:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D50422223E; Fri, 9 Aug 2019 22:59:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C4FC20602 for ; Fri, 9 Aug 2019 22:59:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F36EC6B026E; Fri, 9 Aug 2019 18:59:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EE7D26B026F; Fri, 9 Aug 2019 18:59:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3C5B6B0270; Fri, 9 Aug 2019 18:59:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 9AA2F6B026E for ; Fri, 9 Aug 2019 18:59:06 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id d190so62312479pfa.0 for ; Fri, 09 Aug 2019 15:59:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=YBf7bCSsZFTvPoM1NYxMaQl2NfTT88MOelJjIEo0m6Y=; b=WaW5WKr+SZiyjDif0n0t2W3Qf/K2QCQhPuQ0+gCRPX2ajEq2FNmqqRg1OR7u0ulywy GuMhGfv8p82FEpU6sW7PxsNE/sjSiaVfPCk5Z1dGjJTWm50NRFPBEGThmBHq/MAF1SZZ vz8itvyFen5ogh10z2rq2/2IKqtWFU5Nd8s0APzI5RWNOCh2+lYcG93FrdXawKTLhmdv C2JeZEv1gcDimW9+2/s0fCPuvZDolex7CtwBMdG1cX/XxgGP4fCzwHtREMBo3Ug+MU1a pYg0st2S84DkQ2NQZgj8wG3UW9s9a5xu5HQ74O/0ThckWrwkevBbz169atSyGxXguew/ +nag== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWWri8wOxcxaRp5DB/2ypyIqRA2RDyFbPCIDqjnq1d7t/MHV3/I FlEk/d8BuR6AWI7skJB5zGzKq/75W/davYHuL2Zy0IAhRzWDhDDVHTx+++8XG6Xup0rK6Ddj+2H bETOpxMe4w+S3vXTJvvokogqF8VwkrpXyXvJmSlQVcsd8ntp+mPvNqiOnq9zpRiUA4A== X-Received: by 2002:a17:902:6847:: with SMTP id f7mr20940565pln.311.1565391546289; Fri, 09 Aug 2019 15:59:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqzremvC3hduSRFZjuEehK4Pj1m9Z/IwyTvWSPzFb1llLIZIxkaFJHll4INV1NuKnVL2UGXc X-Received: by 2002:a17:902:6847:: with SMTP id f7mr20940529pln.311.1565391545525; Fri, 09 Aug 2019 15:59:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391545; cv=none; d=google.com; s=arc-20160816; b=k2tbJRo02dicPIEYaByZyWAFSHh+9EPCKhdALlG07vzlNSFIyG2o4bvX1mKVchkb3W lMlKJCPw8Vehv78ncHI3fs/Jc7k/JBmxuo39Vlsz/F/chOKUkeizKlDIuKUVRtG7QIUc p/etmiO9nNzmaHJC5ONzulHCS9o91KLCZvHk2x13IfQmra0bekncC8jhtYRc+DiP/Hl7 o+GFX2rev9xQLX2RElkT+TTMZDAKnZKEZFXAJJPN+I3GVDDFfmNHQ+7EC7QqWMdOgCw6 caF6WAb4qTUj4ixgRhTwy2uIIYfIsFQNmWAyyGrh1OB0TQBvhxkHKo1xIujDP3lJLud0 6DBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=YBf7bCSsZFTvPoM1NYxMaQl2NfTT88MOelJjIEo0m6Y=; b=qroeHdRHiziJpmEFMx37ISZh573zxzqDpVmhP7sc9m+nDXdNW096eMGK6WyOCXebgk bDitxDIAQdr8d78H4lCOR8AEM+WzyVEscGUDiG+3Vnz/y/E0Whk3v1nVHhJoZXrF6ort Zb90Ym0VdlgUfcXDW1RnOiNKNcLVnRvrbQ0+/sfeGlM6F5voCBlK6ujTUFAxohxCWDff qYdHaTKCa18E7Sjt3z0B0/3QNQnboGE8XXU5T/qwgAivQ2rvghfum9uiY+HY17RT53Oi F3LQZolNDFLh2FoisDsbm/JSLCvidHImqIeGBWnZJ8JjwyC7hb8wS21+733b5qIkOMwM PgHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga17.intel.com (mga17.intel.com. [192.55.52.151]) by mx.google.com with ESMTPS id i188si56395012pfe.96.2019.08.09.15.59.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:05 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) client-ip=192.55.52.151; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:05 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="176932450" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:04 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 15/19] mm/gup: Introduce vaddr_pin_pages() Date: Fri, 9 Aug 2019 15:58:29 -0700 Message-Id: <20190809225833.6657-16-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny The addition of FOLL_LONGTERM has taken on additional meaning for CMA pages. In addition subsystems such as RDMA require new information to be passed to the GUP interface to track file owning information. As such a simple FOLL_LONGTERM flag is no longer sufficient for these users to pin pages. Introduce a new GUP like call which takes the newly introduced vaddr_pin information. Failure to pass the vaddr_pin object back to a vaddr_put* call will result in a failure if pins were created on files during the pin operation. Signed-off-by: Ira Weiny --- Changes from list: Change to vaddr_put_pages_dirty_lock Change to vaddr_unpin_pages_dirty_lock include/linux/mm.h | 5 ++++ mm/gup.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 657c947bda49..90c5802866df 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1603,6 +1603,11 @@ int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, struct task_struct *task, bool bypass_rlim); +long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages, + struct vaddr_pin *vaddr_pin); +void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages, + struct vaddr_pin *vaddr_pin, bool make_dirty); bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page); /* Container for pinned pfns / pages */ diff --git a/mm/gup.c b/mm/gup.c index eeaa0ddd08a6..6d23f70d7847 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2536,3 +2536,62 @@ int get_user_pages_fast(unsigned long start, int nr_pages, return ret; } EXPORT_SYMBOL_GPL(get_user_pages_fast); + +/** + * vaddr_pin_pages pin pages by virtual address and return the pages to the + * user. + * + * @addr, start address + * @nr_pages, number of pages to pin + * @gup_flags, flags to use for the pin + * @pages, array of pages returned + * @vaddr_pin, initalized meta information this pin is to be associated + * with. + * + * NOTE regarding vaddr_pin: + * + * Some callers can share pins via file descriptors to other processes. + * Callers such as this should use the f_owner field of vaddr_pin to indicate + * the file the fd points to. All other callers should use the mm this pin is + * being made against. Usually "current->mm". + * + * Expects mmap_sem to be read locked. + */ +long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages, + struct vaddr_pin *vaddr_pin) +{ + long ret; + + gup_flags |= FOLL_LONGTERM; + + if (!vaddr_pin || (!vaddr_pin->mm && !vaddr_pin->f_owner)) + return -EINVAL; + + ret = __gup_longterm_locked(current, + vaddr_pin->mm, + addr, nr_pages, + pages, NULL, gup_flags, + vaddr_pin); + return ret; +} +EXPORT_SYMBOL(vaddr_pin_pages); + +/** + * vaddr_unpin_pages_dirty_lock - counterpart to vaddr_pin_pages + * + * @pages, array of pages returned + * @nr_pages, number of pages in pages + * @vaddr_pin, same information passed to vaddr_pin_pages + * @make_dirty: whether to mark the pages dirty + * + * The semantics are similar to put_user_pages_dirty_lock but a vaddr_pin used + * in vaddr_pin_pages should be passed back into this call for propper + * tracking. + */ +void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages, + struct vaddr_pin *vaddr_pin, bool make_dirty) +{ + __put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, make_dirty); +} +EXPORT_SYMBOL(vaddr_unpin_pages_dirty_lock); From patchwork Fri Aug 9 22:58:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087995 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 21ED06C5 for ; Fri, 9 Aug 2019 22:59:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0CB7D20602 for ; Fri, 9 Aug 2019 22:59:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 000EC2223E; Fri, 9 Aug 2019 22:59:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 91E3720602 for ; Fri, 9 Aug 2019 22:59:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 598B76B026F; Fri, 9 Aug 2019 18:59:08 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 54F1F6B0270; Fri, 9 Aug 2019 18:59:08 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39CDB6B0271; Fri, 9 Aug 2019 18:59:08 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 05DD86B026F for ; Fri, 9 Aug 2019 18:59:08 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id g185so118959pfb.13 for ; Fri, 09 Aug 2019 15:59:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=KJ855nZTb3U4x3doVTE/WvrCgUCp2cy8KAAWpSqjQoo=; b=Sf4ouxkWi1TWlKiO2KnB1jfJtD4F9ICVy00vZqz39IeLV/6zcggy6ZHb9n+FLPEBfb 6cmwPSYSx2Piw70bDgoXN184BDWN63wpDrE6fed2wBzeoZGCdgYRC2yLgCmxciD9BbUA VXEBPSohB6TMpW9AuMGUHyxg1+e04LUyRYosi1WV1NwgQT5F7LtIZ7nklNK4N4OiYHhj Yg6cSvueL68N8QUXITohjMNmN7FKmS/p+NVUpF5Jya7B3McP2AGkt6ABKixFf2UEwDMS OYXiydQCCcaOEk+04VXrMCdZJDTpZsxm/9dKbrmQOgnQ+MWFp/MN9Fz2PEGlsWt6RWsq TRJg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXCw82wboysFcNTe5f+p0prFgFmXFDD4gaEDtiu94bCk10QAiNx z/7sRBg2Rp+M7Nj4aWl1knsgk8w3MRk5pZIL9isY+9lAb/oSwwPVVWBtsGzDTvMUzMuNVshYy6o DVXfy+TGhFg2vK5FdOycR8Me6jM8+nqBwyuvMKEcR1zWdeAAOqp9UhnBwcQ1U4ppHvw== X-Received: by 2002:aa7:9118:: with SMTP id 24mr23181494pfh.56.1565391547745; Fri, 09 Aug 2019 15:59:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqy3BWMFID4xtuWbP00hgVZS5UTYdaFHKqZFNg/URcoLtABL5ObcV48tVOH1DJFFqrUQLaMB X-Received: by 2002:aa7:9118:: with SMTP id 24mr23181431pfh.56.1565391546650; Fri, 09 Aug 2019 15:59:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391546; cv=none; d=google.com; s=arc-20160816; b=dzRZKYmJ+xf0fKcf4khlROD7jvBkKH5z+Ve7uhAPwE/Gd+Gv346DqyDs4/8Wzoxrcb +6ZfGJH9sXgFMF9ZHHaPBUpKnKmNV2nNfGnh6BFdFCIdAQdjkCuRFcttK17SKSdGKz8H nSOUdBp1Luq91z7WgaoKzqXm7kfpCWESXjX5W0ZBJwnKfdi8XaIidlZ8bQ9WMXaZ7pJq CW3pFmlJuQo5SGBHK8b2W4Srng40EHUWfLmFZSwAEQAT5MMEjogQYQ0NTNR9hab4zIXy jiWOu4B3Z/GWB35ltYsiiEQon+0orjnfkB6O9YoPD/JIzSvGHhhx/VooiO4S48YzqObp 0wjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=KJ855nZTb3U4x3doVTE/WvrCgUCp2cy8KAAWpSqjQoo=; b=Ono8I+2deInAk+aZKnwk0Lr7La5pOeasfm8TS2D5tG8XwDzE0g7+/8ZzE/sO/xEW1N uTwWjOkOzP4FpUJ5uNYfso9Fnzj0bAAhjWt4CA6QjFUniZD4EHmoXyGk/zj/SVRN3zoB Fh7ks6kWSiJbABsSMAnagJ/38FqVVDlhLYs6e/6Ew7GoylnzlUu48CifRTmEJgZiasbn WJyvPpqzIyS0FIap2fsqdxeqBiqHaTdnMF9pyp5hQuxD/A93FcfutfUD/T+XY+Q8LU1a ePdnIFYgUparMniUblL6xmf4DyawpBu4EL52bMoVeLqgPVDPVO9OoJuC2HbjK3aXRm5+ /3qQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id r14si54783250pgm.406.2019.08.09.15.59.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:06 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="375343545" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:05 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 16/19] RDMA/uverbs: Add back pointer to system file object Date: Fri, 9 Aug 2019 15:58:30 -0700 Message-Id: <20190809225833.6657-17-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny In order for MRs to be tracked against the open verbs context the ufile needs to have a pointer to hand to the GUP code. No references need to be taken as this should be valid for the lifetime of the context. Signed-off-by: Ira Weiny --- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_main.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h index 1e5aeb39f774..e802ba8c67d6 100644 --- a/drivers/infiniband/core/uverbs.h +++ b/drivers/infiniband/core/uverbs.h @@ -163,6 +163,7 @@ struct ib_uverbs_file { struct page *disassociate_page; struct xarray idr; + struct file *sys_file; /* backpointer to system file object */ }; struct ib_uverbs_event { diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 11c13c1381cf..002c24e0d4db 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -1092,6 +1092,7 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp) INIT_LIST_HEAD(&file->umaps); filp->private_data = file; + file->sys_file = filp; list_add_tail(&file->list, &dev->uverbs_file_list); mutex_unlock(&dev->lists_mutex); srcu_read_unlock(&dev->disassociate_srcu, srcu_key); From patchwork Fri Aug 9 22:58:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087997 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08D7813B1 for ; Fri, 9 Aug 2019 22:59:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EAAA320602 for ; Fri, 9 Aug 2019 22:59:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DEB312223E; Fri, 9 Aug 2019 22:59:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E08620602 for ; Fri, 9 Aug 2019 22:59:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8365A6B0270; Fri, 9 Aug 2019 18:59:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 793FE6B0271; Fri, 9 Aug 2019 18:59:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65BF66B0272; Fri, 9 Aug 2019 18:59:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 2E5D36B0270 for ; Fri, 9 Aug 2019 18:59:09 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id d6so58198227pls.17 for ; Fri, 09 Aug 2019 15:59:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=vgEDTxmuL9fT10YwzdYGN4d9enkzvhW0KGg81eGf8Sw=; b=j7/5mRJPJzWtTGIOqNddQAdXXM+xzJ/sEM1GYtVuCcSJB55skYPaqwCEPxUM3wRWx9 Xp4LSvexQ/CvnL3iZX6G9RsuPIs70GLvqVorXNAsz9DofWqUICdwFyUeiStOzcIbxbf5 AwmsKx2lMwRNLSkr5+VzfusbCJnQGFUlWzs/KabOecA8HKuwuZtPzOCZRv6a88jm8wTB 80ql5aNgSgupR9L2f7o5PPqziLBPnFwnjr1DjydL1XCAnXdGZc8sfbIaoeHnxQofBpcC KBI2YFjztkxVM+0wda/dEzNQnUB/HzyvMF92oKmcK7Du3NIGdKUWe763NxXCOLrIrMzj kHMg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX1vhaBFdMILdd3UwuZ/8MYhNU9iotHM7ZJJcGgvYY3utj8Hr5W PldFmVJjhrq7jpeZTmaCPVH/S2cCymQWMyAAxPDgX4BNSFv0PHqmzFJdPjxkZ5AnWC/wSUhLDEb cHamnOhnlS414NlxJVg7qU6reYpjNe2+HSk184K3fTQqd1qIxs/lXI83aL5l95C1Qwg== X-Received: by 2002:a17:90a:d3d4:: with SMTP id d20mr12097245pjw.28.1565391548710; Fri, 09 Aug 2019 15:59:08 -0700 (PDT) X-Google-Smtp-Source: APXvYqy+P7WD/XmOebkqyz0mFatwZzUQNi11TEiEkU4rrdDPfWT9UIOvzTkiEs/XfvDxrD8prHNr X-Received: by 2002:a17:90a:d3d4:: with SMTP id d20mr12097210pjw.28.1565391547772; Fri, 09 Aug 2019 15:59:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391547; cv=none; d=google.com; s=arc-20160816; b=n9eKel+5FruNFR4gyeBZtB3eYH/fZ6+owbjqf00tEhwCcQOKiADDz5cEbKMCEDCdPE /tB5HoxPN+YLaDk6Yz17b3DRJzXoU853PS2L9YJg2GT2nQoiz4nKl//Csj9vBDLxIWOT aiBcj0idYYbwnnYfleP1DFXYkvmwd+2IhbJxt0V599hJtfjmoteizDaGYpPpxm63pScC iIFDMLSr3RAkGEwoR0TUv23YuNQr8J4P/WCESGERw66I/2ng4Ac3GoHkIsT1xM9pRj0c FvTfPNXXKEU+RbF2kH5WyAPMoSGXheewGXlMAtJSoFrQRINebFnuDCrb+kJjJTxrbg58 lBJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=vgEDTxmuL9fT10YwzdYGN4d9enkzvhW0KGg81eGf8Sw=; b=DhDvhQhHG14IdLkf0u/q2GiWuqqgJWTA/DYAEYJDoBDEVP/MGR75z+wL6N8+dhV0fW k+i9bZm8hKthlb+oAv4IRZrigAvrEEWOdHKNOjgYIwLJE4X6hG4oyikLlALrgSfWFHLm SGIjaTrW9TE+KIR0jo8hbpsGPG0BeyYpHFfC/gj0607YKE9ug8LHhARbnkQgc/mBkaV8 GkdB9/Us7vfh2kJRQAxm13wgTKtY7Whb+sYlyc2Ko7IuVF0az5CKoTPJIPeVzNiu4n+8 YvUCWVGVh7KZEgt/EmojeYgAjnUUk+ZwL6hDdjscONBM9Sf7CWZYwLFF9Xhh4MywWo8z CeKg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id h189si55552068pgc.236.2019.08.09.15.59.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:07 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:07 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="374631601" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:07 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 17/19] RDMA/umem: Convert to vaddr_[pin|unpin]* operations. Date: Fri, 9 Aug 2019 15:58:31 -0700 Message-Id: <20190809225833.6657-18-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny In order to properly track the pinning information we need to keep a vaddr_pin object around. Store that within the umem object directly. The vaddr_pin object allows the GUP code to associate any files it pins with the RDMA file descriptor associated with this GUP. Furthermore, use the vaddr_pin object to store the owning mm while we are at it. No references need to be taken on the owing file as the lifetime of that object is tied to all the umems being destroyed first. Signed-off-by: Ira Weiny --- drivers/infiniband/core/umem.c | 26 +++++++++++++++++--------- drivers/infiniband/core/umem_odp.c | 16 ++++++++-------- include/rdma/ib_umem.h | 2 +- 3 files changed, 26 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 965cf9dea71a..a9ce3e3816ef 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -54,7 +54,8 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) { page = sg_page_iter_page(&sg_iter); - put_user_pages_dirty_lock(&page, 1, umem->writable && dirty); + vaddr_unpin_pages_dirty_lock(&page, 1, &umem->vaddr_pin, + umem->writable && dirty); } sg_free_table(&umem->sg_head); @@ -243,8 +244,15 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, umem->length = size; umem->address = addr; umem->writable = ib_access_writable(access); - umem->owning_mm = mm = current->mm; - mmgrab(mm); + umem->vaddr_pin.mm = mm = current->mm; + mmgrab(umem->vaddr_pin.mm); + + /* No need to get a reference to the core file object here. The key is + * that sys_file reference is held by the ufile. Any duplication of + * sys_file by the core will keep references active until all those + * contexts are closed out. No matter which process hold them open. + */ + umem->vaddr_pin.f_owner = context->ufile->sys_file; if (access & IB_ACCESS_ON_DEMAND) { if (WARN_ON_ONCE(!context->invalidate_range)) { @@ -292,11 +300,11 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, while (npages) { down_read(&mm->mmap_sem); - ret = get_user_pages(cur_base, + ret = vaddr_pin_pages(cur_base, min_t(unsigned long, npages, PAGE_SIZE / sizeof (struct page *)), - gup_flags | FOLL_LONGTERM, - page_list, NULL); + gup_flags, + page_list, &umem->vaddr_pin); if (ret < 0) { up_read(&mm->mmap_sem); goto umem_release; @@ -336,7 +344,7 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, free_page((unsigned long) page_list); umem_kfree: if (ret) { - mmdrop(umem->owning_mm); + mmdrop(umem->vaddr_pin.mm); kfree(umem); } return ret ? ERR_PTR(ret) : umem; @@ -345,7 +353,7 @@ EXPORT_SYMBOL(ib_umem_get); static void __ib_umem_release_tail(struct ib_umem *umem) { - mmdrop(umem->owning_mm); + mmdrop(umem->vaddr_pin.mm); if (umem->is_odp) kfree(to_ib_umem_odp(umem)); else @@ -369,7 +377,7 @@ void ib_umem_release(struct ib_umem *umem) __ib_umem_release(umem->context->device, umem, 1); - atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm); + atomic64_sub(ib_umem_num_pages(umem), &umem->vaddr_pin.mm->pinned_vm); __ib_umem_release_tail(umem); } EXPORT_SYMBOL(ib_umem_release); diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 2a75c6f8d827..53085896d718 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -278,11 +278,11 @@ static int get_per_mm(struct ib_umem_odp *umem_odp) */ mutex_lock(&ctx->per_mm_list_lock); list_for_each_entry(per_mm, &ctx->per_mm_list, ucontext_list) { - if (per_mm->mm == umem_odp->umem.owning_mm) + if (per_mm->mm == umem_odp->umem.vaddr_pin.mm) goto found; } - per_mm = alloc_per_mm(ctx, umem_odp->umem.owning_mm); + per_mm = alloc_per_mm(ctx, umem_odp->umem.vaddr_pin.mm); if (IS_ERR(per_mm)) { mutex_unlock(&ctx->per_mm_list_lock); return PTR_ERR(per_mm); @@ -355,8 +355,8 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, umem->writable = root->umem.writable; umem->is_odp = 1; odp_data->per_mm = per_mm; - umem->owning_mm = per_mm->mm; - mmgrab(umem->owning_mm); + umem->vaddr_pin.mm = per_mm->mm; + mmgrab(umem->vaddr_pin.mm); mutex_init(&odp_data->umem_mutex); init_completion(&odp_data->notifier_completion); @@ -389,7 +389,7 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, out_page_list: vfree(odp_data->page_list); out_odp_data: - mmdrop(umem->owning_mm); + mmdrop(umem->vaddr_pin.mm); kfree(odp_data); return ERR_PTR(ret); } @@ -399,10 +399,10 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) { struct ib_umem *umem = &umem_odp->umem; /* - * NOTE: This must called in a process context where umem->owning_mm + * NOTE: This must called in a process context where umem->vaddr_pin.mm * == current->mm */ - struct mm_struct *mm = umem->owning_mm; + struct mm_struct *mm = umem->vaddr_pin.mm; int ret_val; umem_odp->page_shift = PAGE_SHIFT; @@ -581,7 +581,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt, unsigned long current_seq) { struct task_struct *owning_process = NULL; - struct mm_struct *owning_mm = umem_odp->umem.owning_mm; + struct mm_struct *owning_mm = umem_odp->umem.vaddr_pin.mm; struct page **local_page_list = NULL; u64 page_mask, off; int j, k, ret = 0, start_idx, npages = 0; diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 1052d0d62be7..ab677c799e29 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -43,7 +43,6 @@ struct ib_umem_odp; struct ib_umem { struct ib_ucontext *context; - struct mm_struct *owning_mm; size_t length; unsigned long address; u32 writable : 1; @@ -52,6 +51,7 @@ struct ib_umem { struct sg_table sg_head; int nmap; unsigned int sg_nents; + struct vaddr_pin vaddr_pin; }; /* Returns the offset of the umem start relative to the first page. */ From patchwork Fri Aug 9 22:58:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11087999 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 41CA113B1 for ; Fri, 9 Aug 2019 22:59:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2CDBF20602 for ; Fri, 9 Aug 2019 22:59:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 20A422223E; Fri, 9 Aug 2019 22:59:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D91620602 for ; Fri, 9 Aug 2019 22:59:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 721546B0271; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6AAD16B0272; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 572D66B0273; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 14AD36B0271 for ; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id e25so62414975pfn.5 for ; Fri, 09 Aug 2019 15:59:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=86nnNDFin5NGIDuIbZP6nLJ89Qnmc/o+qzx9kACIDco=; b=ElkBeSFK7Hu7D0Vq31glJyJ5FQXZsXhlV0Ybb3tXSUKCxNy04yZZ+DIpFF2cVmn752 BJ48nT0V86TWNXZR+bBegtlF6SYI55zCahrKyMww9ZskxTeKUc0Sk4HB6W//HskMfedN Y5n+vaF392zCFfACEl1J1uJq8wZXLKuiVcnhHhDNDR8iJ/6ozIGuWwQ2sG/2l6EnRxSS 7Qg4O2K7l2N2MYS2xo1d9hHoin59r5UcexYDtACqa2wYMniBuyw+TGq1FV6H0jzPEbOY 6HqQ12EPkMCMrHCAdbh9+jMEMWKQG2jG+TXP96Y37xDRTgPwZwYiuL4iIH1YC8kwg5wS 1utA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXc/6SwwJ38y1djw4lveMwUSwF/bO0OkdwL6iIlTgcL9ctPNHg4 p3RGOFflbUX5ZLCxhwYSINJoqicWM3/JW/RFO0ZqCUwiGDs1tdg9nos58ggXI88PxqnQYZgisWn 8C28qrYlpAaNrMGKNvAwN013cwRWqeCvVC7WtvtWPZl+F0chZUvc2e/FMGBxz5nJjDw== X-Received: by 2002:a63:fe15:: with SMTP id p21mr13506393pgh.149.1565391550625; Fri, 09 Aug 2019 15:59:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqzbaF/ogLgoAnPb5Ga6H+p8GfIyz5pxJYrpu5TKeFuTybG96thFXwBSNmC5jFik6WDRu+hx X-Received: by 2002:a63:fe15:: with SMTP id p21mr13506339pgh.149.1565391549240; Fri, 09 Aug 2019 15:59:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391549; cv=none; d=google.com; s=arc-20160816; b=g/1OftbS6Odj6jAHt6SY9CDHM3scTMTK4wT41s5g58ZrxyVv1/zE0/+7bhSLT7Z9ne RD6ICpj1G3jFBvDphIdP7afR0Q2AhH2++LL8DzmSktkZ5Kr3VCLCokxdsOLy+SH+laRE qDxrAcJRL/ni+qGgrIrV/VMKNcuy8njR8Yfhfgt072O5mxfA1u9rX30D4I0V67AQ8NOo Q77bYv+piX/5cKCbPRLReH0WocyF+/ze3ed1QtzQYxS4n8c0AsPz562HFJx9tE49Qlab 9BVus9/kK+HaioSDmuGkdEKanuqym2pR0FZE0b9ojlJ/5qXd/F91C7QVVozql24bXBLp yhDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=86nnNDFin5NGIDuIbZP6nLJ89Qnmc/o+qzx9kACIDco=; b=QpdTq4/IS1SYOyjs5HQ0MiP9g3TbwJHAmpCIG9b4Q2HcHQl5FxJGf25v+SQhQB3qV0 AfMmv1Zim8lnARTUTDV+WGy2idpg+iSC3d55TA3X2mbWlBl/Q3nGiCMgiOsA778UDJOy rv3cDGCkJwY1Y1gissnx2oo1qe6ujNFkPr/HLv93TkIqffLhtPy3yVTeDCU1+t+nPQIr HpW7EqTns3w56askWtoreqirMt2fFxqEzo4jaCLxxt5C2+44Y9gjuPRrtUR3b8UaaNfO 9F68qmPcvh5EuBEiHTXRF/PBBZfW8DGbZSXBTN9uvQhkbE7QQ2gpWtL1Un882hKso0Z6 0PfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga02.intel.com (mga02.intel.com. [134.134.136.20]) by mx.google.com with ESMTPS id g7si49662014plp.171.2019.08.09.15.59.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) client-ip=134.134.136.20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:08 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="199539307" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:08 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 18/19] {mm,procfs}: Add display file_pins proc Date: Fri, 9 Aug 2019 15:58:32 -0700 Message-Id: <20190809225833.6657-19-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Now that we have the file pins information stored add a new procfs entry to display them to the user. NOTE output will be dependant on where the file pin is tied to. Some processes may have the pin associated with a file descriptor in which case that file is reported as well. Others are associated directly with the process mm and are reported as such. For example of a file pinned to an RDMA open context (fd 4) and a file pinned to the mm of that process: 4: /dev/infiniband/uverbs0 /mnt/pmem/foo /mnt/pmem/bar Signed-off-by: Ira Weiny --- fs/proc/base.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 214 insertions(+) diff --git a/fs/proc/base.c b/fs/proc/base.c index ebea9501afb8..f4d219172235 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2995,6 +2995,7 @@ static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns, */ static const struct file_operations proc_task_operations; static const struct inode_operations proc_task_inode_operations; +static const struct file_operations proc_pid_file_pins_operations; static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), @@ -3024,6 +3025,7 @@ static const struct pid_entry tgid_base_stuff[] = { ONE("stat", S_IRUGO, proc_tgid_stat), ONE("statm", S_IRUGO, proc_pid_statm), REG("maps", S_IRUGO, proc_pid_maps_operations), + REG("file_pins", S_IRUGO, proc_pid_file_pins_operations), #ifdef CONFIG_NUMA REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations), #endif @@ -3422,6 +3424,7 @@ static const struct pid_entry tid_base_stuff[] = { ONE("stat", S_IRUGO, proc_tid_stat), ONE("statm", S_IRUGO, proc_pid_statm), REG("maps", S_IRUGO, proc_pid_maps_operations), + REG("file_pins", S_IRUGO, proc_pid_file_pins_operations), #ifdef CONFIG_PROC_CHILDREN REG("children", S_IRUGO, proc_tid_children_operations), #endif @@ -3718,3 +3721,214 @@ void __init set_proc_pid_nlink(void) nlink_tid = pid_entry_nlink(tid_base_stuff, ARRAY_SIZE(tid_base_stuff)); nlink_tgid = pid_entry_nlink(tgid_base_stuff, ARRAY_SIZE(tgid_base_stuff)); } + +/** + * file_pin information below. + */ + +struct proc_file_pins_private { + struct inode *inode; + struct task_struct *task; + struct mm_struct *mm; + struct files_struct *files; + unsigned int nr_pins; + struct xarray fps; +} __randomize_layout; + +static void release_fp(struct proc_file_pins_private *priv) +{ + up_read(&priv->mm->mmap_sem); + mmput(priv->mm); +} + +static void print_fd_file_pin(struct seq_file *m, struct file *file, + unsigned long i) +{ + struct file_file_pin *fp; + struct file_file_pin *tmp; + + if (list_empty_careful(&file->file_pins)) + return; + + seq_printf(m, "%lu: ", i); + seq_file_path(m, file, "\n"); + seq_putc(m, '\n'); + + list_for_each_entry_safe(fp, tmp, &file->file_pins, list) { + seq_puts(m, " "); + seq_file_path(m, fp->file, "\n"); + seq_putc(m, '\n'); + } +} + +/* We are storing the index's within the FD table for later retrieval */ +static int store_fd(const void *priv , struct file *file, unsigned i) +{ + struct proc_file_pins_private *fp_priv; + + /* cast away const... */ + fp_priv = (struct proc_file_pins_private *)priv; + + if (list_empty_careful(&file->file_pins)) + return 0; + + /* can't sleep in the iterate of the fd table */ + xa_store(&fp_priv->fps, fp_priv->nr_pins, xa_mk_value(i), GFP_ATOMIC); + fp_priv->nr_pins++; + + return 0; +} + +static void store_mm_pins(struct proc_file_pins_private *priv) +{ + struct mm_file_pin *fp; + struct mm_file_pin *tmp; + + list_for_each_entry_safe(fp, tmp, &priv->mm->file_pins, list) { + xa_store(&priv->fps, priv->nr_pins, fp, GFP_KERNEL); + priv->nr_pins++; + } +} + + +static void *fp_start(struct seq_file *m, loff_t *ppos) +{ + struct proc_file_pins_private *priv = m->private; + unsigned int pos = *ppos; + + priv->task = get_proc_task(priv->inode); + if (!priv->task) + return ERR_PTR(-ESRCH); + + if (!priv->mm || !mmget_not_zero(priv->mm)) + return NULL; + + priv->files = get_files_struct(priv->task); + down_read(&priv->mm->mmap_sem); + + xa_destroy(&priv->fps); + priv->nr_pins = 0; + + /* grab fds of "files" which have pins and store as xa values */ + if (priv->files) + iterate_fd(priv->files, 0, store_fd, priv); + + /* store mm_file_pins as xa entries */ + store_mm_pins(priv); + + if (pos >= priv->nr_pins) { + release_fp(priv); + return NULL; + } + + return xa_load(&priv->fps, pos); +} + +static void *fp_next(struct seq_file *m, void *v, loff_t *pos) +{ + struct proc_file_pins_private *priv = m->private; + + (*pos)++; + if ((*pos) >= priv->nr_pins) { + release_fp(priv); + return NULL; + } + + return xa_load(&priv->fps, *pos); +} + +static void fp_stop(struct seq_file *m, void *v) +{ + struct proc_file_pins_private *priv = m->private; + + if (v) + release_fp(priv); + + if (priv->task) { + put_task_struct(priv->task); + priv->task = NULL; + } + + if (priv->files) { + put_files_struct(priv->files); + priv->files = NULL; + } +} + +static int show_fp(struct seq_file *m, void *v) +{ + struct proc_file_pins_private *priv = m->private; + + if (xa_is_value(v)) { + struct file *file; + unsigned long fd = xa_to_value(v); + + rcu_read_lock(); + file = fcheck_files(priv->files, fd); + if (file) + print_fd_file_pin(m, file, fd); + rcu_read_unlock(); + } else { + struct mm_file_pin *fp = v; + + seq_puts(m, "mm: "); + seq_file_path(m, fp->file, "\n"); + } + + return 0; +} + +static const struct seq_operations proc_pid_file_pins_op = { + .start = fp_start, + .next = fp_next, + .stop = fp_stop, + .show = show_fp +}; + +static int proc_file_pins_open(struct inode *inode, struct file *file) +{ + struct proc_file_pins_private *priv = __seq_open_private(file, + &proc_pid_file_pins_op, + sizeof(*priv)); + + if (!priv) + return -ENOMEM; + + xa_init(&priv->fps); + priv->inode = inode; + priv->mm = proc_mem_open(inode, PTRACE_MODE_READ); + priv->task = NULL; + if (IS_ERR(priv->mm)) { + int err = PTR_ERR(priv->mm); + + seq_release_private(inode, file); + return err; + } + + return 0; +} + +static int proc_file_pins_release(struct inode *inode, struct file *file) +{ + struct seq_file *seq = file->private_data; + struct proc_file_pins_private *priv = seq->private; + + /* This is for "protection" not sure when these may end up not being + * NULL here... */ + WARN_ON(priv->files); + WARN_ON(priv->task); + + if (priv->mm) + mmdrop(priv->mm); + + xa_destroy(&priv->fps); + + return seq_release_private(inode, file); +} + +static const struct file_operations proc_pid_file_pins_operations = { + .open = proc_file_pins_open, + .read = seq_read, + .llseek = seq_lseek, + .release = proc_file_pins_release, +}; From patchwork Fri Aug 9 22:58:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11088001 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D1AB76C5 for ; Fri, 9 Aug 2019 22:59:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C049720602 for ; Fri, 9 Aug 2019 22:59:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B4CB12228E; Fri, 9 Aug 2019 22:59:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 30FF120602 for ; Fri, 9 Aug 2019 22:59:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E25026B0272; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DB16D6B0273; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C26366B0274; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 8AA136B0273 for ; Fri, 9 Aug 2019 18:59:11 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id s13so2417023plp.7 for ; Fri, 09 Aug 2019 15:59:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=dCSe5Gz/krwYGTrjnccnTjmL88HvXVG7GECluMxTUKQ=; b=MAYT/erEsp7Cj7rtez0UAUwgyDUbidrBalsyXkex4i9uW2wbFjx9qIriep49RKf7zy EQ+pMW028ccgsRlxuoJ1VQwN876PymtE6D5CAqCy9JlAQViQ7ZKAB0/6WClaCVuJ9DxY IcReBap8gV6+brtA5YrPQq7vRruomSFrtRDXIQ5qsPPxzMxz49PnPzkgFh1P+PrAcC3N sQfX0H6GSIWPeUdYfRg16jsHwV4urRozRd+Ep7Dd1y9E0BpicaC2W1JtnafuHhzS91Ek iUw1KakW5+ZnlAymTVEcDNZkcU4YiW9B1TvEVIzW75LnVgG9zf02d8tfBz/wYlMGXVSF xZLQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAW0k7nqEdx0n930Yy1D99EeLtVy8DyR6a3t9Ks2IbFyplf7N67M uFLLKIdcdmIxAe4PgGX+Ed7fj9TPTPSd1i2WQ+AvkdgObRGVoER8wadWE3v7Whn9Gv7c6xNaOV7 Gb08V74GlTJPslTtAc/G2+ttliDrYjU6cmiRMDkBRhfl99V0GB8LvXntdr2rlpIcwtA== X-Received: by 2002:a17:902:fe14:: with SMTP id g20mr20103223plj.54.1565391551212; Fri, 09 Aug 2019 15:59:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqzsW+Cwbgc4KIYcDOgECOT3RyHn84PV81ceXilFeZ9ZGjWmMVELoOWTLnf2lA6rCio8OfgP X-Received: by 2002:a17:902:fe14:: with SMTP id g20mr20103193plj.54.1565391550406; Fri, 09 Aug 2019 15:59:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565391550; cv=none; d=google.com; s=arc-20160816; b=XPjaJ5CgwS3JMdsyhMT5C7qRgTtF4Px+4Qgjf2VK2snsZB6mN5iugfVgVUiWlIP4zn IylvpuC7egSs2AuXv/8m6NXoqHdK/BLexZhizlxR9kYibscgDK/hLYjjRULuhr9wldyr Sd7HmCEKMEqcqrIPp1+NrKAjeojn7JJ0Pwj19fbLpoas1fXlMXnAbv0O75z7N0/dYWU1 Aukn+wd37fqv5Wgp7Yeyn3vJV0SrOqTX3WQLI3km1ivRRSI6KiNc+X/Jba672ikN1vh3 04qk2Qau8R5FXUiRAvk2KTuwDYdcoIii3X4ZSrXKwmo14VyUSm4kbFU7RWoyKvSb15CP LMng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=dCSe5Gz/krwYGTrjnccnTjmL88HvXVG7GECluMxTUKQ=; b=tgTKXFAPjPiytnrTRtGLUvMncFfibcX88B34iqwejil98XoIOutJZBM8vTKHzyq4Tl 1UGOJypqUeMYXqTNoFyZAjLDm8sfwrSo7ekDWcMx7AyKaBGAEwmPY3XuPnrSIlcoO6Kv bgvu03Hp00p3B8h4tfevF4YSIV1W5T1qGyZqpSPAZiIAES4zF4U5EmcqslzLViwghWM+ x0SzUubQqJyUgcGK+nS19/YO2QOqCLs6ZhNy/x7JAu54PxH2pUfn3xjREOcPQzcZ3sOA mKBAJiTBMUXKBYFbetN7+DZAUoxS5WLQZKY7460JauCLHXKp5k2C0gWayc6dR/NCPKaY E4rw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id x12si38312062pgq.220.2019.08.09.15.59.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Aug 2019 15:59:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ira.weiny@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ira.weiny@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:09 -0700 X-IronPort-AV: E=Sophos;i="5.64,367,1559545200"; d="scan'208";a="183030881" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Aug 2019 15:59:09 -0700 From: ira.weiny@intel.com To: Andrew Morton Cc: Jason Gunthorpe , Dan Williams , Matthew Wilcox , Jan Kara , "Theodore Ts'o" , John Hubbard , Michal Hocko , Dave Chinner , linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, Ira Weiny Subject: [RFC PATCH v2 19/19] mm/gup: Remove FOLL_LONGTERM DAX exclusion Date: Fri, 9 Aug 2019 15:58:33 -0700 Message-Id: <20190809225833.6657-20-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190809225833.6657-1-ira.weiny@intel.com> References: <20190809225833.6657-1-ira.weiny@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Ira Weiny Now that there is a mechanism for users to safely take LONGTERM pins on FS DAX pages, remove the FS DAX exclusion from the GUP implementation. Special processing remains in effect for CONFIG_CMA NOTE: Some callers still fail because the vaddr_pin information has not been passed into the new interface. As new users appear they can start to use the new interface to support FS DAX. Signed-off-by: Ira Weiny --- mm/gup.c | 78 ++++++-------------------------------------------------- 1 file changed, 8 insertions(+), 70 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 6d23f70d7847..58f008a3c153 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1415,26 +1415,6 @@ static long __get_user_pages_locked(struct task_struct *tsk, } #endif /* !CONFIG_MMU */ -#if defined(CONFIG_FS_DAX) || defined (CONFIG_CMA) -static bool check_dax_vmas(struct vm_area_struct **vmas, long nr_pages) -{ - long i; - struct vm_area_struct *vma_prev = NULL; - - for (i = 0; i < nr_pages; i++) { - struct vm_area_struct *vma = vmas[i]; - - if (vma == vma_prev) - continue; - - vma_prev = vma; - - if (vma_is_fsdax(vma)) - return true; - } - return false; -} - #ifdef CONFIG_CMA static struct page *new_non_cma_page(struct page *page, unsigned long private) { @@ -1568,18 +1548,6 @@ static long check_and_migrate_cma_pages(struct task_struct *tsk, return nr_pages; } -#else -static long check_and_migrate_cma_pages(struct task_struct *tsk, - struct mm_struct *mm, - unsigned long start, - unsigned long nr_pages, - struct page **pages, - struct vm_area_struct **vmas, - unsigned int gup_flags) -{ - return nr_pages; -} -#endif /* CONFIG_CMA */ /* * __gup_longterm_locked() is a wrapper for __get_user_pages_locked which @@ -1594,49 +1562,28 @@ static long __gup_longterm_locked(struct task_struct *tsk, unsigned int gup_flags, struct vaddr_pin *vaddr_pin) { - struct vm_area_struct **vmas_tmp = vmas; unsigned long flags = 0; - long rc, i; + long rc; - if (gup_flags & FOLL_LONGTERM) { - if (!pages) - return -EINVAL; - - if (!vmas_tmp) { - vmas_tmp = kcalloc(nr_pages, - sizeof(struct vm_area_struct *), - GFP_KERNEL); - if (!vmas_tmp) - return -ENOMEM; - } + if (flags & FOLL_LONGTERM) flags = memalloc_nocma_save(); - } rc = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, - vmas_tmp, NULL, gup_flags, vaddr_pin); + vmas, NULL, gup_flags, vaddr_pin); if (gup_flags & FOLL_LONGTERM) { memalloc_nocma_restore(flags); if (rc < 0) goto out; - if (check_dax_vmas(vmas_tmp, rc)) { - for (i = 0; i < rc; i++) - put_page(pages[i]); - rc = -EOPNOTSUPP; - goto out; - } - rc = check_and_migrate_cma_pages(tsk, mm, start, rc, pages, - vmas_tmp, gup_flags); + vmas, gup_flags); } out: - if (vmas_tmp != vmas) - kfree(vmas_tmp); return rc; } -#else /* !CONFIG_FS_DAX && !CONFIG_CMA */ +#else /* !CONFIG_CMA */ static __always_inline long __gup_longterm_locked(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, @@ -1649,7 +1596,7 @@ static __always_inline long __gup_longterm_locked(struct task_struct *tsk, return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, NULL, flags, vaddr_pin); } -#endif /* CONFIG_FS_DAX || CONFIG_CMA */ +#endif /* CONFIG_CMA */ /* * This is the same as get_user_pages_remote(), just with a @@ -1887,9 +1834,6 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, goto pte_unmap; if (pte_devmap(pte)) { - if (unlikely(flags & FOLL_LONGTERM)) - goto pte_unmap; - pgmap = get_dev_pagemap(pte_pfn(pte), pgmap); if (unlikely(!pgmap)) { undo_dev_pagemap(nr, nr_start, pages); @@ -2139,12 +2083,9 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (!pmd_access_permitted(orig, flags & FOLL_WRITE)) return 0; - if (pmd_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) - return 0; + if (pmd_devmap(orig)) return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr, flags, vaddr_pin); - } refs = 0; page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); @@ -2182,12 +2123,9 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, if (!pud_access_permitted(orig, flags & FOLL_WRITE)) return 0; - if (pud_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) - return 0; + if (pud_devmap(orig)) return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr, flags, vaddr_pin); - } refs = 0; page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);