From patchwork Tue Dec 4 23:37:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712719 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 56DD917DB for ; Tue, 4 Dec 2018 23:37:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 49FD729C06 for ; Tue, 4 Dec 2018 23:37:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E3CB29C44; Tue, 4 Dec 2018 23:37:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D9B0629C3A for ; Tue, 4 Dec 2018 23:37:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725886AbeLDXhl (ORCPT ); Tue, 4 Dec 2018 18:37:41 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:42820 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726092AbeLDXhk (ORCPT ); Tue, 4 Dec 2018 18:37:40 -0500 Received: by mail-pg1-f193.google.com with SMTP id d72so8091184pga.9 for ; Tue, 04 Dec 2018 15:37:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uEeDBkOwPid70UEg8pUacK6i7pi0+OoFq4mtLStZsZw=; b=WM3qER6gZnamoCqOraZo0+l/uvpyWdqQYR1qkX5U0690qlTReyDJakw8A5TbrEFC91 9/2ZTF4qa9uRzE++BvFLqq7CxV6N5n9pEj8Y3YOZwtF4D+xhTNHP1TzeyUR+Oi97HGeJ 8zMWRhu+ulbnfqDvMO9dKy312MROBSOsbD7RxE+8hqBbCXMdScsupIThwZe2PLiRfTAy r6SShf0ONo/VuFRTWzy+nUxTFkzpaDEVOpTUb90Y0Lp2clvy2trniwPSaChQGfg8rykk McN29ee3Vv6rcZfx6h14HGJGbHwkQeydr0cFsrcZRDM1ZMiPxUwnSZhGFOuXSxXXqs3s IN8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uEeDBkOwPid70UEg8pUacK6i7pi0+OoFq4mtLStZsZw=; b=F0A0pkx3Eov/tzMKn3R8jB4DjiaiBpJjdd5Rp7pdf17m2mi6bjhFQ2ma+sR0xEM7om vo5aAkDKeZ7CWS7Iez53W6m1kyK5RJBOf0F3B/K+XDxCdyvErdsu9bXp8vO+NUE7Ijbq 31Q8GRqwwo8mvAqYQLBfrmC/ppkv/e6+eI/pj3EeXxJtXtzJ4hqsz9CBmaIrjRtNiuDl Liwac+YR1O4rGS6dYPwaVT210St92eR4/fnjknqj3MOi12jEwU8kvxM7Qbdool9LruGB a43Eze9kP/TVUlIm6BYSzKUVLNUaDYkfwwo/HUGgcsKL3VLXCPj1fYGtdtZbJF8tpI+C 77zw== X-Gm-Message-State: AA+aEWbM5iXu1X/GBulgyCe6AuLH9baLpVpgx/afyrm+FVsWCydvrgSK VvJuF3gWzYadMwoLHTUap3kRE0o7+B8= X-Google-Smtp-Source: AFSGD/X9dy6/YWg2RzhNc3aa1H+mjufewC9ndul5KGodYelxHwnfEYTEkn1ww/+Q93mf2xedOtMbKw== X-Received: by 2002:a62:a99:: with SMTP id 25mr21771482pfk.121.1543966659124; Tue, 04 Dec 2018 15:37:39 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:38 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 01/26] fs: add an iopoll method to struct file_operations Date: Tue, 4 Dec 2018 16:37:04 -0700 Message-Id: <20181204233729.26776-2-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig This new methods is used to explicitly poll for I/O completion for an iocb. It must be called for any iocb submitted asynchronously (that is with a non-null ki_complete) which has the IOCB_HIPRI flag set. The method is assisted by a new ki_cookie field in struct iocb to store the polling cookie. TODO: we can probably union ki_cookie with the existing hint and I/O priority fields to avoid struct kiocb growth. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- Documentation/filesystems/vfs.txt | 3 +++ include/linux/fs.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 5f71a252e2e0..d9dc5e4d82b9 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -857,6 +857,7 @@ struct file_operations { ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); + int (*iopoll)(struct kiocb *kiocb, bool spin); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); @@ -902,6 +903,8 @@ otherwise noted. write_iter: possibly asynchronous write with iov_iter as source + iopoll: called when aio wants to poll for completions on HIPRI iocbs + iterate: called when the VFS needs to read the directory contents iterate_shared: called when the VFS needs to read the directory contents diff --git a/include/linux/fs.h b/include/linux/fs.h index a1ab233e6469..6a5f71f8ae06 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -310,6 +310,7 @@ struct kiocb { int ki_flags; u16 ki_hint; u16 ki_ioprio; /* See linux/ioprio.h */ + unsigned int ki_cookie; /* for ->iopoll */ } __randomize_layout; static inline bool is_sync_kiocb(struct kiocb *kiocb) @@ -1781,6 +1782,7 @@ struct file_operations { ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); + int (*iopoll)(struct kiocb *kiocb, bool spin); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); From patchwork Tue Dec 4 23:37:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712723 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 18E5B17D5 for ; Tue, 4 Dec 2018 23:37:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C75929C06 for ; Tue, 4 Dec 2018 23:37:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0073829C44; Tue, 4 Dec 2018 23:37:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE67229C06 for ; Tue, 4 Dec 2018 23:37:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726578AbeLDXhn (ORCPT ); Tue, 4 Dec 2018 18:37:43 -0500 Received: from mail-pg1-f196.google.com ([209.85.215.196]:46473 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726092AbeLDXhm (ORCPT ); Tue, 4 Dec 2018 18:37:42 -0500 Received: by mail-pg1-f196.google.com with SMTP id w7so8084253pgp.13 for ; Tue, 04 Dec 2018 15:37:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FSmFyxdzuP5ug/6YEtOiiyUa72y6iXE8GKTBze9pRyY=; b=qlMYd37o2DkQ1ult2b2Q7Vs9z4B4VNJMb2oQLB4pmOOOM+ETqKmU+ek2WqcteOS/hf 3fk+oxqXy/Ap2ZAVmGRSHOomawgkwGibsgtT2J/vGTH4hNsE5dvMwIA56ufTa5yycteo zy0ALXXboieiifhKsgq4NT97odnaFaZStYGHkEOrYM3YKULleA93Bj8aiKraVQc558AI dCEIfx3LBgPxl6B4RNhZuk49v1r7tjbTOLYk7NX23HIhhZDe6yvWz0TLDZu9P2gkk7VC ZToJmvbqYF6EhWv+YiyJTWD0o8pXj8mDW3KHun/1ywcvZnYVv+03ZknyR5xgRFbpe2TH Dopg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FSmFyxdzuP5ug/6YEtOiiyUa72y6iXE8GKTBze9pRyY=; b=sWokffGJekUyHmb7QsgNQXCTSfrQfW/zeymhXugef4+W0JS3KbrCIEvyyd/OMY5V14 ZuhGOXghh6CKqXnvCq8y3fHupWdrIk5VJjvr+Ftui1lmKLOnly1EUmoandG1bvEo3rL1 7UCRgy2oB5eOJqkqM9lzBKtfQGiZjrhGT85+2l0xArsLyKCwhMIERrE4HMF+tbOizFqp QbDj2Ov4vDbP8OFnz8SK1uouwgoW6YeDxr5gjewuvhAbXKNzGVho0ajssQRvsO3uESA4 0lH7HyWm6lRuRKX59QvJGje7gZLwmtWFtpDFLekYMZEgZVlaTxQRqWiqAMIHcLrnq6Nc Fd5g== X-Gm-Message-State: AA+aEWZmAgspgJ9KSTufKCJRvXnYl+tlKFEk3orE0X+lzFXZ4pJDV/kR 0Qr+e71HVNgDjbf/PouZBFehSCK5G9o= X-Google-Smtp-Source: AFSGD/UVk0jn174qmJfatEBZtlMDsDTIbnBGv5Qzzs3/ZmsrkcOjoekCOck9thzDRUl5Dal/0aLw5A== X-Received: by 2002:a65:564b:: with SMTP id m11mr17957051pgs.216.1543966661174; Tue, 04 Dec 2018 15:37:41 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:40 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 02/26] block: add REQ_HIPRI_ASYNC Date: Tue, 4 Dec 2018 16:37:05 -0700 Message-Id: <20181204233729.26776-3-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For the upcoming async polled IO, we can't sleep allocating requests. If we do, then we introduce a deadlock where the submitter already has async polled IO in-flight, but can't wait for them to complete since polled requests must be active found and reaped. Signed-off-by: Jens Axboe --- include/linux/blk_types.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index c0ba1a038ff3..1d9c3da0f2a1 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -346,6 +346,7 @@ enum req_flag_bits { #define REQ_NOWAIT (1ULL << __REQ_NOWAIT) #define REQ_NOUNMAP (1ULL << __REQ_NOUNMAP) #define REQ_HIPRI (1ULL << __REQ_HIPRI) +#define REQ_HIPRI_ASYNC (REQ_HIPRI | REQ_NOWAIT) #define REQ_DRV (1ULL << __REQ_DRV) #define REQ_SWAP (1ULL << __REQ_SWAP) From patchwork Tue Dec 4 23:37:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712727 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 15E6F17DB for ; Tue, 4 Dec 2018 23:37:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0916829C06 for ; Tue, 4 Dec 2018 23:37:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F02E329C44; Tue, 4 Dec 2018 23:37:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5D1E29C06 for ; Tue, 4 Dec 2018 23:37:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726092AbeLDXho (ORCPT ); Tue, 4 Dec 2018 18:37:44 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:41240 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726581AbeLDXho (ORCPT ); Tue, 4 Dec 2018 18:37:44 -0500 Received: by mail-pl1-f196.google.com with SMTP id u6so9064943plm.8 for ; Tue, 04 Dec 2018 15:37:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=eJ3aWzBT4gm4FwYHYUV1JBWUWsiFyJVaotzVIFD6bng=; b=wPBjiib/AjwVUwGJ8Lj5+r5OfUcoMGHUL3c3C2u3/Vn3Vducqv3jAJEiEW8fIUU6R+ 2iTHG1CSQSlEMtijEzm9YarsbbSdzsn/KtPXXzOBMrRXHAzw03cMwxcGaNMmUbKmQCMr TgS+An6DJrpU2gTFLW2AZS3aw3tWZJAWAEnVCAoEiIEZc/s5qlQZF4Q2FMjTDjTJnyWd hzABWFT78uNa+cwEzAE/5EsSU1Tz3jBZr/GYFYFCdZoU6cH9pegXWYdND9wp+0A9s0rD A6nDu0B07HWgQdLFsOO/+UKAFsQ9GKQBx5ExC2gcdQPKgm5+GWrLBzioapfM5tJCT0H/ Jx4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eJ3aWzBT4gm4FwYHYUV1JBWUWsiFyJVaotzVIFD6bng=; b=XHTf+GL9iIHJzZ2Gxxi/Sj9HucW1uCBnnwqmF0uQ2QkdjWP2uL44QlqTp1u+MF2oty pMwPLIg9Hh7idZWbMeYdeSPtkygYW3e7cD4wzARVnxwH1EqnOXJyQBN3ctJmyUIJXgex GK5rot7Z0i7yblnw5M23WZPd/X7GL1sdmOo6GvZMDYedmBn8Us0SYEcYqY/0uMU8A5BH EdNyATr1GpOu3BXjkF8J4I82Ry+iAXrUzC/TdV5vwACh2GcrQfj5RgP61qyQO7p+bcBT hmyNDIAZfPY/qjxlqYvBLM1Qys8plGB1Pz4hOVLAWILlqPWl8SXYl1qv8kf4evilQ0ZK F89g== X-Gm-Message-State: AA+aEWbSCHJE/YBvSjTmieGPGmqjpj6buJdv+UK3w5k6701Na9vR/Mpn 7S82p17qeGmp9iiEobYvZA5Lp9bvhYU= X-Google-Smtp-Source: AFSGD/V86QrCPgAdBchJQQz7hOU1x9SJhRs1OCI9da1EDCeCoh6DTnTxrl+TeHAZGced5LMzX9kt7g== X-Received: by 2002:a17:902:1008:: with SMTP id b8mr21275265pla.252.1543966663256; Tue, 04 Dec 2018 15:37:43 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:42 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 03/26] block: wire up block device iopoll method Date: Tue, 4 Dec 2018 16:37:06 -0700 Message-Id: <20181204233729.26776-4-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig Just call blk_poll on the iocb cookie, we can derive the block device from the inode trivially. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/block_dev.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/block_dev.c b/fs/block_dev.c index e1886cc7048f..6de8d35f6e41 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -281,6 +281,14 @@ struct blkdev_dio { static struct bio_set blkdev_dio_pool; +static int blkdev_iopoll(struct kiocb *kiocb, bool wait) +{ + struct block_device *bdev = I_BDEV(kiocb->ki_filp->f_mapping->host); + struct request_queue *q = bdev_get_queue(bdev); + + return blk_poll(q, READ_ONCE(kiocb->ki_cookie), wait); +} + static void blkdev_bio_end_io(struct bio *bio) { struct blkdev_dio *dio = bio->bi_private; @@ -398,6 +406,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) bio->bi_opf |= REQ_HIPRI; qc = submit_bio(bio); + WRITE_ONCE(iocb->ki_cookie, qc); break; } @@ -2070,6 +2079,7 @@ const struct file_operations def_blk_fops = { .llseek = block_llseek, .read_iter = blkdev_read_iter, .write_iter = blkdev_write_iter, + .iopoll = blkdev_iopoll, .mmap = generic_file_mmap, .fsync = blkdev_fsync, .unlocked_ioctl = block_ioctl, From patchwork Tue Dec 4 23:37:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712729 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0814713AF for ; Tue, 4 Dec 2018 23:37:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EEC1329C06 for ; Tue, 4 Dec 2018 23:37:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E314D29C44; Tue, 4 Dec 2018 23:37:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9850D29C06 for ; Tue, 4 Dec 2018 23:37:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726598AbeLDXhq (ORCPT ); Tue, 4 Dec 2018 18:37:46 -0500 Received: from mail-pf1-f193.google.com ([209.85.210.193]:39871 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726001AbeLDXhp (ORCPT ); Tue, 4 Dec 2018 18:37:45 -0500 Received: by mail-pf1-f193.google.com with SMTP id c72so9005589pfc.6 for ; Tue, 04 Dec 2018 15:37:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CoUoLVQhZCk01wz0SKdc8zjKvmVqwUz+894lpbana7A=; b=ob2Al9z+bpfazXrOpsm2EijMCcEkBP/Ljga69OwyllrOEb91xaOvlFCcaGoQEwKnuh 0Q9L8pP8s28kig9yaGFQybCniKw+kQabPavh6j0dLL3rr+EUsHPHRnTA3kuDwPIh6Mig Mn3NbXKdYVoUT4ffdXKt25jPRpOHTuHEIpuWxbn4GWy5+tvzUi4fgZRIs9Q8sYkXGRhw +PLfuaXjteBI4i1XjNEqTzO6t1IDxQ9gWmm93nLH/PlP0dzvQoJy2Rt4xzl0FBggZBgO ejpihfIdG2viqjJCxXmsoTrs8KArcEDFi2BXtzwAks9AMgCOR5uxiXFoWKnY1Qgebq1x +sSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=CoUoLVQhZCk01wz0SKdc8zjKvmVqwUz+894lpbana7A=; b=PJ7YmjmffKyO+Zu0e4umyhd0pGkLCJgwot7EDVhkE2UYKK0fZwQ54gve9QYTqn03j9 9vKxq5MV8qU8skXziIpPFvpN+HIogxy8L98I9XBYcN3M92xPWwbauHZ5JqQoqTk1Y1qq cJqCY6JQGEdQvEuyU1EVy0NGX+bT8fplsa78WdEdRcTENDF/G6qNaoy8Y7y1aESyftLK Lw6l8CDoZ3PG7noRLsJYY2hvNjKv+83He4vnZBw9DhnHGehVvni4KcjDU4RcBoL3TFmx jFwf5TYzbluqQXZ4KQrckrVvtmZ9zQ5JOaPha6JPMFbp39R+oyt2PmqUbOqrj2dcq3Jb KsUA== X-Gm-Message-State: AA+aEWb6JsI2BSXSu+e2gRGqtjr5remTwEpDgr5sP6Wo9x2cZN+sd5Zi 3VeZ9L8jhWBJo/BDfATeCCHCBLikMSU= X-Google-Smtp-Source: AFSGD/UMALFfr/m4YuRIp4o2wdkiDimaOlGrMwcbJEQBqApDn7m6HiBybCGeeursdN+bD2K1y/S6EQ== X-Received: by 2002:a62:c185:: with SMTP id i127mr17728535pfg.43.1543966664980; Tue, 04 Dec 2018 15:37:44 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:44 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 04/26] block: use REQ_HIPRI_ASYNC for non-sync polled IO Date: Tue, 4 Dec 2018 16:37:07 -0700 Message-Id: <20181204233729.26776-5-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Tell the block layer if it's a sync or async polled request, so it can do the right thing. Signed-off-by: Jens Axboe --- fs/block_dev.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 6de8d35f6e41..b8f574615792 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -402,8 +402,12 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES); if (!nr_pages) { - if (iocb->ki_flags & IOCB_HIPRI) - bio->bi_opf |= REQ_HIPRI; + if (iocb->ki_flags & IOCB_HIPRI) { + if (!is_sync) + bio->bi_opf |= REQ_HIPRI_ASYNC; + else + bio->bi_opf |= REQ_HIPRI; + } qc = submit_bio(bio); WRITE_ONCE(iocb->ki_cookie, qc); From patchwork Tue Dec 4 23:37:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712735 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AE7217DB for ; Tue, 4 Dec 2018 23:37:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E68029C06 for ; Tue, 4 Dec 2018 23:37:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 02D7229C44; Tue, 4 Dec 2018 23:37:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8972A29C06 for ; Tue, 4 Dec 2018 23:37:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725895AbeLDXhs (ORCPT ); Tue, 4 Dec 2018 18:37:48 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:35091 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726001AbeLDXhs (ORCPT ); Tue, 4 Dec 2018 18:37:48 -0500 Received: by mail-pg1-f194.google.com with SMTP id s198so8110289pgs.2 for ; Tue, 04 Dec 2018 15:37:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8tm06g2UYqNEscam4O7nNmIUSHvbGtCWrlfmTvusP68=; b=TyNmeUJfx1JN1iQID9mN1Uj4IlWNvSEl8/qTq76nZw90Ncj057Pf4+29b3tOVxCtAo roy261CBxwZQoX9+sPpaqJPdDKy7WHno48CEgVOF4Ap/FwJ6bUdplGQIwB0dmuCUZQpm mvYlaaopRoY4lxxtvIryI6JdbWVhjhhrVoHWUKdEuZPPw+unjGmbRHhe3Twmjkv+PIB1 Z+Kct5+bsaOTgD84gIFVwL61QURSm3RfzKJMmHf+BVcUcBxyoDsAZEn10zw2y+TSYYC9 HXViQzHnghA2Re0e+Igyn74OgC3wFVek/9ZioZjEewNfXUtXrPUJ9wCQAg393IttaAUS 0wLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8tm06g2UYqNEscam4O7nNmIUSHvbGtCWrlfmTvusP68=; b=slV+sGnw0p8ShuubUPT1Ue6VvoCIY1acxMF1eT3Ilk7doIwZE3i+DgTK5LPLU5S+II EPusQb+b9lnr0UykfTmWCiZTtC1XUqYy1oYuOglKW2atcuKOlq3Nmtt7Clm0cdJEBbR8 ZBrv7NHUXnqs//YDwPhW7UiXDG1X8VuqqExcf6Eiy5fJN4Tzs7HfkXjho2Eul18CwbqG InSt0+ktGXn5BcDvUhKHtTgj/4eFm/1iMzDJCINqoN3A52TqHU5xm+aBsB8lzskMY56f HZRNithzvFvUrs2XrQaAayGEHAh07S+wceuYpeO5IXMXe8XLXC8J+VScA9cnBpgFrotf aXRA== X-Gm-Message-State: AA+aEWam/mcPV5mWJAjWVK5hRKMrlks39n/5PHkriZRi6js9QGDIob0A RChmIROybfLxZcswWU2rr5QT/PQ38UM= X-Google-Smtp-Source: AFSGD/VZtzRiK8/ROhamK7YaKZEhoH+tj/zL82fSs16Yh9giRInbJwDFJIloHThfZkQN+6UFGwDJ0w== X-Received: by 2002:a65:514c:: with SMTP id g12mr18331930pgq.169.1543966667040; Tue, 04 Dec 2018 15:37:47 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:46 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 05/26] iomap: wire up the iopoll method Date: Tue, 4 Dec 2018 16:37:08 -0700 Message-Id: <20181204233729.26776-6-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig Store the request queue the last bio was submitted to in the iocb private data in addition to the cookie so that we find the right block device. Also refactor the common direct I/O bio submission code into a nice little helper. Signed-off-by: Christoph Hellwig Modified to use REQ_HIPRI_ASYNC for async polled IO. Signed-off-by: Jens Axboe --- fs/gfs2/file.c | 2 ++ fs/iomap.c | 47 +++++++++++++++++++++++++++++-------------- fs/xfs/xfs_file.c | 1 + include/linux/iomap.h | 1 + 4 files changed, 36 insertions(+), 15 deletions(-) diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 45a17b770d97..358157efc5b7 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -1280,6 +1280,7 @@ const struct file_operations gfs2_file_fops = { .llseek = gfs2_llseek, .read_iter = gfs2_file_read_iter, .write_iter = gfs2_file_write_iter, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = gfs2_ioctl, .mmap = gfs2_mmap, .open = gfs2_open, @@ -1310,6 +1311,7 @@ const struct file_operations gfs2_file_fops_nolock = { .llseek = gfs2_llseek, .read_iter = gfs2_file_read_iter, .write_iter = gfs2_file_write_iter, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = gfs2_ioctl, .mmap = gfs2_mmap, .open = gfs2_open, diff --git a/fs/iomap.c b/fs/iomap.c index d094e5688bd3..bd483fcb7b5a 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1441,6 +1441,32 @@ struct iomap_dio { }; }; +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) +{ + struct request_queue *q = READ_ONCE(kiocb->private); + + if (!q) + return 0; + return blk_poll(q, READ_ONCE(kiocb->ki_cookie), spin); +} +EXPORT_SYMBOL_GPL(iomap_dio_iopoll); + +static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, + struct bio *bio) +{ + atomic_inc(&dio->ref); + + if (dio->iocb->ki_flags & IOCB_HIPRI) { + if (!dio->wait_for_completion) + bio->bi_opf |= REQ_HIPRI_ASYNC; + else + bio->bi_opf |= REQ_HIPRI; + } + + dio->submit.last_queue = bdev_get_queue(iomap->bdev); + dio->submit.cookie = submit_bio(bio); +} + static ssize_t iomap_dio_complete(struct iomap_dio *dio) { struct kiocb *iocb = dio->iocb; @@ -1553,7 +1579,7 @@ static void iomap_dio_bio_end_io(struct bio *bio) } } -static blk_qc_t +static void iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, unsigned len) { @@ -1567,15 +1593,10 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - if (dio->iocb->ki_flags & IOCB_HIPRI) - flags |= REQ_HIPRI; - get_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); - - atomic_inc(&dio->ref); - return submit_bio(bio); + iomap_dio_submit_bio(dio, iomap, bio); } static loff_t @@ -1678,9 +1699,6 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, bio_set_pages_dirty(bio); } - if (dio->iocb->ki_flags & IOCB_HIPRI) - bio->bi_opf |= REQ_HIPRI; - iov_iter_advance(dio->submit.iter, n); dio->size += n; @@ -1688,11 +1706,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, copied += n; nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES); - - atomic_inc(&dio->ref); - - dio->submit.last_queue = bdev_get_queue(iomap->bdev); - dio->submit.cookie = submit_bio(bio); + iomap_dio_submit_bio(dio, iomap, bio); } while (nr_pages); /* @@ -1912,6 +1926,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (dio->flags & IOMAP_DIO_WRITE_FUA) dio->flags &= ~IOMAP_DIO_NEED_SYNC; + WRITE_ONCE(iocb->ki_cookie, dio->submit.cookie); + WRITE_ONCE(iocb->private, dio->submit.last_queue); + if (!atomic_dec_and_test(&dio->ref)) { if (!dio->wait_for_completion) return -EIOCBQUEUED; diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index e47425071e65..60c2da41f0fc 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1203,6 +1203,7 @@ const struct file_operations xfs_file_operations = { .write_iter = xfs_file_write_iter, .splice_read = generic_file_splice_read, .splice_write = iter_file_splice_write, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = xfs_file_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = xfs_file_compat_ioctl, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 9a4258154b25..0fefb5455bda 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -162,6 +162,7 @@ typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret, unsigned flags); ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, iomap_dio_end_io_t end_io); +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin); #ifdef CONFIG_SWAP struct file; From patchwork Tue Dec 4 23:37:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712739 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EBCE713AF for ; Tue, 4 Dec 2018 23:37:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFEA929C06 for ; Tue, 4 Dec 2018 23:37:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D499829EB9; Tue, 4 Dec 2018 23:37:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77D3429C06 for ; Tue, 4 Dec 2018 23:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726617AbeLDXhu (ORCPT ); Tue, 4 Dec 2018 18:37:50 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:46221 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726001AbeLDXhu (ORCPT ); Tue, 4 Dec 2018 18:37:50 -0500 Received: by mail-pl1-f196.google.com with SMTP id t13so9063291ply.13 for ; Tue, 04 Dec 2018 15:37:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=er4XNTBrDyS5t2gvWJtJjpkVFpAPOC78sIXCy9gEMJ4=; b=PQ78I25EACF4QpvLy2c3RPPg3RGlYaTp356E3BtnPHRUhKnUnNAiREl35+DqWiL9MJ wruRA6lo9pj6KoexHWN9W5qh1LCvDsmNYVgHX9MbemBF2ud/QsiKJERpqlEK0sdG9ePp znlptHklXgtAk6qe8HEoEkZsewLgg4BK4BmO7JZOVpjt5DamG/jagvUY4mymOUwiAPNv CqVqpDOSXcxCP8FCm/yXVsypqtwyyK7wOuDZ4SkR59aONeBebGsxbipUka0udprNnj/l WWTeMxuq9TCLdrTK1ghSgkLozJyEQsNC1Cpoq79NQqQAWM65lY8osnVLbxukhaKsdPKg vFsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=er4XNTBrDyS5t2gvWJtJjpkVFpAPOC78sIXCy9gEMJ4=; b=LN2a+w356x26/002yK2XAIn+X1/z/zwYU3vK97AVimHFFXBV7IfZMcsEt98Nw0njc5 ruQ799DLuX99FTFSZiw2+M4TbWC8zf/A4GIvJNiPLQVk+Oe4p8m2EPmxFi6FOe5jwThz iKr4o36YeUIVwshMBtReamiNlxfnezSMDjaYws7OCw6kW+1ym4nEF9ipcPJnMb8+UEEX hIEDKn1x/HwoBph8JIUHcahBuR6104GxwQIcBc5MixbZuod4lTHO4qIqGGxhtrrsS4jy TtegXGZCeFv52rowKJoccfrZhR3nZD7pBXMCqN6bAe1ajdKUrXotzlIqB7UDE6M7fywH Wwig== X-Gm-Message-State: AA+aEWYKG7QTM9Klh9ViLhsMG3meGkVzQgbeGuxO7pEpQs/vHOYdiboE Le1C5OV2nf354AmAVi5EDhX3N0czAQ4= X-Google-Smtp-Source: AFSGD/U/wShxrZGjTzS5JrlcAERYXLvQHt9ixDXJZmJEejy80uMAUNmjIUmOTaxM3xsIRXl3pK6+qg== X-Received: by 2002:a17:902:bc3:: with SMTP id 61mr22135069plr.15.1543966669243; Tue, 04 Dec 2018 15:37:49 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:48 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 06/26] aio: use assigned completion handler Date: Tue, 4 Dec 2018 16:37:09 -0700 Message-Id: <20181204233729.26776-7-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We know this is a read/write request, but in preparation for having different kinds of those, ensure that we call the assigned handler instead of assuming it's aio_complete_rq(). Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/aio.c b/fs/aio.c index 05647d352bf3..cf0de61743e8 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1490,7 +1490,7 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) ret = -EINTR; /*FALLTHRU*/ default: - aio_complete_rw(req, ret, 0); + req->ki_complete(req, ret, 0); } } From patchwork Tue Dec 4 23:37:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712743 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AB95717D5 for ; Tue, 4 Dec 2018 23:37:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9DA0729C06 for ; Tue, 4 Dec 2018 23:37:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9226A29EB9; Tue, 4 Dec 2018 23:37:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3ECF529C06 for ; Tue, 4 Dec 2018 23:37:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726001AbeLDXhy (ORCPT ); Tue, 4 Dec 2018 18:37:54 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:37862 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726624AbeLDXhy (ORCPT ); Tue, 4 Dec 2018 18:37:54 -0500 Received: by mail-pg1-f193.google.com with SMTP id 80so8092784pge.4 for ; Tue, 04 Dec 2018 15:37:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=X5thebUgmjkZJznWx9PghrfmA/cZ8t8Zs5+DcD2OHEI=; b=dE8WS+kUC7yZA4W/psiXS2ohAuqeprvpmbU6SBBH2j//376jO5T895dTIRaKHvHs4i 4LXsmnPEopGbxesEKPAAIpWzOguX+nEM1KE1AK9UzjaITvaiFXrJ4Rl+R8a3rDJXvhxO ipcpJh/omuzdnmyroasy9amsPxS4Y3+7AlMqGAMiPc2SHBnp7D72Ych4lcIcJ3FDnwCd wUglliba9O8OjMMh9jsjoV2iCckthmeKav0NjAXJh83GTSQkekOwmSYKFXSlkyxAMuOl 7FPOL7nSsRy+NFEC115kEWA1EfeJkQZAvw4WNaZn31cLedsLxDyW7kg5vGFl+wCCFAfo UgMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=X5thebUgmjkZJznWx9PghrfmA/cZ8t8Zs5+DcD2OHEI=; b=K/nc5pA6bHIDu8D2qc5+xFfZ3bozqePKrwnbVvCoqxoeF5dEZ5MTWXqNrVtLdqkXdz bY+vkjLAqGoOBN8VFggadDVqGXfz9GABVN3hC5xGennEZZIj6YL9WwJxbJFHCvyhW5Zt YJEVnHaQdTtXsSuvAu8xVEd+JdD1mpK7BaEm9m3uLV81e5xZYTmgu+kN2DqMewvximeO JscxMw3Du2XRCfTMQqabQK8MIdNGzIdJ6heOcDRH8Ss/AY/I2+2LO+mMBCVmqwxJNSiv 8WMhFZj6gCIeHE/9VdKPKGZ7IJ24U2T4RZwtdaDh/fLOkHMWU/eGP7Ou1fH6MgFUe3D2 eQvA== X-Gm-Message-State: AA+aEWbv7bz4QsIyZ7gCbQ5qYfWqq20V62iQOREJZNmtBZtLe7ulk526 L2LMIKena3p/rHAjcKw67NXFfqqcKU4= X-Google-Smtp-Source: AFSGD/UVaMY5FzeFtJ9H4penb1h/+C7kMxs1mWyVILtRTdEOYh5ylEJTTUtloBU3PRxVG3ELb2mc7w== X-Received: by 2002:a63:63c3:: with SMTP id x186mr18526940pgb.330.1543966672509; Tue, 04 Dec 2018 15:37:52 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:50 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 07/26] aio: separate out ring reservation from req allocation Date: Tue, 4 Dec 2018 16:37:10 -0700 Message-Id: <20181204233729.26776-8-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig This is in preparation for certain types of IO not needing a ring reserveration. Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index cf0de61743e8..eaceb40e6cf5 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -901,7 +901,7 @@ static void put_reqs_available(struct kioctx *ctx, unsigned nr) local_irq_restore(flags); } -static bool get_reqs_available(struct kioctx *ctx) +static bool __get_reqs_available(struct kioctx *ctx) { struct kioctx_cpu *kcpu; bool ret = false; @@ -993,6 +993,14 @@ static void user_refill_reqs_available(struct kioctx *ctx) spin_unlock_irq(&ctx->completion_lock); } +static bool get_reqs_available(struct kioctx *ctx) +{ + if (__get_reqs_available(ctx)) + return true; + user_refill_reqs_available(ctx); + return __get_reqs_available(ctx); +} + /* aio_get_req * Allocate a slot for an aio request. * Returns NULL if no requests are free. @@ -1001,24 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) { struct aio_kiocb *req; - if (!get_reqs_available(ctx)) { - user_refill_reqs_available(ctx); - if (!get_reqs_available(ctx)) - return NULL; - } - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); if (unlikely(!req)) - goto out_put; + return NULL; percpu_ref_get(&ctx->reqs); INIT_LIST_HEAD(&req->ki_list); refcount_set(&req->ki_refcnt, 0); req->ki_ctx = ctx; return req; -out_put: - put_reqs_available(ctx, 1); - return NULL; } static struct kioctx *lookup_ioctx(unsigned long ctx_id) @@ -1805,9 +1804,13 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return -EINVAL; } + if (!get_reqs_available(ctx)) + return -EAGAIN; + + ret = -EAGAIN; req = aio_get_req(ctx); if (unlikely(!req)) - return -EAGAIN; + goto out_put_reqs_available; if (iocb.aio_flags & IOCB_FLAG_RESFD) { /* @@ -1870,11 +1873,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, goto out_put_req; return 0; out_put_req: - put_reqs_available(ctx, 1); percpu_ref_put(&ctx->reqs); if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); kmem_cache_free(kiocb_cachep, req); +out_put_reqs_available: + put_reqs_available(ctx, 1); return ret; } From patchwork Tue Dec 4 23:37:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712747 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D82517DB for ; Tue, 4 Dec 2018 23:37:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6F98129C06 for ; Tue, 4 Dec 2018 23:37:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 63A9F29EB9; Tue, 4 Dec 2018 23:37:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1298B29C06 for ; Tue, 4 Dec 2018 23:37:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725925AbeLDXh4 (ORCPT ); Tue, 4 Dec 2018 18:37:56 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:33798 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726624AbeLDXhz (ORCPT ); Tue, 4 Dec 2018 18:37:55 -0500 Received: by mail-pg1-f195.google.com with SMTP id 17so8110784pgg.1 for ; Tue, 04 Dec 2018 15:37:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=j8zbGImEaiI6fzdEzobInCNQIlTJ46kLaRSNEoXmu1s=; b=Set3dj9nxVQrkupYh/+vPbi3t2uxo7+EjNlGSkIkQl0JTq/vhMHP5SQ1GKLG4UJsYT goH8r45Ukjn6fzROvOFzgYeEPTIiPAEdjbGoePumQiVPTenDrVMpLERic2Ya4PaX98Vj cDEcGXe4RzCUcqW/NqCXzh6ms2KpAyqwa/KLSPt0h80HBZMNJPgcLW0L4Xwli6Ck4mkw JICDXjN3tL19OSIrQZWglTy/qsTHmiObvAg/i6w+uGIC/Y2FhQ2cz1ICi3dyKx9QI/AJ aSCwW5ZibvSpME6fTPqLStYRoIT3mAWAX6ECgkwdc/QC1UXgab6+cJAutkIPXJ/vjbHo Z24g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=j8zbGImEaiI6fzdEzobInCNQIlTJ46kLaRSNEoXmu1s=; b=sV4QUi5PdMxTf4sCZF3Fzt8zvinkQR58+QhUpMKsnWepilQLN7ptwFm4+HLWPAA3dV +Bk3EydFo/X6s92kdmtXI/cq79rMtQ05Zq5mk6HoXo9B98LVbyssktBl+hQ06825t5WQ NIjpD9LtVhy5Z9JdyvzE6/QiRm93uEYqC22JztXEjuW1VmpOGqfh/mWKKZUhwFj5AtJ3 dxggnX5S4i/aSiF8wZ5aho1pgXAYp3xQhbidMgmlSmwL3Knz46YcZY4cpaHd0OllGPGN pScS3jeaYREpGTo6wcFF1E1KO9Z48E1Oiq7eCRkF9CdmSTI8Ko7Rwlt8jE+gSfreFKsg nu/A== X-Gm-Message-State: AA+aEWagfqjacJ8Z1FDc/1XDPjbvRaYCWF5fweaDCaXqK3DKgD08yrFW P11/CJdGW3GA5ts8G02yeYegTJ6o9/E= X-Google-Smtp-Source: AFSGD/VBmQMykE9KxJcCOsVQ6mWUMkR59I46LJ+UNutZ8v/nWJqEjOW8gCHKkFpPqJwdy7OZl8zLpQ== X-Received: by 2002:a62:7dcb:: with SMTP id y194mr22258894pfc.113.1543966674877; Tue, 04 Dec 2018 15:37:54 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:53 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 08/26] aio: don't zero entire aio_kiocb aio_get_req() Date: Tue, 4 Dec 2018 16:37:11 -0700 Message-Id: <20181204233729.26776-9-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It's 192 bytes, fairly substantial. Most items don't need to be cleared, especially not upfront. Clear the ones we do need to clear, and leave the other ones for setup when the iocb is prepared and submitted. Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index eaceb40e6cf5..522c04864d82 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1009,14 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) { struct aio_kiocb *req; - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); if (unlikely(!req)) return NULL; percpu_ref_get(&ctx->reqs); + req->ki_ctx = ctx; INIT_LIST_HEAD(&req->ki_list); refcount_set(&req->ki_refcnt, 0); - req->ki_ctx = ctx; + req->ki_eventfd = NULL; return req; } @@ -1730,6 +1731,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) if (unlikely(!req->file)) return -EBADF; + req->head = NULL; + req->woken = false; + req->cancelled = false; + apt.pt._qproc = aio_poll_queue_proc; apt.pt._key = req->events; apt.iocb = aiocb; From patchwork Tue Dec 4 23:37:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712759 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 44ECB13AF for ; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 376BA29EB9 for ; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 29D3129C06; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DB11629C06 for ; Tue, 4 Dec 2018 23:37:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726642AbeLDXh6 (ORCPT ); Tue, 4 Dec 2018 18:37:58 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:38699 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725905AbeLDXh6 (ORCPT ); Tue, 4 Dec 2018 18:37:58 -0500 Received: by mail-pg1-f195.google.com with SMTP id g189so8094688pgc.5 for ; Tue, 04 Dec 2018 15:37:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Fw/O4zwzIPzN/w4BqDNf+gyUsaBSdiYM1Ss+7FBryao=; b=kjFVKuX27kUtJSUMX2coC/c20FfIZx9y/OTxfRiLxLm13H+dnDhuSfmoo/EiDXoORz xkDZh8kCC/vnqsqZp9mxQ5pmmuBdMw6reTtXaatWtGPepolGdzS53Tez4fcCS1Hm9fKb 1MG7Bv5EYsYBGjP9U84pnrn8QK+4hE3zphmo4JuKEFOmn37D4HD3lhy1hjk9NzxmogdH kN2EERERbGLbJNheu5Kk+K3dlJjCBdGg4XuD3be41hVDVj8siKBZI/K8V4hu/T93l5Dz rfdMGifSJl84vHut0x+o3fPrC4v9HCZZZXdp/W8qAoTydFdMJXoAjbe7UDhnVG+uCgh4 XPxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Fw/O4zwzIPzN/w4BqDNf+gyUsaBSdiYM1Ss+7FBryao=; b=kENQrqNiFlM3LSEavho03ChS4Y5Yw6PTELsIM5iZxtyoU9qzUNnIa+fhxi8C7y55Ve VU4AeZxOU6DYTuJVY5G9uUZDo5vBbUdH+5yy+PruXbwyso5gtNWbfU65F+tGkFchMB1Y FX3EH84U1ObfxdOzNR2zMyGB+ShLhwW2/3o1T9eL85ghSjM23/x841wvGZ7Qufmy3v/E OW5Fg6tGKfAvXVsbtsCaNTFgngPgroN0iiGTeJTBvmzuovddFBjBoxGrZeYpAIVO4MBF 379vn3E3Uzn5eToChejIk9m5kjjmpWgPyDNwAWI4WtyiDRXcCSQv9cTMM6HVlOoekoJz wffw== X-Gm-Message-State: AA+aEWYJ1uQWSC+VDhUpuz90RmZOEewi8+NlU5KI90AHXk9hjUWqnrRS oZjfbcxaKPsrPy1ACNQmM2Vd+/jzETE= X-Google-Smtp-Source: AFSGD/UvjaxhgXDJ1caBF/2wLeHQTob+emaAQr4xJ6WKznvV+7A1k3WWCt9NPbBbFHJj+dyAnTr6kA== X-Received: by 2002:a62:1212:: with SMTP id a18mr22779095pfj.217.1543966677020; Tue, 04 Dec 2018 15:37:57 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:56 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 09/26] aio: only use blk plugs for > 2 depth submissions Date: Tue, 4 Dec 2018 16:37:12 -0700 Message-Id: <20181204233729.26776-10-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Plugging is meant to optimize submission of a string of IOs, if we don't have more than 2 being submitted, don't bother setting up a plug. Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 522c04864d82..ed6c3914477a 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -69,6 +69,12 @@ struct aio_ring { struct io_event io_events[0]; }; /* 128 bytes + ring size */ +/* + * Plugging is meant to work with larger batches of IOs. If we don't + * have more than the below, then don't bother setting up a plug. + */ +#define AIO_PLUG_THRESHOLD 2 + #define AIO_RING_PAGES 8 struct kioctx_table { @@ -1919,7 +1925,8 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_start_plug(&plug); for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -1932,7 +1939,8 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (ret) break; } - blk_finish_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_finish_plug(&plug); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -1959,7 +1967,8 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (nr > ctx->nr_events) nr = ctx->nr_events; - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_start_plug(&plug); for (i = 0; i < nr; i++) { compat_uptr_t user_iocb; @@ -1972,7 +1981,8 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (ret) break; } - blk_finish_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_finish_plug(&plug); percpu_ref_put(&ctx->users); return i ? i : ret; From patchwork Tue Dec 4 23:37:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712753 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6A8B17DB for ; Tue, 4 Dec 2018 23:38:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A50F29C3A for ; Tue, 4 Dec 2018 23:38:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E77C29ED2; Tue, 4 Dec 2018 23:38:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 508C029C3A for ; Tue, 4 Dec 2018 23:38:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725905AbeLDXiA (ORCPT ); Tue, 4 Dec 2018 18:38:00 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:38400 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726538AbeLDXiA (ORCPT ); Tue, 4 Dec 2018 18:38:00 -0500 Received: by mail-pl1-f193.google.com with SMTP id e5so9079028plb.5 for ; Tue, 04 Dec 2018 15:37:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=46YCPSjyTrlcpzDW0Ukbp5ofhNBoK9a5mtdA0d9T/5Q=; b=iIOYrDG1w21bEIhxrVmcfUuIG9VXHhcogcg/cq1Gt9ZqZRTpydmQBC14v9zhevMLCh e6l1w6XYqIBYTUjZQ2N+stYk9KYQPmrVfqyws6i1HGuBa2ppJutarQ10T+ZgRKTmVYoQ Oy7U0uk2PtG22JANA132fuiQbqifl9MnbyHOErzBI8EkG6vL47TBlVyXr5UXV1ax4hAP pMOPToAUTqdUjNJ+fmCjnDZI75Yt/BWH2KKbkIjGkvSwwLSkI7kWoUeoAPNo63twvvVn YNBH2fI7Vg4fr2S4vbb/boz2/0KXdz8icZmUHeozFMg57o0wQWLFIST2dGt40VGV8l+x BGow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=46YCPSjyTrlcpzDW0Ukbp5ofhNBoK9a5mtdA0d9T/5Q=; b=R2+F4gqcnuMa1v894R7Orq3mM30dVRm/lscrJdNG3bf1tPFJ+oqaZFxsxk7FllUE0X +ClCFdvlvyxipZ7CtkqgIZhls61nhfgMo9/ozCwmQSsgkhZEv1iyiXbb0QL6tNfd+rTZ lVZ+gSQ/2oW1SSjQmUhpzT7hJI1Prp/Rj6C5vGCKToKCql0Cllgq8zR+vvL1sTajq97b L4kHqfopJSFNbcfWkjf+bx0Su7x8vVPSkyniiaaPIqPG4EUfADH7ykFRLiw5VZ3qpXTg F1sb3QVmRd49eokwFOq8at9yHs9dn2aa8tsoYQyUzNuZcwlGOpw3zFe76isePsz0QoA5 AANQ== X-Gm-Message-State: AA+aEWasZRdfNS+uTVL7pTMAkeohRd/9O3KsPlF7EuDhlwAJ1c5QH1WN o/OcOIB8fckmS98kmqKyC2rYqx5unvU= X-Google-Smtp-Source: AFSGD/UByFXnwmtPsEbw4UOJJ15sI+ebt+u8iHRY3wQ2xSWWc9vAfal2jXUYSmX3RM3NmyowY9q6jA== X-Received: by 2002:a17:902:541:: with SMTP id 59mr22448986plf.88.1543966679118; Tue, 04 Dec 2018 15:37:59 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:37:58 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 10/26] aio: use iocb_put() instead of open coding it Date: Tue, 4 Dec 2018 16:37:13 -0700 Message-Id: <20181204233729.26776-11-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Replace the percpu_ref_put() + kmem_cache_free() with a call to iocb_put() instead. Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index ed6c3914477a..cf93b92bfb1e 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1884,10 +1884,9 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, goto out_put_req; return 0; out_put_req: - percpu_ref_put(&ctx->reqs); if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); - kmem_cache_free(kiocb_cachep, req); + iocb_put(req); out_put_reqs_available: put_reqs_available(ctx, 1); return ret; From patchwork Tue Dec 4 23:37:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712757 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8821417D5 for ; Tue, 4 Dec 2018 23:38:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7BC3C29C3A for ; Tue, 4 Dec 2018 23:38:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FCBB29ED2; Tue, 4 Dec 2018 23:38:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E1AD429C3A for ; Tue, 4 Dec 2018 23:38:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726538AbeLDXiD (ORCPT ); Tue, 4 Dec 2018 18:38:03 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:37141 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726650AbeLDXiC (ORCPT ); Tue, 4 Dec 2018 18:38:02 -0500 Received: by mail-pl1-f193.google.com with SMTP id b5so9090385plr.4 for ; Tue, 04 Dec 2018 15:38:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zjXjGJpzxG4sQV1Ooff1P7NVDuSBRD7W2guw+LhTNUo=; b=Ham/5gv2xLuPADAHF5xChPpcYg9sAWSUG37GpArjNIWoqeQCp08uH+aQopZnY82gHB PKCYhKQan5Qbl+ubZKdR+HV3G3DGm6Oh7ji5wOTKECNjb7+tUunokcsV24Qx6xzNsZBD iZUcYI+1mobfx/C7b0+zJPHQi45ke7U18DuRvM/1jmQ5Uy3WGe8e74s8n1BoZRNNmGoe gIwJjfPEiO+ofAldZx308hFdmh9XHT5RQYDbzL2WDp+qBAPdUPlJGagsY4C3LR5iKfaQ myH+G/GUqGyMCdqzGAoRyxp/aiYzfY1O2KNYCH/KYx3Y837SPCmkbhDWPZU9D69YQB6F 3LpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zjXjGJpzxG4sQV1Ooff1P7NVDuSBRD7W2guw+LhTNUo=; b=R0wFr+LCqPf9O6fKJqlwdyj6XgN758ndu4giL4KgoTg9Nn5YQMZGy64w4AIFovPB+n APkfpkKCUhXc3wf5CPBrOuOARAgNx3P8JQ/kLoHkrY2vTEJuxeBT1e7YR0IWvtytR7CU ccxWH0MpzZB1ObysN5kuPmDCVgOggtymfxRufAxVSbdv2EElm+dTXmHhRzxETuwWZzKX 0+oJm1vxsdAKdwqCYkaBHx4/U/ierODnrGrGOEVBmsbieX2GENfwtfcukB27Djw5vgwY lNyJc2Uma8vtm1wOSvz/EhnzyKH2yKWKupLl2R4DeuhT1EEESICps6v45mxA8BOYk2TT gsXQ== X-Gm-Message-State: AA+aEWbxOlcLnqJ60bKvrT1KA94a9/Dugk0o2IFMwZKaXVuorywXvfFn b1K6BI6dwz9kxFSpZxSD0KhMh/EhnXk= X-Google-Smtp-Source: AFSGD/Ui3ZeGiirUXpj5J6naNTcOnb2Pdkc7dwerpON9tBL2WJnQvo0vdIuGmvMKsZefrfvh5Etgxg== X-Received: by 2002:a17:902:bb98:: with SMTP id m24mr21406695pls.71.1543966681410; Tue, 04 Dec 2018 15:38:01 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.37.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:00 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 11/26] aio: split out iocb copy from io_submit_one() Date: Tue, 4 Dec 2018 16:37:14 -0700 Message-Id: <20181204233729.26776-12-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In preparation of handing in iocbs in a different fashion as well. Also make it clear that the iocb being passed in isn't modified, by marking it const throughout. Signed-off-by: Jens Axboe --- fs/aio.c | 68 +++++++++++++++++++++++++++++++------------------------- 1 file changed, 38 insertions(+), 30 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index cf93b92bfb1e..06c8bcc72496 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1420,7 +1420,7 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) aio_complete(iocb, res, res2); } -static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) +static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) { int ret; @@ -1461,7 +1461,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) return ret; } -static int aio_setup_rw(int rw, struct iocb *iocb, struct iovec **iovec, +static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec, bool vectored, bool compat, struct iov_iter *iter) { void __user *buf = (void __user *)(uintptr_t)iocb->aio_buf; @@ -1500,8 +1500,8 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } -static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored, - bool compat) +static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb, + bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct iov_iter iter; @@ -1533,8 +1533,8 @@ static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored, return ret; } -static ssize_t aio_write(struct kiocb *req, struct iocb *iocb, bool vectored, - bool compat) +static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb, + bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct iov_iter iter; @@ -1589,7 +1589,8 @@ static void aio_fsync_work(struct work_struct *work) aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0); } -static int aio_fsync(struct fsync_iocb *req, struct iocb *iocb, bool datasync) +static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb, + bool datasync) { if (unlikely(iocb->aio_buf || iocb->aio_offset || iocb->aio_nbytes || iocb->aio_rw_flags)) @@ -1717,7 +1718,7 @@ aio_poll_queue_proc(struct file *file, struct wait_queue_head *head, add_wait_queue(head, &pt->iocb->poll.wait); } -static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) +static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) { struct kioctx *ctx = aiocb->ki_ctx; struct poll_iocb *req = &aiocb->poll; @@ -1789,27 +1790,23 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) return 0; } -static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, - bool compat) +static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, + struct iocb __user *user_iocb, bool compat) { struct aio_kiocb *req; - struct iocb iocb; ssize_t ret; - if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) - return -EFAULT; - /* enforce forwards compatibility on users */ - if (unlikely(iocb.aio_reserved2)) { + if (unlikely(iocb->aio_reserved2)) { pr_debug("EINVAL: reserve field set\n"); return -EINVAL; } /* prevent overflows */ if (unlikely( - (iocb.aio_buf != (unsigned long)iocb.aio_buf) || - (iocb.aio_nbytes != (size_t)iocb.aio_nbytes) || - ((ssize_t)iocb.aio_nbytes < 0) + (iocb->aio_buf != (unsigned long)iocb->aio_buf) || + (iocb->aio_nbytes != (size_t)iocb->aio_nbytes) || + ((ssize_t)iocb->aio_nbytes < 0) )) { pr_debug("EINVAL: overflow check\n"); return -EINVAL; @@ -1823,14 +1820,14 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, if (unlikely(!req)) goto out_put_reqs_available; - if (iocb.aio_flags & IOCB_FLAG_RESFD) { + if (iocb->aio_flags & IOCB_FLAG_RESFD) { /* * If the IOCB_FLAG_RESFD flag of aio_flags is set, get an * instance of the file* now. The file descriptor must be * an eventfd() fd, and will be signaled for each completed * event using the eventfd_signal() function. */ - req->ki_eventfd = eventfd_ctx_fdget((int) iocb.aio_resfd); + req->ki_eventfd = eventfd_ctx_fdget((int) iocb->aio_resfd); if (IS_ERR(req->ki_eventfd)) { ret = PTR_ERR(req->ki_eventfd); req->ki_eventfd = NULL; @@ -1845,32 +1842,32 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, } req->ki_user_iocb = user_iocb; - req->ki_user_data = iocb.aio_data; + req->ki_user_data = iocb->aio_data; - switch (iocb.aio_lio_opcode) { + switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(&req->rw, &iocb, false, compat); + ret = aio_read(&req->rw, iocb, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(&req->rw, &iocb, false, compat); + ret = aio_write(&req->rw, iocb, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(&req->rw, &iocb, true, compat); + ret = aio_read(&req->rw, iocb, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(&req->rw, &iocb, true, compat); + ret = aio_write(&req->rw, iocb, true, compat); break; case IOCB_CMD_FSYNC: - ret = aio_fsync(&req->fsync, &iocb, false); + ret = aio_fsync(&req->fsync, iocb, false); break; case IOCB_CMD_FDSYNC: - ret = aio_fsync(&req->fsync, &iocb, true); + ret = aio_fsync(&req->fsync, iocb, true); break; case IOCB_CMD_POLL: - ret = aio_poll(req, &iocb); + ret = aio_poll(req, iocb); break; default: - pr_debug("invalid aio operation %d\n", iocb.aio_lio_opcode); + pr_debug("invalid aio operation %d\n", iocb->aio_lio_opcode); ret = -EINVAL; break; } @@ -1892,6 +1889,17 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return ret; } +static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, + bool compat) +{ + struct iocb iocb; + + if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) + return -EFAULT; + + return __io_submit_one(ctx, &iocb, user_iocb, compat); +} + /* sys_io_submit: * Queue the nr iocbs pointed to by iocbpp for processing. Returns * the number of iocbs queued. May return -EINVAL if the aio_context From patchwork Tue Dec 4 23:37:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712763 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE14417DB for ; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E194529C06 for ; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D60F929C3A; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90AB629ED2 for ; Tue, 4 Dec 2018 23:38:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726171AbeLDXiE (ORCPT ); Tue, 4 Dec 2018 18:38:04 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:41256 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726504AbeLDXiE (ORCPT ); Tue, 4 Dec 2018 18:38:04 -0500 Received: by mail-pl1-f193.google.com with SMTP id u6so9065312plm.8 for ; Tue, 04 Dec 2018 15:38:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=L8yxXuFlKRVZGhU6vByhKAQcKxwaMq31dRwuK5kkt+A=; b=I0fiKTp97lWAptjPCAJ8bghiFPuKgt5x2fcBaV7748QwKc7zzf4m4DfGDe942PAQic zBBx1+lC5KaXPKmpoOdHPxTW9utb1ekRhc9z89V0DYtU71ZX4Ps8HASp1uAI15onLctU QyJ5hIkTzm8t4ZZ4q7p0aYpa+yjpSq54ovVx1/X0mUyVpUIlwPE7XpPfV17gavUEUAX2 wpuUO1RSmxyj30V0Idkzk7Osz4hvVoRXGOajE08JYmUYxRCX6RvqpOAIgfviMS8Wp5UZ 58QjIGEIOgpq9LkKn5iwLKfFl1G2hOiLxrT8J2k2iKlZNsZIH6bWwe574r292S0UJ/Gj 5P2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=L8yxXuFlKRVZGhU6vByhKAQcKxwaMq31dRwuK5kkt+A=; b=RRAkOuxOpdnYemhAiz4fywaM4GladzxHw0YQXyAOudGND9Skr4Gghqi0JdYb69SD9M J58V7A40Ec8dgMVp1z5HqNqgEQos3j+ky4CCL3XVPSmWodLijnwGQD1qSDr8uHmB6fgV AlxVhKLJSr5csS93wDD3KRlMrWCXeDTUcAdAQbc9kW65p1cfsWK8WGY8VcBzETui600N 1HnjSp6ihbkN3WQprKKR15s7+jB48VzSXmuT8Il0QCsFU4BxWPaXp2sD3f6NYa5m8gHw PPbv8U9qTw8FLtcQ5z5HMQ1o07pg4ZCIfhsGPG+SxWE4aQNaY4oV/n0PhaBOhNqR/qvO t2Ag== X-Gm-Message-State: AA+aEWYCHKr354p934hwOxS2CzcdnWqLHfHC+4YulLJkAwaBtEQJQn7Z sBVTjc6E2b1QFTVap10HdKPWU0m+8y0= X-Google-Smtp-Source: AFSGD/VnjMZuzmOf/CuJrjb3quTvbHG8WmRQRAYgkuLMvn4Hf7iAjytZlj5cSIt4aZIq1aXjBKGoXA== X-Received: by 2002:a17:902:a5c3:: with SMTP id t3mr12422871plq.117.1543966683426; Tue, 04 Dec 2018 15:38:03 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:02 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 12/26] aio: abstract out io_event filler helper Date: Tue, 4 Dec 2018 16:37:15 -0700 Message-Id: <20181204233729.26776-13-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Jens Axboe --- fs/aio.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 06c8bcc72496..173f1f79dc8f 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1063,6 +1063,15 @@ static inline void iocb_put(struct aio_kiocb *iocb) } } +static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, + long res, long res2) +{ + ev->obj = (u64)(unsigned long)iocb->ki_user_iocb; + ev->data = iocb->ki_user_data; + ev->res = res; + ev->res2 = res2; +} + /* aio_complete * Called when the io request on the given iocb is complete. */ @@ -1090,10 +1099,7 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2) ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); event = ev_page + pos % AIO_EVENTS_PER_PAGE; - event->obj = (u64)(unsigned long)iocb->ki_user_iocb; - event->data = iocb->ki_user_data; - event->res = res; - event->res2 = res2; + aio_fill_event(event, iocb, res, res2); kunmap_atomic(ev_page); flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); From patchwork Tue Dec 4 23:37:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712767 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF86613AF for ; Tue, 4 Dec 2018 23:38:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B2B8329C06 for ; Tue, 4 Dec 2018 23:38:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A6FF529EB9; Tue, 4 Dec 2018 23:38:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2464A29C06 for ; Tue, 4 Dec 2018 23:38:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726504AbeLDXiH (ORCPT ); Tue, 4 Dec 2018 18:38:07 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:41472 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbeLDXiH (ORCPT ); Tue, 4 Dec 2018 18:38:07 -0500 Received: by mail-pg1-f194.google.com with SMTP id 70so8091478pgh.8 for ; Tue, 04 Dec 2018 15:38:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fcaGUab+hmUcXDyEXpGB51iaDLsNQ4/e2XEygehySfs=; b=B37QG1JJxguLijbmLuYy1eXZoNS9MjgVYW1kxXNUhmc0DucDKTVIMt7Kiq42OYotsh eSJHWVxhtfXpFCUjRQSxAFjwEvxWaFjxk7GmG5/oCWC4MHRF/7Pk+Yu+JivmLjOulL2b XaOWrRkTy+f16YTmtKkFvDtmWSNgE3P9K8rn9dQIknT3BLsd4o8tbIYw5+pF2zSCjBTO dbjbWh5p4bxlqAq2EM5VeWHFQEy2mnkp9IOGsZuAuVIs7DpvQx8HsAqBvpwLsL7rhnG7 1K6lKLdQwEz9kYiETkmTVAsHGanInmWE53MTaHpXrAdlXp28fGdX7ZA/2zVTbVJWUZGO yr+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fcaGUab+hmUcXDyEXpGB51iaDLsNQ4/e2XEygehySfs=; b=X3MAXrppccT1CMeGlFZKh/x3cVeQudJ6bqaHwSv8P8N2KlPBOHzcQR69lVuSv5yB/x DdISCX3tnkytYdHlGout/kUd6W4Lad1mlfxTqfygVt3+A3frN0V9Z9MxXJb3ykbc/xvZ tGjTQdZ8kWaTeUKzihKh5TEfKklqY2yerUhZtzojVjxa2VEut5HG12mqUzK/s8Dv/ZLx 93W6yPTTf+nMruKDYOXtO7hPNvp8rVEqvmEZuxwNDebpmwcDpIV3mQ5GGUhyGcn2UuVM IkxuQ2eSUgtziixTpBHSzqiI0yBXznNpC+hA625GUI0NDfUlB1CKiq5WyOe5N7VacYMg CTLQ== X-Gm-Message-State: AA+aEWac6gK2V72f7KCyXDnwirqT+D//dT0UKD9VspJKzzxefym020k/ 9tDF8NdZv4RMLliNnm/BqkrnsQ0Te6k= X-Google-Smtp-Source: AFSGD/UYOFvePmGQ12L8dnS8k0HW3ZR5mjhCSMR7p57zj9UQuZpzAP2Eh4GXByOEWYzNjaTJuTQa1A== X-Received: by 2002:a63:b30f:: with SMTP id i15mr18594221pgf.240.1543966685431; Tue, 04 Dec 2018 15:38:05 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:04 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 13/26] aio: add io_setup2() system call Date: Tue, 4 Dec 2018 16:37:16 -0700 Message-Id: <20181204233729.26776-14-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is just like io_setup(), except add a flags argument to let the caller control/define some of the io_context behavior. Outside of the flags, we add an iocb array and two user pointers for future use. Signed-off-by: Jens Axboe --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + fs/aio.c | 69 ++++++++++++++++---------- include/linux/syscalls.h | 3 ++ include/uapi/asm-generic/unistd.h | 4 +- kernel/sys_ni.c | 1 + 5 files changed, 52 insertions(+), 26 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index f0b1709a5ffb..67c357225fb0 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -343,6 +343,7 @@ 332 common statx __x64_sys_statx 333 common io_pgetevents __x64_sys_io_pgetevents 334 common rseq __x64_sys_rseq +335 common io_setup2 __x64_sys_io_setup2 # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/aio.c b/fs/aio.c index 173f1f79dc8f..26631d6872d2 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -100,6 +100,8 @@ struct kioctx { unsigned long user_id; + unsigned int flags; + struct __percpu kioctx_cpu *cpu; /* @@ -686,10 +688,8 @@ static void aio_nr_sub(unsigned nr) spin_unlock(&aio_nr_lock); } -/* ioctx_alloc - * Allocates and initializes an ioctx. Returns an ERR_PTR if it failed. - */ -static struct kioctx *ioctx_alloc(unsigned nr_events) +static struct kioctx *io_setup_flags(unsigned long ctxid, + unsigned int nr_events, unsigned int flags) { struct mm_struct *mm = current->mm; struct kioctx *ctx; @@ -701,6 +701,12 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) */ unsigned int max_reqs = nr_events; + if (unlikely(ctxid || nr_events == 0)) { + pr_debug("EINVAL: ctx %lu nr_events %u\n", + ctxid, nr_events); + return ERR_PTR(-EINVAL); + } + /* * We keep track of the number of available ringbuffer slots, to prevent * overflow (reqs_available), and we also use percpu counters for this. @@ -726,6 +732,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) if (!ctx) return ERR_PTR(-ENOMEM); + ctx->flags = flags; ctx->max_reqs = max_reqs; spin_lock_init(&ctx->ctx_lock); @@ -1281,6 +1288,34 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, return ret; } +SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, + iocbs, void __user *, user1, void __user *, user2, + aio_context_t __user *, ctxp) +{ + struct kioctx *ioctx; + unsigned long ctx; + long ret; + + if (flags || user1 || user2) + return -EINVAL; + + ret = get_user(ctx, ctxp); + if (unlikely(ret)) + goto out; + + ioctx = io_setup_flags(ctx, nr_events, flags); + ret = PTR_ERR(ioctx); + if (IS_ERR(ioctx)) + goto out; + + ret = put_user(ioctx->user_id, ctxp); + if (ret) + kill_ioctx(current->mm, ioctx, NULL); + percpu_ref_put(&ioctx->users); +out: + return ret; +} + /* sys_io_setup: * Create an aio_context capable of receiving at least nr_events. * ctxp must not point to an aio_context that already exists, and @@ -1296,7 +1331,7 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, */ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) { - struct kioctx *ioctx = NULL; + struct kioctx *ioctx; unsigned long ctx; long ret; @@ -1304,14 +1339,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) if (unlikely(ret)) goto out; - ret = -EINVAL; - if (unlikely(ctx || nr_events == 0)) { - pr_debug("EINVAL: ctx %lu nr_events %u\n", - ctx, nr_events); - goto out; - } - - ioctx = ioctx_alloc(nr_events); + ioctx = io_setup_flags(ctx, nr_events, 0); ret = PTR_ERR(ioctx); if (!IS_ERR(ioctx)) { ret = put_user(ioctx->user_id, ctxp); @@ -1327,7 +1355,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) { - struct kioctx *ioctx = NULL; + struct kioctx *ioctx; unsigned long ctx; long ret; @@ -1335,23 +1363,14 @@ COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) if (unlikely(ret)) goto out; - ret = -EINVAL; - if (unlikely(ctx || nr_events == 0)) { - pr_debug("EINVAL: ctx %lu nr_events %u\n", - ctx, nr_events); - goto out; - } - - ioctx = ioctx_alloc(nr_events); + ioctx = io_setup_flags(ctx, nr_events, 0); ret = PTR_ERR(ioctx); if (!IS_ERR(ioctx)) { - /* truncating is ok because it's a user address */ - ret = put_user((u32)ioctx->user_id, ctx32p); + ret = put_user(ioctx->user_id, ctx32p); if (ret) kill_ioctx(current->mm, ioctx, NULL); percpu_ref_put(&ioctx->users); } - out: return ret; } diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 2ac3d13a915b..a20a663d583f 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -287,6 +287,9 @@ static inline void addr_limit_user_check(void) */ #ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx); +asmlinkage long sys_io_setup2(unsigned, unsigned, struct iocb __user *, + void __user *, void __user *, + aio_context_t __user *); asmlinkage long sys_io_destroy(aio_context_t ctx); asmlinkage long sys_io_submit(aio_context_t, long, struct iocb __user * __user *); diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 538546edbfbd..b4527ed373b0 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -738,9 +738,11 @@ __SYSCALL(__NR_statx, sys_statx) __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents) #define __NR_rseq 293 __SYSCALL(__NR_rseq, sys_rseq) +#define __NR_io_setup2 294 +__SYSCALL(__NR_io_setup2, sys_io_setup2) #undef __NR_syscalls -#define __NR_syscalls 294 +#define __NR_syscalls 295 /* * 32 bit systems traditionally used different diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index df556175be50..17c8b4393669 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -37,6 +37,7 @@ asmlinkage long sys_ni_syscall(void) */ COND_SYSCALL(io_setup); +COND_SYSCALL(io_setup2); COND_SYSCALL_COMPAT(io_setup); COND_SYSCALL(io_destroy); COND_SYSCALL(io_submit); From patchwork Tue Dec 4 23:37:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712771 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C97FB13AF for ; Tue, 4 Dec 2018 23:38:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BCFAE29C06 for ; Tue, 4 Dec 2018 23:38:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B145729EB9; Tue, 4 Dec 2018 23:38:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2814F29C06 for ; Tue, 4 Dec 2018 23:38:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726663AbeLDXiJ (ORCPT ); Tue, 4 Dec 2018 18:38:09 -0500 Received: from mail-pf1-f193.google.com ([209.85.210.193]:40372 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbeLDXiJ (ORCPT ); Tue, 4 Dec 2018 18:38:09 -0500 Received: by mail-pf1-f193.google.com with SMTP id i12so9002784pfo.7 for ; Tue, 04 Dec 2018 15:38:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=VIelYwIQhuJ5AQqTwtkFXKeN6WxDwL+cWQ8voRQ13S0=; b=d6miXTU9Z4A/cp8m95J4Xxd/U58htbpmyTwK8VyCNShZt47ee4m5i5Uj6P4G3tNM2p 51XvXtcYGnQEJhRSIqeO5qfG4zZ9m6ePg1SLrUySY5aCnbLpIxgGf1GBY3hQu7CNaZHL Z5yHZM1xBEaQymEjPT1BCkOJAtvpAACQchLLdxdw3uyBgHgi34gkO+N6HGX2RldXDdzY ro/7MLVDSRCcmFTV7UMQmsN2PUth7fDOdUzq/sOlU8aU9a5mJo+MPql0pHW5adAa0Cau 4PxTgufa5VU5vsZjA/yKJkqk7kkP82Q42de+BnsY7jWTq7zvpVNaURCzxTDmUMw/gkRF YhxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=VIelYwIQhuJ5AQqTwtkFXKeN6WxDwL+cWQ8voRQ13S0=; b=Fpv8fQzqM5670acGeogmBBpFZwfZO39ToHe+VjREAAaA8slFqXqatI6xEiIOh6GMcz /Sto6qx0JsHOEoQw2LAVZOB8NgB6oCr6zr/LgGMM/50T7hsFYgIJxCy2/XW88pYSmZcU 5gC4aiF5FVwjEr3ss1tt+SwHob48MmJzZWKPrHVYv5jLa+rsMH5h3tCj/X1NU8Jo68Hh AC4c42uPL8qHh1/uf7uNzVFyPoVGKM/gVmPFzPGoUfMSKt0gfhhzG1Lpvpk46fy29zCP vG9EZhSdUik3lPEWrTVE5TNEOZfikkjd3/aETPtzp4WIWGSQ9EtexBtilpdyAIf5hcjd XZqQ== X-Gm-Message-State: AA+aEWbzuGXTNP+xaZsLIwaS31gglCvan2rAd+eTMnwK/VbkVYg2A3Z4 RzPxzbazwADfhSdnxPBRT8FaSoyvHeo= X-Google-Smtp-Source: AFSGD/UBis7FX5ONJ6WpZJyGf5Gs4b305mrD7oG87jzrfFY2Bdkzrxavk3/BeFLpWzWMczuD5ndmYQ== X-Received: by 2002:a63:9a09:: with SMTP id o9mr17796984pge.94.1543966687233; Tue, 04 Dec 2018 15:38:07 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:06 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 14/26] aio: add support for having user mapped iocbs Date: Tue, 4 Dec 2018 16:37:17 -0700 Message-Id: <20181204233729.26776-15-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For io_submit(), we have to first copy each pointer to an iocb, then copy the iocb. The latter is 64 bytes in size, and that's a lot of copying for a single IO. Add support for setting IOCTX_FLAG_USERIOCB through the new io_setup2() system call, which allows the iocbs to reside in userspace. If this flag is used, then io_submit() doesn't take pointers to iocbs anymore, it takes an index value into the array of iocbs instead. Similary, for io_getevents(), the iocb ->obj will be the index, not the pointer to the iocb. See the change made to fio to support this feature, it's pretty trivialy to adapt to. For applications, like fio, that previously embedded the iocb inside a application private structure, some sort of lookup table/structure is needed to find the private IO structure from the index at io_getevents() time. http://git.kernel.dk/cgit/fio/commit/?id=3c3168e91329c83880c91e5abc28b9d6b940fd95 Signed-off-by: Jens Axboe --- fs/aio.c | 126 +++++++++++++++++++++++++++++++---- include/uapi/linux/aio_abi.h | 2 + 2 files changed, 116 insertions(+), 12 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 26631d6872d2..bb6f07ca6940 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -92,6 +92,11 @@ struct ctx_rq_wait { atomic_t count; }; +struct aio_mapped_range { + struct page **pages; + long nr_pages; +}; + struct kioctx { struct percpu_ref users; atomic_t dead; @@ -127,6 +132,8 @@ struct kioctx { struct page **ring_pages; long nr_pages; + struct aio_mapped_range iocb_range; + struct rcu_work free_rwork; /* see free_ioctx() */ /* @@ -222,6 +229,11 @@ static struct vfsmount *aio_mnt; static const struct file_operations aio_ring_fops; static const struct address_space_operations aio_ctx_aops; +static const unsigned int iocb_page_shift = + ilog2(PAGE_SIZE / sizeof(struct iocb)); + +static void aio_useriocb_unmap(struct kioctx *); + static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) { struct file *file; @@ -578,6 +590,7 @@ static void free_ioctx(struct work_struct *work) free_rwork); pr_debug("freeing %p\n", ctx); + aio_useriocb_unmap(ctx); aio_free_ring(ctx); free_percpu(ctx->cpu); percpu_ref_exit(&ctx->reqs); @@ -1288,6 +1301,70 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, return ret; } +static struct iocb *aio_iocb_from_index(struct kioctx *ctx, int index) +{ + struct iocb *iocb; + + iocb = page_address(ctx->iocb_range.pages[index >> iocb_page_shift]); + index &= ((1 << iocb_page_shift) - 1); + return iocb + index; +} + +static void aio_unmap_range(struct aio_mapped_range *range) +{ + int i; + + if (!range->nr_pages) + return; + + for (i = 0; i < range->nr_pages; i++) + put_page(range->pages[i]); + + kfree(range->pages); + range->pages = NULL; + range->nr_pages = 0; +} + +static int aio_map_range(struct aio_mapped_range *range, void __user *uaddr, + size_t size, int gup_flags) +{ + int nr_pages, ret; + + if ((unsigned long) uaddr & ~PAGE_MASK) + return -EINVAL; + + nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; + + range->pages = kzalloc(nr_pages * sizeof(struct page *), GFP_KERNEL); + if (!range->pages) + return -ENOMEM; + + down_write(¤t->mm->mmap_sem); + ret = get_user_pages((unsigned long) uaddr, nr_pages, gup_flags, + range->pages, NULL); + up_write(¤t->mm->mmap_sem); + + if (ret < nr_pages) { + kfree(range->pages); + return -ENOMEM; + } + + range->nr_pages = nr_pages; + return 0; +} + +static void aio_useriocb_unmap(struct kioctx *ctx) +{ + aio_unmap_range(&ctx->iocb_range); +} + +static int aio_useriocb_map(struct kioctx *ctx, struct iocb __user *iocbs) +{ + size_t size = sizeof(struct iocb) * ctx->max_reqs; + + return aio_map_range(&ctx->iocb_range, iocbs, size, 0); +} + SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, iocbs, void __user *, user1, void __user *, user2, aio_context_t __user *, ctxp) @@ -1296,7 +1373,9 @@ SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, unsigned long ctx; long ret; - if (flags || user1 || user2) + if (user1 || user2) + return -EINVAL; + if (flags & ~IOCTX_FLAG_USERIOCB) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1308,9 +1387,17 @@ SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, if (IS_ERR(ioctx)) goto out; + if (flags & IOCTX_FLAG_USERIOCB) { + ret = aio_useriocb_map(ioctx, iocbs); + if (ret) + goto err; + } + ret = put_user(ioctx->user_id, ctxp); - if (ret) + if (ret) { +err: kill_ioctx(current->mm, ioctx, NULL); + } percpu_ref_put(&ioctx->users); out: return ret; @@ -1860,10 +1947,13 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } } - ret = put_user(KIOCB_KEY, &user_iocb->aio_key); - if (unlikely(ret)) { - pr_debug("EFAULT: aio_key\n"); - goto out_put_req; + /* Don't support cancel on user mapped iocbs */ + if (!(ctx->flags & IOCTX_FLAG_USERIOCB)) { + ret = put_user(KIOCB_KEY, &user_iocb->aio_key); + if (unlikely(ret)) { + pr_debug("EFAULT: aio_key\n"); + goto out_put_req; + } } req->ki_user_iocb = user_iocb; @@ -1917,12 +2007,22 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, bool compat) { - struct iocb iocb; + struct iocb iocb, *iocbp; - if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) - return -EFAULT; + if (ctx->flags & IOCTX_FLAG_USERIOCB) { + unsigned long iocb_index = (unsigned long) user_iocb; - return __io_submit_one(ctx, &iocb, user_iocb, compat); + if (iocb_index >= ctx->max_reqs) + return -EINVAL; + + iocbp = aio_iocb_from_index(ctx, iocb_index); + } else { + if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) + return -EFAULT; + iocbp = &iocb; + } + + return __io_submit_one(ctx, iocbp, user_iocb, compat); } /* sys_io_submit: @@ -2066,6 +2166,9 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, if (unlikely(!ctx)) return -EINVAL; + if (ctx->flags & IOCTX_FLAG_USERIOCB) + goto err; + spin_lock_irq(&ctx->ctx_lock); kiocb = lookup_kiocb(ctx, iocb); if (kiocb) { @@ -2082,9 +2185,8 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, */ ret = -EINPROGRESS; } - +err: percpu_ref_put(&ctx->users); - return ret; } diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index 8387e0af0f76..814e6606c413 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -106,6 +106,8 @@ struct iocb { __u32 aio_resfd; }; /* 64 bytes */ +#define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ + #undef IFBIG #undef IFLITTLE From patchwork Tue Dec 4 23:37:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712775 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B81F017DB for ; Tue, 4 Dec 2018 23:38:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA69329C06 for ; Tue, 4 Dec 2018 23:38:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E8B329EB9; Tue, 4 Dec 2018 23:38:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7AADC29C3A for ; Tue, 4 Dec 2018 23:38:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726026AbeLDXiL (ORCPT ); Tue, 4 Dec 2018 18:38:11 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:37155 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726665AbeLDXiL (ORCPT ); Tue, 4 Dec 2018 18:38:11 -0500 Received: by mail-pl1-f196.google.com with SMTP id b5so9090550plr.4 for ; Tue, 04 Dec 2018 15:38:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/BCos7ns7d7ZmuQfj19Yb3toA6KneUTOLUbJ0sObFZU=; b=b/rvqTbgS/bwL1TUPUXq/SbX889Wc+dcS4qCDeoWU7IuMVl5eLXyw78Yr22UJPz1Py zhv577b6wX55vp6XH+rrGt3CXUgB1iH3c1HQf5z2jNqNmMirhuYHai0PvUSceN1SNx8e xc73dbmVtTJbOum+Eq1ODoeWihq6K8rvcfvX/en3Q/9frKV7l/ptl88S4WTNilzW+RK0 sdxTypGrghylfaDx6r+48DVB6V6aU5bs9c1a41PMWNYMTR98ClkBwtlN5UVNkDoDLXC3 FkUfHwdRJ/Zm0jJXiv1ZObegJsGsJGYfiVPYmvilOeJJLAnK3UiQzCd0lBOT/D5HUrA8 lIQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/BCos7ns7d7ZmuQfj19Yb3toA6KneUTOLUbJ0sObFZU=; b=OcXMJjqdZOOD/2lM0STQ3TVDuBCgxiyLRT9mKrLNEBiAk1yQpxznNeTL6UuBIMG691 9piypeR7YRRb2wv5swSH/Xmbqte5KEfRuHFe1szTe4Qir1o/pbXGTPwCRiXwbhiV5sPS NU1qRQrPSWp5/heag0Klt1vZTmFSYOEP/cdGSBjHS9FOW+cgZw9PNC2mH8X1dQWIt5Ct ko2u0sCtLrstn6N0IbPXKmVVYFYtAZqNEDAQyMFEzAQv4J3lggwGzB9wzM4ZYGS8FEHQ 6DeYAAL+6w5CJeJ+pSXuE9JsyZyiyJTiu/4lf2YCRblVBWr+JWAmra8f1uSAtcpdA6DT lXSA== X-Gm-Message-State: AA+aEWbscCofksWOiXZQnCyd5X1El215kszu5fjZ7LTdZLlILImFxnf9 FCOS64ll+4zybHdsCJonzlSstQxjLbY= X-Google-Smtp-Source: AFSGD/UpwK+d4eqiRm120ieMqxcK4pNc6IXfUV93hwpAvzj974RB7I0p5Kz/8MgGeQjM/spa4fRHfg== X-Received: by 2002:a17:902:aa8c:: with SMTP id d12mr22317905plr.25.1543966689871; Tue, 04 Dec 2018 15:38:09 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:08 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 15/26] aio: support for IO polling Date: Tue, 4 Dec 2018 16:37:18 -0700 Message-Id: <20181204233729.26776-16-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add polled variants of PREAD/PREADV and PWRITE/PWRITEV. These act like their non-polled counterparts, except we expect to poll for completion of them. The polling happens at io_getevent() time, and works just like non-polled IO. To setup an io_context for polled IO, the application must call io_setup2() with IOCTX_FLAG_IOPOLL as one of the flags. It is illegal to mix and match polled and non-polled IO on an io_context. Polled IO doesn't support the user mapped completion ring. Events must be reaped through the io_getevents() system call. For non-irq driven poll devices, there's no way to support completion reaping from userspace by just looking at the ring. The application itself is the one that pulls completion entries. Signed-off-by: Jens Axboe --- fs/aio.c | 393 +++++++++++++++++++++++++++++++---- include/uapi/linux/aio_abi.h | 3 + 2 files changed, 360 insertions(+), 36 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index bb6f07ca6940..5d34317c2929 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -153,6 +153,18 @@ struct kioctx { atomic_t reqs_available; } ____cacheline_aligned_in_smp; + /* iopoll submission state */ + struct { + spinlock_t poll_lock; + struct list_head poll_submitted; + } ____cacheline_aligned_in_smp; + + /* iopoll completion state */ + struct { + struct list_head poll_completing; + struct mutex getevents_lock; + } ____cacheline_aligned_in_smp; + struct { spinlock_t ctx_lock; struct list_head active_reqs; /* used for cancellation */ @@ -205,14 +217,27 @@ struct aio_kiocb { __u64 ki_user_data; /* user's data for completion */ struct list_head ki_list; /* the aio core uses this - * for cancellation */ + * for cancellation, or for + * polled IO */ + + unsigned long ki_flags; +#define IOCB_POLL_COMPLETED 0 /* polled IO has completed */ +#define IOCB_POLL_EAGAIN 1 /* polled submission got EAGAIN */ + refcount_t ki_refcnt; - /* - * If the aio_resfd field of the userspace iocb is not zero, - * this is the underlying eventfd context to deliver events to. - */ - struct eventfd_ctx *ki_eventfd; + union { + /* + * If the aio_resfd field of the userspace iocb is not zero, + * this is the underlying eventfd context to deliver events to. + */ + struct eventfd_ctx *ki_eventfd; + + /* + * For polled IO, stash completion info here + */ + struct io_event ki_ev; + }; }; /*------ sysctl variables----*/ @@ -233,6 +258,7 @@ static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); static void aio_useriocb_unmap(struct kioctx *); +static void aio_iopoll_reap_events(struct kioctx *); static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) { @@ -471,11 +497,15 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) int i; struct file *file; - /* Compensate for the ring buffer's head/tail overlap entry */ - nr_events += 2; /* 1 is required, 2 for good luck */ - + /* + * Compensate for the ring buffer's head/tail overlap entry. + * IO polling doesn't require any io event entries + */ size = sizeof(struct aio_ring); - size += sizeof(struct io_event) * nr_events; + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) { + nr_events += 2; /* 1 is required, 2 for good luck */ + size += sizeof(struct io_event) * nr_events; + } nr_pages = PFN_UP(size); if (nr_pages < 0) @@ -758,6 +788,11 @@ static struct kioctx *io_setup_flags(unsigned long ctxid, INIT_LIST_HEAD(&ctx->active_reqs); + spin_lock_init(&ctx->poll_lock); + INIT_LIST_HEAD(&ctx->poll_submitted); + INIT_LIST_HEAD(&ctx->poll_completing); + mutex_init(&ctx->getevents_lock); + if (percpu_ref_init(&ctx->users, free_ioctx_users, 0, GFP_KERNEL)) goto err; @@ -829,11 +864,15 @@ static int kill_ioctx(struct mm_struct *mm, struct kioctx *ctx, { struct kioctx_table *table; + mutex_lock(&ctx->getevents_lock); spin_lock(&mm->ioctx_lock); if (atomic_xchg(&ctx->dead, 1)) { spin_unlock(&mm->ioctx_lock); + mutex_unlock(&ctx->getevents_lock); return -EINVAL; } + aio_iopoll_reap_events(ctx); + mutex_unlock(&ctx->getevents_lock); table = rcu_dereference_raw(mm->ioctx_table); WARN_ON(ctx != rcu_access_pointer(table->table[ctx->id])); @@ -1042,6 +1081,7 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) percpu_ref_get(&ctx->reqs); req->ki_ctx = ctx; INIT_LIST_HEAD(&req->ki_list); + req->ki_flags = 0; refcount_set(&req->ki_refcnt, 0); req->ki_eventfd = NULL; return req; @@ -1083,6 +1123,15 @@ static inline void iocb_put(struct aio_kiocb *iocb) } } +static void iocb_put_many(struct kioctx *ctx, void **iocbs, int *nr) +{ + if (*nr) { + percpu_ref_put_many(&ctx->reqs, *nr); + kmem_cache_free_bulk(kiocb_cachep, *nr, iocbs); + *nr = 0; + } +} + static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, long res, long res2) { @@ -1272,6 +1321,182 @@ static bool aio_read_events(struct kioctx *ctx, long min_nr, long nr, return ret < 0 || *i >= min_nr; } +#define AIO_IOPOLL_BATCH 8 + +/* + * Process completed iocb iopoll entries, copying the result to userspace. + */ +static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, + unsigned int *nr_events, long max) +{ + void *iocbs[AIO_IOPOLL_BATCH]; + struct aio_kiocb *iocb, *n; + int to_free = 0, ret = 0; + + /* Shouldn't happen... */ + if (*nr_events >= max) + return 0; + + list_for_each_entry_safe(iocb, n, &ctx->poll_completing, ki_list) { + if (*nr_events == max) + break; + if (!test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) + continue; + if (to_free == AIO_IOPOLL_BATCH) + iocb_put_many(ctx, iocbs, &to_free); + + list_del(&iocb->ki_list); + iocbs[to_free++] = iocb; + + fput(iocb->rw.ki_filp); + + if (evs && copy_to_user(evs + *nr_events, &iocb->ki_ev, + sizeof(iocb->ki_ev))) { + ret = -EFAULT; + break; + } + (*nr_events)++; + } + + if (to_free) + iocb_put_many(ctx, iocbs, &to_free); + + return ret; +} + +static int aio_iopoll_getevents(struct kioctx *ctx, + struct io_event __user *event, + unsigned int *nr_events, long min, long max) +{ + struct aio_kiocb *iocb; + int to_poll, polled, ret; + + /* + * Check if we already have done events that satisfy what we need + */ + if (!list_empty(&ctx->poll_completing)) { + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + /* if min is zero, still go through a poll check */ + if (min && *nr_events >= min) + return 0; + } + + /* + * Take in a new working set from the submitted list, if possible. + */ + if (!list_empty_careful(&ctx->poll_submitted)) { + spin_lock(&ctx->poll_lock); + list_splice_init(&ctx->poll_submitted, &ctx->poll_completing); + spin_unlock(&ctx->poll_lock); + } + + if (list_empty(&ctx->poll_completing)) + return 0; + + /* + * Check again now that we have a new batch. + */ + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + /* if min is zero, still go through a poll check */ + if (min && *nr_events >= min) + return 0; + + /* + * Find up to 'max' worth of events to poll for, including the + * events we already successfully polled + */ + polled = to_poll = 0; + list_for_each_entry(iocb, &ctx->poll_completing, ki_list) { + /* + * Poll for needed events with spin == true, anything after + * that we just check if we have more, up to max. + */ + bool spin = polled + *nr_events >= min; + struct kiocb *kiocb = &iocb->rw; + + if (test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) + break; + if (++to_poll + *nr_events > max) + break; + + ret = kiocb->ki_filp->f_op->iopoll(kiocb, spin); + if (ret < 0) + return ret; + + polled += ret; + if (polled + *nr_events >= max) + break; + } + + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + return to_poll; +} + +/* + * We can't just wait for polled events to come to us, we have to actively + * find and complete them. + */ +static void aio_iopoll_reap_events(struct kioctx *ctx) +{ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + return; + + while (!list_empty_careful(&ctx->poll_submitted) || + !list_empty(&ctx->poll_completing)) { + unsigned int nr_events = 0; + + aio_iopoll_getevents(ctx, NULL, &nr_events, 1, UINT_MAX); + } +} + +static int __aio_iopoll_check(struct kioctx *ctx, struct io_event __user *event, + unsigned int *nr_events, long min_nr, long max_nr) +{ + int ret = 0; + + while (!*nr_events || !need_resched()) { + int tmin = 0; + + if (*nr_events < min_nr) + tmin = min_nr - *nr_events; + + ret = aio_iopoll_getevents(ctx, event, nr_events, tmin, max_nr); + if (ret <= 0) + break; + ret = 0; + } + + return ret; +} + +static int aio_iopoll_check(struct kioctx *ctx, long min_nr, long nr, + struct io_event __user *event) +{ + unsigned int nr_events = 0; + int ret; + + /* Only allow one thread polling at a time */ + if (!mutex_trylock(&ctx->getevents_lock)) + return -EBUSY; + if (unlikely(atomic_read(&ctx->dead))) { + ret = -EINVAL; + goto err; + } + + ret = __aio_iopoll_check(ctx, event, &nr_events, min_nr, nr); +err: + mutex_unlock(&ctx->getevents_lock); + return nr_events ? nr_events : ret; +} + static long read_events(struct kioctx *ctx, long min_nr, long nr, struct io_event __user *event, ktime_t until) @@ -1375,7 +1600,7 @@ SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, if (user1 || user2) return -EINVAL; - if (flags & ~IOCTX_FLAG_USERIOCB) + if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL)) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1509,13 +1734,8 @@ static void aio_remove_iocb(struct aio_kiocb *iocb) spin_unlock_irqrestore(&ctx->ctx_lock, flags); } -static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) +static void kiocb_end_write(struct kiocb *kiocb) { - struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); - - if (!list_empty_careful(&iocb->ki_list)) - aio_remove_iocb(iocb); - if (kiocb->ki_flags & IOCB_WRITE) { struct inode *inode = file_inode(kiocb->ki_filp); @@ -1527,19 +1747,48 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) __sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE); file_end_write(kiocb->ki_filp); } +} + +static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) +{ + struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); + + if (!list_empty_careful(&iocb->ki_list)) + aio_remove_iocb(iocb); + + kiocb_end_write(kiocb); fput(kiocb->ki_filp); aio_complete(iocb, res, res2); } -static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) +static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) { + struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); + + kiocb_end_write(kiocb); + + /* + * Handle EAGAIN from resource limits with polled IO inline, don't + * pass the event back to userspace. + */ + if (unlikely(res == -EAGAIN)) + set_bit(IOCB_POLL_EAGAIN, &iocb->ki_flags); + else { + aio_fill_event(&iocb->ki_ev, iocb, res, res2); + set_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags); + } +} + +static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb) +{ + struct kioctx *ctx = kiocb->ki_ctx; + struct kiocb *req = &kiocb->rw; int ret; req->ki_filp = fget(iocb->aio_fildes); if (unlikely(!req->ki_filp)) return -EBADF; - req->ki_complete = aio_complete_rw; req->ki_pos = iocb->aio_offset; req->ki_flags = iocb_flags(req->ki_filp); if (iocb->aio_flags & IOCB_FLAG_RESFD) @@ -1565,9 +1814,35 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) if (unlikely(ret)) goto out_fput; - req->ki_flags &= ~IOCB_HIPRI; /* no one is going to poll for this I/O */ - return 0; + if (iocb->aio_flags & IOCB_FLAG_HIPRI) { + /* shares space in the union, and is rather pointless.. */ + ret = -EINVAL; + if (iocb->aio_flags & IOCB_FLAG_RESFD) + goto out_fput; + + /* can't submit polled IO to a non-polled ctx */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + goto out_fput; + + ret = -EOPNOTSUPP; + if (!(req->ki_flags & IOCB_DIRECT) || + !req->ki_filp->f_op->iopoll) + goto out_fput; + + req->ki_flags |= IOCB_HIPRI; + req->ki_complete = aio_complete_rw_poll; + } else { + /* can't submit non-polled IO to a polled ctx */ + ret = -EINVAL; + if (ctx->flags & IOCTX_FLAG_IOPOLL) + goto out_fput; + /* no one is going to poll for this I/O */ + req->ki_flags &= ~IOCB_HIPRI; + req->ki_complete = aio_complete_rw; + } + + return 0; out_fput: fput(req->ki_filp); return ret; @@ -1612,15 +1887,40 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } -static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb, +/* + * After the iocb has been issued, it's safe to be found on the poll list. + * Adding the kiocb to the list AFTER submission ensures that we don't + * find it from a io_getevents() thread before the issuer is done accessing + * the kiocb cookie. + */ +static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) +{ + /* + * For fast devices, IO may have already completed. If it has, add + * it to the front so we find it first. We can't add to the poll_done + * list as that's unlocked from the completion side. + */ + const int front_add = test_bit(IOCB_POLL_COMPLETED, &kiocb->ki_flags); + struct kioctx *ctx = kiocb->ki_ctx; + + spin_lock(&ctx->poll_lock); + if (front_add) + list_add(&kiocb->ki_list, &ctx->poll_submitted); + else + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); +} + +static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; + struct kiocb *req = &kiocb->rw; struct iov_iter iter; struct file *file; ssize_t ret; - ret = aio_prep_rw(req, iocb); + ret = aio_prep_rw(kiocb, iocb); if (ret) return ret; file = req->ki_filp; @@ -1645,15 +1945,16 @@ static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb, return ret; } -static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb, +static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; + struct kiocb *req = &kiocb->rw; struct iov_iter iter; struct file *file; ssize_t ret; - ret = aio_prep_rw(req, iocb); + ret = aio_prep_rw(kiocb, iocb); if (ret) return ret; file = req->ki_filp; @@ -1924,7 +2225,8 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, return -EINVAL; } - if (!get_reqs_available(ctx)) + /* Poll IO doesn't need ring reservations */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL) && !get_reqs_available(ctx)) return -EAGAIN; ret = -EAGAIN; @@ -1947,8 +2249,8 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } } - /* Don't support cancel on user mapped iocbs */ - if (!(ctx->flags & IOCTX_FLAG_USERIOCB)) { + /* Don't support cancel on user mapped iocbs or polled context */ + if (!(ctx->flags & (IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL))) { ret = put_user(KIOCB_KEY, &user_iocb->aio_key); if (unlikely(ret)) { pr_debug("EFAULT: aio_key\n"); @@ -1959,26 +2261,33 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, req->ki_user_iocb = user_iocb; req->ki_user_data = iocb->aio_data; + ret = -EINVAL; switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(&req->rw, iocb, false, compat); + ret = aio_read(req, iocb, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(&req->rw, iocb, false, compat); + ret = aio_write(req, iocb, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(&req->rw, iocb, true, compat); + ret = aio_read(req, iocb, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(&req->rw, iocb, true, compat); + ret = aio_write(req, iocb, true, compat); break; case IOCB_CMD_FSYNC: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_fsync(&req->fsync, iocb, false); break; case IOCB_CMD_FDSYNC: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_fsync(&req->fsync, iocb, true); break; case IOCB_CMD_POLL: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_poll(req, iocb); break; default: @@ -1994,13 +2303,21 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, */ if (ret) goto out_put_req; + if (ctx->flags & IOCTX_FLAG_IOPOLL) { + if (test_bit(IOCB_POLL_EAGAIN, &req->ki_flags)) { + ret = -EAGAIN; + goto out_put_req; + } + aio_iopoll_iocb_issued(req); + } return 0; out_put_req: if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); iocb_put(req); out_put_reqs_available: - put_reqs_available(ctx, 1); + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + put_reqs_available(ctx, 1); return ret; } @@ -2166,7 +2483,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, if (unlikely(!ctx)) return -EINVAL; - if (ctx->flags & IOCTX_FLAG_USERIOCB) + if (ctx->flags & (IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL)) goto err; spin_lock_irq(&ctx->ctx_lock); @@ -2201,8 +2518,12 @@ static long do_io_getevents(aio_context_t ctx_id, long ret = -EINVAL; if (likely(ioctx)) { - if (likely(min_nr <= nr && min_nr >= 0)) - ret = read_events(ioctx, min_nr, nr, events, until); + if (likely(min_nr <= nr && min_nr >= 0)) { + if (ioctx->flags & IOCTX_FLAG_IOPOLL) + ret = aio_iopoll_check(ioctx, min_nr, nr, events); + else + ret = read_events(ioctx, min_nr, nr, events, until); + } percpu_ref_put(&ioctx->users); } diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index 814e6606c413..ea0b9a19f4df 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -52,9 +52,11 @@ enum { * is valid. * IOCB_FLAG_IOPRIO - Set if the "aio_reqprio" member of the "struct iocb" * is valid. + * IOCB_FLAG_HIPRI - Use IO completion polling */ #define IOCB_FLAG_RESFD (1 << 0) #define IOCB_FLAG_IOPRIO (1 << 1) +#define IOCB_FLAG_HIPRI (1 << 2) /* read() from /dev/aio returns these structures. */ struct io_event { @@ -107,6 +109,7 @@ struct iocb { }; /* 64 bytes */ #define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ +#define IOCTX_FLAG_IOPOLL (1 << 1) /* io_context is polled */ #undef IFBIG #undef IFLITTLE From patchwork Tue Dec 4 23:37:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712779 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D0EC613AF for ; Tue, 4 Dec 2018 23:38:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C395C29C06 for ; Tue, 4 Dec 2018 23:38:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B74A429EB9; Tue, 4 Dec 2018 23:38:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 23E8C29C3A for ; Tue, 4 Dec 2018 23:38:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726380AbeLDXiQ (ORCPT ); Tue, 4 Dec 2018 18:38:16 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:33812 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726597AbeLDXiQ (ORCPT ); Tue, 4 Dec 2018 18:38:16 -0500 Received: by mail-pg1-f193.google.com with SMTP id 17so8111099pgg.1 for ; Tue, 04 Dec 2018 15:38:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FMBq4Ko2X8hOWlZDiRYYO1SvxX1cU99rIKE1Z/Iwshw=; b=wTHxxqSqaLdRtaUCpkk7nREaquaoNMqtcc34mO4mFLCI5UgEZgZjR9yKNK7jqDpC2d rxagvmejex7KsLz4wVew7Kn306YDXdesZyvLsS967SsDlK1va/L81nXcrabzV39+Zw1h Jse2RLq6GxzAYOav1AOKbC3I1KXMOmvPklHje/uNp8qMkRaWK78c8O6wCzAN/FJsMM0t 4w81zFj/b41ffkz0QtyPVXkxmpxJtGEaGzbeBS4eWNGbiHf7OLuOAH3l3hObXhA/qiTi ddXP9l1pshhOXwsjovmI6pjPFBjjQlQUzDOzH6ud9ww71+dIBG9z130heQ9Gj06a/ZMJ NdCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FMBq4Ko2X8hOWlZDiRYYO1SvxX1cU99rIKE1Z/Iwshw=; b=mWu8KUlEJ1GFvOqyOyR1zyRmEPXToiEKAu/X8FKOQGYH3xwRzlQUoC1G4PN8KOm6uS jE7o4pp+IIkjFYK1BCfdzxoNf7yYYH/b/8Fl1nBraLsVALLJfrGifYlVclULhptqJfpn eURCKu/cl/Nyq9YNKO2CNdNlUi0E5sZlYWSMEF4HZAwUg7oEaEfkpP5WEh4CUIXL2ZtM D4+aS/yBSe20tF5kNj5y0U7qeEb/VEuTZHUEAP0TQxkjgSiNjdnW+2hMMd2GcBkDNXTi Shr1crMClO9N8HBfsmtHghIAW8KN2VWLRp+5i/r8767ReXUuGu7uY61RPA3pEpXV1Wuq Ll1Q== X-Gm-Message-State: AA+aEWZXn1RnulmrBof9goiL9SkaTGWqx1V6MPFJUwMxa9rjIorPOKJ7 ezkQxlhTJ5Ypp7xnMB80oWdC1uE8Y3I= X-Google-Smtp-Source: AFSGD/XzWyAzMRPXdGtpwR3vCNt5S2qndhUzVY+Gtpe5nm3NuRVgK08erryBwK16RZRrhm1YMUh7IA== X-Received: by 2002:a65:530c:: with SMTP id m12mr17949829pgq.224.1543966694537; Tue, 04 Dec 2018 15:38:14 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:11 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 16/26] aio: add submission side request cache Date: Tue, 4 Dec 2018 16:37:19 -0700 Message-Id: <20181204233729.26776-17-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We have to add each submitted polled request to the io_context poll_submitted list, which means we have to grab the poll_lock. We already use the block plug to batch submissions if we're doing a batch of IO submissions, extend that to cover the poll requests internally as well. Signed-off-by: Jens Axboe --- fs/aio.c | 136 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 113 insertions(+), 23 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 5d34317c2929..ae0805dc814c 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -240,6 +240,21 @@ struct aio_kiocb { }; }; +struct aio_submit_state { + struct kioctx *ctx; + + struct blk_plug plug; +#ifdef CONFIG_BLOCK + struct blk_plug_cb plug_cb; +#endif + + /* + * Polled iocbs that have been submitted, but not added to the ctx yet + */ + struct list_head req_list; + unsigned int req_count; +}; + /*------ sysctl variables----*/ static DEFINE_SPINLOCK(aio_nr_lock); unsigned long aio_nr; /* current system wide number of aio requests */ @@ -257,6 +272,15 @@ static const struct address_space_operations aio_ctx_aops; static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); +/* + * We rely on block level unplugs to flush pending requests, if we schedule + */ +#ifdef CONFIG_BLOCK +static const bool aio_use_state_req_list = true; +#else +static const bool aio_use_state_req_list = false; +#endif + static void aio_useriocb_unmap(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); @@ -1887,13 +1911,28 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } +/* + * Called either at the end of IO submission, or through a plug callback + * because we're going to schedule. Moves out local batch of requests to + * the ctx poll list, so they can be found for polling + reaping. + */ +static void aio_flush_state_reqs(struct kioctx *ctx, + struct aio_submit_state *state) +{ + spin_lock(&ctx->poll_lock); + list_splice_tail_init(&state->req_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + state->req_count = 0; +} + /* * After the iocb has been issued, it's safe to be found on the poll list. * Adding the kiocb to the list AFTER submission ensures that we don't * find it from a io_getevents() thread before the issuer is done accessing * the kiocb cookie. */ -static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) +static void aio_iopoll_iocb_issued(struct aio_submit_state *state, + struct aio_kiocb *kiocb) { /* * For fast devices, IO may have already completed. If it has, add @@ -1903,12 +1942,21 @@ static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) const int front_add = test_bit(IOCB_POLL_COMPLETED, &kiocb->ki_flags); struct kioctx *ctx = kiocb->ki_ctx; - spin_lock(&ctx->poll_lock); - if (front_add) - list_add(&kiocb->ki_list, &ctx->poll_submitted); - else - list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); - spin_unlock(&ctx->poll_lock); + if (!state || !aio_use_state_req_list) { + spin_lock(&ctx->poll_lock); + if (front_add) + list_add(&kiocb->ki_list, &ctx->poll_submitted); + else + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + } else { + if (front_add) + list_add(&kiocb->ki_list, &state->req_list); + else + list_add_tail(&kiocb->ki_list, &state->req_list); + if (++state->req_count >= AIO_IOPOLL_BATCH) + aio_flush_state_reqs(ctx, state); + } } static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, @@ -2204,7 +2252,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) } static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, - struct iocb __user *user_iocb, bool compat) + struct iocb __user *user_iocb, + struct aio_submit_state *state, bool compat) { struct aio_kiocb *req; ssize_t ret; @@ -2308,7 +2357,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EAGAIN; goto out_put_req; } - aio_iopoll_iocb_issued(req); + aio_iopoll_iocb_issued(state, req); } return 0; out_put_req: @@ -2322,7 +2371,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, - bool compat) + struct aio_submit_state *state, bool compat) { struct iocb iocb, *iocbp; @@ -2339,7 +2388,44 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, iocbp = &iocb; } - return __io_submit_one(ctx, iocbp, user_iocb, compat); + return __io_submit_one(ctx, iocbp, user_iocb, state, compat); +} + +#ifdef CONFIG_BLOCK +static void aio_state_unplug(struct blk_plug_cb *cb, bool from_schedule) +{ + struct aio_submit_state *state; + + state = container_of(cb, struct aio_submit_state, plug_cb); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} +#endif + +/* + * Batched submission is done, ensure local IO is flushed out. + */ +static void aio_submit_state_end(struct aio_submit_state *state) +{ + blk_finish_plug(&state->plug); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} + +/* + * Start submission side cache. + */ +static void aio_submit_state_start(struct aio_submit_state *state, + struct kioctx *ctx) +{ + state->ctx = ctx; + INIT_LIST_HEAD(&state->req_list); + state->req_count = 0; +#ifdef CONFIG_BLOCK + state->plug_cb.callback = aio_state_unplug; + blk_start_plug(&state->plug); + list_add(&state->plug_cb.list, &state->plug.cb_list); +#endif } /* sys_io_submit: @@ -2357,10 +2443,10 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, struct iocb __user * __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2374,8 +2460,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -2384,12 +2472,12 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, break; } - ret = io_submit_one(ctx, user_iocb, false); + ret = io_submit_one(ctx, user_iocb, statep, false); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -2399,10 +2487,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, int, nr, compat_uptr_t __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2416,8 +2504,10 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { compat_uptr_t user_iocb; @@ -2426,12 +2516,12 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, break; } - ret = io_submit_one(ctx, compat_ptr(user_iocb), true); + ret = io_submit_one(ctx, compat_ptr(user_iocb), statep, true); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; From patchwork Tue Dec 4 23:37:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712783 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4DE6117D5 for ; Tue, 4 Dec 2018 23:38:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 40F6729C06 for ; Tue, 4 Dec 2018 23:38:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3538F29EB9; Tue, 4 Dec 2018 23:38:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C603B29C06 for ; Tue, 4 Dec 2018 23:38:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725906AbeLDXiU (ORCPT ); Tue, 4 Dec 2018 18:38:20 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:33681 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726691AbeLDXiS (ORCPT ); Tue, 4 Dec 2018 18:38:18 -0500 Received: by mail-pg1-f194.google.com with SMTP id z11so8110177pgu.0 for ; Tue, 04 Dec 2018 15:38:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=14kCfLiACV93AenWhL1xKvUDPabDTnEKBX7DXRr/kpI=; b=vHvIQ/91rChfA/ETgPsxy9iJw6El2Rc1MYTmzYitZOJWaqXfl784d8PKkdjQXjSzxS WsRIrLnKsa7I8JgIcxK5eDenCMmMDvG4Kz6dl2E3DmNBJMOcqplJNqoUp8jtUNnNwR5S QDRDbA9rzdiXZHq9mj77k+fRr7+PnmYZipJHtTXhZgYQNii5uPlcMA5TeeIuxuLOexeJ 5OVN/+3FVmtydnhJ1nC/UwgH1n/Ox8jlEH6LNINkVjWO/M5r4A58w9RucGaYIw7b+lO8 VKbPpSr2nKsuiid61HwNQsGDlMJc8Ve4EytR0a0m81eQ+MQ/pDVdPLMY1u4lTrY+Y0Mi LLow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=14kCfLiACV93AenWhL1xKvUDPabDTnEKBX7DXRr/kpI=; b=QFFX2SWaPCJI5ZEiB6w1rkGE0g79l3Cfpp+m6YFX9yTdFYTYMEo+votEVBsg/lr5x1 Xa3WJc+uhl7qsoLGgQfGyqGTN1MLpfVH1m8i2iqZxRZF0wtihTMa6JAiNyZdQOvDXbel AKPdLo6LArGIxvCqc4vOqpeM2qX0ZwIA+f3ta3sl0PGEMZjGCeDS1nSrG5LJY8PSYkEv m+7+RJifg7y+Lb36+7DdwWzEKur0oVq3VBocZ13Nghh+Z/MklZSrz0JAu61eQ9/4rcoe umEPIyMqdYHCoqqhp3776a1FMvaH6zjNcgM1n8JD4tf92btbV4dGG9JvbCs2qUZSO05F rAHw== X-Gm-Message-State: AA+aEWZ/HB9zh3eppOzA+diO+Ix6RoTY3x6wri9jjZHjfIvoO8aye+6H fh56qFvlkULScVHj+jSiuVa+pZwukNY= X-Google-Smtp-Source: AFSGD/UDkB9Ha2AFk4if4EckwdBEvtR6quq7H4YQJv47V5dyMvm6CUTNioY5PbEcCzVmpcokNV9L0w== X-Received: by 2002:a62:5716:: with SMTP id l22mr22702401pfb.16.1543966696996; Tue, 04 Dec 2018 15:38:16 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:15 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 17/26] fs: add fget_many() and fput_many() Date: Tue, 4 Dec 2018 16:37:20 -0700 Message-Id: <20181204233729.26776-18-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Some uses cases repeatedly get and put references to the same file, but the only exposed interface is doing these one at the time. As each of these entail an atomic inc or dec on a shared structure, that cost can add up. Add fget_many(), which works just like fget(), except it takes an argument for how many references to get on the file. Ditto fput_many(), which can drop an arbitrary number of references to a file. Signed-off-by: Jens Axboe --- fs/file.c | 15 ++++++++++----- fs/file_table.c | 10 ++++++++-- include/linux/file.h | 2 ++ include/linux/fs.h | 3 ++- 4 files changed, 22 insertions(+), 8 deletions(-) diff --git a/fs/file.c b/fs/file.c index 7ffd6e9d103d..ad9870edfd51 100644 --- a/fs/file.c +++ b/fs/file.c @@ -676,7 +676,7 @@ void do_close_on_exec(struct files_struct *files) spin_unlock(&files->file_lock); } -static struct file *__fget(unsigned int fd, fmode_t mask) +static struct file *__fget(unsigned int fd, fmode_t mask, unsigned int refs) { struct files_struct *files = current->files; struct file *file; @@ -691,7 +691,7 @@ static struct file *__fget(unsigned int fd, fmode_t mask) */ if (file->f_mode & mask) file = NULL; - else if (!get_file_rcu(file)) + else if (!get_file_rcu_many(file, refs)) goto loop; } rcu_read_unlock(); @@ -699,15 +699,20 @@ static struct file *__fget(unsigned int fd, fmode_t mask) return file; } +struct file *fget_many(unsigned int fd, unsigned int refs) +{ + return __fget(fd, FMODE_PATH, refs); +} + struct file *fget(unsigned int fd) { - return __fget(fd, FMODE_PATH); + return fget_many(fd, 1); } EXPORT_SYMBOL(fget); struct file *fget_raw(unsigned int fd) { - return __fget(fd, 0); + return __fget(fd, 0, 1); } EXPORT_SYMBOL(fget_raw); @@ -738,7 +743,7 @@ static unsigned long __fget_light(unsigned int fd, fmode_t mask) return 0; return (unsigned long)file; } else { - file = __fget(fd, mask); + file = __fget(fd, mask, 1); if (!file) return 0; return FDPUT_FPUT | (unsigned long)file; diff --git a/fs/file_table.c b/fs/file_table.c index e49af4caf15d..6a3964df33e4 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -326,9 +326,9 @@ void flush_delayed_fput(void) static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); -void fput(struct file *file) +void fput_many(struct file *file, unsigned int refs) { - if (atomic_long_dec_and_test(&file->f_count)) { + if (atomic_long_sub_and_test(refs, &file->f_count)) { struct task_struct *task = current; if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { @@ -347,6 +347,12 @@ void fput(struct file *file) } } +void fput(struct file *file) +{ + fput_many(file, 1); +} + + /* * synchronous analog of fput(); for kernel threads that might be needed * in some umount() (and thus can't use flush_delayed_fput() without diff --git a/include/linux/file.h b/include/linux/file.h index 6b2fb032416c..3fcddff56bc4 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -13,6 +13,7 @@ struct file; extern void fput(struct file *); +extern void fput_many(struct file *, unsigned int); struct file_operations; struct vfsmount; @@ -44,6 +45,7 @@ static inline void fdput(struct fd fd) } extern struct file *fget(unsigned int fd); +extern struct file *fget_many(unsigned int fd, unsigned int refs); extern struct file *fget_raw(unsigned int fd); extern unsigned long __fdget(unsigned int fd); extern unsigned long __fdget_raw(unsigned int fd); diff --git a/include/linux/fs.h b/include/linux/fs.h index 6a5f71f8ae06..dc54a65c401a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -952,7 +952,8 @@ static inline struct file *get_file(struct file *f) atomic_long_inc(&f->f_count); return f; } -#define get_file_rcu(x) atomic_long_inc_not_zero(&(x)->f_count) +#define get_file_rcu_many(x, cnt) atomic_long_add_unless(&(x)->f_count, (cnt), 0) +#define get_file_rcu(x) get_file_rcu_many((x), 1) #define fput_atomic(x) atomic_long_add_unless(&(x)->f_count, -1, 1) #define file_count(x) atomic_long_read(&(x)->f_count) From patchwork Tue Dec 4 23:37:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712787 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90DC717D5 for ; Tue, 4 Dec 2018 23:38:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83A0529C06 for ; Tue, 4 Dec 2018 23:38:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 74E2B29C3A; Tue, 4 Dec 2018 23:38:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE59429ED2 for ; Tue, 4 Dec 2018 23:38:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726597AbeLDXiW (ORCPT ); Tue, 4 Dec 2018 18:38:22 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:44885 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726691AbeLDXiW (ORCPT ); Tue, 4 Dec 2018 18:38:22 -0500 Received: by mail-pl1-f196.google.com with SMTP id k8so9065680pls.11 for ; Tue, 04 Dec 2018 15:38:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=judl9Bd+gAX598NMF94uBUUR80aUoPJYBIu45By8XSI=; b=B3r3zdMGPoORa2HjO1m0dRG8Bk6Go31u2iz8zQOAfA2eTpmuRaPDS0Sdt50gfdEAGz UCTmBmvYaDqyCNJBTpADPA1gZyKD31G+NHfaXQ+xXiKU6S01FCfciY/8Tx83NgO67gjp YYZGRLIgr0mlQjXiofVjaZjuNPp8TFGrlcmULjgvYwdYz6d/hheZJPAiAAGrWOSdWKCT /ktFSIgLT5GOlcZctoAWUnolgH5gU7cQHt2jmwAE8B+IRQQvEpzkhlUwlZXC05Xg1TtC YL9rGmQ+8FZrn9qIAqLe1VtWW3BBtvS1MIf6+HX7+QO0Wi40EDGSXx1rXftXtyIXyVwy y0KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=judl9Bd+gAX598NMF94uBUUR80aUoPJYBIu45By8XSI=; b=fWB2+mJ5fOtD736XpNShe3nU/wmxax1Q5Wy+lRLCGepbXuKB1iUansk/OxLEWMckZP lGPikWDFJ3bo94LI7nWOn5F0Zyzx7d4fU2VeY2HHyyq1fj/R7PjmDfMzYwPI/b+pwGlU s1ZgLITw2zyWRSOO8JMQj9+TXvkGX5ks2521rL5dKwJQ0WEWM3YcExDtHj1T8flsNBOZ SXXO4NsVVoC0vxS++g9YNn8v0rs/2oMXcInmCrgCHaLqbCJ/tBzG6LIp7p029n2sAO8U vKPF9704oBWUXZvQBe6tvzDafgBFVyOWHYWhzHF37Ied18FlUjwjVAkGrEcuImiyEJS3 4z8w== X-Gm-Message-State: AA+aEWaIYxIOFZXIaVCyily/6qKlKdDOTBe1OseGImUZ1s10SaZmQKOR qyriiO0sRsOS1e9kFbgFMgDGnKEUNDY= X-Google-Smtp-Source: AFSGD/UTPeclHujyxdJK+xJZjRIiYzWNfOvHbvklg1dN8CWJPUNK4pJMQjRrAXXs6yA2MmBOU6FUwA== X-Received: by 2002:a17:902:7c0c:: with SMTP id x12mr22532702pll.265.1543966700587; Tue, 04 Dec 2018 15:38:20 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:18 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 18/26] aio: use fget/fput_many() for file references Date: Tue, 4 Dec 2018 16:37:21 -0700 Message-Id: <20181204233729.26776-19-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On the submission side, add file reference batching to the aio_submit_state. We get as many references as the number of iocbs we are submitting, and drop unused ones if we end up switching files. The assumption here is that we're usually only dealing with one fd, and if there are multiple, hopefuly they are at least somewhat ordered. Could trivially be extended to cover multiple fds, if needed. On the completion side we do the same thing, except this is trivially done just locally in aio_iopoll_reap(). Signed-off-by: Jens Axboe --- fs/aio.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 91 insertions(+), 15 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index ae0805dc814c..634b540b0c92 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -253,6 +253,15 @@ struct aio_submit_state { */ struct list_head req_list; unsigned int req_count; + + /* + * File reference cache + */ + struct file *file; + unsigned int fd; + unsigned int has_refs; + unsigned int used_refs; + unsigned int ios_left; }; /*------ sysctl variables----*/ @@ -1355,7 +1364,8 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, { void *iocbs[AIO_IOPOLL_BATCH]; struct aio_kiocb *iocb, *n; - int to_free = 0, ret = 0; + int file_count, to_free = 0, ret = 0; + struct file *file = NULL; /* Shouldn't happen... */ if (*nr_events >= max) @@ -1372,7 +1382,20 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, list_del(&iocb->ki_list); iocbs[to_free++] = iocb; - fput(iocb->rw.ki_filp); + /* + * Batched puts of the same file, to avoid dirtying the + * file usage count multiple times, if avoidable. + */ + if (!file) { + file = iocb->rw.ki_filp; + file_count = 1; + } else if (file == iocb->rw.ki_filp) { + file_count++; + } else { + fput_many(file, file_count); + file = iocb->rw.ki_filp; + file_count = 1; + } if (evs && copy_to_user(evs + *nr_events, &iocb->ki_ev, sizeof(iocb->ki_ev))) { @@ -1382,6 +1405,9 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, (*nr_events)++; } + if (file) + fput_many(file, file_count); + if (to_free) iocb_put_many(ctx, iocbs, &to_free); @@ -1804,13 +1830,58 @@ static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) } } -static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb) +static void aio_file_put(struct aio_submit_state *state) +{ + if (state->file) { + int diff = state->has_refs - state->used_refs; + + if (diff) + fput_many(state->file, diff); + state->file = NULL; + } +} + +/* + * Get as many references to a file as we have IOs left in this submission, + * assuming most submissions are for one file, or at least that each file + * has more than one submission. + */ +static struct file *aio_file_get(struct aio_submit_state *state, int fd) +{ + if (!state) + return fget(fd); + + if (!state->file) { +get_file: + state->file = fget_many(fd, state->ios_left); + if (!state->file) + return NULL; + + state->fd = fd; + state->has_refs = state->ios_left; + state->used_refs = 1; + state->ios_left--; + return state->file; + } + + if (state->fd == fd) { + state->used_refs++; + state->ios_left--; + return state->file; + } + + aio_file_put(state); + goto get_file; +} + +static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb, + struct aio_submit_state *state) { struct kioctx *ctx = kiocb->ki_ctx; struct kiocb *req = &kiocb->rw; int ret; - req->ki_filp = fget(iocb->aio_fildes); + req->ki_filp = aio_file_get(state, iocb->aio_fildes); if (unlikely(!req->ki_filp)) return -EBADF; req->ki_pos = iocb->aio_offset; @@ -1960,7 +2031,8 @@ static void aio_iopoll_iocb_issued(struct aio_submit_state *state, } static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, - bool vectored, bool compat) + struct aio_submit_state *state, bool vectored, + bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -1968,7 +2040,7 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, struct file *file; ssize_t ret; - ret = aio_prep_rw(kiocb, iocb); + ret = aio_prep_rw(kiocb, iocb, state); if (ret) return ret; file = req->ki_filp; @@ -1994,7 +2066,8 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, } static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, - bool vectored, bool compat) + struct aio_submit_state *state, bool vectored, + bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -2002,7 +2075,7 @@ static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, struct file *file; ssize_t ret; - ret = aio_prep_rw(kiocb, iocb); + ret = aio_prep_rw(kiocb, iocb, state); if (ret) return ret; file = req->ki_filp; @@ -2313,16 +2386,16 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EINVAL; switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(req, iocb, false, compat); + ret = aio_read(req, iocb, state, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(req, iocb, false, compat); + ret = aio_write(req, iocb, state, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(req, iocb, true, compat); + ret = aio_read(req, iocb, state, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(req, iocb, true, compat); + ret = aio_write(req, iocb, state, true, compat); break; case IOCB_CMD_FSYNC: if (ctx->flags & IOCTX_FLAG_IOPOLL) @@ -2410,17 +2483,20 @@ static void aio_submit_state_end(struct aio_submit_state *state) blk_finish_plug(&state->plug); if (!list_empty(&state->req_list)) aio_flush_state_reqs(state->ctx, state); + aio_file_put(state); } /* * Start submission side cache. */ static void aio_submit_state_start(struct aio_submit_state *state, - struct kioctx *ctx) + struct kioctx *ctx, int max_ios) { state->ctx = ctx; INIT_LIST_HEAD(&state->req_list); state->req_count = 0; + state->file = NULL; + state->ios_left = max_ios; #ifdef CONFIG_BLOCK state->plug_cb.callback = aio_state_unplug; blk_start_plug(&state->plug); @@ -2461,7 +2537,7 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, nr = ctx->nr_events; if (nr > AIO_PLUG_THRESHOLD) { - aio_submit_state_start(&state, ctx); + aio_submit_state_start(&state, ctx, nr); statep = &state; } for (i = 0; i < nr; i++) { @@ -2505,7 +2581,7 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, nr = ctx->nr_events; if (nr > AIO_PLUG_THRESHOLD) { - aio_submit_state_start(&state, ctx); + aio_submit_state_start(&state, ctx, nr); statep = &state; } for (i = 0; i < nr; i++) { From patchwork Tue Dec 4 23:37:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712791 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 59DFE17D5 for ; Tue, 4 Dec 2018 23:38:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CF7F29C06 for ; Tue, 4 Dec 2018 23:38:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4177A29EB9; Tue, 4 Dec 2018 23:38:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E551C29C06 for ; Tue, 4 Dec 2018 23:38:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725886AbeLDXiY (ORCPT ); Tue, 4 Dec 2018 18:38:24 -0500 Received: from mail-pl1-f195.google.com ([209.85.214.195]:40344 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726669AbeLDXiX (ORCPT ); Tue, 4 Dec 2018 18:38:23 -0500 Received: by mail-pl1-f195.google.com with SMTP id u18so9075866plq.7 for ; Tue, 04 Dec 2018 15:38:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ul+VE0yRWp7wiLWglojNjFDoqMVcfJibb/mrajeGb/U=; b=HDkYULpaTdjG0VndsOVCNomLAWFR4ZR48+pVNOT+806Kdx+3UuNvb4egZJc2ruX0xJ rdmtYtl4PBnZJFZPqJPL2e5wiRhpqy416tjkMNbJjcuzt0cFNSb5VKne8/gfEz6zkHXJ t4jcsYZeIHHm5k27h7+ML+vNw56rAAtgA8G8yUdCFaFEZInDns34Ojk9VTvRHoGXFyiv Jj8hRCtqHPp56Swv2cQKeIjVP2AoirEkAMn62Ylot7wBN7hWAO3Mc26gpTnWe6W/fwtX o5ohTOxbj3Y3PzgrMt8pcmH71Dcc53/mF70offPhYkggYbnFXXex1h9lQjdNK7ga8avE nfqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Ul+VE0yRWp7wiLWglojNjFDoqMVcfJibb/mrajeGb/U=; b=oR4cXdGyF4/evPA+dPb+ODTYnF7t2+GTeiNTP+OsnE9MnwxkxKM7ZkBxmpznSVS1/V z9yL75z8ZfFBf4yLrY+E3v0wNW1+yD+NzpX2jTO82mXBzAo8GoirBJK+8PWPI/b3hkJJ V3Y8dh3w+CTdt+blu4HREnvsw0gXef2XwYWOdcdQ1f/hZPfX1DaoSUKOm82s+dbkDpOu JEgulvg4KBWpInqzjhSA19k7tWD9L4ewpu+eYaMdEiBH0OTaGQvvpRlyaRkKb8UB2pzY dPysp7zHjuFzwsaXp+FWdWTKZ5ns0q4IC0IbU/RluHMGeKpheErQyY9qrDchBHK1Qpds gqTA== X-Gm-Message-State: AA+aEWY+sbyjqKV9rDl58nzEdIAOYXnM2tqQUU7dI/6FkmlXowLPw1lB 2p3+yiMtJfavtFpblnBabzhHaMkGAKs= X-Google-Smtp-Source: AFSGD/WBeY9LmV7DIXmFZwKfJD2sYa+bqdvhbeR06tOSq2igIBMpnjnFi+kiNZ1wh5eEeQYWStd4kw== X-Received: by 2002:a17:902:bc43:: with SMTP id t3mr20525665plz.124.1543966702841; Tue, 04 Dec 2018 15:38:22 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:21 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 19/26] aio: split iocb init from allocation Date: Tue, 4 Dec 2018 16:37:22 -0700 Message-Id: <20181204233729.26776-20-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In preparation from having pre-allocated requests, that we then just need to initialize before use. Signed-off-by: Jens Axboe --- fs/aio.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 634b540b0c92..416bb2e365e0 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1099,6 +1099,16 @@ static bool get_reqs_available(struct kioctx *ctx) return __get_reqs_available(ctx); } +static void aio_iocb_init(struct kioctx *ctx, struct aio_kiocb *req) +{ + percpu_ref_get(&ctx->reqs); + req->ki_ctx = ctx; + INIT_LIST_HEAD(&req->ki_list); + req->ki_flags = 0; + refcount_set(&req->ki_refcnt, 0); + req->ki_eventfd = NULL; +} + /* aio_get_req * Allocate a slot for an aio request. * Returns NULL if no requests are free. @@ -1111,12 +1121,7 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) if (unlikely(!req)) return NULL; - percpu_ref_get(&ctx->reqs); - req->ki_ctx = ctx; - INIT_LIST_HEAD(&req->ki_list); - req->ki_flags = 0; - refcount_set(&req->ki_refcnt, 0); - req->ki_eventfd = NULL; + aio_iocb_init(ctx, req); return req; } From patchwork Tue Dec 4 23:37:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E755E109C for ; Tue, 4 Dec 2018 23:38:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DAC4F29C06 for ; Tue, 4 Dec 2018 23:38:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF66729EB9; Tue, 4 Dec 2018 23:38:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 764DA29C06 for ; Tue, 4 Dec 2018 23:38:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726669AbeLDXi0 (ORCPT ); Tue, 4 Dec 2018 18:38:26 -0500 Received: from mail-pg1-f196.google.com ([209.85.215.196]:36867 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726665AbeLDXi0 (ORCPT ); Tue, 4 Dec 2018 18:38:26 -0500 Received: by mail-pg1-f196.google.com with SMTP id 80so8093313pge.4 for ; Tue, 04 Dec 2018 15:38:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2+la47KDAQKGmZHMhrQqv7/JmZ/DmBCr3wt9bubXBUw=; b=NLvp1mIghPnkZPt4HnF+MITp2xKyPiKxKfKVvWgGOyITdVy6OuPu+1kFyvM1dlR3OU 3qZC8BjlyGqcUFo33JkytHBmYN4ZSvNRImN1loOjbtMeAmeXzMHwK7ghJbwOBYUJsf4M Y0CxeD26tysB5NV3VP/tPLrQ8lVt6I2x9dTvq+nbbwCBgmkbTl93jO2d1KajWiaWhsMd hCVVSPbmKc4t6UgXKjQ/wfq/amC/KRx54y3suXPKX6R1bo+Z0Dob0m+Z5a1Uy8nZr9hf IVfVN46xB9cqOQ2WI+Y1KYrfEDl6FfH49qQYUGnqGbqrBBVMFOOgiHHcHiMmwQbCgzJF Ky8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2+la47KDAQKGmZHMhrQqv7/JmZ/DmBCr3wt9bubXBUw=; b=fNutr8Ftt+Io6ryTA5qg20CZB+fHMYxvPkajwZ3AZHvNMLklH9CYdNDL3yWAPhU591 3QROPiaP/CqAbLBKATgyw3f82FX+lHBM/lgzv19CBXrs66JvKpugrcAZ8YwW+zL+uW3+ lENICqPK46VfcpWlZdv04aD97Olw9k/4bS0sujEBbFunQHWuINvAj5jKK7xI9lir8fza NkkbtRKK/Q3DhQrnnO4aqT+A3a/NfxCXpMge6KecFrNNBX43Wj+mHeRyypDuChM58vJb ZnKp8ho4B8zVl2k56YgGq6Lv6qRGyMuPhbiAHr9jRrchdDJW0PlQVv6oGTd97Vuw44dN PXzA== X-Gm-Message-State: AA+aEWafjhk2VnAUTS1NWZmV9CKKDnP3BK75+yiosr5DwNQfhyaNhwRR /6zhXi/GBDchx5G2LcRl6atZsUPOo7w= X-Google-Smtp-Source: AFSGD/UBTBuH7VXnjUcNx3kOh9ltHSVTgMCaGnlsITE34Qg0JI35OogHS10AMdtRpikm5fK1hGYHQg== X-Received: by 2002:a63:741:: with SMTP id 62mr17982025pgh.352.1543966704697; Tue, 04 Dec 2018 15:38:24 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:23 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 20/26] aio: batch aio_kiocb allocation Date: Tue, 4 Dec 2018 16:37:23 -0700 Message-Id: <20181204233729.26776-21-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Similarly to how we use the state->ios_left to know how many references to get to a file, we can use it to allocate the aio_kiocb's we need in bulk. Signed-off-by: Jens Axboe --- fs/aio.c | 47 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 416bb2e365e0..1735ec556089 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -240,6 +240,8 @@ struct aio_kiocb { }; }; +#define AIO_IOPOLL_BATCH 8 + struct aio_submit_state { struct kioctx *ctx; @@ -254,6 +256,13 @@ struct aio_submit_state { struct list_head req_list; unsigned int req_count; + /* + * aio_kiocb alloc cache + */ + void *iocbs[AIO_IOPOLL_BATCH]; + unsigned int free_iocbs; + unsigned int cur_iocb; + /* * File reference cache */ @@ -1113,15 +1122,35 @@ static void aio_iocb_init(struct kioctx *ctx, struct aio_kiocb *req) * Allocate a slot for an aio request. * Returns NULL if no requests are free. */ -static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) +static struct aio_kiocb *aio_get_req(struct kioctx *ctx, + struct aio_submit_state *state) { struct aio_kiocb *req; - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); - if (unlikely(!req)) - return NULL; + if (!state) + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); + else if (!state->free_iocbs) { + size_t size; + + size = min_t(size_t, state->ios_left, ARRAY_SIZE(state->iocbs)); + size = kmem_cache_alloc_bulk(kiocb_cachep, GFP_KERNEL, size, + state->iocbs); + if (size < 0) + return ERR_PTR(size); + else if (!size) + return ERR_PTR(-ENOMEM); + state->free_iocbs = size - 1; + state->cur_iocb = 1; + req = state->iocbs[0]; + } else { + req = state->iocbs[state->cur_iocb]; + state->free_iocbs--; + state->cur_iocb++; + } + + if (req) + aio_iocb_init(ctx, req); - aio_iocb_init(ctx, req); return req; } @@ -1359,8 +1388,6 @@ static bool aio_read_events(struct kioctx *ctx, long min_nr, long nr, return ret < 0 || *i >= min_nr; } -#define AIO_IOPOLL_BATCH 8 - /* * Process completed iocb iopoll entries, copying the result to userspace. */ @@ -2357,7 +2384,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, return -EAGAIN; ret = -EAGAIN; - req = aio_get_req(ctx); + req = aio_get_req(ctx, state); if (unlikely(!req)) goto out_put_reqs_available; @@ -2489,6 +2516,9 @@ static void aio_submit_state_end(struct aio_submit_state *state) if (!list_empty(&state->req_list)) aio_flush_state_reqs(state->ctx, state); aio_file_put(state); + if (state->free_iocbs) + kmem_cache_free_bulk(kiocb_cachep, state->free_iocbs, + &state->iocbs[state->cur_iocb]); } /* @@ -2500,6 +2530,7 @@ static void aio_submit_state_start(struct aio_submit_state *state, state->ctx = ctx; INIT_LIST_HEAD(&state->req_list); state->req_count = 0; + state->free_iocbs = 0; state->file = NULL; state->ios_left = max_ios; #ifdef CONFIG_BLOCK From patchwork Tue Dec 4 23:37:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6EB2117DB for ; Tue, 4 Dec 2018 23:38:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 61EBF29C06 for ; Tue, 4 Dec 2018 23:38:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 568F829EB9; Tue, 4 Dec 2018 23:38:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F01F29C06 for ; Tue, 4 Dec 2018 23:38:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726423AbeLDXi2 (ORCPT ); Tue, 4 Dec 2018 18:38:28 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:42729 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726701AbeLDXi2 (ORCPT ); Tue, 4 Dec 2018 18:38:28 -0500 Received: by mail-pl1-f193.google.com with SMTP id y1so4267293plp.9 for ; Tue, 04 Dec 2018 15:38:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7QkOJIdwfAgSYahhwfwYvkmzaNCDGUD8Z2PwUSvN4TU=; b=djIibfsyBElfuwMuiHGfyhE2Lrff+914xqBrZqye1bAvmb1N5MAPnOFQatrmNCqsDz bOf1DfLNNl6grdz5Pq8yyullje7fWdsvj1O09q756OU//oyxjG+g6PRg4+NjztOoVPdu yXUQ65s6wErml1ekLbaHUChf+/HB5Ev8qiGiM1oTvZpefXfyEFxH0pkV8vB+r+ZH+zpL OgdX0rKqSbzQ4wI/Fc66gWYcwZW4wxwcgCWYnhafsjTro+BE9q8UZtBalfbfe62RcC31 yOmsC6g0wvTagpcsuNtb27Mha5SdnkNFwzP1uIiBdbb7w2ogIxw9VyAJ/4lgCCQ0nNKl yuWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7QkOJIdwfAgSYahhwfwYvkmzaNCDGUD8Z2PwUSvN4TU=; b=ZaodNy+1otIKL/3ZW5MsGPj2VT0g5S3I+RE76lEqgJw5tfIdfqlxzL1qXR5tUFwfLU PvgVa6kSVB/M0Xw1jbljCYGVmcbVNSvlbP9ccTr/Ecbqnlju7JOz1EWwXgC+4pIEl45i moWijE56OpXR9+d18bGrBsufBVoR08IxWZ0NkS40B8V9yffjpYfrnuI0YBX4r7e8VVL2 //76yTsKOyUA/KZcx+j/autS3fV8jaFSGOy4DRlysBzZVMmVN734BBHEEZJbEiPFFgvx sYXbFT6t6B65Qle7XnEJesDZcux+KZx2SHy7KTA8Hsq1gCSO7SbAOubaYRK/41EShFQ3 uuqA== X-Gm-Message-State: AA+aEWb6ZC+PJA6MBhTdVA/177VuJ3DMDY49dY74S/m7DC+xTaM3Isks oC7npbTbmgj7b+EwgEvbdG7zyTYbyZ4= X-Google-Smtp-Source: AFSGD/Xyz58/Dp/UjaBS0nA0vu9Pdfudz4SYcLlBKlTHA3dX6m0/+xK1df7t4zfbFZJ6VOr66mr0Ow== X-Received: by 2002:a17:902:34a:: with SMTP id 68mr22536321pld.268.1543966706496; Tue, 04 Dec 2018 15:38:26 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:25 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 21/26] block: add BIO_HOLD_PAGES flag Date: Tue, 4 Dec 2018 16:37:24 -0700 Message-Id: <20181204233729.26776-22-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For user mapped IO, we do get_user_pages() upfront, and then do a put_page() on each page at end_io time to release the page reference. In preparation for having permanently mapped pages, add a BIO_HOLD_PAGES flag that tells us not to release the pages, the caller will do that. Signed-off-by: Jens Axboe --- block/bio.c | 6 ++++-- include/linux/blk_types.h | 1 + 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index 03895cc0d74a..ab174bce5436 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1635,7 +1635,8 @@ static void bio_dirty_fn(struct work_struct *work) next = bio->bi_private; bio_set_pages_dirty(bio); - bio_release_pages(bio); + if (!bio_flagged(bio, BIO_HOLD_PAGES)) + bio_release_pages(bio); bio_put(bio); } } @@ -1651,7 +1652,8 @@ void bio_check_pages_dirty(struct bio *bio) goto defer; } - bio_release_pages(bio); + if (!bio_flagged(bio, BIO_HOLD_PAGES)) + bio_release_pages(bio); bio_put(bio); return; defer: diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 1d9c3da0f2a1..344e5b61aa4e 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -227,6 +227,7 @@ struct bio { #define BIO_TRACE_COMPLETION 10 /* bio_endio() should trace the final completion * of this bio. */ #define BIO_QUEUE_ENTERED 11 /* can use blk_queue_enter_live() */ +#define BIO_HOLD_PAGES 12 /* don't put O_DIRECT pages */ /* See BVEC_POOL_OFFSET below before adding new flags */ From patchwork Tue Dec 4 23:37:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712803 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6057717DB for ; Tue, 4 Dec 2018 23:38:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 53BEC29C06 for ; Tue, 4 Dec 2018 23:38:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 47FBB29EB9; Tue, 4 Dec 2018 23:38:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EF93529C06 for ; Tue, 4 Dec 2018 23:38:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726665AbeLDXia (ORCPT ); Tue, 4 Dec 2018 18:38:30 -0500 Received: from mail-pl1-f194.google.com ([209.85.214.194]:45984 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726701AbeLDXi3 (ORCPT ); Tue, 4 Dec 2018 18:38:29 -0500 Received: by mail-pl1-f194.google.com with SMTP id a14so9065354plm.12 for ; Tue, 04 Dec 2018 15:38:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=H/nxDHAB/OPHiKU4vV61gL4I+W3NMxEoNevWsSgLfuE=; b=hKAhSJ7rdP9szqo4HDJXZNOVgveXC19Y2qvki9NytTk/WIpw0QhoOHTrRW9hjJy3pX iERJ1HzXfx9NWRMn6erEE2ezRU23tGuj5FG8JmT2jHjnrsOi9/JbKg6ESla32RiN2XIQ INaEe30+cTyLV2vuKBwnXLHSPbMhm8uk/pPY5qkuk6lBOUrpnR8NUeaozfF2fj9jU26Q YT2KJlGKUkMq+WT0IWMcDg+rBynPfJYpvE3ubzvpWQxcELzJJvy3h3k7YQ/HzVhDjj4k ljWY6OOT44fsGB/QZSQFbP2UJfB34SW1N0m+EQe5juZ8d8IBhsRuxqDXC19gCx8W5B6z QNoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=H/nxDHAB/OPHiKU4vV61gL4I+W3NMxEoNevWsSgLfuE=; b=JR0zAKF/iEpL3S+UMCBKYxDPgtu/njwP4wILeDRM+v24r/pbgr4095LvpJWfAlDuzd da/wGNTjsC1Ha/AN8P0p3nPkx/ZJqNGr6O6B9n54qqM/ymhwfHg1E7zFgTBTzhT+cUJ6 p3QYy3tpwttY/ozDA0jZ/klewEKDXJYU41S3j6TeYSwKEIIFEmS12BiUcjY7f9wWGYMa vofFBEFAnkD8tGelRHYzdSxMMSxaLl6FT+KGU2CPyVfA2bFEVRoOxOwKQwr/lfGVVa6Z HjOvec5m3gBNztrFl19YRL3hLtJegF7CbsLvBjlKYQC7mKHsRQ93aT5ZmKtFC1O10q1I AP3g== X-Gm-Message-State: AA+aEWYqRb6Lj0rDq6wiR3SqgG0m53iGkeU50CxG77bfOTchN/xfS0Sv ns03IyZuupWQcH6ZSOyH6h3qMWVmf08= X-Google-Smtp-Source: AFSGD/Uytif9H5edEwA0CdsjuwdlEggcz/ayI7/UfSZNTPDUTEIgeIpihVZAgbsVT2+BV7TZOcmqDw== X-Received: by 2002:a17:902:848d:: with SMTP id c13mr22178536plo.257.1543966708395; Tue, 04 Dec 2018 15:38:28 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:27 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 22/26] block: implement bio helper to add iter bvec pages to bio Date: Tue, 4 Dec 2018 16:37:25 -0700 Message-Id: <20181204233729.26776-23-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For an ITER_BVEC, we can just iterate the iov and add the pages to the bio directly. Signed-off-by: Jens Axboe --- block/bio.c | 27 +++++++++++++++++++++++++++ include/linux/bio.h | 1 + 2 files changed, 28 insertions(+) diff --git a/block/bio.c b/block/bio.c index ab174bce5436..0e466d06f748 100644 --- a/block/bio.c +++ b/block/bio.c @@ -903,6 +903,33 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) } EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); +/** + * bio_iov_bvec_add_pages - add pages from an ITER_BVEC to a bio + * @bio: bio to add pages to + * @iter: iov iterator describing the region to be added + * + * Iterate pages in the @iter and add them to the bio. We flag the + * @bio with BIO_HOLD_PAGES, telling IO completion not to free them. + */ +int bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter) +{ + unsigned short orig_vcnt = bio->bi_vcnt; + const struct bio_vec *bv; + + do { + size_t size; + + bv = iter->bvec + iter->iov_offset; + size = bio_add_page(bio, bv->bv_page, bv->bv_len, bv->bv_offset); + if (size != bv->bv_len) + break; + iov_iter_advance(iter, size); + } while (iov_iter_count(iter) && !bio_full(bio)); + + bio_set_flag(bio, BIO_HOLD_PAGES); + return bio->bi_vcnt > orig_vcnt ? 0 : -EINVAL; +} + static void submit_bio_wait_endio(struct bio *bio) { complete(bio->bi_private); diff --git a/include/linux/bio.h b/include/linux/bio.h index 056fb627edb3..39ddb53e7e7f 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -434,6 +434,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, void __bio_add_page(struct bio *bio, struct page *page, unsigned int len, unsigned int off); int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter); +int bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter); struct rq_map_data; extern struct bio *bio_map_user_iov(struct request_queue *, struct iov_iter *, gfp_t); From patchwork Tue Dec 4 23:37:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712813 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E33FC17DB for ; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D87452A838 for ; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CB6842A55D; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 46E3728D96 for ; Tue, 4 Dec 2018 23:38:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726701AbeLDXic (ORCPT ); Tue, 4 Dec 2018 18:38:32 -0500 Received: from mail-pf1-f194.google.com ([209.85.210.194]:38364 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726712AbeLDXib (ORCPT ); Tue, 4 Dec 2018 18:38:31 -0500 Received: by mail-pf1-f194.google.com with SMTP id q1so9008959pfi.5 for ; Tue, 04 Dec 2018 15:38:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7Qnfg7ArnbxEsp5GZEYhER2UYK3J+cRrwNMf1dzAyQ0=; b=H2XVBgwRHjr6Ehm4bnnka/pSmHF+rIr+wILFasDdk55Ci0hqMMbvbWZcrM/Q+UuHcP gJUkq1fBec+MXVM6D88qG27wlQY3nL76o8ukrBBLe4nJ/hBLCpn/BxQOrrvaODwM+Q+z MB+NFbXt2+YMahU91/Ibf0cq8rJr1jYNRug7/ld2zDvKk+7ty+n4A3ZAhsSQSv/QJZIZ h44v9sy2L9pjLtnB+z749DVtxZumtRtoWeHLD/UpRLoqzE0+hc78ib6M4+ONWyZGyYqt VaOnIravjvCAhQ7Sva/dIeCOEp2yr5wagkKLaXgdonxy2M2CGhFVGVp4NotXPihP/bi+ AB2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7Qnfg7ArnbxEsp5GZEYhER2UYK3J+cRrwNMf1dzAyQ0=; b=Ui+3P3cyp+h1XI61Kh3NarxM33wS+Te+rf5Y3nIJAuzsTFVVcQfOqwlR+qoMFF9GEB v1/wiOHR/H3GDu5sPH3uOy73eHWbWIBVZ7X4sbFhBNHl8nkjZCJllfj8MOYqgylQjsJ7 CmJJTe6wLA6nsXntY/G3K9liiDABbf1mfbymbn9ipVv5d7tRlMdKIESYfAqzAnRjJTeo IW2xSWnZjIiB77nimbTvTod2dyezbHgCMij/5fHAvGWMVKpi35uxpMVkibBnvCbrijpT 8YsucY1vhum3ZziUjCeYLrTJfXeB6+TlFm5RmWkzXlwt31lKOx9iYKssT06wjlq1zLdL U6Dw== X-Gm-Message-State: AA+aEWbmfZnQ6CT2ECLKbB81l/K8+YRvQrhNMiXt+m2IGqZGunYGVfQ7 /vZzSUzmqwqA8v83KROhDK2BcDvWeXc= X-Google-Smtp-Source: AFSGD/XcIKOV2dNKYz9y9us37EgWU7MGz/qCYP/MLpKr3ukNe0a7xn2RhnvzOs+m1239O3l2F3WpKQ== X-Received: by 2002:a63:1f4e:: with SMTP id q14mr17751656pgm.88.1543966710125; Tue, 04 Dec 2018 15:38:30 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:29 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 23/26] fs: add support for mapping an ITER_BVEC for O_DIRECT Date: Tue, 4 Dec 2018 16:37:26 -0700 Message-Id: <20181204233729.26776-24-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This adds support for sync/async O_DIRECT to make a bvec type iter for bdev access, as well as iomap. Signed-off-by: Jens Axboe --- fs/block_dev.c | 16 ++++++++++++---- fs/iomap.c | 5 ++++- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index b8f574615792..236c6abe649d 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -219,7 +219,10 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, bio.bi_end_io = blkdev_bio_end_io_simple; bio.bi_ioprio = iocb->ki_ioprio; - ret = bio_iov_iter_get_pages(&bio, iter); + if (iov_iter_is_bvec(iter)) + ret = bio_iov_bvec_add_pages(&bio, iter); + else + ret = bio_iov_iter_get_pages(&bio, iter); if (unlikely(ret)) goto out; ret = bio.bi_iter.bi_size; @@ -326,8 +329,9 @@ static void blkdev_bio_end_io(struct bio *bio) struct bio_vec *bvec; int i; - bio_for_each_segment_all(bvec, bio, i) - put_page(bvec->bv_page); + if (!bio_flagged(bio, BIO_HOLD_PAGES)) + bio_for_each_segment_all(bvec, bio, i) + put_page(bvec->bv_page); bio_put(bio); } } @@ -381,7 +385,11 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) bio->bi_end_io = blkdev_bio_end_io; bio->bi_ioprio = iocb->ki_ioprio; - ret = bio_iov_iter_get_pages(bio, iter); + if (iov_iter_is_bvec(iter)) + ret = bio_iov_bvec_add_pages(bio, iter); + else + ret = bio_iov_iter_get_pages(bio, iter); + if (unlikely(ret)) { bio->bi_status = BLK_STS_IOERR; bio_endio(bio); diff --git a/fs/iomap.c b/fs/iomap.c index bd483fcb7b5a..f00e5e198c57 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1673,7 +1673,10 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - ret = bio_iov_iter_get_pages(bio, &iter); + if (iov_iter_is_bvec(&iter)) + ret = bio_iov_bvec_add_pages(bio, &iter); + else + ret = bio_iov_iter_get_pages(bio, &iter); if (unlikely(ret)) { /* * We have to stop part way through an IO. We must fall From patchwork Tue Dec 4 23:37:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712817 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8159717D5 for ; Tue, 4 Dec 2018 23:38:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 758DE28A5E for ; Tue, 4 Dec 2018 23:38:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 677FE2A88F; Tue, 4 Dec 2018 23:38:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 88535293BD for ; Tue, 4 Dec 2018 23:38:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726712AbeLDXie (ORCPT ); Tue, 4 Dec 2018 18:38:34 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:41499 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726717AbeLDXie (ORCPT ); Tue, 4 Dec 2018 18:38:34 -0500 Received: by mail-pg1-f195.google.com with SMTP id 70so8091939pgh.8 for ; Tue, 04 Dec 2018 15:38:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9MIabrCZv2wQyfgdQ8RjyDsWgH6dv0dOreMfkqGsOyU=; b=SYUpn0F6F2xIg6+oEOxYeimcMQpJrCuWHHqNB09/67z9XQfJLyrFDUZWJYKVMeIk9J ZK/r0pb32AvtT858O1DLXFz6aU7c+8T+QoLsV6PSUjRqVS/ol6J7dgH8dCTR07kxik6R 9K8AD1eyysXWEoUAVntTdzxvczk+nZ/u+eUpIvh4FquV4NuHCeMYP9HNfPvrz+iUO4Bn E8Ii8BOF7zH5b9Fzblh0gMz1e7OAtZq167YVAxcpes45khck8ZNxgRhsz3OZtwuLyafW W/rtufdAGWRESfCcYJ6P0Rwl+4A+NJBevg3npAbDZilru58lrFqSRrsrih/2kDtht1Qw XIIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9MIabrCZv2wQyfgdQ8RjyDsWgH6dv0dOreMfkqGsOyU=; b=c/AXAyoI7ya1f9NYV/lzsBVJS/Li+KJYU+ctsgHyFIOZKIcc7dLQDeUG5QyoJNmt9H crVcKb698mjsJMVp72hh8vcSMiMzh9CCC/3/WFjLGvzmrTeh9E8Fj4i7Aepv5PGFePcR W6CNWo14Rxmo83/RUYQer+j4H3rsARvAuC6+yfP6+nRktWqZMJpqmdhCtIdt9/Gmher5 ne86Tpp+3l2m0HJC2lFcZFcDBe6uzxNPSSPDg9O2PIP8XECBeWLZLeNxmZTYaQr2wJOn iYsi8dH/tcmxuPX43AZFwREZKqw99hj4E3eh7sbn6KNHY0mqrWu/vw8MJK/0hjrngk7S n/MQ== X-Gm-Message-State: AA+aEWaUvIAMEoAkSBvZT+FV9vx7r4y4oU13m8cpvadOD9zP2+onOLrY MpuSAGYyOEF7sGPxEIoJsskKoPqGX+U= X-Google-Smtp-Source: AFSGD/VPN2sBwHhLQtqTzW3GRQvgKl9K5bnxX1tIc+70GAh0HMCK8xhsalnq6N7/Juc/eQLVQVwoig== X-Received: by 2002:a62:6385:: with SMTP id x127mr22379960pfb.15.1543966711993; Tue, 04 Dec 2018 15:38:31 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:31 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 24/26] aio: add support for pre-mapped user IO buffers Date: Tue, 4 Dec 2018 16:37:27 -0700 Message-Id: <20181204233729.26776-25-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If we have fixed user buffers, we can map them into the kernel when we setup the io_context. That avoids the need to do get_user_pages() for each and every IO. To utilize this feature, the application must set both IOCTX_FLAG_USERIOCB, to provide iocb's in userspace, and then IOCTX_FLAG_FIXEDBUFS. The latter tells aio that the iocbs that are mapped already contain valid destination and sizes. These buffers can then be mapped into the kernel for the life time of the io_context, as opposed to just the duration of the each single IO. Only works with non-vectored read/write commands for now, not with PREADV/PWRITEV. A limit of 4M is imposed as the largest buffer we currently support. There's nothing preventing us from going larger, but we need some cap, and 4M seemed like it would definitely be big enough. RLIMIT_MEMLOCK is used to cap the total amount of memory pinned. See the fio change for how to utilize this feature: http://git.kernel.dk/cgit/fio/commit/?id=2041bd343da1c1e955253f62374588718c64f0f3 Signed-off-by: Jens Axboe --- fs/aio.c | 193 ++++++++++++++++++++++++++++++++--- include/uapi/linux/aio_abi.h | 1 + 2 files changed, 177 insertions(+), 17 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 1735ec556089..1c8a8bb631a9 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -97,6 +98,11 @@ struct aio_mapped_range { long nr_pages; }; +struct aio_mapped_ubuf { + struct bio_vec *bvec; + unsigned int nr_bvecs; +}; + struct kioctx { struct percpu_ref users; atomic_t dead; @@ -132,6 +138,8 @@ struct kioctx { struct page **ring_pages; long nr_pages; + struct aio_mapped_ubuf *user_bufs; + struct aio_mapped_range iocb_range; struct rcu_work free_rwork; /* see free_ioctx() */ @@ -301,6 +309,7 @@ static const bool aio_use_state_req_list = false; static void aio_useriocb_unmap(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); +static void aio_iocb_buffer_unmap(struct kioctx *); static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) { @@ -662,6 +671,7 @@ static void free_ioctx(struct work_struct *work) free_rwork); pr_debug("freeing %p\n", ctx); + aio_iocb_buffer_unmap(ctx); aio_useriocb_unmap(ctx); aio_free_ring(ctx); free_percpu(ctx->cpu); @@ -1672,6 +1682,122 @@ static int aio_useriocb_map(struct kioctx *ctx, struct iocb __user *iocbs) return aio_map_range(&ctx->iocb_range, iocbs, size, 0); } +static void aio_iocb_buffer_unmap(struct kioctx *ctx) +{ + int i, j; + + if (!ctx->user_bufs) + return; + + for (i = 0; i < ctx->max_reqs; i++) { + struct aio_mapped_ubuf *amu = &ctx->user_bufs[i]; + + for (j = 0; j < amu->nr_bvecs; j++) + put_page(amu->bvec[j].bv_page); + + kfree(amu->bvec); + amu->nr_bvecs = 0; + } + + kfree(ctx->user_bufs); + ctx->user_bufs = NULL; +} + +static int aio_iocb_buffer_map(struct kioctx *ctx) +{ + unsigned long total_pages, page_limit; + struct page **pages = NULL; + int i, j, got_pages = 0; + struct iocb *iocb; + int ret = -EINVAL; + + ctx->user_bufs = kzalloc(ctx->max_reqs * sizeof(struct aio_mapped_ubuf), + GFP_KERNEL); + if (!ctx->user_bufs) + return -ENOMEM; + + /* Don't allow more pages than we can safely lock */ + total_pages = 0; + page_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + + for (i = 0; i < ctx->max_reqs; i++) { + struct aio_mapped_ubuf *amu = &ctx->user_bufs[i]; + unsigned long off, start, end, ubuf; + int pret, nr_pages; + size_t size; + + iocb = aio_iocb_from_index(ctx, i); + + /* + * Don't impose further limits on the size and buffer + * constraints here, we'll -EINVAL later when IO is + * submitted if they are wrong. + */ + ret = -EFAULT; + if (!iocb->aio_buf) + goto err; + + /* arbitrary limit, but we need something */ + if (iocb->aio_nbytes > SZ_4M) + goto err; + + ubuf = iocb->aio_buf; + end = (ubuf + iocb->aio_nbytes + PAGE_SIZE - 1) >> PAGE_SHIFT; + start = ubuf >> PAGE_SHIFT; + nr_pages = end - start; + + ret = -ENOMEM; + if (total_pages + nr_pages > page_limit) + goto err; + + if (!pages || nr_pages > got_pages) { + kfree(pages); + pages = kmalloc(nr_pages * sizeof(struct page *), + GFP_KERNEL); + if (!pages) + goto err; + got_pages = nr_pages; + } + + amu->bvec = kmalloc(nr_pages * sizeof(struct bio_vec), + GFP_KERNEL); + if (!amu->bvec) + goto err; + + down_write(¤t->mm->mmap_sem); + pret = get_user_pages((unsigned long) iocb->aio_buf, nr_pages, + 1, pages, NULL); + up_write(¤t->mm->mmap_sem); + + if (pret < nr_pages) { + if (pret < 0) + ret = pret; + goto err; + } + + off = ubuf & ~PAGE_MASK; + size = iocb->aio_nbytes; + for (j = 0; j < nr_pages; j++) { + size_t vec_len; + + vec_len = min_t(size_t, size, PAGE_SIZE - off); + amu->bvec[j].bv_page = pages[j]; + amu->bvec[j].bv_len = vec_len; + amu->bvec[j].bv_offset = off; + off = 0; + size -= vec_len; + } + amu->nr_bvecs = nr_pages; + total_pages += nr_pages; + } + kfree(pages); + return 0; +err: + kfree(pages); + aio_iocb_buffer_unmap(ctx); + return ret; +} + SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, iocbs, void __user *, user1, void __user *, user2, aio_context_t __user *, ctxp) @@ -1682,7 +1808,8 @@ SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, if (user1 || user2) return -EINVAL; - if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL)) + if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL | + IOCTX_FLAG_FIXEDBUFS)) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1698,6 +1825,15 @@ SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, ret = aio_useriocb_map(ioctx, iocbs); if (ret) goto err; + if (flags & IOCTX_FLAG_FIXEDBUFS) { + ret = aio_iocb_buffer_map(ioctx); + if (ret) + goto err; + } + } else if (flags & IOCTX_FLAG_FIXEDBUFS) { + /* can only support fixed bufs with user mapped iocbs */ + ret = -EINVAL; + goto err; } ret = put_user(ioctx->user_id, ctxp); @@ -1975,23 +2111,39 @@ static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb, return ret; } -static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec, - bool vectored, bool compat, struct iov_iter *iter) +static int aio_setup_rw(int rw, struct aio_kiocb *kiocb, + const struct iocb *iocb, struct iovec **iovec, bool vectored, + bool compat, bool kaddr, struct iov_iter *iter) { - void __user *buf = (void __user *)(uintptr_t)iocb->aio_buf; + void __user *ubuf = (void __user *)(uintptr_t)iocb->aio_buf; size_t len = iocb->aio_nbytes; if (!vectored) { - ssize_t ret = import_single_range(rw, buf, len, *iovec, iter); + ssize_t ret; + + if (!kaddr) { + ret = import_single_range(rw, ubuf, len, *iovec, iter); + } else { + long index = (long) kiocb->ki_user_iocb; + struct aio_mapped_ubuf *amu; + + /* __io_submit_one() already validated the index */ + amu = &kiocb->ki_ctx->user_bufs[index]; + iov_iter_bvec(iter, rw, amu->bvec, amu->nr_bvecs, len); + ret = 0; + + } *iovec = NULL; return ret; } + if (kaddr) + return -EINVAL; #ifdef CONFIG_COMPAT if (compat) - return compat_import_iovec(rw, buf, len, UIO_FASTIOV, iovec, + return compat_import_iovec(rw, ubuf, len, UIO_FASTIOV, iovec, iter); #endif - return import_iovec(rw, buf, len, UIO_FASTIOV, iovec, iter); + return import_iovec(rw, ubuf, len, UIO_FASTIOV, iovec, iter); } static inline void aio_rw_done(struct kiocb *req, ssize_t ret) @@ -2064,7 +2216,7 @@ static void aio_iopoll_iocb_issued(struct aio_submit_state *state, static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, struct aio_submit_state *state, bool vectored, - bool compat) + bool compat, bool kaddr) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -2084,9 +2236,11 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, if (unlikely(!file->f_op->read_iter)) goto out_fput; - ret = aio_setup_rw(READ, iocb, &iovec, vectored, compat, &iter); + ret = aio_setup_rw(READ, kiocb, iocb, &iovec, vectored, compat, kaddr, + &iter); if (ret) goto out_fput; + ret = rw_verify_area(READ, file, &req->ki_pos, iov_iter_count(&iter)); if (!ret) aio_rw_done(req, call_read_iter(file, req, &iter)); @@ -2099,7 +2253,7 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, struct aio_submit_state *state, bool vectored, - bool compat) + bool compat, bool kaddr) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -2119,7 +2273,8 @@ static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, if (unlikely(!file->f_op->write_iter)) goto out_fput; - ret = aio_setup_rw(WRITE, iocb, &iovec, vectored, compat, &iter); + ret = aio_setup_rw(WRITE, kiocb, iocb, &iovec, vectored, compat, kaddr, + &iter); if (ret) goto out_fput; ret = rw_verify_area(WRITE, file, &req->ki_pos, iov_iter_count(&iter)); @@ -2358,7 +2513,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, struct iocb __user *user_iocb, - struct aio_submit_state *state, bool compat) + struct aio_submit_state *state, bool compat, + bool kaddr) { struct aio_kiocb *req; ssize_t ret; @@ -2418,16 +2574,16 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EINVAL; switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(req, iocb, state, false, compat); + ret = aio_read(req, iocb, state, false, compat, kaddr); break; case IOCB_CMD_PWRITE: - ret = aio_write(req, iocb, state, false, compat); + ret = aio_write(req, iocb, state, false, compat, kaddr); break; case IOCB_CMD_PREADV: - ret = aio_read(req, iocb, state, true, compat); + ret = aio_read(req, iocb, state, true, compat, kaddr); break; case IOCB_CMD_PWRITEV: - ret = aio_write(req, iocb, state, true, compat); + ret = aio_write(req, iocb, state, true, compat, kaddr); break; case IOCB_CMD_FSYNC: if (ctx->flags & IOCTX_FLAG_IOPOLL) @@ -2479,6 +2635,7 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, struct aio_submit_state *state, bool compat) { struct iocb iocb, *iocbp; + bool kaddr; if (ctx->flags & IOCTX_FLAG_USERIOCB) { unsigned long iocb_index = (unsigned long) user_iocb; @@ -2486,14 +2643,16 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, if (iocb_index >= ctx->max_reqs) return -EINVAL; + kaddr = (ctx->flags & IOCTX_FLAG_FIXEDBUFS) != 0; iocbp = aio_iocb_from_index(ctx, iocb_index); } else { if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) return -EFAULT; + kaddr = false; iocbp = &iocb; } - return __io_submit_one(ctx, iocbp, user_iocb, state, compat); + return __io_submit_one(ctx, iocbp, user_iocb, state, compat, kaddr); } #ifdef CONFIG_BLOCK diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index ea0b9a19f4df..05d72cf86bd3 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -110,6 +110,7 @@ struct iocb { #define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ #define IOCTX_FLAG_IOPOLL (1 << 1) /* io_context is polled */ +#define IOCTX_FLAG_FIXEDBUFS (1 << 2) /* IO buffers are fixed */ #undef IFBIG #undef IFLITTLE From patchwork Tue Dec 4 23:37:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712809 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6CBF1109C for ; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C27829D8E for ; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4F6C929D16; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA14229489 for ; Tue, 4 Dec 2018 23:38:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726728AbeLDXif (ORCPT ); Tue, 4 Dec 2018 18:38:35 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:41293 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726717AbeLDXif (ORCPT ); Tue, 4 Dec 2018 18:38:35 -0500 Received: by mail-pl1-f196.google.com with SMTP id u6so9065896plm.8 for ; Tue, 04 Dec 2018 15:38:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wbgdkuX/L3fLyZ228OAfSCXQBLVO7eMAEh2tepSbBGU=; b=bBXaskkDSqkQosN6wAk5w9WEYOsU3jrcxjyNCFdyCv8VLVsw1ua0z7El8d/AxvMACx TJOn/uN82PHV+BPLmdhmj8e5c43c5cYCKX9ALTmHqKe0ROvL+e+K9KjgCL4JISUx5Qi5 Y4zOz47efWiMCDQ2qfwPmvtQoLbPNn5pD74t3t/K+B+tVdEUKSay5H4Y8VbU9vIf2YwB 8bl3w4d/xkSG4gC/F+gysHcf7MoZig950C+DBUe1bm9jG2f80Fw1Y7sIeFgtzx1hI831 a03Qx55MDKWSHb/8ivcvx15gVrcDaZxtYUqUTrfNqd8qC9Gz5fHxOcvO87sYxzrsM4Sy OJKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wbgdkuX/L3fLyZ228OAfSCXQBLVO7eMAEh2tepSbBGU=; b=MZjn2TUzvvzGAhDI9bksUFKLpLCfvBHlTSA4oDLybekTz92iU0l8QmcTsQDFQaDvYm uQdt9exhqyzYdrYR9KsqTqwNu62jLnsWme40D3m7p9SaDjp+bdj5icb+M/yHKPZp+tos jKTUFsW6bd1qOzLdOPKevcPQYA7fw5CPzxXeIDv0q8yiMvRPCHXnmV24apc+QA2krWoP 7bw2pP+JG3m+U1YRLo64wKP5IotmdMn23F8hYZT8jxocIUtdQk+/6xF2hdjWF5VmjW/k Up90L1ha3UigbKpY0zxHSSD9Mi+3zonm+vO7zubWDG9iWN+OiSdIySrZC1TMQuc5Ks1B nBDg== X-Gm-Message-State: AA+aEWbcMjQ/mXJpMusMQM2/bKdtEnm5nnNV+j4jIvI9tScR6idZgnMc 4z09Z97R+5ifsKp26YOsiGcWj7lUUoM= X-Google-Smtp-Source: AFSGD/UOifFDnvRfNi0SxdjVBi403RuKY1dM3CB2s5Z6LuBFqwwTeNQ6AkiKKoUINkT/fAEXojQ8dw== X-Received: by 2002:a17:902:541:: with SMTP id 59mr22450519plf.88.1543966714050; Tue, 04 Dec 2018 15:38:34 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:33 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 25/26] aio: split old ring complete out from aio_complete() Date: Tue, 4 Dec 2018 16:37:28 -0700 Message-Id: <20181204233729.26776-26-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Jens Axboe --- fs/aio.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 1c8a8bb631a9..39aaffd6d436 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1218,12 +1218,9 @@ static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, ev->res2 = res2; } -/* aio_complete - * Called when the io request on the given iocb is complete. - */ -static void aio_complete(struct aio_kiocb *iocb, long res, long res2) +static void aio_ring_complete(struct kioctx *ctx, struct aio_kiocb *iocb, + long res, long res2) { - struct kioctx *ctx = iocb->ki_ctx; struct aio_ring *ring; struct io_event *ev_page, *event; unsigned tail, pos, head; @@ -1273,6 +1270,16 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2) spin_unlock_irqrestore(&ctx->completion_lock, flags); pr_debug("added to ring %p at [%u]\n", iocb, tail); +} + +/* aio_complete + * Called when the io request on the given iocb is complete. + */ +static void aio_complete(struct aio_kiocb *iocb, long res, long res2) +{ + struct kioctx *ctx = iocb->ki_ctx; + + aio_ring_complete(ctx, iocb, res, res2); /* * Check if the user asked us to deliver the result through an From patchwork Tue Dec 4 23:37:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10712819 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C358C109C for ; Tue, 4 Dec 2018 23:38:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6E8928A5E for ; Tue, 4 Dec 2018 23:38:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2556293BD; Tue, 4 Dec 2018 23:38:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9C712A1E5 for ; Tue, 4 Dec 2018 23:38:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726717AbeLDXii (ORCPT ); Tue, 4 Dec 2018 18:38:38 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:38736 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726720AbeLDXih (ORCPT ); Tue, 4 Dec 2018 18:38:37 -0500 Received: by mail-pg1-f193.google.com with SMTP id g189so8095340pgc.5 for ; Tue, 04 Dec 2018 15:38:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wT27u5aIuXpgF5eCmNuXfKGXBPGExXzVdUoLndFGu/Q=; b=Nxbz/3VguBRqt7vwugJEYYgycRFpY1LO3uf3/BW/7Q3EGa1tMtcRub8tKAxp0TlbuR LXsb64g9QyHsuXzr+/U5PfPzm7Y1kwIh1BuzB+Bdn3s0BsDCwQOP2D/NI2r3Gi60F84u KinkINZP0IZ4g/8NFguBkEol/+4lnRsPQLGNAgxt7bsXMS6gnmtnP6oqsZgyca2o6EhL sG4brFTLdnYBtktociWcGyUZWYzOpiNArwRyv7uY7XKKzb5PLOKoqkf/wr7rhr0L5Xbk 6z1iBuZBFdEz71IJyFhsPgbANvEilonITK9sxgazYrZkzuGR7ou6fAr/uNY8wLGaqbY1 SSsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wT27u5aIuXpgF5eCmNuXfKGXBPGExXzVdUoLndFGu/Q=; b=unH9FnMFKef6SK/X5EftpJ5HuSE+JOvc8pta88TQ/fWNaU5zvVImwdd/ne2zWgBR67 p2kQZ/qZu2BntMbIxxi2yByW1AxCmgWBAn8cQKEjmt1eWwKCEIaWfRovz3sjVDjkJyzu y/87GGnlfAfcEZbzWYsekUT5IBw/Hm4vdzSe/cKuTd3o6fLe7PcTL6zA2W0oLProHMOe DTeVv12PiN6kU/lxFAofpNHaI+5dK494VNR8U//d5Jk/ELHk1BsUZS4OOWMVLsHP/SgJ 1/y+e7/TT+2ZLuA9BA9ntDWuv47wYsEGT/6zYaPzqjCoFwHpYmM1mT7bFrOhje7B4Nvf +j0A== X-Gm-Message-State: AA+aEWZum8CziN4NWsppP1IHcY3fox2KJGV2jkiBMv6GShXJGCkXgK6l ygBwmC2DlC/en6CiuSdyGub0ObdWKWg= X-Google-Smtp-Source: AFSGD/VNyff7N//ntYx/471IgRWzaIZ+9qRAQtHGqQByKO2/h+QmwzCBqW8K1ccCtOReCqhnPVefpQ== X-Received: by 2002:a62:d148:: with SMTP id t8mr22637590pfl.52.1543966715906; Tue, 04 Dec 2018 15:38:35 -0800 (PST) Received: from x1.localdomain (66.29.188.166.static.utbb.net. [66.29.188.166]) by smtp.gmail.com with ESMTPSA id t13sm22527635pgr.42.2018.12.04.15.38.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 15:38:35 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, jmoyer@redhat.com, Jens Axboe Subject: [PATCH 26/26] aio: add support for submission/completion rings Date: Tue, 4 Dec 2018 16:37:29 -0700 Message-Id: <20181204233729.26776-27-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181204233729.26776-1-axboe@kernel.dk> References: <20181204233729.26776-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Experimental support for submitting and completing IO through rings shared between the application and kernel. The submission rings are struct iocb, like we would submit through io_submit(), and the completion rings are struct io_event, like we would pass in (and copy back) from io_getevents(). A new system call is added for this, io_ring_enter(). This system call submits IO that is queued in the SQ ring, and/or completes IO and stores the results in the CQ ring. This could be augmented with a kernel thread that does the submission and polling, then the application would never have to enter the kernel to do IO. Sample application: http://brick.kernel.dk/snaps/aio-ring.c Signed-off-by: Jens Axboe --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + fs/aio.c | 312 +++++++++++++++++++++++-- include/linux/syscalls.h | 4 +- include/uapi/linux/aio_abi.h | 26 +++ 4 files changed, 323 insertions(+), 20 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 67c357225fb0..55a26700a637 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -344,6 +344,7 @@ 333 common io_pgetevents __x64_sys_io_pgetevents 334 common rseq __x64_sys_rseq 335 common io_setup2 __x64_sys_io_setup2 +336 common io_ring_enter __x64_sys_io_ring_enter # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/aio.c b/fs/aio.c index 39aaffd6d436..6024c6943d7d 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -142,6 +142,10 @@ struct kioctx { struct aio_mapped_range iocb_range; + /* if used, completion and submission rings */ + struct aio_mapped_range sq_ring; + struct aio_mapped_range cq_ring; + struct rcu_work free_rwork; /* see free_ioctx() */ /* @@ -297,6 +301,8 @@ static const struct address_space_operations aio_ctx_aops; static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); +static const unsigned int event_page_shift = + ilog2(PAGE_SIZE / sizeof(struct io_event)); /* * We rely on block level unplugs to flush pending requests, if we schedule @@ -307,6 +313,7 @@ static const bool aio_use_state_req_list = true; static const bool aio_use_state_req_list = false; #endif +static void aio_scqring_unmap(struct kioctx *); static void aio_useriocb_unmap(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); static void aio_iocb_buffer_unmap(struct kioctx *); @@ -673,6 +680,7 @@ static void free_ioctx(struct work_struct *work) aio_iocb_buffer_unmap(ctx); aio_useriocb_unmap(ctx); + aio_scqring_unmap(ctx); aio_free_ring(ctx); free_percpu(ctx->cpu); percpu_ref_exit(&ctx->reqs); @@ -1218,6 +1226,47 @@ static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, ev->res2 = res2; } +static struct io_event *__aio_get_cqring_ev(struct aio_io_event_ring *ring, + struct aio_mapped_range *range, + unsigned *next_tail) +{ + struct io_event *ev; + unsigned tail; + + smp_rmb(); + tail = READ_ONCE(ring->tail); + *next_tail = tail + 1; + if (*next_tail == ring->nr_events) + *next_tail = 0; + if (*next_tail == READ_ONCE(ring->head)) + return NULL; + + /* io_event array starts offset one into the mapped range */ + tail++; + ev = page_address(range->pages[tail >> event_page_shift]); + tail &= ((1 << event_page_shift) - 1); + return ev + tail; +} + +static void aio_commit_cqring(struct kioctx *ctx, unsigned next_tail) +{ + struct aio_io_event_ring *ring; + + ring = page_address(ctx->cq_ring.pages[0]); + if (next_tail != ring->tail) { + ring->tail = next_tail; + smp_wmb(); + } +} + +static struct io_event *aio_peek_cqring(struct kioctx *ctx, unsigned *ntail) +{ + struct aio_io_event_ring *ring; + + ring = page_address(ctx->cq_ring.pages[0]); + return __aio_get_cqring_ev(ring, &ctx->cq_ring, ntail); +} + static void aio_ring_complete(struct kioctx *ctx, struct aio_kiocb *iocb, long res, long res2) { @@ -1279,7 +1328,17 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2) { struct kioctx *ctx = iocb->ki_ctx; - aio_ring_complete(ctx, iocb, res, res2); + if (ctx->flags & IOCTX_FLAG_SCQRING) { + struct io_event *ev; + unsigned int tail; + + /* Can't fail, we have a ring reservation */ + ev = aio_peek_cqring(ctx, &tail); + aio_fill_event(ev, iocb, res, res2); + aio_commit_cqring(ctx, tail); + } else { + aio_ring_complete(ctx, iocb, res, res2); + } /* * Check if the user asked us to deliver the result through an @@ -1421,6 +1480,9 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, return 0; list_for_each_entry_safe(iocb, n, &ctx->poll_completing, ki_list) { + struct io_event *ev = NULL; + unsigned int next_tail; + if (*nr_events == max) break; if (!test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) @@ -1428,6 +1490,14 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, if (to_free == AIO_IOPOLL_BATCH) iocb_put_many(ctx, iocbs, &to_free); + /* Will only happen if the application over-commits */ + ret = -EAGAIN; + if (ctx->flags & IOCTX_FLAG_SCQRING) { + ev = aio_peek_cqring(ctx, &next_tail); + if (!ev) + break; + } + list_del(&iocb->ki_list); iocbs[to_free++] = iocb; @@ -1446,8 +1516,11 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, file_count = 1; } - if (evs && copy_to_user(evs + *nr_events, &iocb->ki_ev, - sizeof(iocb->ki_ev))) { + if (ev) { + memcpy(ev, &iocb->ki_ev, sizeof(*ev)); + aio_commit_cqring(ctx, next_tail); + } else if (evs && copy_to_user(evs + *nr_events, &iocb->ki_ev, + sizeof(iocb->ki_ev))) { ret = -EFAULT; break; } @@ -1625,15 +1698,42 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, return ret; } -static struct iocb *aio_iocb_from_index(struct kioctx *ctx, int index) +static struct iocb *__aio_sqring_from_index(struct aio_iocb_ring *ring, + struct aio_mapped_range *range, + int index) { struct iocb *iocb; - iocb = page_address(ctx->iocb_range.pages[index >> iocb_page_shift]); + /* iocb array starts offset one into the mapped range */ + index++; + iocb = page_address(range->pages[index >> iocb_page_shift]); index &= ((1 << iocb_page_shift) - 1); return iocb + index; } +static struct iocb *aio_sqring_from_index(struct kioctx *ctx, int index) +{ + struct aio_iocb_ring *ring; + + ring = page_address(ctx->sq_ring.pages[0]); + return __aio_sqring_from_index(ring, &ctx->sq_ring, index); +} + +static struct iocb *aio_iocb_from_index(struct kioctx *ctx, int index) +{ + struct iocb *iocb; + + if (ctx->flags & IOCTX_FLAG_SCQRING) { + iocb = aio_sqring_from_index(ctx, index); + } else { + iocb = page_address(ctx->iocb_range.pages[index >> iocb_page_shift]); + index &= ((1 << iocb_page_shift) - 1); + iocb += index; + } + + return iocb; +} + static void aio_unmap_range(struct aio_mapped_range *range) { int i; @@ -1689,6 +1789,43 @@ static int aio_useriocb_map(struct kioctx *ctx, struct iocb __user *iocbs) return aio_map_range(&ctx->iocb_range, iocbs, size, 0); } +static void aio_scqring_unmap(struct kioctx *ctx) +{ + aio_unmap_range(&ctx->sq_ring); + aio_unmap_range(&ctx->cq_ring); +} + +static int aio_scqring_map(struct kioctx *ctx, + struct aio_iocb_ring __user *sq_ring, + struct aio_io_event_ring __user *cq_ring) +{ + struct aio_iocb_ring *ksq_ring; + struct aio_io_event_ring *kcq_ring; + size_t size; + int ret; + + size = (1 + ctx->max_reqs) * sizeof(struct iocb); + ret = aio_map_range(&ctx->sq_ring, sq_ring, size, 0); + if (ret) + return ret; + + size = (1 + ctx->max_reqs) * sizeof(struct io_event); + ret = aio_map_range(&ctx->cq_ring, cq_ring, size, FOLL_WRITE); + if (ret) { + aio_unmap_range(&ctx->sq_ring); + return ret; + } + + ksq_ring = page_address(ctx->sq_ring.pages[0]); + ksq_ring->nr_events = ctx->max_reqs; + ksq_ring->head = ksq_ring->tail = 0; + + kcq_ring = page_address(ctx->cq_ring.pages[0]); + kcq_ring->nr_events = ctx->max_reqs; + kcq_ring->head = kcq_ring->tail = 0; + return 0; +} + static void aio_iocb_buffer_unmap(struct kioctx *ctx) { int i, j; @@ -1805,18 +1942,18 @@ static int aio_iocb_buffer_map(struct kioctx *ctx) return ret; } -SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, - iocbs, void __user *, user1, void __user *, user2, +SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, + struct iocb __user *, iocbs, + struct aio_iocb_ring __user *, sq_ring, + struct aio_io_event_ring __user *, cq_ring, aio_context_t __user *, ctxp) { struct kioctx *ioctx; unsigned long ctx; long ret; - if (user1 || user2) - return -EINVAL; if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL | - IOCTX_FLAG_FIXEDBUFS)) + IOCTX_FLAG_FIXEDBUFS | IOCTX_FLAG_SCQRING)) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1829,18 +1966,26 @@ SYSCALL_DEFINE6(io_setup2, u32, nr_events, u32, flags, struct iocb __user *, goto out; if (flags & IOCTX_FLAG_USERIOCB) { + ret = -EINVAL; + if (flags & IOCTX_FLAG_SCQRING) + goto err; + ret = aio_useriocb_map(ioctx, iocbs); if (ret) goto err; - if (flags & IOCTX_FLAG_FIXEDBUFS) { - ret = aio_iocb_buffer_map(ioctx); - if (ret) - goto err; - } - } else if (flags & IOCTX_FLAG_FIXEDBUFS) { - /* can only support fixed bufs with user mapped iocbs */ + } + if (flags & IOCTX_FLAG_SCQRING) { + ret = aio_scqring_map(ioctx, sq_ring, cq_ring); + if (ret) + goto err; + } + if (flags & IOCTX_FLAG_FIXEDBUFS) { ret = -EINVAL; - goto err; + if (!(flags & (IOCTX_FLAG_USERIOCB | IOCTX_FLAG_SCQRING))) + goto err; + ret = aio_iocb_buffer_map(ioctx); + if (ret) + goto err; } ret = put_user(ioctx->user_id, ctxp); @@ -2706,6 +2851,128 @@ static void aio_submit_state_start(struct aio_submit_state *state, #endif } +static struct iocb *__aio_get_sqring(struct aio_iocb_ring *ring, + struct aio_mapped_range *range, + unsigned *next_head) +{ + unsigned head; + + smp_rmb(); + head = READ_ONCE(ring->head); + if (head == READ_ONCE(ring->tail)) + return NULL; + + *next_head = head + 1; + if (*next_head == ring->nr_events) + *next_head = 0; + + return __aio_sqring_from_index(ring, range, head); +} + +static void aio_commit_sqring(struct kioctx *ctx, unsigned next_head) +{ + struct aio_iocb_ring *ring; + + ring = page_address(ctx->sq_ring.pages[0]); + if (ring->head != next_head) { + ring->head = next_head; + smp_wmb(); + } +} + +static const struct iocb *aio_peek_sqring(struct kioctx *ctx, unsigned *nhead) +{ + struct aio_iocb_ring *ring; + + ring = page_address(ctx->sq_ring.pages[0]); + return __aio_get_sqring(ring, &ctx->sq_ring, nhead); +} + +static int aio_ring_submit(struct kioctx *ctx, unsigned int to_submit) +{ + struct aio_submit_state state, *statep = NULL; + int i, ret = 0, submit = 0; + + if (to_submit > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx, to_submit); + statep = &state; + } + + for (i = 0; i < to_submit; i++) { + const struct iocb *iocb; + unsigned int next_head; + + iocb = aio_peek_sqring(ctx, &next_head); + if (!iocb) + break; + + ret = __io_submit_one(ctx, iocb, NULL, NULL, false, true); + if (ret) + break; + + submit++; + aio_commit_sqring(ctx, next_head); + } + + if (statep) + aio_submit_state_end(statep); + + return submit ? submit : ret; +} + +static int __io_ring_enter(struct kioctx *ctx, unsigned int to_submit, + unsigned int min_complete, unsigned int flags) +{ + int ret = 0; + + if (flags & IORING_FLAG_SUBMIT) { + ret = aio_ring_submit(ctx, to_submit); + if (ret < 0) + return ret; + } + if (flags & IORING_FLAG_GETEVENTS) { + unsigned int nr_events = 0; + int get_ret; + + get_ret = __aio_iopoll_check(ctx, NULL, &nr_events, + min_complete, -1U); + if (get_ret < 0 && !ret) + ret = get_ret; + } + + return ret; +} + +SYSCALL_DEFINE4(io_ring_enter, aio_context_t, ctx_id, u32, to_submit, + u32, min_complete, u32, flags) +{ + struct kioctx *ctx; + long ret; + + BUILD_BUG_ON(sizeof(struct aio_iocb_ring) != sizeof(struct iocb)); + BUILD_BUG_ON(sizeof(struct aio_io_event_ring) != + sizeof(struct io_event)); + + ctx = lookup_ioctx(ctx_id); + if (!ctx) { + pr_debug("EINVAL: invalid context id\n"); + return -EINVAL; + } + + ret = -EBUSY; + if (!mutex_trylock(&ctx->getevents_lock)) + goto err; + + ret = -EINVAL; + if (ctx->flags & IOCTX_FLAG_SCQRING) + ret = __io_ring_enter(ctx, to_submit, min_complete, flags); + + mutex_unlock(&ctx->getevents_lock); +err: + percpu_ref_put(&ctx->users); + return ret; +} + /* sys_io_submit: * Queue the nr iocbs pointed to by iocbpp for processing. Returns * the number of iocbs queued. May return -EINVAL if the aio_context @@ -2735,6 +3002,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, return -EINVAL; } + /* SCQRING must use io_ring_enter() */ + if (ctx->flags & IOCTX_FLAG_SCQRING) + return -EINVAL; + if (nr > ctx->nr_events) nr = ctx->nr_events; @@ -2886,7 +3157,10 @@ static long do_io_getevents(aio_context_t ctx_id, long ret = -EINVAL; if (likely(ioctx)) { - if (likely(min_nr <= nr && min_nr >= 0)) { + /* SCQRING must use io_ring_enter() */ + if (ioctx->flags & IOCTX_FLAG_SCQRING) + ret = -EINVAL; + else if (min_nr <= nr && min_nr >= 0) { if (ioctx->flags & IOCTX_FLAG_IOPOLL) ret = aio_iopoll_check(ioctx, min_nr, nr, events); else diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index a20a663d583f..576725d00020 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -288,8 +288,10 @@ static inline void addr_limit_user_check(void) #ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx); asmlinkage long sys_io_setup2(unsigned, unsigned, struct iocb __user *, - void __user *, void __user *, + struct aio_iocb_ring __user *, + struct aio_io_event_ring __user *, aio_context_t __user *); +asmlinkage long sys_io_ring_enter(aio_context_t, unsigned, unsigned, unsigned); asmlinkage long sys_io_destroy(aio_context_t ctx); asmlinkage long sys_io_submit(aio_context_t, long, struct iocb __user * __user *); diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index 05d72cf86bd3..9fb7d0ec868f 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -111,6 +111,32 @@ struct iocb { #define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ #define IOCTX_FLAG_IOPOLL (1 << 1) /* io_context is polled */ #define IOCTX_FLAG_FIXEDBUFS (1 << 2) /* IO buffers are fixed */ +#define IOCTX_FLAG_SCQRING (1 << 3) /* Use SQ/CQ rings */ + +struct aio_iocb_ring { + union { + struct { + u32 head, tail; + u32 nr_events; + }; + struct iocb pad_iocb; + }; + struct iocb iocbs[0]; +}; + +struct aio_io_event_ring { + union { + struct { + u32 head, tail; + u32 nr_events; + }; + struct io_event pad_event; + }; + struct io_event events[0]; +}; + +#define IORING_FLAG_SUBMIT (1 << 0) +#define IORING_FLAG_GETEVENTS (1 << 1) #undef IFBIG #undef IFLITTLE