From patchwork Tue Nov 20 17:19:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690907 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 729575A4 for ; Tue, 20 Nov 2018 17:20:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 573702ACDD for ; Tue, 20 Nov 2018 17:20:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4BCAC2ACE0; Tue, 20 Nov 2018 17:20:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E61B22ACDE for ; Tue, 20 Nov 2018 17:20:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726314AbeKUDuN (ORCPT ); Tue, 20 Nov 2018 22:50:13 -0500 Received: from mail-io1-f68.google.com ([209.85.166.68]:42993 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730339AbeKUDuM (ORCPT ); Tue, 20 Nov 2018 22:50:12 -0500 Received: by mail-io1-f68.google.com with SMTP id x6so1939602ioa.9 for ; Tue, 20 Nov 2018 09:19:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vjQmCjtMfgeSTXM33EwWkwcAAZSfGgg/BnpXf4PVsww=; b=Pzl3bG8djEWQw6DmvK7B/f3Ij5P1N52alFFM4EnVN+XXt4KmNE2OZb1/mT4f+xOw9y VwRz5UpzCBH7RwdyToYNE5P7uMI/WwukvkISxjfZdgAYYPYuHaPUmNsm3c0d9UcLg82C zy1gEfZPWBoPXz5VhFhnugQkalUlGO07AwPeUk/EdR7LtE9LVyN+HJPkHM1cCAVfKVZL 3P14LVfpmOS0vnd2NCOKszW+HjrmN6UZjTPi7ITsDErufkHeOfAsO4CqEn6+EjMgzhjI TmXkpYYO2yyi4/NEk6ockHKFh5Yj7AQgQOsKdajOd/lwdLfgGR08+TB6B85/ZS5TSldo Rprg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vjQmCjtMfgeSTXM33EwWkwcAAZSfGgg/BnpXf4PVsww=; b=mh0e79aLwqPvf6tHF4pItDHMZBLD0fNQRQP2x+i5NjDKaEg0l4ozvHBL5ybmnTeWPq Owapxv3YXEb3jlju6tnUT0rY0rtXitGnfNR+KWReUvpZeIdzSJZveSdsoKNOGmlZlYa/ nCaT0DzeGwO8yD5HP+A+Nh0oW1wro56v2W+oZ22Vmg5NJI7VUFOk/tFHXNHEFhGEqRCJ Spa5ONj54oKsjZlbp9z0AgSf2c1V7hSUHUruRV0hO6EGnw4T7DzvfKAx4oHaBoMu8D8E kKkeoxtHezjoah+eOM6K6PTMaWkOjNuk1jdKnzWSnkK/EmjSB5jJmc4BAmv4QDZuBrKV wy0Q== X-Gm-Message-State: AA+aEWZQ+hyKu9DUuAnTVpRRPQAk03RQNfusmOfd9j4L7A4CZDulRn9L KvcFvUFUuEE3w0Qp0/YnA8eb+EJdSCA= X-Google-Smtp-Source: AFSGD/UonK3hot73Mg4ZvjLkxxys/F+e0OUUYxFRFWb1owjmJv61iFUiM/4Kh+54npdkww3thoyVHw== X-Received: by 2002:a6b:ed04:: with SMTP id n4-v6mr2248372iog.106.1542734398384; Tue, 20 Nov 2018 09:19:58 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.19.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:19:57 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/8] fs: add file_operations ->iopoll() handler Date: Tue, 20 Nov 2018 10:19:46 -0700 Message-Id: <20181120171953.1258-2-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In preparation for adding async aio iopolling. Signed-off-by: Jens Axboe --- include/linux/fs.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/fs.h b/include/linux/fs.h index a1ab233e6469..9109c287437a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1781,6 +1781,7 @@ struct file_operations { ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); + int (*iopoll)(struct kiocb *kiocb, bool spin); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); From patchwork Tue Nov 20 17:19:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690911 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1BA7A1923 for ; Tue, 20 Nov 2018 17:20:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F36EE2ACDC for ; Tue, 20 Nov 2018 17:20:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E790F2ACDE; Tue, 20 Nov 2018 17:20:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 943CB2ACDD for ; Tue, 20 Nov 2018 17:20:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730339AbeKUDuP (ORCPT ); Tue, 20 Nov 2018 22:50:15 -0500 Received: from mail-io1-f67.google.com ([209.85.166.67]:33812 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730381AbeKUDuO (ORCPT ); Tue, 20 Nov 2018 22:50:14 -0500 Received: by mail-io1-f67.google.com with SMTP id f6so1960385iob.1 for ; Tue, 20 Nov 2018 09:20:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ubKXGzZN20TioLBNRFOAZwYfq7mQcy6VCcL/0ltMyYI=; b=B6tRJglzugIT3Wi0VCFahgnAjNDGABzlhAyqpTRrmWyyTVk4y5pxbidO3VcaRTnkS8 oAWLxQhWdO2BgR2hgMI5uo3nx973g/PqNKbfInv3iGQAAkaYlnxDSPdgfXI/cbP7MVG2 dXbLojAc4yxg4H09qYETUHSUNq8iV/padjcz8ljBBPNdmrIcy1YQsY6QMi/IQWptRbyB B17FOuqK2vsI8QLlweBTcGzfXB8H6/8LNRCiuw5qZRQCJ+xb9dev6TjpseDHng+hyEiL akAYXbqTx54DXNSIq2dJh5q95e/yFx7vXxNrIqlpv7/Q+BKt9MZ8LYZ9Gt4eO8V9+k// hEHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ubKXGzZN20TioLBNRFOAZwYfq7mQcy6VCcL/0ltMyYI=; b=BlrKvKjiWXotm5AyxXY07ZYJ+yymJklNUO4gYwWsRa96NKSf+riwx0Y3Z9NZIoAruO PdfqBkDNoxne5TcOZHk2whtIuF2QFmRsh6iKsYpo2dtLDqUuZf3X5KD+RN9uMS6x1UKz XKT9b+P3zIxi5RDFpKQ83e3+e3qrKMHBcHeMMSgGgAuAhBEbnIYJaaoM4T1KC097kv/t GJQKSCeVut5oEO2+1GP21JsUbgFPbvyZMMwkw8S3Nny6ZPLuHe6Hoj/j2LF/bKtJe/u5 /EZ7SfhfTp0WW4yTVxtbXlHh1+3410dwjPsxamYOklc8Qk2g3wXs0z46SbSQHXHPbe2s lR/g== X-Gm-Message-State: AA+aEWZZV6hc5vq2sCvwNEGyDiX2ASxE9SH85r4ib/zpvBcREPhwxahV 9Lz5+mqHLPIV56JIrfhlXDcOOAg4byg= X-Google-Smtp-Source: AFSGD/Wrc/pQ8CKOx/0vfTb3KPvmbZ5ogZcJHqFv6FHwvUJDSCkv0SO0U9+EZDTSUns65OJFOx/B9A== X-Received: by 2002:a6b:9383:: with SMTP id v125-v6mr1896758iod.282.1542734400355; Tue, 20 Nov 2018 09:20:00 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.19.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:19:59 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/8] block: wire up block device ->iopoll() Date: Tue, 20 Nov 2018 10:19:47 -0700 Message-Id: <20181120171953.1258-3-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Jens Axboe --- fs/block_dev.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index d233a59ea364..711cd5a3469e 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -273,6 +273,7 @@ struct blkdev_dio { }; size_t size; atomic_t ref; + blk_qc_t qc; bool multi_bio : 1; bool should_dirty : 1; bool is_sync : 1; @@ -281,6 +282,21 @@ struct blkdev_dio { static struct bio_set blkdev_dio_pool; +static int blkdev_iopoll(struct kiocb *kiocb, bool wait) +{ + struct blkdev_dio *dio = READ_ONCE(kiocb->private); + + /* dio can be NULL here, if the IO hasn't been submitted yet */ + if (dio) { + struct block_device *bdev; + + bdev = I_BDEV(kiocb->ki_filp->f_mapping->host); + return blk_poll(bdev_get_queue(bdev), READ_ONCE(dio->qc), wait); + } + + return 0; +} + static void blkdev_bio_end_io(struct bio *bio) { struct blkdev_dio *dio = bio->bi_private; @@ -335,7 +351,6 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) bool is_poll = (iocb->ki_flags & IOCB_HIPRI) != 0; bool is_read = (iov_iter_rw(iter) == READ), is_sync; loff_t pos = iocb->ki_pos; - blk_qc_t qc = BLK_QC_T_NONE; int ret = 0; if ((pos | iov_iter_alignment(iter)) & @@ -355,6 +370,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) dio->size = 0; dio->multi_bio = false; dio->should_dirty = is_read && iter_is_iovec(iter); + dio->qc = BLK_QC_T_NONE; + + WRITE_ONCE(iocb->private, dio); /* * Don't plug for HIPRI/polled IO, as those should go straight @@ -395,7 +413,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) if (iocb->ki_flags & IOCB_HIPRI) bio->bi_opf |= REQ_HIPRI; - qc = submit_bio(bio); + dio->qc = submit_bio(bio); break; } @@ -423,7 +441,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) break; if (!(iocb->ki_flags & IOCB_HIPRI) || - !blk_poll(bdev_get_queue(bdev), qc, true)) + !blk_poll(bdev_get_queue(bdev), dio->qc, true)) io_schedule(); } __set_current_state(TASK_RUNNING); @@ -2061,6 +2079,7 @@ const struct file_operations def_blk_fops = { .llseek = block_llseek, .read_iter = blkdev_read_iter, .write_iter = blkdev_write_iter, + .iopoll = blkdev_iopoll, .mmap = generic_file_mmap, .fsync = blkdev_fsync, .unlocked_ioctl = block_ioctl, From patchwork Tue Nov 20 17:19:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690921 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2691414E2 for ; Tue, 20 Nov 2018 17:20:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BAD52ACDE for ; Tue, 20 Nov 2018 17:20:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F3B0E2ACE7; Tue, 20 Nov 2018 17:20:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8D59F2ACDE for ; Tue, 20 Nov 2018 17:20:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730418AbeKUDuR (ORCPT ); Tue, 20 Nov 2018 22:50:17 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:55280 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730415AbeKUDuQ (ORCPT ); Tue, 20 Nov 2018 22:50:16 -0500 Received: by mail-it1-f195.google.com with SMTP id a205-v6so4485252itd.4 for ; Tue, 20 Nov 2018 09:20:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=rB3MVuM6w3WCAUAjcVZ4Da11wfbFSvSVLatFDmKZI40=; b=mxdLYtiX2fk9VMfstCWW4891FMhFEcah17xFR4QfoDQqmRnBvtuYlH/2SMuUCRzNxn 3FOexzf8p5V7gf6G2P2jGNTtZPOUMNbX7uUrZVozc5LYeX4OAyeoim5UIBLrJrbnVEYP skme+jQyBzDETWqEDpphp2pndZPJgtJJ3mvu3tM2SwvKhBJhmi8Of/aozPCoCrds8sdn vHIFjWTmSI3GD9MvRQZdWlcZ3BIChbOGvgRr8sNLmGpDrjEYinVXcYcAOEArhVrTbOs0 PA0HYMe35TPkq2i9CqNhoYZVj9r9mZ0Sy8Uh+9XF/y9dWGv2CJUU2FMdYdaoWEB3eYJV wkYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=rB3MVuM6w3WCAUAjcVZ4Da11wfbFSvSVLatFDmKZI40=; b=kMwiI1lSSBfTA97XefcG52J8vylUhOgTD3WWDOLwjT5F78nteXsQVwcvJvPBQriH/3 n0BvfCVKAdsHxR4Wp+A+sMzFvzH/x89Bz0hYgAQaNFqw/sWnb0lk4HS+Z16uSWYGC0vg JjdPVa/HCtQU+l+pKVUUxZAg0o9lDLIeNI1bjTluoRbcRJ3SM+8sigOK/qpDk9ccgZsE 4GwwXS4e8vI8DiTez1EBA2/w335wGqRHk2S28gB0pyi+JEvkuNsELmxOW2RDLHpBbTxT z7G61awGEPWtyMKGxvsMz6VDPkZNiTwLorYUps3I5utzf8kxIjIMLvR5Eybj8BcJvbf9 SJAg== X-Gm-Message-State: AGRZ1gKjyz5B7ZdNXaVdHT4KNbwfszQO6p7GPJYpzEcZEuLjU0UBT4/R 1WpVIOVTKE+IgEzzzsmRD0NEEPAuL+g= X-Google-Smtp-Source: AJdET5dtEz/95TlBDbxOFKbtvUTkpNwTV/d71r0Ry0VVVTlL3HLavTGsLKnPSaprpncDQbkE99HPiw== X-Received: by 2002:a02:1b1d:: with SMTP id l29mr2562147jad.98.1542734402144; Tue, 20 Nov 2018 09:20:02 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.20.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:20:01 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/8] iomap/xfs: wire up file_operations ->iopoll() Date: Tue, 20 Nov 2018 10:19:48 -0700 Message-Id: <20181120171953.1258-4-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add an iomap implementation of fops->iopoll() and wire it up for XFS. Signed-off-by: Jens Axboe --- fs/iomap.c | 50 +++++++++++++++++++++++++++++-------------- fs/xfs/xfs_file.c | 1 + include/linux/iomap.h | 1 + 3 files changed, 36 insertions(+), 16 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 74c1f37f0fd6..faf96198f99a 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1419,14 +1419,14 @@ struct iomap_dio { unsigned flags; int error; bool wait_for_completion; + blk_qc_t cookie; + struct request_queue *last_queue; union { /* used during submission and for synchronous completion: */ struct { struct iov_iter *iter; struct task_struct *waiter; - struct request_queue *last_queue; - blk_qc_t cookie; } submit; /* used for aio completion: */ @@ -1436,6 +1436,30 @@ struct iomap_dio { }; }; +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) +{ + struct iomap_dio *dio = kiocb->private; + struct request_queue *q = READ_ONCE(dio->last_queue); + + if (!q || dio->cookie == BLK_QC_T_NONE) + return 0; + return blk_poll(q, READ_ONCE(dio->cookie), spin); +} +EXPORT_SYMBOL_GPL(iomap_dio_iopoll); + +static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, + struct bio *bio) +{ + atomic_inc(&dio->ref); + + /* + * iomap_dio_iopoll can race with us. A non-zero last_queue marks that + * we are ready to poll. + */ + WRITE_ONCE(dio->cookie, submit_bio(bio)); + WRITE_ONCE(dio->last_queue, bdev_get_queue(iomap->bdev)); +} + static ssize_t iomap_dio_complete(struct iomap_dio *dio) { struct kiocb *iocb = dio->iocb; @@ -1548,7 +1572,7 @@ static void iomap_dio_bio_end_io(struct bio *bio) } } -static blk_qc_t +static void iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, unsigned len) { @@ -1568,9 +1592,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, get_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); - - atomic_inc(&dio->ref); - return submit_bio(bio); + iomap_dio_submit_bio(dio, iomap, bio); } static loff_t @@ -1676,11 +1698,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, copied += n; nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES); - - atomic_inc(&dio->ref); - - dio->submit.last_queue = bdev_get_queue(iomap->bdev); - dio->submit.cookie = submit_bio(bio); + iomap_dio_submit_bio(dio, iomap, bio); } while (nr_pages); if (need_zeroout) { @@ -1782,6 +1800,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio = kmalloc(sizeof(*dio), GFP_KERNEL); if (!dio) return -ENOMEM; + iocb->private = dio; dio->iocb = iocb; atomic_set(&dio->ref, 1); @@ -1791,11 +1810,11 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio->error = 0; dio->flags = 0; dio->wait_for_completion = is_sync_kiocb(iocb); + dio->cookie = BLK_QC_T_NONE; + dio->last_queue = NULL; dio->submit.iter = iter; dio->submit.waiter = current; - dio->submit.cookie = BLK_QC_T_NONE; - dio->submit.last_queue = NULL; if (iov_iter_rw(iter) == READ) { if (pos >= dio->i_size) @@ -1894,9 +1913,8 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, break; if (!(iocb->ki_flags & IOCB_HIPRI) || - !dio->submit.last_queue || - !blk_poll(dio->submit.last_queue, - dio->submit.cookie, true)) + !dio->last_queue || + !blk_poll(dio->last_queue, dio->cookie, true)) io_schedule(); } __set_current_state(TASK_RUNNING); diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 53c9ab8fb777..603e705781a4 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1203,6 +1203,7 @@ const struct file_operations xfs_file_operations = { .write_iter = xfs_file_write_iter, .splice_read = generic_file_splice_read, .splice_write = iter_file_splice_write, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = xfs_file_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = xfs_file_compat_ioctl, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 9a4258154b25..0fefb5455bda 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -162,6 +162,7 @@ typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret, unsigned flags); ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, iomap_dio_end_io_t end_io); +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin); #ifdef CONFIG_SWAP struct file; From patchwork Tue Nov 20 17:19:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690933 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC69C14E2 for ; Tue, 20 Nov 2018 17:20:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F1D42ACDD for ; Tue, 20 Nov 2018 17:20:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 832B92ACE0; Tue, 20 Nov 2018 17:20:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A4DE2ACDD for ; Tue, 20 Nov 2018 17:20:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730165AbeKUDvI (ORCPT ); Tue, 20 Nov 2018 22:51:08 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:45547 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730419AbeKUDuR (ORCPT ); Tue, 20 Nov 2018 22:50:17 -0500 Received: by mail-io1-f65.google.com with SMTP id w7so1926767iom.12 for ; Tue, 20 Nov 2018 09:20:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=NZHrUuNmmD/8Ezs95ryDGAkFRBh7DIEuoppRHZCFEdE=; b=NK8DNXI5wUhTjs+EA/1neL8dFcOk0uff8oGqKaMruDr4vLI54uoclLeZOov+f/83+/ gTJioW6b97bEnS+YR0IJnac36bLdyBR+DKJCANC3RAMULLzPUV3kkzBeS2ESNfcNGpyl HgGPgXKG00NA3GLJSj1nH5H0NgPZU5MSZ/7zBm6N7b5qwHXwvX0wKyERsafVfsElS8jj hppSZPoFOE79C7Qo08ZmfP0+i92cePjK015jMdBaAWKJOog+ji8EWVkHc2IMpuUF/a2T GYFQ1HzsEigfHsAaJDb6ZmYBMwl0xocOw2/Ej2iaIIgH1qTOAKi0AMmV6KGaMu86heRZ lvJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NZHrUuNmmD/8Ezs95ryDGAkFRBh7DIEuoppRHZCFEdE=; b=bIaSaxJu5sB8FEPV5kqFwtepGH9LaHXs7W/mspWdiPi3c1vmVx6yCll143gTUT71/7 /QHAjfOG7glx+1bvsXTBekTtaJcAOo0+FwShTTRaKDc+nl3Xko/r5m1DiPm3q+rAlCht 6PdbdI5tdkF1lV5t+dbNB8yuhTllgwDIIBIt1L3Z3wt5SkbK4tukaA04pqqv945ACkEa 5fAq6Lvi86az/XmC16Fg0JaO86vwO5MpNG/Nj3vQ5yu14/aam6SArqig3jJmYksG7qLA H+z3pbmsZPKP8swujs80fmCx596x04pRjYnwnDosims0gABk3P5rh1m+/5i43hlt5GwV E6wQ== X-Gm-Message-State: AA+aEWbemNUgk0ngNBw0yZDAEXPLaG0x2zFcelG16cEp7dkaWQLHLVLJ j694B1v4disXQjQ9hUFgFT9IY1uOAus= X-Google-Smtp-Source: AFSGD/UZkhBBUS4oWCrP51Bt68Fe5DTEgyCIRuRaCfbQ6DeZjlipSvMTiIhZ5PJNZDlw2qfuWx92fg== X-Received: by 2002:a6b:620d:: with SMTP id f13-v6mr2187530iog.11.1542734403636; Tue, 20 Nov 2018 09:20:03 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.20.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:20:02 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/8] aio: use assigned completion handler Date: Tue, 20 Nov 2018 10:19:49 -0700 Message-Id: <20181120171953.1258-5-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We know this is a read/write request, but in preparation for having different kinds of those, ensure that we call the assigned handler instead of assuming it's aio_complete_rq(). Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/aio.c b/fs/aio.c index b984918be4b7..574a14b43037 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1484,7 +1484,7 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) ret = -EINTR; /*FALLTHRU*/ default: - aio_complete_rw(req, ret, 0); + req->ki_complete(req, ret, 0); } } From patchwork Tue Nov 20 17:19:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690929 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D2D0C5A4 for ; Tue, 20 Nov 2018 17:20:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7BA82ACFD for ; Tue, 20 Nov 2018 17:20:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC13F2ACE0; Tue, 20 Nov 2018 17:20:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B22A2ACED for ; Tue, 20 Nov 2018 17:20:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730415AbeKUDuT (ORCPT ); Tue, 20 Nov 2018 22:50:19 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:36457 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730442AbeKUDuT (ORCPT ); Tue, 20 Nov 2018 22:50:19 -0500 Received: by mail-io1-f65.google.com with SMTP id m19so1959372ioh.3 for ; Tue, 20 Nov 2018 09:20:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YHmOeb+nH1WLQpjM9QAPFM+mPW2s3h03b5WAhc+cvsI=; b=h15ovWOimmTtizENiBaIZ5BjiSfy8UWj803Ap2T0SRzjQuKa/SihYM0ViFTKmFPAHa 3k6dB4BBpygC6ovEpH60uSmewYPG+pRSjjDbURYn7RljV5I1avOViRDyY7houNqTgkKI rDEvPYuLkNCmtOuZ2/QD/GoTCgk8IDhHyGjw03gtF716gJq38t9c12OfKsvJGWV2qL1z fAqLggSuVNmE6sNf2xcD9LIWVWJdkSsKt0yuNmYeqeEhGAoCcPuynY1WGK2tQMt7+tXA xsvVj7VuXppmn0BbFKebUwx6RVn5YRG6eciUOUxqjHw3BJ8HPs3kI+TzkaYpKLISKQ5o PL7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YHmOeb+nH1WLQpjM9QAPFM+mPW2s3h03b5WAhc+cvsI=; b=Y8JdGEBlGpWCKXz6l8BBjg1v6xxbCvC0xY4bPIyB/sEMxsrmwpn18tHjGC0BeCdfFp 1mo4AYXo2xtJqAO8Gi1/rqRpjXxaUsmSlvR0AItmoy4bCGGiYyBbm0ksH4HLQzB5PZdE to+qmyV2Iqk1gagFR5+b89D3JiwUoTRPsgkmlNyRBWKUeG4eZLnJDYJPm22pIHgaqPmu nPTrIlvH1tuxit7l3G17vLMyFcbDegV6+uzk5KaFTpzvqD3cTYkzaQ39uVIWn6BgmXkS +aKhp/9dptS4jppIiJxzm1cUOFzfHyb8XgrGJ5cSMAJezH5yrQTV+sZjnERJKDdgVX7Q 1m1A== X-Gm-Message-State: AA+aEWZbho2acpFeP2S6BSP6uiNjnm0gQdBz3DQW4D/CFO0JXzFNmnnF 97hQL3mHJ4sulMp0wzYHE0QsoDQwtss= X-Google-Smtp-Source: AFSGD/V2qXqWP0/zwHJeZ7/ImPlRZfI+Veh8Y2O8IfFFygAuaPRqNVSu3B0zPzEguW05xdbjRPZWyg== X-Received: by 2002:a6b:37c2:: with SMTP id e185-v6mr2201152ioa.173.1542734405305; Tue, 20 Nov 2018 09:20:05 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.20.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:20:04 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 5/8] aio: fix failure to put the file pointer Date: Tue, 20 Nov 2018 10:19:50 -0700 Message-Id: <20181120171953.1258-6-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the ioprio capability check fails, we return without putting the file pointer. Fixes: d9a08a9e616b ("fs: Add aio iopriority support") Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/aio.c b/fs/aio.c index 574a14b43037..611e9824279d 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1436,6 +1436,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) ret = ioprio_check_cap(iocb->aio_reqprio); if (ret) { pr_debug("aio ioprio check cap error: %d\n", ret); + fput(req->ki_filp); return ret; } From patchwork Tue Nov 20 17:19:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690915 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91AE314E2 for ; Tue, 20 Nov 2018 17:20:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 759292ACF1 for ; Tue, 20 Nov 2018 17:20:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 674282ACFD; Tue, 20 Nov 2018 17:20:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B11872ACE0 for ; Tue, 20 Nov 2018 17:20:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730458AbeKUDuX (ORCPT ); Tue, 20 Nov 2018 22:50:23 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:35529 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730478AbeKUDuV (ORCPT ); Tue, 20 Nov 2018 22:50:21 -0500 Received: by mail-it1-f193.google.com with SMTP id v11so4745730itj.0 for ; Tue, 20 Nov 2018 09:20:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=F9di3IEVX4QrugPYKu6MpcNmw67MmPmAIxCBHyMeKH4=; b=fWdTtJn9f68RcjBTJ2S69pkRxiujrpt3MAumFVeM75vCQ+N21PPjcx2NH1UQCp2U/O KfuLCs5/7qgN/PFhZnjyiiZqM8ls/01Bz5u1d57cJufrOR3qKSI6F1/G3fqpMJ3AL+A5 OEjiozGNyAlCkdcklmpbGDC5DhJ0gXFyp4CMUcUj9Uaeu1xWtKrEjMHiUXlKD3zde8Uv XLKKTDFYyqsOb6v7HqP6ekWcHf8AGOJXznzJbKuF6DofolAWBbTk53AJ2g9eHWj/KYR6 uANgvsBxJFu7XLdYiJObX0IVac+npAAtOs0VdosS3yZogYgCnXkMvt+YMVbj4OI6AHci 3UKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=F9di3IEVX4QrugPYKu6MpcNmw67MmPmAIxCBHyMeKH4=; b=QDWzd9n6Nagf/0b9gZ7ZRa52YXORIPzMfHOW8SeVXG3kgCJpBnosJWe/3JR9SwycMx G0iqDokQGZD1/06MoQTfiDUj0n/npDXj5oOWbyC+1kknN7mjk9pGy+YWGdNCxzqmqIK5 QCNNmvLT7QEuPfUHViwvnNH0gk0XG6e7nXE2FIJzTPFXSwBkcgAvylZGqpEjiF/+CkQf eyddpGKZaa10VJ8W/2JX3YcEkr2eQzfCfqWWMMZcKWvFIecb7OdeE8IX0BxG50EbHRkB aaKL1aXlarnLTV6DqPzPTLYD3c2IKRM6hLedFipfqOLdDRy5YNbcJ6Mlr2jweWTwBrir OT9Q== X-Gm-Message-State: AGRZ1gJrYwyPV1pVkOtk0LtT1DVgD640Wm+3LTy9bTEN3L3hDkEa2WnG jXVhqho7oFPoZyKFB1qgJvIzpmPUNLo= X-Google-Smtp-Source: AFSGD/WpeHvq5QEyZmvAX8EM49N7cpNrjq+7hd7nd259o6j2tWBLN2cG+ml3s5HLCrM1zzrESpyIGQ== X-Received: by 2002:a24:648f:: with SMTP id t137mr2782863itc.95.1542734406920; Tue, 20 Nov 2018 09:20:06 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.20.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:20:06 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 6/8] aio: add io_setup2() system call Date: Tue, 20 Nov 2018 10:19:51 -0700 Message-Id: <20181120171953.1258-7-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is just like io_setup(), except we get to pass in flags for what kind of behavior we want. Signed-off-by: Jens Axboe --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + fs/aio.c | 67 +++++++++++++++++--------- include/linux/syscalls.h | 2 + include/uapi/asm-generic/unistd.h | 4 +- kernel/sys_ni.c | 1 + 5 files changed, 51 insertions(+), 24 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index f0b1709a5ffb..67c357225fb0 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -343,6 +343,7 @@ 332 common statx __x64_sys_statx 333 common io_pgetevents __x64_sys_io_pgetevents 334 common rseq __x64_sys_rseq +335 common io_setup2 __x64_sys_io_setup2 # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/aio.c b/fs/aio.c index 611e9824279d..8453bb849c32 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -683,7 +683,7 @@ static void aio_nr_sub(unsigned nr) /* ioctx_alloc * Allocates and initializes an ioctx. Returns an ERR_PTR if it failed. */ -static struct kioctx *ioctx_alloc(unsigned nr_events) +static struct kioctx *ioctx_alloc(unsigned nr_events, unsigned int flags) { struct mm_struct *mm = current->mm; struct kioctx *ctx; @@ -1269,6 +1269,44 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, return ret; } +static struct kioctx *io_setup_flags(unsigned long ctx, unsigned int nr_events, + unsigned int flags) +{ + if (unlikely(ctx || nr_events == 0)) { + pr_debug("EINVAL: ctx %lu nr_events %u\n", + ctx, nr_events); + return ERR_PTR(-EINVAL); + } + + return ioctx_alloc(nr_events, flags); +} + +SYSCALL_DEFINE3(io_setup2, u32, nr_events, u32, flags, + aio_context_t __user *, ctxp) +{ + struct kioctx *ioctx; + unsigned long ctx; + long ret; + + if (flags) + return -EINVAL; + + ret = get_user(ctx, ctxp); + if (unlikely(ret)) + goto out; + + ioctx = io_setup_flags(ctx, nr_events, flags); + ret = PTR_ERR(ioctx); + if (!IS_ERR(ioctx)) { + ret = put_user(ioctx->user_id, ctxp); + if (ret) + kill_ioctx(current->mm, ioctx, NULL); + percpu_ref_put(&ioctx->users); + } +out: + return ret; +} + /* sys_io_setup: * Create an aio_context capable of receiving at least nr_events. * ctxp must not point to an aio_context that already exists, and @@ -1284,7 +1322,7 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, */ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) { - struct kioctx *ioctx = NULL; + struct kioctx *ioctx; unsigned long ctx; long ret; @@ -1292,14 +1330,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) if (unlikely(ret)) goto out; - ret = -EINVAL; - if (unlikely(ctx || nr_events == 0)) { - pr_debug("EINVAL: ctx %lu nr_events %u\n", - ctx, nr_events); - goto out; - } - - ioctx = ioctx_alloc(nr_events); + ioctx = io_setup_flags(ctx, nr_events, 0); ret = PTR_ERR(ioctx); if (!IS_ERR(ioctx)) { ret = put_user(ioctx->user_id, ctxp); @@ -1307,7 +1338,6 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) kill_ioctx(current->mm, ioctx, NULL); percpu_ref_put(&ioctx->users); } - out: return ret; } @@ -1315,7 +1345,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) { - struct kioctx *ioctx = NULL; + struct kioctx *ioctx; unsigned long ctx; long ret; @@ -1323,23 +1353,14 @@ COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) if (unlikely(ret)) goto out; - ret = -EINVAL; - if (unlikely(ctx || nr_events == 0)) { - pr_debug("EINVAL: ctx %lu nr_events %u\n", - ctx, nr_events); - goto out; - } - - ioctx = ioctx_alloc(nr_events); + ioctx = io_setup_flags(ctx, nr_events, 0); ret = PTR_ERR(ioctx); if (!IS_ERR(ioctx)) { - /* truncating is ok because it's a user address */ - ret = put_user((u32)ioctx->user_id, ctx32p); + ret = put_user(ioctx->user_id, ctx32p); if (ret) kill_ioctx(current->mm, ioctx, NULL); percpu_ref_put(&ioctx->users); } - out: return ret; } diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 2ac3d13a915b..455fabc5b0ac 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -287,6 +287,8 @@ static inline void addr_limit_user_check(void) */ #ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx); +asmlinkage long sys_io_setup2(unsigned nr_reqs, unsigned flags, + aio_context_t __user *ctx); asmlinkage long sys_io_destroy(aio_context_t ctx); asmlinkage long sys_io_submit(aio_context_t, long, struct iocb __user * __user *); diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 538546edbfbd..b4527ed373b0 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -738,9 +738,11 @@ __SYSCALL(__NR_statx, sys_statx) __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents) #define __NR_rseq 293 __SYSCALL(__NR_rseq, sys_rseq) +#define __NR_io_setup2 294 +__SYSCALL(__NR_io_setup2, sys_io_setup2) #undef __NR_syscalls -#define __NR_syscalls 294 +#define __NR_syscalls 295 /* * 32 bit systems traditionally used different diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index df556175be50..17c8b4393669 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -37,6 +37,7 @@ asmlinkage long sys_ni_syscall(void) */ COND_SYSCALL(io_setup); +COND_SYSCALL(io_setup2); COND_SYSCALL_COMPAT(io_setup); COND_SYSCALL(io_destroy); COND_SYSCALL(io_submit); From patchwork Tue Nov 20 17:19:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690925 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A0541923 for ; Tue, 20 Nov 2018 17:20:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FC6E2ACDE for ; Tue, 20 Nov 2018 17:20:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 345372ACFA; Tue, 20 Nov 2018 17:20:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD04A2ACF8 for ; Tue, 20 Nov 2018 17:20:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727551AbeKUDux (ORCPT ); Tue, 20 Nov 2018 22:50:53 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:39199 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730496AbeKUDuW (ORCPT ); Tue, 20 Nov 2018 22:50:22 -0500 Received: by mail-io1-f65.google.com with SMTP id j18-v6so1946961iog.6 for ; Tue, 20 Nov 2018 09:20:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SzRyk7JKC2mFK9FAb48i6ycj+96baAFMRnokOMy9t/A=; b=hBM2lZ3JbKjtScoijUqwB7qYOwknjgLdksItdazOwqKCGS+PcoeW7YEl5JAdYU8z4D dOEK7KJemA5Ra3tgW3CiE3ZoSmEU9LOa8lwgGnaxTaB+nTOU+RIOnJelr9n5uRS95bWz CyYLZNZjU5cRHdtgGXJPYB3UoC8h+vkoEdaxFHNxS5MplYgwC3RDiEBiYCobkaZFkOaM /mX6QeFBdhTwe+ZtjsIwfDdkkKzYFF8h1WwS9n9Kuo3vQimBTqTKtzgwu1SSqo1T6G/B 2zUj9iylx3S/kTZJqTGCne17F2qW4qQqDunOvH2T0l8tyhAPOskUP8KdONG1DiA1AERq C13g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SzRyk7JKC2mFK9FAb48i6ycj+96baAFMRnokOMy9t/A=; b=MX1/F3aZhb9D4A4xUH6YDAbbJBpoY0Id+ik2blU/QNi1PAaiSS68HLPNFBFKmgNr2n tR85Ot+nTXrBUFMSw63vgx7bNRfZ/JzbkAeSnXIQmVxAOL4dfdMO7BMXl6HABQfju6cG gjFQ5OB6YtSipVpbN5eNp7S2iikH8jRUwFFhWdVB/EQ+F6g2gCdmtMXXtwhXgHZLj1i/ UxEwc3O/xv3EHtDoxbSHmJmfV/9ZqdffoBtW+VnCmq2tTOE9PS10Tze2LgQfqrXoTopX j0sjbAUZVj5eFfrW6hbBol8B3B2goeIgLIQwURx+DwaW90rchcoGocZCehy/S1XKt4QJ iT1Q== X-Gm-Message-State: AA+aEWZFjZJ8y8hH3l5xYBER9mw9KsPRUp95R39dTxT27zk5HetkhPJd jHXjlfZf45UPCEuKzlFyp7STe+ozaE4= X-Google-Smtp-Source: AFSGD/WMBIT4wEUJW9TOJqv+md8DeEb1MF6BFadnNBDdXnvfa8PsnqqALQcb46KYqLGhxWFCiAiSHA== X-Received: by 2002:a6b:7019:: with SMTP id l25mr2299637ioc.145.1542734408390; Tue, 20 Nov 2018 09:20:08 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.20.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:20:07 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 7/8] aio: separate out ring reservation from req allocation Date: Tue, 20 Nov 2018 10:19:52 -0700 Message-Id: <20181120171953.1258-8-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is in preparation for certain types of IO not needing a ring reserveration. Signed-off-by: Jens Axboe --- fs/aio.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 8453bb849c32..8bbb0b77d9c4 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -901,7 +901,7 @@ static void put_reqs_available(struct kioctx *ctx, unsigned nr) local_irq_restore(flags); } -static bool get_reqs_available(struct kioctx *ctx) +static bool __get_reqs_available(struct kioctx *ctx) { struct kioctx_cpu *kcpu; bool ret = false; @@ -993,6 +993,14 @@ static void user_refill_reqs_available(struct kioctx *ctx) spin_unlock_irq(&ctx->completion_lock); } +static bool get_reqs_available(struct kioctx *ctx) +{ + if (__get_reqs_available(ctx)) + return true; + user_refill_reqs_available(ctx); + return __get_reqs_available(ctx); +} + /* aio_get_req * Allocate a slot for an aio request. * Returns NULL if no requests are free. @@ -1001,24 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) { struct aio_kiocb *req; - if (!get_reqs_available(ctx)) { - user_refill_reqs_available(ctx); - if (!get_reqs_available(ctx)) - return NULL; - } - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); if (unlikely(!req)) - goto out_put; + return NULL; percpu_ref_get(&ctx->reqs); INIT_LIST_HEAD(&req->ki_list); refcount_set(&req->ki_refcnt, 0); req->ki_ctx = ctx; return req; -out_put: - put_reqs_available(ctx, 1); - return NULL; } static struct kioctx *lookup_ioctx(unsigned long ctx_id) @@ -1821,9 +1820,13 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return -EINVAL; } + if (!get_reqs_available(ctx)) + return -EAGAIN; + + ret = -EAGAIN; req = aio_get_req(ctx); if (unlikely(!req)) - return -EAGAIN; + goto out_put_reqs_available; if (iocb.aio_flags & IOCB_FLAG_RESFD) { /* @@ -1886,11 +1889,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, goto out_put_req; return 0; out_put_req: - put_reqs_available(ctx, 1); percpu_ref_put(&ctx->reqs); if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); kmem_cache_free(kiocb_cachep, req); +out_put_reqs_available: + put_reqs_available(ctx, 1); return ret; } From patchwork Tue Nov 20 17:19:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10690917 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 643F75A4 for ; Tue, 20 Nov 2018 17:20:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 471172ACE3 for ; Tue, 20 Nov 2018 17:20:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34F412ACF6; Tue, 20 Nov 2018 17:20:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A596B2ACEA for ; Tue, 20 Nov 2018 17:20:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727638AbeKUDuj (ORCPT ); Tue, 20 Nov 2018 22:50:39 -0500 Received: from mail-io1-f67.google.com ([209.85.166.67]:38194 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730531AbeKUDu1 (ORCPT ); Tue, 20 Nov 2018 22:50:27 -0500 Received: by mail-io1-f67.google.com with SMTP id l14so1798974ioj.5 for ; Tue, 20 Nov 2018 09:20:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xeGvICAvQi+yhK9l/IdOwP5A2kpCzbDWPU3Dkf+7hUI=; b=phV2nVLCZv2Pd+VvGlywoB6qG+beop0d3KJNLrgBi1qjnqajTHYYp1qJt/V+ynhuG3 UpoVNv0MY3uyMrSnJDpOwz35kyP2iM6fqjsFCXc4QJpqF2Pusi+h1+igLyk1EpnTtNym Y6r2iiPRN1d7Bn23nl6HC3jLjKxO85g4Y91FDHUZoiKU0mQ0naTdU730UW4lYpQf2kfs sUBPjLxxWSCoUXslejI/fU29HjCXd52r8+lDpg2g0vMqLFin0Mr/E67Ip6Zw9T3XnHeT buVlrxfQTKXLJvP1kFlzR/qMI9AoRTKT6rEa4NTfld/xfSdFs2tcPiNW1eEKNBW9OjBy CffQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xeGvICAvQi+yhK9l/IdOwP5A2kpCzbDWPU3Dkf+7hUI=; b=Gbt4+X9Ynn8BN8xFIpnZGGCXwm5l8xyiOENeKQ7U1ASFUtsua7S8z4gQz5N9I+I+e3 wgtHLZLLWxI43XefVJoQ8G2EepS+ZRwvPci1y3fCQKRgKMqWFxDVibzJWgmPRjU4N2i1 DTop/ZSnup3tKQDB6r5xL909iR1YjPCuv9YIPc3FPEQasRrEYdq6numKlVnfyBjmfZL1 3G1NAZR6tmIpzlcq+YPBQMh/lvVY3QF6FRRr7TfdiDXTPpjELA4fReBzqED/REIZ6VgM hoLNzBHuEacJn2ipB3amWLHbMSTGq5C8ERHowmku7ODhTZFImZqTnV+4y3qGDm7WzQRp 06mg== X-Gm-Message-State: AA+aEWbdYvmYzYSAzwB4YG6izacwA9rApHOZ1dnu6/xuNKVYzbjc7sPq c/BxNaiZROXtwq85t6o4OzbAzmBpaZo= X-Google-Smtp-Source: AFSGD/XbRBoj+8ByPG25E/XI9HZ4/uRMswHyQI76IsIaZ3882cD5C3Su84qP+voAnvN4A/vUwGQlsg== X-Received: by 2002:a6b:c0c6:: with SMTP id q189-v6mr2378594iof.31.1542734412238; Tue, 20 Nov 2018 09:20:12 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id 186-v6sm15751530itf.11.2018.11.20.09.20.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Nov 2018 09:20:10 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 8/8] aio: support for IO polling Date: Tue, 20 Nov 2018 10:19:53 -0700 Message-Id: <20181120171953.1258-9-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181120171953.1258-1-axboe@kernel.dk> References: <20181120171953.1258-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add polled variants of PREAD/PREADV and PWRITE/PWRITEV. These act like their non-polled counterparts, except we expect to poll for completion of them. The polling happens at io_getevent() time, and works just like non-polled IO. To setup an io_context for polled IO, the application must call io_setup2() with IOCTX_FLAG_IOPOLL as one of the flags. It is illegal to mix and match polled and non-polled IO on an io_context. Polled IO doesn't support the user mapped completion ring. Events must be reaped through the io_getevents() system call. For non-irq driven poll devices, there's no way to support completion reaping from userspace by just looking at the ring. The application itself is the one that pulls completion entries. Signed-off-by: Jens Axboe --- fs/aio.c | 377 ++++++++++++++++++++++++++++++----- include/uapi/linux/aio_abi.h | 4 + 2 files changed, 336 insertions(+), 45 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 8bbb0b77d9c4..ea93847d25d1 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -94,6 +94,8 @@ struct kioctx { unsigned long user_id; + unsigned int flags; + struct __percpu kioctx_cpu *cpu; /* @@ -138,6 +140,19 @@ struct kioctx { atomic_t reqs_available; } ____cacheline_aligned_in_smp; + /* iopoll submission state */ + struct { + spinlock_t poll_lock; + struct list_head poll_submitted; + } ____cacheline_aligned_in_smp; + + /* iopoll completion state */ + struct { + struct list_head poll_completing; + unsigned long getevents_busy; + atomic_t poll_completed; + } ____cacheline_aligned_in_smp; + struct { spinlock_t ctx_lock; struct list_head active_reqs; /* used for cancellation */ @@ -191,13 +206,24 @@ struct aio_kiocb { struct list_head ki_list; /* the aio core uses this * for cancellation */ + + unsigned long ki_flags; +#define IOCB_POLL_COMPLETED 0 + refcount_t ki_refcnt; - /* - * If the aio_resfd field of the userspace iocb is not zero, - * this is the underlying eventfd context to deliver events to. - */ - struct eventfd_ctx *ki_eventfd; + union { + /* + * If the aio_resfd field of the userspace iocb is not zero, + * this is the underlying eventfd context to deliver events to. + */ + struct eventfd_ctx *ki_eventfd; + + /* + * For polled IO, stash completion info here + */ + struct io_event ki_ev; + }; }; /*------ sysctl variables----*/ @@ -214,6 +240,8 @@ static struct vfsmount *aio_mnt; static const struct file_operations aio_ring_fops; static const struct address_space_operations aio_ctx_aops; +static void aio_iopoll_reap_events(struct kioctx *); + static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) { struct file *file; @@ -451,11 +479,15 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) int i; struct file *file; - /* Compensate for the ring buffer's head/tail overlap entry */ - nr_events += 2; /* 1 is required, 2 for good luck */ - + /* + * Compensate for the ring buffer's head/tail overlap entry. + * IO polling doesn't require any io event entries + */ size = sizeof(struct aio_ring); - size += sizeof(struct io_event) * nr_events; + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) { + nr_events += 2; /* 1 is required, 2 for good luck */ + size += sizeof(struct io_event) * nr_events; + } nr_pages = PFN_UP(size); if (nr_pages < 0) @@ -720,6 +752,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events, unsigned int flags) if (!ctx) return ERR_PTR(-ENOMEM); + ctx->flags = flags; ctx->max_reqs = max_reqs; spin_lock_init(&ctx->ctx_lock); @@ -732,6 +765,11 @@ static struct kioctx *ioctx_alloc(unsigned nr_events, unsigned int flags) INIT_LIST_HEAD(&ctx->active_reqs); + spin_lock_init(&ctx->poll_lock); + INIT_LIST_HEAD(&ctx->poll_submitted); + INIT_LIST_HEAD(&ctx->poll_completing); + atomic_set(&ctx->poll_completed, 0); + if (percpu_ref_init(&ctx->users, free_ioctx_users, 0, GFP_KERNEL)) goto err; @@ -814,6 +852,8 @@ static int kill_ioctx(struct mm_struct *mm, struct kioctx *ctx, RCU_INIT_POINTER(table->table[ctx->id], NULL); spin_unlock(&mm->ioctx_lock); + aio_iopoll_reap_events(ctx); + /* free_ioctx_reqs() will do the necessary RCU synchronization */ wake_up_all(&ctx->wait); @@ -1056,6 +1096,24 @@ static inline void iocb_put(struct aio_kiocb *iocb) } } +static void iocb_put_many(struct kioctx *ctx, void **iocbs, int *nr) +{ + if (nr) { + kmem_cache_free_bulk(kiocb_cachep, *nr, iocbs); + percpu_ref_put_many(&ctx->reqs, *nr); + *nr = 0; + } +} + +static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, + long res, long res2) +{ + ev->obj = (u64)(unsigned long)iocb->ki_user_iocb; + ev->data = iocb->ki_user_data; + ev->res = res; + ev->res2 = res2; +} + /* aio_complete * Called when the io request on the given iocb is complete. */ @@ -1083,10 +1141,7 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2) ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); event = ev_page + pos % AIO_EVENTS_PER_PAGE; - event->obj = (u64)(unsigned long)iocb->ki_user_iocb; - event->data = iocb->ki_user_data; - event->res = res; - event->res2 = res2; + aio_fill_event(event, iocb, res, res2); kunmap_atomic(ev_page); flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); @@ -1239,6 +1294,165 @@ static bool aio_read_events(struct kioctx *ctx, long min_nr, long nr, return ret < 0 || *i >= min_nr; } +#define AIO_POLL_STACK 8 + +/* + * Process completed iocb iopoll entries, copying the result to userspace. + */ +static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, + unsigned int *nr_events, long max) +{ + void *iocbs[AIO_POLL_STACK]; + struct aio_kiocb *iocb, *n; + int to_free = 0, ret = 0; + + list_for_each_entry_safe(iocb, n, &ctx->poll_completing, ki_list) { + if (!test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) + continue; + if (to_free == AIO_POLL_STACK) + iocb_put_many(ctx, iocbs, &to_free); + + list_del(&iocb->ki_list); + iocbs[to_free++] = iocb; + + fput(iocb->rw.ki_filp); + + if (!evs) { + (*nr_events)++; + continue; + } + + if (copy_to_user(evs + *nr_events, &iocb->ki_ev, + sizeof(iocb->ki_ev))) { + ret = -EFAULT; + break; + } + if (++(*nr_events) == max) + break; + } + + if (to_free) + iocb_put_many(ctx, iocbs, &to_free); + + return ret; +} + +static int __aio_iopoll_check(struct kioctx *ctx, struct io_event __user *event, + unsigned int *nr_events, long min, long max) +{ + struct aio_kiocb *iocb; + unsigned int poll_completed; + int to_poll, polled, ret; + + /* + * Check if we already have done events that satisfy what we need + */ + if (!list_empty(&ctx->poll_completing)) { + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + } + + /* + * Take in a new working set from the submitted list if possible. + */ + if (!list_empty_careful(&ctx->poll_submitted)) { + spin_lock(&ctx->poll_lock); + list_splice_init(&ctx->poll_submitted, &ctx->poll_completing); + spin_unlock(&ctx->poll_lock); + } + + if (list_empty(&ctx->poll_completing)) + return 0; + + /* + * Check again now that we have a new batch. + */ + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + + /* + * Find up to 'max_nr' worth of events to poll for, including the + * events we already successfully polled + */ + polled = to_poll = 0; + poll_completed = atomic_read(&ctx->poll_completed); + list_for_each_entry(iocb, &ctx->poll_completing, ki_list) { + /* + * Poll for needed events with wait == true, anything after + * that we just check if we have more, up to max. + */ + bool wait = polled + *nr_events >= min; + struct kiocb *kiocb = &iocb->rw; + + if (test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) + break; + if (++to_poll + *nr_events >= max) + break; + + polled += kiocb->ki_filp->f_op->iopoll(kiocb, wait); + if (polled + *nr_events >= max) + break; + if (poll_completed != atomic_read(&ctx->poll_completed)) + break; + } + + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + return to_poll; +} + +/* + * We can't just wait for polled events to come to us, we have to actively + * find and complete them. + */ +static void aio_iopoll_reap_events(struct kioctx *ctx) +{ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + return; + + while (!list_empty_careful(&ctx->poll_submitted) || + !list_empty(&ctx->poll_completing)) { + unsigned int nr_events = 0; + + __aio_iopoll_check(ctx, NULL, &nr_events, 1, UINT_MAX); + } +} + +static int aio_iopoll_check(struct kioctx *ctx, long min_nr, long nr, + struct io_event __user *event) +{ + unsigned int nr_events = 0; + int ret = 0; + + /* * Only allow one thread polling at a time */ + if (test_and_set_bit(0, &ctx->getevents_busy)) + return -EBUSY; + + while (!nr_events || !need_resched()) { + int tmin = 0; + + if (nr_events < min_nr) + tmin = min_nr - nr_events; + + ret = __aio_iopoll_check(ctx, event, &nr_events, tmin, nr); + if (ret <= 0) + break; + ret = 0; + } + + clear_bit(0, &ctx->getevents_busy); + return nr_events ? nr_events : ret; +} + static long read_events(struct kioctx *ctx, long min_nr, long nr, struct io_event __user *event, ktime_t until) @@ -1287,7 +1501,7 @@ SYSCALL_DEFINE3(io_setup2, u32, nr_events, u32, flags, unsigned long ctx; long ret; - if (flags) + if (flags & ~IOCTX_FLAG_IOPOLL) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1411,13 +1625,8 @@ static void aio_remove_iocb(struct aio_kiocb *iocb) spin_unlock_irqrestore(&ctx->ctx_lock, flags); } -static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) +static void kiocb_end_write(struct kiocb *kiocb) { - struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); - - if (!list_empty_careful(&iocb->ki_list)) - aio_remove_iocb(iocb); - if (kiocb->ki_flags & IOCB_WRITE) { struct inode *inode = file_inode(kiocb->ki_filp); @@ -1429,19 +1638,42 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) __sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE); file_end_write(kiocb->ki_filp); } +} + +static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) +{ + struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); + + if (!list_empty_careful(&iocb->ki_list)) + aio_remove_iocb(iocb); + + kiocb_end_write(kiocb); fput(kiocb->ki_filp); aio_complete(iocb, res, res2); } -static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) +static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) { + struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); + struct kioctx *ctx = iocb->ki_ctx; + + kiocb_end_write(kiocb); + + aio_fill_event(&iocb->ki_ev, iocb, res, res2); + set_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags); + atomic_inc(&ctx->poll_completed); +} + +static int aio_prep_rw(struct aio_kiocb *kiocb, struct iocb *iocb) +{ + struct kioctx *ctx = kiocb->ki_ctx; + struct kiocb *req = &kiocb->rw; int ret; req->ki_filp = fget(iocb->aio_fildes); if (unlikely(!req->ki_filp)) return -EBADF; - req->ki_complete = aio_complete_rw; req->ki_pos = iocb->aio_offset; req->ki_flags = iocb_flags(req->ki_filp); if (iocb->aio_flags & IOCB_FLAG_RESFD) @@ -1456,8 +1688,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) ret = ioprio_check_cap(iocb->aio_reqprio); if (ret) { pr_debug("aio ioprio check cap error: %d\n", ret); - fput(req->ki_filp); - return ret; + goto out_fput; } req->ki_ioprio = iocb->aio_reqprio; @@ -1466,7 +1697,41 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) ret = kiocb_set_rw_flags(req, iocb->aio_rw_flags); if (unlikely(ret)) - fput(req->ki_filp); + goto out_fput; + + if (iocb->aio_flags & IOCB_FLAG_HIPRI) { + /* shares space in the union, and is rather pointless.. */ + ret = -EINVAL; + if (iocb->aio_flags & IOCB_FLAG_RESFD) + goto out_fput; + + /* can't submit polled IO to a non-polled ctx */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + goto out_fput; + + ret = -EOPNOTSUPP; + if (!(req->ki_flags & IOCB_DIRECT) || + !req->ki_filp->f_op->iopoll) + goto out_fput; + + req->ki_flags |= IOCB_HIPRI; + req->ki_complete = aio_complete_rw_poll; + + spin_lock(&ctx->poll_lock); + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + } else { + /* can't submit non-polled IO to a polled ctx */ + ret = -EINVAL; + if (ctx->flags & IOCTX_FLAG_IOPOLL) + goto out_fput; + + req->ki_complete = aio_complete_rw; + } + + return 0; +out_fput: + fput(req->ki_filp); return ret; } @@ -1509,15 +1774,16 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } -static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored, - bool compat) +static ssize_t aio_read(struct aio_kiocb *kiocb, struct iocb *iocb, + bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; + struct kiocb *req = &kiocb->rw; struct iov_iter iter; struct file *file; ssize_t ret; - ret = aio_prep_rw(req, iocb); + ret = aio_prep_rw(kiocb, iocb); if (ret) return ret; file = req->ki_filp; @@ -1542,15 +1808,16 @@ static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored, return ret; } -static ssize_t aio_write(struct kiocb *req, struct iocb *iocb, bool vectored, - bool compat) +static ssize_t aio_write(struct aio_kiocb *kiocb, struct iocb *iocb, + bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; + struct kiocb *req = &kiocb->rw; struct iov_iter iter; struct file *file; ssize_t ret; - ret = aio_prep_rw(req, iocb); + ret = aio_prep_rw(kiocb, iocb); if (ret) return ret; file = req->ki_filp; @@ -1820,7 +2087,8 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return -EINVAL; } - if (!get_reqs_available(ctx)) + /* Poll IO doesn't need ring reservations */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL) && !get_reqs_available(ctx)) return -EAGAIN; ret = -EAGAIN; @@ -1843,35 +2111,45 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, } } - ret = put_user(KIOCB_KEY, &user_iocb->aio_key); - if (unlikely(ret)) { - pr_debug("EFAULT: aio_key\n"); - goto out_put_req; + /* polled IO isn't cancelable, don't bother copying the key */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) { + ret = put_user(KIOCB_KEY, &user_iocb->aio_key); + if (unlikely(ret)) { + pr_debug("EFAULT: aio_key\n"); + goto out_put_req; + } } req->ki_user_iocb = user_iocb; req->ki_user_data = iocb.aio_data; + ret = -EINVAL; switch (iocb.aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(&req->rw, &iocb, false, compat); + ret = aio_read(req, &iocb, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(&req->rw, &iocb, false, compat); + ret = aio_write(req, &iocb, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(&req->rw, &iocb, true, compat); + ret = aio_read(req, &iocb, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(&req->rw, &iocb, true, compat); + ret = aio_write(req, &iocb, true, compat); break; case IOCB_CMD_FSYNC: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_fsync(&req->fsync, &iocb, false); break; case IOCB_CMD_FDSYNC: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_fsync(&req->fsync, &iocb, true); break; case IOCB_CMD_POLL: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_poll(req, &iocb); break; default: @@ -1894,7 +2172,8 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, eventfd_ctx_put(req->ki_eventfd); kmem_cache_free(kiocb_cachep, req); out_put_reqs_available: - put_reqs_available(ctx, 1); + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + put_reqs_available(ctx, 1); return ret; } @@ -1930,7 +2209,9 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - blk_start_plug(&plug); + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + blk_start_plug(&plug); + for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -1943,7 +2224,9 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (ret) break; } - blk_finish_plug(&plug); + + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + blk_finish_plug(&plug); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -2068,8 +2351,12 @@ static long do_io_getevents(aio_context_t ctx_id, long ret = -EINVAL; if (likely(ioctx)) { - if (likely(min_nr <= nr && min_nr >= 0)) - ret = read_events(ioctx, min_nr, nr, events, until); + if (likely(min_nr <= nr && min_nr >= 0)) { + if (ioctx->flags & IOCTX_FLAG_IOPOLL) + ret = aio_iopoll_check(ioctx, min_nr, nr, events); + else + ret = read_events(ioctx, min_nr, nr, events, until); + } percpu_ref_put(&ioctx->users); } diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index 8387e0af0f76..3b98b5fbacde 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -52,9 +52,11 @@ enum { * is valid. * IOCB_FLAG_IOPRIO - Set if the "aio_reqprio" member of the "struct iocb" * is valid. + * IOCB_FLAG_HIPRI - Use IO completion polling */ #define IOCB_FLAG_RESFD (1 << 0) #define IOCB_FLAG_IOPRIO (1 << 1) +#define IOCB_FLAG_HIPRI (1 << 2) /* read() from /dev/aio returns these structures. */ struct io_event { @@ -106,6 +108,8 @@ struct iocb { __u32 aio_resfd; }; /* 64 bytes */ +#define IOCTX_FLAG_IOPOLL (1 << 0) + #undef IFBIG #undef IFLITTLE