From patchwork Fri Nov 30 16:56:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0AC0A18B8 for ; Fri, 30 Nov 2018 16:56:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E713430415 for ; Fri, 30 Nov 2018 16:56:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D81AE30422; Fri, 30 Nov 2018 16:56:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 798C830415 for ; Fri, 30 Nov 2018 16:56:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726779AbeLAEGs (ORCPT ); Fri, 30 Nov 2018 23:06:48 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:40521 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726647AbeLAEGs (ORCPT ); Fri, 30 Nov 2018 23:06:48 -0500 Received: by mail-it1-f196.google.com with SMTP id h193so10214417ita.5 for ; Fri, 30 Nov 2018 08:56:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2MxFtSFebSJ+ImD4ppk7jV+8mSV4CUk/N9/CLTpskeY=; b=EGWp2JrOvq3xeH5hMK13Fm1t7ubF2kGBUiNVxNT/x0448O/YjgUwl+AaRuOBTXrJ6E JqfXqgpZleYJQGnRIeX9G7rz0Pm6h50HDW/YrJNyaHysyJv1UScPc16MFHnwU1hRNGZa yD400S6xyNyzdoxCdwNkbozW3GAazSh1yE2Em08tfhHNFNAjG7MWQ27g1M67rTFt4dRa 6vTbmksUsF0VrOKAuezDbYtmFJo4AJurhgFIDInvo8F8+JF92tdAJTKXINtWocLmtzD0 vnWUYJYolvxaLpDNKdIOgIjVP3wZsWQK3NMInmDJTp/dNy1oIWfiqgW90Wdueh6MoHAy aUjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2MxFtSFebSJ+ImD4ppk7jV+8mSV4CUk/N9/CLTpskeY=; b=Ff9za3O7CTwyaGVa3sySrzQIs268ijRgo/E4rXdEdSDDVRsWBGTvR1lvZfHqTkFNB0 mbeqms7Zz9AcD5V0mDd+LUbVpEdeeSUuLPaGeIAYMPPV0+9OyMPgfO/OujrW3y+SbanQ S+nG4I5YwDoiWk/xQnw9M84F39KK1qzf0W4wNfxUvB52KWZFHGFwzFUh7BaM53uEiAZ8 qSsqlu2c8FR9NoPklx1iiWstB5uoeg3ENCFkL7FMRDxW9uZl7gK96qnUcD1OVH7gC0P0 byC+B5FLI3gJ2b9YzKJIQAAFXXcBPNjexThR7RTDCfNsVNX2vD2X23BVwbWuGGlACd5U ws/A== X-Gm-Message-State: AA+aEWZt559osdFxaEItkWT2RdRKqnPx8zZLHGi3AcBmoO46P3185/51 dppEu/jbkO3dfTwkmND9NaGtn3vy5Rg= X-Google-Smtp-Source: AFSGD/W404x7YYbrrVgUrwlP+xDC6yoBZYfStOm3eBsulbaFX2IWYNNBfG4HfBxvYpIFUZRBHyfxJw== X-Received: by 2002:a24:4a95:: with SMTP id k143mr5780360itb.77.1543597011910; Fri, 30 Nov 2018 08:56:51 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:56:51 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 01/27] aio: fix failure to put the file pointer Date: Fri, 30 Nov 2018 09:56:20 -0700 Message-Id: <20181130165646.27341-2-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the ioprio capability check fails, we return without putting the file pointer. Fixes: d9a08a9e616b ("fs: Add aio iopriority support") Reviewed-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/aio.c b/fs/aio.c index b984918be4b7..205390c0c1bb 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1436,6 +1436,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) ret = ioprio_check_cap(iocb->aio_reqprio); if (ret) { pr_debug("aio ioprio check cap error: %d\n", ret); + fput(req->ki_filp); return ret; } From patchwork Fri Nov 30 16:56:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706735 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E4EB109C for ; Fri, 30 Nov 2018 16:56:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 79416303D2 for ; Fri, 30 Nov 2018 16:56:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6C3A530415; Fri, 30 Nov 2018 16:56:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 084EE303D2 for ; Fri, 30 Nov 2018 16:56:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726788AbeLAEGu (ORCPT ); Fri, 30 Nov 2018 23:06:50 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:38758 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726645AbeLAEGt (ORCPT ); Fri, 30 Nov 2018 23:06:49 -0500 Received: by mail-it1-f196.google.com with SMTP id h65so10224987ith.3 for ; Fri, 30 Nov 2018 08:56:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wp0ipsqt9//y3KPR0Mr4TZVsJyq1DbTfCdimy2bEq90=; b=X/Dk8qhh1Dbxk7/1nGYTf0bkALGSSBx3f8nLIzu3GKGfIwXRicVJWkg3rR1LrW2jax oMdlWti47RAdM3r5+Ez1aXYrYyWwh44ki/VVmkEeJINVeDUiyBYL7Ut+wbxf/q8RGRIh iesDk4yZ/+pwiRU7lWriqrvMU3x2CPWzSUK1vzBzmh+Yx9plFSkw/tfPX6qMd5IH4+9Y PNNxqbvSe4k6auQH8SS53FArnp88lNPfmNS/blaxSDduTLGbcsE+bFBElHEHESoSk3ZG sZfx5/GMJhDrR1eSS6/iY4Zhk1oFsovfxc15KM01bVf1L3KanMnz31TZs91tOqSW46// TgRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wp0ipsqt9//y3KPR0Mr4TZVsJyq1DbTfCdimy2bEq90=; b=BvaVKRM6zQMwsFmCyYbeL8qrUuhXKgp2Jolj7DasszcOf/EfISKUrw5+a+UOZvXZhH arP+Lrtkiqx+XKsWuz65933GmMKc6zQom85IJ6OcGkAenvkQyjUL1QbyNHP0vlqZQHQu 11+IztNmTRGg06ePBf24X/5cs6iDuonQeRq+zfCcXQJhlGAIPMUmOfwshzxUEWkr2jU5 o7dgQMlWxCVxCsL0+TU+gDL3dfNz8WlEk/W28MG6iMBITvGxRpbeatiPJTymEcouoIsK sbEagMxXTT2VkVyQKyN61ol3sXxyGMo6LX9JT6L1WeXicf8JSTiUmJ+TJaleglZ4R0jK 1gyA== X-Gm-Message-State: AA+aEWZDc3uRdVhG4KCSjQcDyHCGvd5N6UsqhqHQG5pfZ/T2w7diP2R0 0kPD9XrBLLWkMzLyH/cucik5D3NpaoE= X-Google-Smtp-Source: AFSGD/WOc2KTlSy1yXDqNsH7W7ePMUqavv70PNiYG28NJv83SHfK5wkGHTJ7g75cNUbWdm9sIwd+tA== X-Received: by 2002:a24:d08d:: with SMTP id m135mr5738482itg.89.1543597013565; Fri, 30 Nov 2018 08:56:53 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:56:52 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 02/27] aio: clear IOCB_HIPRI Date: Fri, 30 Nov 2018 09:56:21 -0700 Message-Id: <20181130165646.27341-3-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig No one is going to poll for aio (yet), so we must clear the HIPRI flag, as we would otherwise send it down the poll queues, where no one will be polling for completions. Signed-off-by: Christoph Hellwig IOCB_HIPRI, not RWF_HIPRI. Reviewed-by: Johannes Thumshirn Signed-off-by: Jens Axboe --- fs/aio.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 205390c0c1bb..05647d352bf3 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1436,8 +1436,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) ret = ioprio_check_cap(iocb->aio_reqprio); if (ret) { pr_debug("aio ioprio check cap error: %d\n", ret); - fput(req->ki_filp); - return ret; + goto out_fput; } req->ki_ioprio = iocb->aio_reqprio; @@ -1446,7 +1445,13 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) ret = kiocb_set_rw_flags(req, iocb->aio_rw_flags); if (unlikely(ret)) - fput(req->ki_filp); + goto out_fput; + + req->ki_flags &= ~IOCB_HIPRI; /* no one is going to poll for this I/O */ + return 0; + +out_fput: + fput(req->ki_filp); return ret; } From patchwork Fri Nov 30 16:56:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706739 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3E2A713A4 for ; Fri, 30 Nov 2018 16:56:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3098B303D2 for ; Fri, 30 Nov 2018 16:56:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 24AC530415; Fri, 30 Nov 2018 16:56:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C19C4303D2 for ; Fri, 30 Nov 2018 16:56:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726645AbeLAEGw (ORCPT ); Fri, 30 Nov 2018 23:06:52 -0500 Received: from mail-io1-f67.google.com ([209.85.166.67]:38332 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726830AbeLAEGv (ORCPT ); Fri, 30 Nov 2018 23:06:51 -0500 Received: by mail-io1-f67.google.com with SMTP id l14so5080001ioj.5 for ; Fri, 30 Nov 2018 08:56:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uEeDBkOwPid70UEg8pUacK6i7pi0+OoFq4mtLStZsZw=; b=f5taqLSyd9amqf+/notpD8cM/uAHKrvFZ1+1t1oNMkXdH/u6ERvkkIhUfui86QDSVB RhlMR4NqYDIZyLECu/iRSKGPc0JTvyHkmrSR3afNRb5g2kJSadGl01Db/HrSJmy7UYnf +ovSEBBjmj9PUNDHtG/mrFTADwbXMvCtdN7vODRqi9/O7iz4LvaxTlNbUnvwmvZnlac2 pdauQFUFS70yBqHA84CfMKNhyNdThAaz2M0ZnW0kztaku0z1Nfv2OUsxX6j5VCOQU7t5 5wtJXdI8d/OwSLLdQ1mVMFtLURn6RS4ETyHuJLwZWCzYlzAsEb9z9QeXEIeYenvlp4K6 55Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uEeDBkOwPid70UEg8pUacK6i7pi0+OoFq4mtLStZsZw=; b=L05LJqbq+Qgd7BvcMl+a42HPjZ9OogaLorzGOwldsybSuHOdJtmRbABocCYDHcXRrI qlB6IWrBE6oOg/u60Z4L8D3E5HfdT396afNGobqUNIOz8ZK4vAkJ75EkVk2jq0KfnxsZ ype+ysWjtelDP7domv7JmEKDXAZ2JITWhIkK23WS+AEVzVVtROPGVHexvl5B9LN/3go/ VQ65HRapg2fX+xIZTruyVwn73uRjRh6f60o1qmil0lTpaDi16uZpvzT8tyIzCPu3HoSU juXUJqEjlic23kOCjBKN9deCX61lMaDVCbc9MTfeS7uQFVaC+ZHN94SaxJ95HQJBkMvn gJ0Q== X-Gm-Message-State: AA+aEWaLspJ4+HHOj328NojJfkTtt94sxCqsIAg0fV4QrIBVmI66fsW4 964KK7DNIGz0jYLK6q9sxVbCQTP1a08= X-Google-Smtp-Source: AFSGD/VCaBbQ1iMBZWcXa71Z3wKiswhL1TGQNN8LHLMVoyyOS6f1Q69UGtAsdgT7gTPk8FWpQmuHzQ== X-Received: by 2002:a6b:9207:: with SMTP id u7mr5321601iod.286.1543597015179; Fri, 30 Nov 2018 08:56:55 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:56:54 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 03/27] fs: add an iopoll method to struct file_operations Date: Fri, 30 Nov 2018 09:56:22 -0700 Message-Id: <20181130165646.27341-4-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig This new methods is used to explicitly poll for I/O completion for an iocb. It must be called for any iocb submitted asynchronously (that is with a non-null ki_complete) which has the IOCB_HIPRI flag set. The method is assisted by a new ki_cookie field in struct iocb to store the polling cookie. TODO: we can probably union ki_cookie with the existing hint and I/O priority fields to avoid struct kiocb growth. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- Documentation/filesystems/vfs.txt | 3 +++ include/linux/fs.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 5f71a252e2e0..d9dc5e4d82b9 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -857,6 +857,7 @@ struct file_operations { ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); + int (*iopoll)(struct kiocb *kiocb, bool spin); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); @@ -902,6 +903,8 @@ otherwise noted. write_iter: possibly asynchronous write with iov_iter as source + iopoll: called when aio wants to poll for completions on HIPRI iocbs + iterate: called when the VFS needs to read the directory contents iterate_shared: called when the VFS needs to read the directory contents diff --git a/include/linux/fs.h b/include/linux/fs.h index a1ab233e6469..6a5f71f8ae06 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -310,6 +310,7 @@ struct kiocb { int ki_flags; u16 ki_hint; u16 ki_ioprio; /* See linux/ioprio.h */ + unsigned int ki_cookie; /* for ->iopoll */ } __randomize_layout; static inline bool is_sync_kiocb(struct kiocb *kiocb) @@ -1781,6 +1782,7 @@ struct file_operations { ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); + int (*iopoll)(struct kiocb *kiocb, bool spin); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); From patchwork Fri Nov 30 16:56:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706743 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2172F18B8 for ; Fri, 30 Nov 2018 16:56:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0FA09303EA for ; Fri, 30 Nov 2018 16:56:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 03AA530415; Fri, 30 Nov 2018 16:56:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB5E13041D for ; Fri, 30 Nov 2018 16:56:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726881AbeLAEGw (ORCPT ); Fri, 30 Nov 2018 23:06:52 -0500 Received: from mail-it1-f194.google.com ([209.85.166.194]:36844 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726830AbeLAEGw (ORCPT ); Fri, 30 Nov 2018 23:06:52 -0500 Received: by mail-it1-f194.google.com with SMTP id c9so10242664itj.1 for ; Fri, 30 Nov 2018 08:56:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=eJ3aWzBT4gm4FwYHYUV1JBWUWsiFyJVaotzVIFD6bng=; b=nX/PmpLduSA9mqR4VBb8drcGkVO1AAo0v/hZLs4NUHO/7xChEotTLFjbVTUCVzTmIK twcSpUMQ3hmX/PJI3Xq4m4wnFazExi/erEvpBFviQ59XUFCa9xC8oV3TWJBPPUUPn578 pqOuNF7nkRF/hz6eLoQiAYwOeLcvVjqfDEhzlf55LuwxRIW/FMspJPIAAjmNdYByYVEx mfNokArAJaK6hAMgYKmwjXoj2nMayHi9f3zSUXOfJ6OnLjq7dOchyevxx5ZK5SXuihm+ OMuZBL70rOIPxP04JPyd/7BLCssgmitbA4qq8LlbH2Pimv/azJ4TjiwQ1Ov0lB67nDYI kZeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eJ3aWzBT4gm4FwYHYUV1JBWUWsiFyJVaotzVIFD6bng=; b=qWp7EqX+nVNPBgzZCU2pVSd01z9Zs9IfVdH4iCPjmibb9wWmLFM0kGorqfUEGrXXtO /1DvzrydrE6ck3Pdg0xSJ4J1CgVtpYrHIbvGScJy810wmR5rqEdTg8oa0lwWkKlXqzP8 YakBgx2KwGR3+oJyZG1SQIHpl6EKAhSrDQhCEVIrRfxddF9dXcnC2ZVsGolv8b+icX03 rpVDADBdELO+SFM0KVx4pW8SNWR9IDSMaM7+foW2HMVXY+udyeubjhRx0qyJulMx8QWT 1aEeyZYgMXXxIzry6KxCkzwV1lxgH1qE6T+uKO0QDC6DVIhZP0lrehTB2sgzVpHNfgJi Rb8A== X-Gm-Message-State: AA+aEWZ7oNk0PFuUOCD4+YKV2KdaOvfLe4yYGkTfaklcatBdDz9MPqB0 GHwGoe8f50iVtQou/2YZ6YydZxc2zY0= X-Google-Smtp-Source: AFSGD/UsLoqORr4GKEZsDyGUoXn5f2YvD82zm7qVTDqunQCW+JkUlxngTEpw9ivPa2HUUdQ7LhwZ/A== X-Received: by 2002:a02:1e88:: with SMTP id 8mr5670956jat.108.1543597016459; Fri, 30 Nov 2018 08:56:56 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:56:55 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 04/27] block: wire up block device iopoll method Date: Fri, 30 Nov 2018 09:56:23 -0700 Message-Id: <20181130165646.27341-5-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig Just call blk_poll on the iocb cookie, we can derive the block device from the inode trivially. Reviewed-by: Johannes Thumshirn Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/block_dev.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/block_dev.c b/fs/block_dev.c index e1886cc7048f..6de8d35f6e41 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -281,6 +281,14 @@ struct blkdev_dio { static struct bio_set blkdev_dio_pool; +static int blkdev_iopoll(struct kiocb *kiocb, bool wait) +{ + struct block_device *bdev = I_BDEV(kiocb->ki_filp->f_mapping->host); + struct request_queue *q = bdev_get_queue(bdev); + + return blk_poll(q, READ_ONCE(kiocb->ki_cookie), wait); +} + static void blkdev_bio_end_io(struct bio *bio) { struct blkdev_dio *dio = bio->bi_private; @@ -398,6 +406,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) bio->bi_opf |= REQ_HIPRI; qc = submit_bio(bio); + WRITE_ONCE(iocb->ki_cookie, qc); break; } @@ -2070,6 +2079,7 @@ const struct file_operations def_blk_fops = { .llseek = block_llseek, .read_iter = blkdev_read_iter, .write_iter = blkdev_write_iter, + .iopoll = blkdev_iopoll, .mmap = generic_file_mmap, .fsync = blkdev_fsync, .unlocked_ioctl = block_ioctl, From patchwork Fri Nov 30 16:56:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706747 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E28F113A4 for ; Fri, 30 Nov 2018 16:57:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D0F95303EA for ; Fri, 30 Nov 2018 16:57:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C52783041C; Fri, 30 Nov 2018 16:57:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D482303EA for ; Fri, 30 Nov 2018 16:57:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726830AbeLAEGy (ORCPT ); Fri, 30 Nov 2018 23:06:54 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:39266 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727013AbeLAEGy (ORCPT ); Fri, 30 Nov 2018 23:06:54 -0500 Received: by mail-it1-f195.google.com with SMTP id a6so10217209itl.4 for ; Fri, 30 Nov 2018 08:56:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0Qn86S9IysAhUjs5lPwgSM110FSlNFVglHplsagyxFA=; b=PnnqyejQfl3Y1Sx27UmSfYpY6XtRSsUkDvCBqF32TA7W8fVFAuIoTBWfJd6XPDavIR D7CEkQR8B35dmNB9vmJtSno1X0r/F9MkeHCazLVp9DGTES8W7VyqCSe5fGV+XcFGZqrH HZ4ieXxta9CZL/MgpkenNLTjzqJdAlxS2ologaO9S+/8shR+Gw6giCFN471CuoxhOIC/ 2Pkuf0a8cmF5Yy0sbrBvugVOKvBY0/anggkKLjU0C/Uehq+q15KWZMdc4QNpqDOwA82W PxLQhQ+hioHgoWbm3Si8URsnLMQnDyhVuME7SvIqv8a8oZD6LuPsjaQdOEKVhT6m4iMb 1tAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0Qn86S9IysAhUjs5lPwgSM110FSlNFVglHplsagyxFA=; b=PWm7vN43CeuNJRPzX64vzS6eouP52YuV1pYrjA/8X/balBVEv9L0Y3Bo/w/i+DnfL+ beG3sA6RY0WZ/zLYaaDxP8d9ygkKAEEZ/RSEoCeA+vtB4nQ8VyZQbX9I8iS1HuROyDfw CVIlnyrieB+/YRaleyGG2SNTiZOexMCkZdqVkKAbAPELJu4VfihQXspjNPqsOsY9rWpc gUKJo2K7rJe9BaG12Qj0r/cJXUXb+ZJhI4lOum8PYIPk9ou3abua5no1HnuNO2/hz3Tj ROrko1X1e3jQ0MMWrPsIQyNuxHhaAQ5QIEZeQLwK9DACyao4aTnRs4vG947DkEXKXdE0 u2fA== X-Gm-Message-State: AA+aEWYEQ73SqstcPSE418sKkM3AHLk7T/p4SPy1R+hN2hlH/B2KU/2W kdx/MEX4fv0nfLc/8ZWBydDFDJ606M4= X-Google-Smtp-Source: AFSGD/VUg1tqk5zm1omAZOdspM+84VPskzax56AznGHgAnyPgdIrhavsyYAjV6LJaxwuP1fTkpaueg== X-Received: by 2002:a02:9281:: with SMTP id b1mr5893855jah.86.1543597018129; Fri, 30 Nov 2018 08:56:58 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:56:57 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 05/27] block: ensure that async polled IO is marked REQ_NOWAIT Date: Fri, 30 Nov 2018 09:56:24 -0700 Message-Id: <20181130165646.27341-6-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We can't wait for polled events to complete, as they may require active polling from whoever submitted it. If that is the same task that is submitting new IO, we could deadlock waiting for IO to complete that this task is supposed to be completing itself. Signed-off-by: Jens Axboe --- fs/block_dev.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 6de8d35f6e41..ebc3d5a0f424 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -402,8 +402,16 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES); if (!nr_pages) { - if (iocb->ki_flags & IOCB_HIPRI) + if (iocb->ki_flags & IOCB_HIPRI) { bio->bi_opf |= REQ_HIPRI; + /* + * For async polled IO, we can't wait for + * requests to complete, as they may also be + * polled and require active reaping. + */ + if (!is_sync) + bio->bi_opf |= REQ_NOWAIT; + } qc = submit_bio(bio); WRITE_ONCE(iocb->ki_cookie, qc); From patchwork Fri Nov 30 16:56:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706751 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00653109C for ; Fri, 30 Nov 2018 16:57:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E67BC30415 for ; Fri, 30 Nov 2018 16:57:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DAFB43041D; Fri, 30 Nov 2018 16:57:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 70A4130415 for ; Fri, 30 Nov 2018 16:57:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727114AbeLAEG4 (ORCPT ); Fri, 30 Nov 2018 23:06:56 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:38792 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727089AbeLAEG4 (ORCPT ); Fri, 30 Nov 2018 23:06:56 -0500 Received: by mail-it1-f196.google.com with SMTP id h65so10225552ith.3 for ; Fri, 30 Nov 2018 08:57:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=p6XwCf9lidTSvsA0jLObhvA2jqsQMgsk5MoVc4+K+Yo=; b=FdHvurCQhk2am2SF5E5y2guHUnsGXpATDdgMJ+E65Ykr/XEyKe4BhE7i6o0suCvLpp fjILJgLeilT+YKLCUVSCNVoT6yEE5+hS4ejkwDFqeB7p1BZrHirsDgnuZyOwDFIDodr7 BF+1+XYgtJRBoF8gDZt3STzTdtIhYViFqXbKiXUgVU8N9BE6aj7njdXz2ieBg1Vvcojp r4VbzKOrunNGXaeF0m8NK+pvDNsKxNu4KRSGQQf8CB7n0mSZny+7+NdO76XDp7a80SPK 81T6ldNdfgtiU1VtVCCuvoPaKtSH0chx17OgOxgXI9xdElLYT0WKOQVUGzWVghvYll+/ TZtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p6XwCf9lidTSvsA0jLObhvA2jqsQMgsk5MoVc4+K+Yo=; b=J5ghKztJvciRcnhRy/IYxK+SP4cZVQRGU5nenXdpgSVVIW25cIgAwwl1ogZGk3njuz YDrKmmtx2c2HWRgOJdSvwywWAzOzcafCpQrUIW4Hov4TfC7FlgrIJu7RUi/rdx2dR4iY jq03MXt/89Nna5iZM+w+IW+RyPsKv7djgV518kogbk1xX5ehKZQpYjjlC4UVt3e5/eMM CpA551ln+js4zS6y8qEJ8sSvcldF2P+K4CjRZjhWhel4dbufdwx7MPDyTTaoQPL7Nw2L oeqhVyob2W4YDpII89tiBj3WHdm6W2TF4m+HBWc2r30nQj0mD4HiGESHu9Rr/+J0dyqe xd/A== X-Gm-Message-State: AA+aEWahA1d24Rp51p6IO1bxCB12FWpvD0PgbZdSdPwvrQCEcxKBhez8 VaPQcrD8qkSMW4QvJ61a5wlVvS8sd/A= X-Google-Smtp-Source: AFSGD/Wh23nkZS0Hk8KsX1HUFUZzg2Rf/ZvctX1YOHykgAalRjU0GYpbgRk+U+ywjRRUNuQxKTdw1Q== X-Received: by 2002:a02:9b26:: with SMTP id o35mr5828456jak.23.1543597019634; Fri, 30 Nov 2018 08:56:59 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:56:58 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 06/27] iomap: wire up the iopoll method Date: Fri, 30 Nov 2018 09:56:25 -0700 Message-Id: <20181130165646.27341-7-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig Store the request queue the last bio was submitted to in the iocb private data in addition to the cookie so that we find the right block device. Also refactor the common direct I/O bio submission code into a nice little helper. Signed-off-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/gfs2/file.c | 2 ++ fs/iomap.c | 43 ++++++++++++++++++++++++++++--------------- fs/xfs/xfs_file.c | 1 + include/linux/iomap.h | 1 + 4 files changed, 32 insertions(+), 15 deletions(-) diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 45a17b770d97..358157efc5b7 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -1280,6 +1280,7 @@ const struct file_operations gfs2_file_fops = { .llseek = gfs2_llseek, .read_iter = gfs2_file_read_iter, .write_iter = gfs2_file_write_iter, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = gfs2_ioctl, .mmap = gfs2_mmap, .open = gfs2_open, @@ -1310,6 +1311,7 @@ const struct file_operations gfs2_file_fops_nolock = { .llseek = gfs2_llseek, .read_iter = gfs2_file_read_iter, .write_iter = gfs2_file_write_iter, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = gfs2_ioctl, .mmap = gfs2_mmap, .open = gfs2_open, diff --git a/fs/iomap.c b/fs/iomap.c index 74c1f37f0fd6..16787b3b09fd 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1436,6 +1436,28 @@ struct iomap_dio { }; }; +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) +{ + struct request_queue *q = READ_ONCE(kiocb->private); + + if (!q) + return 0; + return blk_poll(q, READ_ONCE(kiocb->ki_cookie), spin); +} +EXPORT_SYMBOL_GPL(iomap_dio_iopoll); + +static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, + struct bio *bio) +{ + atomic_inc(&dio->ref); + + if (dio->iocb->ki_flags & IOCB_HIPRI) + bio->bi_opf |= REQ_HIPRI; + + dio->submit.last_queue = bdev_get_queue(iomap->bdev); + dio->submit.cookie = submit_bio(bio); +} + static ssize_t iomap_dio_complete(struct iomap_dio *dio) { struct kiocb *iocb = dio->iocb; @@ -1548,7 +1570,7 @@ static void iomap_dio_bio_end_io(struct bio *bio) } } -static blk_qc_t +static void iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, unsigned len) { @@ -1562,15 +1584,10 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - if (dio->iocb->ki_flags & IOCB_HIPRI) - flags |= REQ_HIPRI; - get_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); - - atomic_inc(&dio->ref); - return submit_bio(bio); + iomap_dio_submit_bio(dio, iomap, bio); } static loff_t @@ -1666,9 +1683,6 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, bio_set_pages_dirty(bio); } - if (dio->iocb->ki_flags & IOCB_HIPRI) - bio->bi_opf |= REQ_HIPRI; - iov_iter_advance(dio->submit.iter, n); dio->size += n; @@ -1676,11 +1690,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, copied += n; nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES); - - atomic_inc(&dio->ref); - - dio->submit.last_queue = bdev_get_queue(iomap->bdev); - dio->submit.cookie = submit_bio(bio); + iomap_dio_submit_bio(dio, iomap, bio); } while (nr_pages); if (need_zeroout) { @@ -1883,6 +1893,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (dio->flags & IOMAP_DIO_WRITE_FUA) dio->flags &= ~IOMAP_DIO_NEED_SYNC; + WRITE_ONCE(iocb->ki_cookie, dio->submit.cookie); + WRITE_ONCE(iocb->private, dio->submit.last_queue); + if (!atomic_dec_and_test(&dio->ref)) { if (!dio->wait_for_completion) return -EIOCBQUEUED; diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 53c9ab8fb777..603e705781a4 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1203,6 +1203,7 @@ const struct file_operations xfs_file_operations = { .write_iter = xfs_file_write_iter, .splice_read = generic_file_splice_read, .splice_write = iter_file_splice_write, + .iopoll = iomap_dio_iopoll, .unlocked_ioctl = xfs_file_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = xfs_file_compat_ioctl, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 9a4258154b25..0fefb5455bda 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -162,6 +162,7 @@ typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret, unsigned flags); ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, iomap_dio_end_io_t end_io); +int iomap_dio_iopoll(struct kiocb *kiocb, bool spin); #ifdef CONFIG_SWAP struct file; From patchwork Fri Nov 30 16:56:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706755 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F7BA13A4 for ; Fri, 30 Nov 2018 16:57:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5106B30415 for ; Fri, 30 Nov 2018 16:57:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3BBCB3041D; Fri, 30 Nov 2018 16:57:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E1FBB30415 for ; Fri, 30 Nov 2018 16:57:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727013AbeLAEG6 (ORCPT ); Fri, 30 Nov 2018 23:06:58 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:37421 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727089AbeLAEG5 (ORCPT ); Fri, 30 Nov 2018 23:06:57 -0500 Received: by mail-it1-f193.google.com with SMTP id b5so10240222iti.2 for ; Fri, 30 Nov 2018 08:57:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZECYVF1gJ/UytKVuamY/WF200Edf1Xru8F859yvniEc=; b=VRVVbdIqGnM2zcUr1aE2Cv9Q2sEVyi2OL5PJAdTs4jhL1bFeNZn3BBR1WQsrfN/sDQ 1XR1QBJnqOCQH+01q+u8JOcAMK+Jdy27tiWhTVculFwtMqGL2UJTuYlqe5O1jALHR6nl 0BbZFiBKpiTgezKV+/iByAeT1rEwZf56cYUezlHLzlv74sXYzE78mbyahdQ8janIxM9R fAfvcbyx1ZX/5WL0C0VgJSb5xkXM2Wpti+52cyj7HpckCJrpOR/G5Hd0Uwkd2yuPDfGd bhkJ4J04hVX20vg8OxRS3AApDbQNg/ew6sRGJxes3mKsl69uwNwVAnX2Zf5CpNjsIuj1 0wWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ZECYVF1gJ/UytKVuamY/WF200Edf1Xru8F859yvniEc=; b=btLsWatsjv3rp/tF6c7gfGydipLhlxzeelkkIhz1Bt95eJsmXFA8RR010l3Lav0Sdd tHbxObGn9FrLnrA78kkIx3DjunsGSikFeb9Ssd7slAcb87d9z4CG/U96gVfY06Rh5ENZ /TaK6eF+4P5wCsdACNcEho4bzW/Bmyi2awxZpC/uHWgp8eJo5Ke1WpOOeqbjDZet9Hj0 NrjQAp/4JlUbAcGoPV6UmFyqr/OdPRFlIeWnm88Q9aPSmUL3Y66jqt7G6/Y1nnbL24RC pciP2vB8vJxuynAQFVYmTedBEUwpZAEuBAC7Z1U6qdmhW5arEYVEVg9raRu+BtJd04PU /Rxg== X-Gm-Message-State: AA+aEWY/OET4Hl9PCngcf9tRz8d3LT0L09hHyVUDByKevdMDQ9OAcBCV 2uU6QOJP2KmGrHWJMMBE4yb9LFrMTVk= X-Google-Smtp-Source: AFSGD/WgVG5dOPb2JGLFeoUpe/MN3aHF/ZaWKBSbKEsHSjuXk91FEFnS3290Ry/6CArh0+yToB9S5A== X-Received: by 2002:a02:97fd:: with SMTP id v58mr5997280jaj.115.1543597021243; Fri, 30 Nov 2018 08:57:01 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.56.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:00 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 07/27] iomap: ensure that async polled IO is marked REQ_NOWAIT Date: Fri, 30 Nov 2018 09:56:26 -0700 Message-Id: <20181130165646.27341-8-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We can't wait for polled events to complete, as they may require active polling from whoever submitted it. If that is the same task that is submitting new IO, we could deadlock waiting for IO to complete that this task is supposed to be completing itself. Signed-off-by: Jens Axboe --- fs/iomap.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/iomap.c b/fs/iomap.c index 16787b3b09fd..96d60b9b2bea 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1451,8 +1451,16 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, { atomic_inc(&dio->ref); - if (dio->iocb->ki_flags & IOCB_HIPRI) + if (dio->iocb->ki_flags & IOCB_HIPRI) { bio->bi_opf |= REQ_HIPRI; + /* + * For async polled IO, we can't wait for requests to + * complete, as they may also be polled and require active + * reaping. + */ + if (!dio->wait_for_completion) + bio->bi_opf |= REQ_NOWAIT; + } dio->submit.last_queue = bdev_get_queue(iomap->bdev); dio->submit.cookie = submit_bio(bio); From patchwork Fri Nov 30 16:56:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706759 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87676109C for ; Fri, 30 Nov 2018 16:57:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 795B130415 for ; Fri, 30 Nov 2018 16:57:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6E08B3041D; Fri, 30 Nov 2018 16:57:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F93F30415 for ; Fri, 30 Nov 2018 16:57:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726796AbeLAEG7 (ORCPT ); Fri, 30 Nov 2018 23:06:59 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:34200 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727089AbeLAEG6 (ORCPT ); Fri, 30 Nov 2018 23:06:58 -0500 Received: by mail-it1-f196.google.com with SMTP id x124so2412355itd.1 for ; Fri, 30 Nov 2018 08:57:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=er4XNTBrDyS5t2gvWJtJjpkVFpAPOC78sIXCy9gEMJ4=; b=Av/YwMzJ8POcTBIy9D5dhiqu9KvU5xSu1GWMV1OEqxgo3hod0BtdP3pKz07UtcSEFL lwcZgK+eCg51Yww201FoD3EOh5mDi7VMzppExEHkL1NO41NOmvOmFVnCqjohfKe1Op/B OD8pG8Cjw36bZFjhSUNI43CDqgH3dz3DFHVN3ChDE8fIzOu3tFxUdt9xTKueBEUR/9ER H507XFCdRTozL0YF+LD2KY/XT/JPJGHmaVw007CXHNR+g8ePhlAzx43Ko9O6vVcfVks3 FuTi6qnMAKNkYKtd/1TeY05UTLVMgM+frVH7+BjDI32v0+MLTAUMeCoMim5zP+4ot4De vGAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=er4XNTBrDyS5t2gvWJtJjpkVFpAPOC78sIXCy9gEMJ4=; b=e0zEOz6XH7IEFFu7ZhE7QLn5pfMAPVv+NEusnAl7ksZaHyHNvfp5MExnCqwdTiNzoe CXLAMNqjSrgXT4iNQnB0aEP6MlVWrLNNK6p3/nkTm1doZhQHtf3x6RAyOnHixsqXotSa bJnkRHZpcWB18mr4ZfQcFB4OtVtb2j7py00t0ltr40FiQNitsMg3b7zUVsmHt2DFRiS7 OsUa4Skticjx1LNH25FT/kWo/qp3FyRKfGop2LAzWA2i4ECm1hEGpWdRNAM+FG6Pb/WX DlphqjIHyfU4j9mzLmvajeUzIk+7/8CPnxXbPY66De7osQx2DzCjQ6/H7qBvgF5iVO8V 5HeQ== X-Gm-Message-State: AA+aEWbC8zKOcRU2V2WmGaqy15Kyu+KZ9KRU+8NgrDUio4iNMcMp81WU /ZBMh3M56qx2qZerA7nlZIQgnXUlews= X-Google-Smtp-Source: AFSGD/V/mIaMAcaIo+4uRU3IxtsT8MLN47bqMhUgE0te9bDcqM3IUtKtOUw9pEncR4KrUxrR8es7bg== X-Received: by 2002:a02:8943:: with SMTP id u3mr5622148jaj.92.1543597022850; Fri, 30 Nov 2018 08:57:02 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:02 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 08/27] aio: use assigned completion handler Date: Fri, 30 Nov 2018 09:56:27 -0700 Message-Id: <20181130165646.27341-9-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We know this is a read/write request, but in preparation for having different kinds of those, ensure that we call the assigned handler instead of assuming it's aio_complete_rq(). Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- fs/aio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/aio.c b/fs/aio.c index 05647d352bf3..cf0de61743e8 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1490,7 +1490,7 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) ret = -EINTR; /*FALLTHRU*/ default: - aio_complete_rw(req, ret, 0); + req->ki_complete(req, ret, 0); } } From patchwork Fri Nov 30 16:56:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706763 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E59613A4 for ; Fri, 30 Nov 2018 16:57:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FF5D30415 for ; Fri, 30 Nov 2018 16:57:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 33F9C3041C; Fri, 30 Nov 2018 16:57:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D881D30421 for ; Fri, 30 Nov 2018 16:57:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727259AbeLAEHB (ORCPT ); Fri, 30 Nov 2018 23:07:01 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:38802 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727203AbeLAEHA (ORCPT ); Fri, 30 Nov 2018 23:07:00 -0500 Received: by mail-it1-f193.google.com with SMTP id h65so10225963ith.3 for ; Fri, 30 Nov 2018 08:57:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=sgBueyD41WcD6YGB1PXXB7qHQZZJctD1Yvvmpvxoomc=; b=VUJyCyloFNtiuGHdaieLZBVq9oWFG3AjexunXFHjMJz4Z1kkKaTvjIFo/ZTws9HLnD ub7LpbkC5cjVRrNiBxkGFJRoHnZyO+4X1It9tePq3+TlXzjMaSabSPr11FUuwt4/Rurg YKhQXvedROn9CNsN0qzCVX6RskknmHL14OHtgSfic1FPXXRUnuo/uHhrGvacKMSHavB4 MTvINDFPH+tz358m8YECIYzotbx+1XGTAgxN8diGvcTinIyLK99DUTYvGml4ylGnfMRp QtpMHhLD/LmQcKEg1Y8pfaYceQbZRknQ8rgmKUhm+iWZug0xb3CBMwNrq+H+ai+RuUd/ yU+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=sgBueyD41WcD6YGB1PXXB7qHQZZJctD1Yvvmpvxoomc=; b=iE21xw73JSKKozx7hT2OJkm71UgQgIxgRZVhSrxl4vSfJqaA4rjmaENFwDA0xLLpv3 PF8oTf6yvK4AoxBe0pGKkrK/orPHXIR6ZvuIFX0nbBFx+LcBUd0QksyOMEGCsq4+UWhV BJaeMYGxoVZ7ILd742nM8nZKKJVqwq/L5i8NvAJJ2kSkrUgxfdP5V/4ttJ/R8B6EFumC IBW50mYrZc96d5pph6x23+DnR1J0lWGVUi1AKJp48fW/UzXtozy6LAekNSasjzWmebi3 klpSxTyLcfqoojd/ZbLHbu2ODKerXh2VoA5cHoy4oqcicEZK+fKxsbVy7GUU7+IIx3pp qxyQ== X-Gm-Message-State: AA+aEWabnIvVWKzMhlM83itIP1CB+O1numRnqqkRkNQnen5TyxWeiISN 03E9vgp3ktKGtHdT47imgOAsg3KR7t0= X-Google-Smtp-Source: AFSGD/U2j1I6gomwXxzKd2EnHHF2mRzBx7Zqkrh6GYhl02JkQ58WvawhzLfLEaQKCtnAMxsrq4fqlg== X-Received: by 2002:a24:7c58:: with SMTP id a85mr6013497itd.9.1543597024348; Fri, 30 Nov 2018 08:57:04 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:03 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 09/27] aio: separate out ring reservation from req allocation Date: Fri, 30 Nov 2018 09:56:28 -0700 Message-Id: <20181130165646.27341-10-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is in preparation for certain types of IO not needing a ring reserveration. Signed-off-by: Jens Axboe --- fs/aio.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index cf0de61743e8..eaceb40e6cf5 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -901,7 +901,7 @@ static void put_reqs_available(struct kioctx *ctx, unsigned nr) local_irq_restore(flags); } -static bool get_reqs_available(struct kioctx *ctx) +static bool __get_reqs_available(struct kioctx *ctx) { struct kioctx_cpu *kcpu; bool ret = false; @@ -993,6 +993,14 @@ static void user_refill_reqs_available(struct kioctx *ctx) spin_unlock_irq(&ctx->completion_lock); } +static bool get_reqs_available(struct kioctx *ctx) +{ + if (__get_reqs_available(ctx)) + return true; + user_refill_reqs_available(ctx); + return __get_reqs_available(ctx); +} + /* aio_get_req * Allocate a slot for an aio request. * Returns NULL if no requests are free. @@ -1001,24 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) { struct aio_kiocb *req; - if (!get_reqs_available(ctx)) { - user_refill_reqs_available(ctx); - if (!get_reqs_available(ctx)) - return NULL; - } - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); if (unlikely(!req)) - goto out_put; + return NULL; percpu_ref_get(&ctx->reqs); INIT_LIST_HEAD(&req->ki_list); refcount_set(&req->ki_refcnt, 0); req->ki_ctx = ctx; return req; -out_put: - put_reqs_available(ctx, 1); - return NULL; } static struct kioctx *lookup_ioctx(unsigned long ctx_id) @@ -1805,9 +1804,13 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return -EINVAL; } + if (!get_reqs_available(ctx)) + return -EAGAIN; + + ret = -EAGAIN; req = aio_get_req(ctx); if (unlikely(!req)) - return -EAGAIN; + goto out_put_reqs_available; if (iocb.aio_flags & IOCB_FLAG_RESFD) { /* @@ -1870,11 +1873,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, goto out_put_req; return 0; out_put_req: - put_reqs_available(ctx, 1); percpu_ref_put(&ctx->reqs); if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); kmem_cache_free(kiocb_cachep, req); +out_put_reqs_available: + put_reqs_available(ctx, 1); return ret; } From patchwork Fri Nov 30 16:56:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706767 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BF8613A4 for ; Fri, 30 Nov 2018 16:57:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E79E3041C for ; Fri, 30 Nov 2018 16:57:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 72B2230415; Fri, 30 Nov 2018 16:57:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2987730415 for ; Fri, 30 Nov 2018 16:57:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727089AbeLAEHC (ORCPT ); Fri, 30 Nov 2018 23:07:02 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:40599 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727203AbeLAEHC (ORCPT ); Fri, 30 Nov 2018 23:07:02 -0500 Received: by mail-it1-f195.google.com with SMTP id h193so10215771ita.5 for ; Fri, 30 Nov 2018 08:57:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mh+baagw6TRyG7ZRVJ1uWljPK7oJpHidng0AtT9sC8g=; b=No3gkT1cBff8WIB0JKRLELIEuIBXuiFMK/q55F8ok1ZOSJh7obe2m3gXpWIPL+/Ac7 KsLpKYPkkNv7cgIUYh9vTkZnGvEL1E9NA4lFiejLQTz85JT6FU086NZ3EbauqFxcXNru sJQQxQRLNXHkqGAjLhL9xWaUzthzrKZBLS4VeBHm5VV8PUraNEurPBrUd0b2x03r3H9I PkwTovmzFGYsaBbd8mboEIqd/AZzz2mVWnfk0hGug/j6zj/Hz2H7PExGes8LBssvO1ac t0a8x1OVFv5Ec69l4ipkpkQH1WACIKpC7zpY9IUzwmddFIDyxdfSrmk3SKx4NkfuOuqK MvGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mh+baagw6TRyG7ZRVJ1uWljPK7oJpHidng0AtT9sC8g=; b=rUnQ3UeaIxMis3EZPBNxmLDjyLtvL8QlwIV2GCZ51Ftt5xwxtt2HPfJz7jQV6QDctl jfRE4eGAWmXuyvXoA3laEXT94aDUpB2U6fWzA+b869759XlKls0IpzxIA3qSq7bbHJi2 yOgtufTCx9dkVlZHpQorgS5BpEiHGYtwc3SgUL/Jcn+mSDjudBAQOKT1teByHGRDMc23 DBYQrY1Jr18FNsQh+wWt2XDW9+B0w/0UlUjGEeP93TivdSKAvw/2uAL9+fS2C5g65S61 eDGxNWd6fveMDG4Q5YUT2mQnQ8dBrVXz6b0fFblJlr0ffpWvAgnk7d81wIJuLXetTpDV NfhQ== X-Gm-Message-State: AA+aEWaMbUaNHr1DQEvb2MX+Iu37mqsMF4E+RrNnY6KB8Xjc7NPgXAuZ lt/VhbYFILEqyko2S/ORtBZYcxRUj3E= X-Google-Smtp-Source: AFSGD/WcLQtsHHmAH2xItbMRnpRj0s2p04D04qlaY3Z7eEiybl1mtRHO/kG9CxSevAdJwGKbwj49PA== X-Received: by 2002:a02:12c5:: with SMTP id 66mr5946928jap.54.1543597025649; Fri, 30 Nov 2018 08:57:05 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:04 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 10/27] aio: don't zero entire aio_kiocb aio_get_req() Date: Fri, 30 Nov 2018 09:56:29 -0700 Message-Id: <20181130165646.27341-11-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It's 192 bytes, fairly substantial. Most items don't need to be cleared, especially not upfront. Clear the ones we do need to clear, and leave the other ones for setup when the iocb is prepared and submitted. Signed-off-by: Jens Axboe Reviewed-by: Christoph Hellwig --- fs/aio.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index eaceb40e6cf5..681f2072f81b 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1009,14 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) { struct aio_kiocb *req; - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); - if (unlikely(!req)) - return NULL; + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); + if (req) { + percpu_ref_get(&ctx->reqs); + req->ki_ctx = ctx; + INIT_LIST_HEAD(&req->ki_list); + refcount_set(&req->ki_refcnt, 0); + req->ki_eventfd = NULL; + } - percpu_ref_get(&ctx->reqs); - INIT_LIST_HEAD(&req->ki_list); - refcount_set(&req->ki_refcnt, 0); - req->ki_ctx = ctx; return req; } @@ -1730,6 +1731,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) if (unlikely(!req->file)) return -EBADF; + req->head = NULL; + req->woken = false; + req->cancelled = false; + apt.pt._qproc = aio_poll_queue_proc; apt.pt._key = req->events; apt.iocb = aiocb; From patchwork Fri Nov 30 16:56:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706771 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1387318B8 for ; Fri, 30 Nov 2018 16:57:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0520330415 for ; Fri, 30 Nov 2018 16:57:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ED8743041D; Fri, 30 Nov 2018 16:57:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1BED30415 for ; Fri, 30 Nov 2018 16:57:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727203AbeLAEHE (ORCPT ); Fri, 30 Nov 2018 23:07:04 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:56290 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727350AbeLAEHD (ORCPT ); Fri, 30 Nov 2018 23:07:03 -0500 Received: by mail-it1-f195.google.com with SMTP id o19so10088451itg.5 for ; Fri, 30 Nov 2018 08:57:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8/1Nol3a3uaTr8sM9gFlQE0kinP2KU2cED85d/G/25k=; b=sq6DFq35bVoDp8yMVd4wxPxeB8WZxYMT8MrxJpIMsjZTcUC+cuzrHaib4c2h+cUWHG SPFt6/IM5SlxytyM7VaRwZJGxCVuWiTt+KVUDZ/adXoMwgVp5OERMgsbUDg2T084vID7 Y3koQryj0o11LyctjsTQhcjhUtC+udH6LMa+T47rVrF7ylZvpS+BgDB0MTNyxUf8hRwE UjS5PIpt1+bkk6fWt1vfk2t3CoMcj5+SIQua01wzJW1RvlU6ZGDbAKiF3dtC43BHoEGE BVXbxpc7KHuKh0MSsaoLwKPac57p3ojDSW3Kcxyd2OsyDoFB+CYlQfP+WdJ7GT9FyXAV paPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8/1Nol3a3uaTr8sM9gFlQE0kinP2KU2cED85d/G/25k=; b=cahWSoVOQoCwnI5cQgoe/ni4ec77mfogRbckG/+VE30Aly9eb34gVawZV9Yp0NFcKe JhmspdmmkNkjyW7E6P471b6S/UtXMq5TCp45Wk9rFt10sqF9yDVWc4Nx2n173ZfgWhsn 0qVDFrdnzZHkV8gtOw8CHGJMoBlVsgPOK7DcZMepbVQjJVqMDE/zopZRSKy1u1uVAKNq OWua6HvSgXjd6BLYtAiJTN4pBjlVSlt9ZaxhGpL1JABnjeijZQVW1c1bsuZyNr0zgo4Z Tf19slAWyEGGt/YmTC7Pa5OQhvb0fHrgxY4oK+fNwvsdJbYqVYZ5M0PrqKjCACeZpsOi rQ7A== X-Gm-Message-State: AA+aEWZDabzd1tnQkBKjkg1dnu/nJY++QZu99Z3w6Ot3UB5crkQqmohO N/50ks10LNM8GDni0fi17Pvmh5dMETI= X-Google-Smtp-Source: AFSGD/UgQEX8ECP7m2g6PGBOyb8hL2FU1DTaFUDFfNacktD29Gyaf8QZxq/WPkzeZurYJbhxzTNmdQ== X-Received: by 2002:a02:242b:: with SMTP id f43mr5640161jaa.144.1543597027331; Fri, 30 Nov 2018 08:57:07 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:06 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 11/27] aio: only use blk plugs for > 2 depth submissions Date: Fri, 30 Nov 2018 09:56:30 -0700 Message-Id: <20181130165646.27341-12-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Plugging is meant to optimize submission of a string of IOs, if we don't have more than 2 being submitted, don't bother setting up a plug. Signed-off-by: Jens Axboe Reviewed-by: Christoph Hellwig --- fs/aio.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 681f2072f81b..533cb7b1112f 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1887,6 +1887,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return ret; } +/* + * Plugging is meant to work with larger batches of IOs. If we don't + * have more than the below, then don't bother setting up a plug. + */ +#define AIO_PLUG_THRESHOLD 2 + /* sys_io_submit: * Queue the nr iocbs pointed to by iocbpp for processing. Returns * the number of iocbs queued. May return -EINVAL if the aio_context @@ -1919,7 +1925,8 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_start_plug(&plug); for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -1932,7 +1939,8 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (ret) break; } - blk_finish_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_finish_plug(&plug); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -1959,7 +1967,8 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (nr > ctx->nr_events) nr = ctx->nr_events; - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_start_plug(&plug); for (i = 0; i < nr; i++) { compat_uptr_t user_iocb; @@ -1972,7 +1981,8 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (ret) break; } - blk_finish_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) + blk_finish_plug(&plug); percpu_ref_put(&ctx->users); return i ? i : ret; From patchwork Fri Nov 30 16:56:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706775 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9345418B8 for ; Fri, 30 Nov 2018 16:57:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85C7F3041C for ; Fri, 30 Nov 2018 16:57:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7A4EB30421; Fri, 30 Nov 2018 16:57:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2B59E3041C for ; Fri, 30 Nov 2018 16:57:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727324AbeLAEHF (ORCPT ); Fri, 30 Nov 2018 23:07:05 -0500 Received: from mail-it1-f194.google.com ([209.85.166.194]:39314 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727350AbeLAEHF (ORCPT ); Fri, 30 Nov 2018 23:07:05 -0500 Received: by mail-it1-f194.google.com with SMTP id a6so10218242itl.4 for ; Fri, 30 Nov 2018 08:57:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6hPqsRSp/t/J4Haujgv9+B43f7+1jV9cXsozT1ugKjo=; b=Sz6VUAyvjCgyg/BgD9Rh8Dp3owrIvg6sysqFF7ip2j3Ssexon4xfM/7+iY6MUDmzh7 Ml7shDAlm4wGOUjRXhtwzF7Tpu0Z2QlwMSazKgfbPoag08AClX2aRokriLYyHZWaH3jl rOJtIKJOlQhYmJYYzQ7VZx4fgm0rwSQyBtvrwqfBm7DtbbfIi9jCxc4CZqcgDxDSciTi OPVZyF2GJs1TlTQRbAtlSR4Lw13sORgIdnhtZtYNW6rWJzbP8GPtpBQayu6Q6T0ulu29 zecSXkGIS328ENcoWi1VF1LXZCbzcMs9TWBPOA3RgZeRlNoXmHbDn0kOrT6wUZnwm8jj 6y9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6hPqsRSp/t/J4Haujgv9+B43f7+1jV9cXsozT1ugKjo=; b=PYcsqZ+7RaqSMrx9zM3jL6J3f7h/TCXPic2VjxUqgEwNFKzErymyUjKvbMABgoJfx1 k89xoXq7HQmpzhQNaEbZb1eOpzHOVWyX5BCz2W2H+Km9mllJmkgnqzCk3yhSqnmGAWg4 d5Ys0ENlZqNnUN9t5+3TF+/rGjoPmw+C0Msxn7BA5Ig66wvMADXAD0pLewG7rbWszT6m U9A7nkRBUe9H4NIQY8UHqXQvcLBl/fWBkj4kCgIa8h269Gu5/fSwRQssXThCsUBWyVS+ Z55Bnal7laiAGLNLfJNmOBKRcfr/ZxXfusQ0ufo7OD6KUUs2p0dTTJ/J8koQuIGcFnE9 2LMQ== X-Gm-Message-State: AA+aEWY8UEiz1MLemQsgGNfYd8c1+SVAvIGPYQOoTJlUWwmOpzEs2Wch g2twFJw2EOJ+nKNYY01QEf10A45p/Lc= X-Google-Smtp-Source: AFSGD/X4FBxX9pIBYP27LYasJrY9Clxt4rqZX5Wzptp2+bfMExF5sW53WDLQGfvkFQe+MErJXm+7Dw== X-Received: by 2002:a02:b719:: with SMTP id g25mr5496992jam.46.1543597028934; Fri, 30 Nov 2018 08:57:08 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:08 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 12/27] aio: use iocb_put() instead of open coding it Date: Fri, 30 Nov 2018 09:56:31 -0700 Message-Id: <20181130165646.27341-13-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Replace the percpu_ref_put() + kmem_cache_free() with a call to iocb_put() instead. Signed-off-by: Jens Axboe Reviewed-by: Christoph Hellwig --- fs/aio.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 533cb7b1112f..e8457f9486e3 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1878,10 +1878,9 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, goto out_put_req; return 0; out_put_req: - percpu_ref_put(&ctx->reqs); if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); - kmem_cache_free(kiocb_cachep, req); + iocb_put(req); out_put_reqs_available: put_reqs_available(ctx, 1); return ret; From patchwork Fri Nov 30 16:56:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706779 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 931B313A4 for ; Fri, 30 Nov 2018 16:57:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 811FD3041C for ; Fri, 30 Nov 2018 16:57:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70F7230425; Fri, 30 Nov 2018 16:57:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2FCF3041C for ; Fri, 30 Nov 2018 16:57:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727521AbeLAEHH (ORCPT ); Fri, 30 Nov 2018 23:07:07 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:39320 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbeLAEHH (ORCPT ); Fri, 30 Nov 2018 23:07:07 -0500 Received: by mail-it1-f195.google.com with SMTP id a6so10218366itl.4 for ; Fri, 30 Nov 2018 08:57:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AZTxn1zH/8CPF8nHBpvAZQPHd/ePx+TIdDehIsUuIw4=; b=PF6DjIKPcdZe1CBhbS6eeoI6rDzt1xIY1lMVAhkjZc2kK6cIgRUv+ge1JZkHY5iLHQ 0aTFMKYBtuFWjAwlouTgK2XuF4PIMBHBQB4EgTM5sYP5yUY/mvyMZOspVLA5dvSpN2EX qzvEj3XhXRaxpTsrfUqKSkMBcIgsuYdikHuuTaImMl4V9RRdyVuOB5Jf6ssAg+6e4ND+ zFc6y1PsRr+lLU3gOcXL5lKhkEJlrj6CcAWmzRHFaKxBG9H2wfr8O+edPp2j4jWCA/Pa 7CLa+5WSiAYWjUzo6tTriwB4r6PCdNTGPXD0gEitpHm46cDSUb75QNRJ3+2EyJ1eoenh OrrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AZTxn1zH/8CPF8nHBpvAZQPHd/ePx+TIdDehIsUuIw4=; b=IJPAtrIRC0KExdwS/aquTRaz0CP0xjksveZUv8m+oxfc+Ii91krqgXN9Q9pejt1co4 7rIvrLAqhVc0GG0z0A1Wq5zHQj5aJ33R3TGX6LbhbyEEEZJyaiWXdi0NJYONukGujQB6 cByIM6NwmYNuiiFYdSeAi440Qyi2EE9yvLdzb6i6JRKxILQM2zEPJ8bkPW0Tp7ZWE0BR yXbSvTvyb0HNFMOk0HwdJseoYPh+td0Np4FjQnKO259cQg9pKHBQx3jD0C+Ut1iYmC7M lRWNiZSijg1dopOgn4dt5IwD51m0SIaEMSl5w4/ZFpUx6yhiT7EcecZUIT68SXyWZuih QV+A== X-Gm-Message-State: AA+aEWb0U+Np5UgeFuPlZT6JrUkuWjPrVl6nAQG0rkm0ViCcPPJhm3dW o05yF03jUyRnvZ+XV4I2rgkglHGKh9c= X-Google-Smtp-Source: AFSGD/XQJvlN3HNzAc9gngyQfLrMfsCydzu0/WcespMZxacupaztQVkZDsYMOi+LAED00YfgM5Uh2Q== X-Received: by 2002:a02:284:: with SMTP id 126mr5931001jau.47.1543597030497; Fri, 30 Nov 2018 08:57:10 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:09 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 13/27] aio: split out iocb copy from io_submit_one() Date: Fri, 30 Nov 2018 09:56:32 -0700 Message-Id: <20181130165646.27341-14-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In preparation of handing in iocbs in a different fashion as well. Also make it clear that the iocb being passed in isn't modified, by marking it const throughout. Signed-off-by: Jens Axboe --- fs/aio.c | 68 +++++++++++++++++++++++++++++++------------------------- 1 file changed, 38 insertions(+), 30 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index e8457f9486e3..ba5758c854e8 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1414,7 +1414,7 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) aio_complete(iocb, res, res2); } -static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) +static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) { int ret; @@ -1455,7 +1455,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb) return ret; } -static int aio_setup_rw(int rw, struct iocb *iocb, struct iovec **iovec, +static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec, bool vectored, bool compat, struct iov_iter *iter) { void __user *buf = (void __user *)(uintptr_t)iocb->aio_buf; @@ -1494,8 +1494,8 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } -static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored, - bool compat) +static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb, + bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct iov_iter iter; @@ -1527,8 +1527,8 @@ static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored, return ret; } -static ssize_t aio_write(struct kiocb *req, struct iocb *iocb, bool vectored, - bool compat) +static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb, + bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct iov_iter iter; @@ -1583,7 +1583,8 @@ static void aio_fsync_work(struct work_struct *work) aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0); } -static int aio_fsync(struct fsync_iocb *req, struct iocb *iocb, bool datasync) +static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb, + bool datasync) { if (unlikely(iocb->aio_buf || iocb->aio_offset || iocb->aio_nbytes || iocb->aio_rw_flags)) @@ -1711,7 +1712,7 @@ aio_poll_queue_proc(struct file *file, struct wait_queue_head *head, add_wait_queue(head, &pt->iocb->poll.wait); } -static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) +static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) { struct kioctx *ctx = aiocb->ki_ctx; struct poll_iocb *req = &aiocb->poll; @@ -1783,27 +1784,23 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) return 0; } -static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, - bool compat) +static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, + struct iocb __user *user_iocb, bool compat) { struct aio_kiocb *req; - struct iocb iocb; ssize_t ret; - if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) - return -EFAULT; - /* enforce forwards compatibility on users */ - if (unlikely(iocb.aio_reserved2)) { + if (unlikely(iocb->aio_reserved2)) { pr_debug("EINVAL: reserve field set\n"); return -EINVAL; } /* prevent overflows */ if (unlikely( - (iocb.aio_buf != (unsigned long)iocb.aio_buf) || - (iocb.aio_nbytes != (size_t)iocb.aio_nbytes) || - ((ssize_t)iocb.aio_nbytes < 0) + (iocb->aio_buf != (unsigned long)iocb->aio_buf) || + (iocb->aio_nbytes != (size_t)iocb->aio_nbytes) || + ((ssize_t)iocb->aio_nbytes < 0) )) { pr_debug("EINVAL: overflow check\n"); return -EINVAL; @@ -1817,14 +1814,14 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, if (unlikely(!req)) goto out_put_reqs_available; - if (iocb.aio_flags & IOCB_FLAG_RESFD) { + if (iocb->aio_flags & IOCB_FLAG_RESFD) { /* * If the IOCB_FLAG_RESFD flag of aio_flags is set, get an * instance of the file* now. The file descriptor must be * an eventfd() fd, and will be signaled for each completed * event using the eventfd_signal() function. */ - req->ki_eventfd = eventfd_ctx_fdget((int) iocb.aio_resfd); + req->ki_eventfd = eventfd_ctx_fdget((int) iocb->aio_resfd); if (IS_ERR(req->ki_eventfd)) { ret = PTR_ERR(req->ki_eventfd); req->ki_eventfd = NULL; @@ -1839,32 +1836,32 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, } req->ki_user_iocb = user_iocb; - req->ki_user_data = iocb.aio_data; + req->ki_user_data = iocb->aio_data; - switch (iocb.aio_lio_opcode) { + switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(&req->rw, &iocb, false, compat); + ret = aio_read(&req->rw, iocb, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(&req->rw, &iocb, false, compat); + ret = aio_write(&req->rw, iocb, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(&req->rw, &iocb, true, compat); + ret = aio_read(&req->rw, iocb, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(&req->rw, &iocb, true, compat); + ret = aio_write(&req->rw, iocb, true, compat); break; case IOCB_CMD_FSYNC: - ret = aio_fsync(&req->fsync, &iocb, false); + ret = aio_fsync(&req->fsync, iocb, false); break; case IOCB_CMD_FDSYNC: - ret = aio_fsync(&req->fsync, &iocb, true); + ret = aio_fsync(&req->fsync, iocb, true); break; case IOCB_CMD_POLL: - ret = aio_poll(req, &iocb); + ret = aio_poll(req, iocb); break; default: - pr_debug("invalid aio operation %d\n", iocb.aio_lio_opcode); + pr_debug("invalid aio operation %d\n", iocb->aio_lio_opcode); ret = -EINVAL; break; } @@ -1886,6 +1883,17 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, return ret; } +static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, + bool compat) +{ + struct iocb iocb; + + if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) + return -EFAULT; + + return __io_submit_one(ctx, &iocb, user_iocb, compat); +} + /* * Plugging is meant to work with larger batches of IOs. If we don't * have more than the below, then don't bother setting up a plug. From patchwork Fri Nov 30 16:56:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706783 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8946118B8 for ; Fri, 30 Nov 2018 16:57:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C04830421 for ; Fri, 30 Nov 2018 16:57:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7097830425; Fri, 30 Nov 2018 16:57:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D9B030421 for ; Fri, 30 Nov 2018 16:57:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727350AbeLAEHI (ORCPT ); Fri, 30 Nov 2018 23:07:08 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:40623 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbeLAEHH (ORCPT ); Fri, 30 Nov 2018 23:07:07 -0500 Received: by mail-it1-f193.google.com with SMTP id h193so10216297ita.5 for ; Fri, 30 Nov 2018 08:57:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RA6SARsil7zOqlPTyhrfo176dQaQeNwiqXCQQq9Z3o8=; b=cXKb3aOYD6f2kYsDTK5bOdZz8yeyWYXjvNbeRg2EVolRnF/oVZGdpMZrt5tgBb9+DU ncEsoOaeILYqgqJaJhjQadqUli/mEoWF0TeoDHa0/KB9Qce0OEybVDkqlUkn2pUiw+n6 ZR7sbZVPECT9MbjodsIHF9Sxa59OuzoW1mh+2SyLvmUOv6cGg7rcy7p7uVpVpvvQ2QY/ YKzFECV4d+1L8hjoExG6FzQpZwOrf80R+NSHVhunjzCewK0hj7neuflrb6OVDClJCs36 qQvumqz899pTdrgyCUcicQJpHKi7C4nzqjrZTwgIJzX3AzysEcrBuqEEbtbdGt6i7qUV sDhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RA6SARsil7zOqlPTyhrfo176dQaQeNwiqXCQQq9Z3o8=; b=ozPRG/rs8wNHOx9MdVBLnAwlmp96AKK+2KLHe+SacjMPzMfPZIunPWHvH74BEz2SkA Lc9d8FRNE13rD0Ijmdzioy8nN3WGOUwThcDXyJnaL+8PTSw2nzXuk0bQ8u+jZ01xWZSK wMm/TzNTYXlvONAvhMO/dbU+fO3CunXQv+eXiTqbOpKZ55KVp+3xLfyxkQBMcMnFIk/V l3bagyoqmFcmmq2gmvss0dtgZb0Ka+o4MaXgpb9UzAztUYBUElNhroz5QCwccPsTSTct gXK6hZZVyb0p+mHfyDaG7ZeEbvnrdU6jpPV/LDVVzwKRIzSt4LpAVR1/BlVBOh6Mudux 054Q== X-Gm-Message-State: AA+aEWZZftcWILrv1dlXTySR6PnI15nuiMDYK/bcIqaTgzaRd+7xuBPS GeBHANsLr8bkB5RLbZDmM3NHbAQhRGo= X-Google-Smtp-Source: AFSGD/Wn4z2tODm+l3fGGPykhpju4YxFoTGQtbJIDaIbhlFRtJoaoR51V27WrZ4R8mE9x4HCNPF38Q== X-Received: by 2002:a24:9a01:: with SMTP id l1mr5788132ite.113.1543597031767; Fri, 30 Nov 2018 08:57:11 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:10 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 14/27] aio: abstract out io_event filler helper Date: Fri, 30 Nov 2018 09:56:33 -0700 Message-Id: <20181130165646.27341-15-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Jens Axboe --- fs/aio.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index ba5758c854e8..12859ea1cb64 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1057,6 +1057,15 @@ static inline void iocb_put(struct aio_kiocb *iocb) } } +static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, + long res, long res2) +{ + ev->obj = (u64)(unsigned long)iocb->ki_user_iocb; + ev->data = iocb->ki_user_data; + ev->res = res; + ev->res2 = res2; +} + /* aio_complete * Called when the io request on the given iocb is complete. */ @@ -1084,10 +1093,7 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2) ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); event = ev_page + pos % AIO_EVENTS_PER_PAGE; - event->obj = (u64)(unsigned long)iocb->ki_user_iocb; - event->data = iocb->ki_user_data; - event->res = res; - event->res2 = res2; + aio_fill_event(event, iocb, res, res2); kunmap_atomic(ev_page); flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); From patchwork Fri Nov 30 16:56:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706785 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7E5C6109C for ; Fri, 30 Nov 2018 16:57:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E5DE30421 for ; Fri, 30 Nov 2018 16:57:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 62F5130425; Fri, 30 Nov 2018 16:57:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4D0D30421 for ; Fri, 30 Nov 2018 16:57:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727583AbeLAEHK (ORCPT ); Fri, 30 Nov 2018 23:07:10 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:35904 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727565AbeLAEHK (ORCPT ); Fri, 30 Nov 2018 23:07:10 -0500 Received: by mail-it1-f196.google.com with SMTP id c9so10244315itj.1 for ; Fri, 30 Nov 2018 08:57:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zVtp4FzAAmfv0AO0psddh6PYN7Ojhn91p4dzu/5A1xE=; b=S2oNWkzPDKnuNaZUPjU3sbAzO/K4UdKC4/U0DcGWuLxUJqzMyTeBYyPK8wzhbcsdk2 Dbg4nFh3VSBoPvu1odXDYrbx9zOS4OTn+JZzzg0jXN6mcrC1a+GjEtodE01KBOlU+AY7 DytMsJwIXO6KetnAbKU3lATpKSFVZ6K8foX27T2E2LhP/ETSO0ZF2lZ6QMXhqWfMaHPE wzt5yMoM3OP9FSMF4+3oSlGiE/4fCRg3KHhpCMjM/iesiGF/x81DrXDc8FxBXE9/DYIg uvvLikivffbWjwvL9+/BP6zefK2S/8ANANOKs/c0qjKqaw/2klZmOrY7hP9JFEUS56+y tKcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zVtp4FzAAmfv0AO0psddh6PYN7Ojhn91p4dzu/5A1xE=; b=KWmZdskUY3RKT2j0kuZb4gl43sHfVeFMCcN47zqKpDHpEptc/Oa+4jlGGiZfS1SydS pkNXndi3n/jhzVTrh1+oNBQxgi+3YnIUIiWtPrjLJ6MaOpfMzQZBZfCwHvvm+9+O2Rgs WYccAxPv6EniM5ONKG2/IWc0IyurmCxW8I3cxQPXUqpIR6lAx7BkIcwsgGffffuTm3gU dQ0Aw981gxIhb74qs5nlVLSg4k4ERaiO7HXWxAAYTtLaZdVyMyl6zKaSFbfi5B1rbLPB ucONmy+/oTw6XbTGd3k0F+5qwHRdpaeM2fgyMrMgReOwc01bDcxKRCHXmBLRF0cXI35C mIKA== X-Gm-Message-State: AA+aEWZOf2V6NSD5uj7yeF6itrjsiWUKBelzWg8pTTmnoYZQ1GjXsBHP vrhBjc4PZft7B9sJHdSSISIj89vPD3w= X-Google-Smtp-Source: AFSGD/WhTR42dwUiMy//ZaNIPVvKeFXCWy0tj+PCGB5580uELcjDMRaUVFfHxUuEFfq9jIEgPcRwzQ== X-Received: by 2002:a02:57c4:: with SMTP id b65mr5868732jad.79.1543597033663; Fri, 30 Nov 2018 08:57:13 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:12 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 15/27] aio: add io_setup2() system call Date: Fri, 30 Nov 2018 09:56:34 -0700 Message-Id: <20181130165646.27341-16-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is just like io_setup(), except add a flags argument to let the caller control/define some of the io_context behavior. Outside of that, we pass in an iocb array for future use. Signed-off-by: Jens Axboe --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + fs/aio.c | 70 ++++++++++++++++---------- include/linux/syscalls.h | 2 + include/uapi/asm-generic/unistd.h | 4 +- kernel/sys_ni.c | 1 + 5 files changed, 50 insertions(+), 28 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index f0b1709a5ffb..67c357225fb0 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -343,6 +343,7 @@ 332 common statx __x64_sys_statx 333 common io_pgetevents __x64_sys_io_pgetevents 334 common rseq __x64_sys_rseq +335 common io_setup2 __x64_sys_io_setup2 # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/aio.c b/fs/aio.c index 12859ea1cb64..74831ce2185e 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -94,6 +94,8 @@ struct kioctx { unsigned long user_id; + unsigned int flags; + struct __percpu kioctx_cpu *cpu; /* @@ -680,21 +682,24 @@ static void aio_nr_sub(unsigned nr) spin_unlock(&aio_nr_lock); } -/* ioctx_alloc - * Allocates and initializes an ioctx. Returns an ERR_PTR if it failed. - */ -static struct kioctx *ioctx_alloc(unsigned nr_events) +static struct kioctx *io_setup_flags(unsigned long ctxid, + unsigned int nr_events, unsigned int flags) { struct mm_struct *mm = current->mm; struct kioctx *ctx; int err = -ENOMEM; - /* * Store the original nr_events -- what userspace passed to io_setup(), * for counting against the global limit -- before it changes. */ unsigned int max_reqs = nr_events; + if (unlikely(ctxid || nr_events == 0)) { + pr_debug("EINVAL: ctx %lu nr_events %u\n", + ctxid, nr_events); + return ERR_PTR(-EINVAL); + } + /* * We keep track of the number of available ringbuffer slots, to prevent * overflow (reqs_available), and we also use percpu counters for this. @@ -720,6 +725,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) if (!ctx) return ERR_PTR(-ENOMEM); + ctx->flags = flags; ctx->max_reqs = max_reqs; spin_lock_init(&ctx->ctx_lock); @@ -1275,6 +1281,33 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, return ret; } +SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, + iocbs, aio_context_t __user *, ctxp) +{ + struct kioctx *ioctx; + unsigned long ctx; + long ret; + + if (flags) + return -EINVAL; + + ret = get_user(ctx, ctxp); + if (unlikely(ret)) + goto out; + + ioctx = io_setup_flags(ctx, nr_events, flags); + ret = PTR_ERR(ioctx); + if (IS_ERR(ioctx)) + goto out; + + ret = put_user(ioctx->user_id, ctxp); + if (ret) + kill_ioctx(current->mm, ioctx, NULL); + percpu_ref_put(&ioctx->users); +out: + return ret; +} + /* sys_io_setup: * Create an aio_context capable of receiving at least nr_events. * ctxp must not point to an aio_context that already exists, and @@ -1290,7 +1323,7 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, */ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) { - struct kioctx *ioctx = NULL; + struct kioctx *ioctx; unsigned long ctx; long ret; @@ -1298,14 +1331,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) if (unlikely(ret)) goto out; - ret = -EINVAL; - if (unlikely(ctx || nr_events == 0)) { - pr_debug("EINVAL: ctx %lu nr_events %u\n", - ctx, nr_events); - goto out; - } - - ioctx = ioctx_alloc(nr_events); + ioctx = io_setup_flags(ctx, nr_events, 0); ret = PTR_ERR(ioctx); if (!IS_ERR(ioctx)) { ret = put_user(ioctx->user_id, ctxp); @@ -1313,7 +1339,6 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) kill_ioctx(current->mm, ioctx, NULL); percpu_ref_put(&ioctx->users); } - out: return ret; } @@ -1321,7 +1346,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) { - struct kioctx *ioctx = NULL; + struct kioctx *ioctx; unsigned long ctx; long ret; @@ -1329,23 +1354,14 @@ COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) if (unlikely(ret)) goto out; - ret = -EINVAL; - if (unlikely(ctx || nr_events == 0)) { - pr_debug("EINVAL: ctx %lu nr_events %u\n", - ctx, nr_events); - goto out; - } - - ioctx = ioctx_alloc(nr_events); + ioctx = io_setup_flags(ctx, nr_events, 0); ret = PTR_ERR(ioctx); if (!IS_ERR(ioctx)) { - /* truncating is ok because it's a user address */ - ret = put_user((u32)ioctx->user_id, ctx32p); + ret = put_user(ioctx->user_id, ctx32p); if (ret) kill_ioctx(current->mm, ioctx, NULL); percpu_ref_put(&ioctx->users); } - out: return ret; } diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 2ac3d13a915b..b661e78717e6 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -287,6 +287,8 @@ static inline void addr_limit_user_check(void) */ #ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx); +asmlinkage long sys_io_setup2(unsigned, unsigned, struct iocb __user *, + aio_context_t __user *); asmlinkage long sys_io_destroy(aio_context_t ctx); asmlinkage long sys_io_submit(aio_context_t, long, struct iocb __user * __user *); diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 538546edbfbd..b4527ed373b0 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -738,9 +738,11 @@ __SYSCALL(__NR_statx, sys_statx) __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents) #define __NR_rseq 293 __SYSCALL(__NR_rseq, sys_rseq) +#define __NR_io_setup2 294 +__SYSCALL(__NR_io_setup2, sys_io_setup2) #undef __NR_syscalls -#define __NR_syscalls 294 +#define __NR_syscalls 295 /* * 32 bit systems traditionally used different diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index df556175be50..17c8b4393669 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -37,6 +37,7 @@ asmlinkage long sys_ni_syscall(void) */ COND_SYSCALL(io_setup); +COND_SYSCALL(io_setup2); COND_SYSCALL_COMPAT(io_setup); COND_SYSCALL(io_destroy); COND_SYSCALL(io_submit); From patchwork Fri Nov 30 16:56:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706789 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 459AC109C for ; Fri, 30 Nov 2018 16:57:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3713E30421 for ; Fri, 30 Nov 2018 16:57:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2BBC130427; Fri, 30 Nov 2018 16:57:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82D2830421 for ; Fri, 30 Nov 2018 16:57:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727588AbeLAEHM (ORCPT ); Fri, 30 Nov 2018 23:07:12 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:56310 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727565AbeLAEHL (ORCPT ); Fri, 30 Nov 2018 23:07:11 -0500 Received: by mail-it1-f193.google.com with SMTP id o19so10088973itg.5 for ; Fri, 30 Nov 2018 08:57:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=oKKQgmcCq/9Ah7syrea645E2X6XDYJJKTHIbZiVUtQA=; b=d0tJD0Wy9ZqANIAHGeCLauIvT1VK7FS8IK8QoCAOQXsuEJ7/RYSep1DMCbjukV2xit 8IN81K9MwjMNC0trD1xv50CEJzyhORoc0CDg4G0MV1WgDMJvxbL3H03rNtcpEq65NqVA lKELC/VGnU7DasbLyGho8vxDQ48aXQbL/LkZ8nmQQBrIfA7dp+PgD5/bXph7HVLP12E3 exI4DhxNeeIZQigdrz7YQGEpI22R2M3jwOT9UfUv833rWUfDudVVDr45BIK94UzDpKtd cp7dFFYc+ty1caW3bf01/jmmL7fjKcRK1Ji5WFehV0obKLWPk5KIQjq/3X/5l0XENa3m 6TOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=oKKQgmcCq/9Ah7syrea645E2X6XDYJJKTHIbZiVUtQA=; b=nvsKhz606QMcL7J1bGYBhKVzUB+2vfMbLEWjTbXiYu85x0KTIha1uMLGoAiGQYjIBL 912aMZNXxtQ8301Q9NZp1d55+ZLGzEJiSN+lgmxiRyGZCrIFsh+sG4o6G4aa9r3pwaWG 7uFxEBCAwsKXDZ6oY8lAIs052OE2quUk5s9XEL658DP4wsiOc1jydmnsUXmHPNGMbGOG 3hy4RFPp7W/AYfDrUqVm3hERQ6fjqy822zq61ba/zHYg0cYA+DgsWOiOQI/YGx7KPx85 cqVSukhBJIdJJh3MfQukhOoN+K3UG/HN5R1oFulaTnxiFQYMXUVv89t8j6xY3GNyiX/R e2pQ== X-Gm-Message-State: AA+aEWa3SQayR2U7o6k6RZnT2QF8cU1d+qajrrQN9IlAXSb7cFQURS+y 2i4i2QGe0wstr7f4sAgTIonc8f2FMUw= X-Google-Smtp-Source: AFSGD/VKuFulIgf3VgS6jKp58AG2QjXP9zPiQs34oa5nM39AyaEQN/SUENA7iC1X/xIfVgA60BlOIA== X-Received: by 2002:a24:750f:: with SMTP id y15mr5507249itc.177.1543597035467; Fri, 30 Nov 2018 08:57:15 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:14 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 16/27] aio: add support for having user mapped iocbs Date: Fri, 30 Nov 2018 09:56:35 -0700 Message-Id: <20181130165646.27341-17-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For io_submit(), we have to first copy each pointer to an iocb, then copy the iocb. The latter is 64 bytes in size, and that's a lot of copying for a single IO. Add support for setting IOCTX_FLAG_USERIOCB through the new io_setup2() system call, which allows the iocbs to reside in userspace. If this flag is used, then io_submit() doesn't take pointers to iocbs anymore, it takes an index value into the array of iocbs instead. Similary, for io_getevents(), the iocb ->obj will be the index, not the pointer to the iocb. See the change made to fio to support this feature, it's pretty trivialy to adapt to. For applications, like fio, that previously embedded the iocb inside a application private structure, some sort of lookup table/structure is needed to find the private IO structure from the index at io_getevents() time. http://git.kernel.dk/cgit/fio/commit/?id=3c3168e91329c83880c91e5abc28b9d6b940fd95 Signed-off-by: Jens Axboe --- fs/aio.c | 111 +++++++++++++++++++++++++++++++---- include/uapi/linux/aio_abi.h | 2 + 2 files changed, 101 insertions(+), 12 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 74831ce2185e..380e6fe8c429 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -121,6 +121,9 @@ struct kioctx { struct page **ring_pages; long nr_pages; + struct page **iocb_pages; + long iocb_nr_pages; + struct rcu_work free_rwork; /* see free_ioctx() */ /* @@ -216,6 +219,11 @@ static struct vfsmount *aio_mnt; static const struct file_operations aio_ring_fops; static const struct address_space_operations aio_ctx_aops; +static const unsigned int iocb_page_shift = + ilog2(PAGE_SIZE / sizeof(struct iocb)); + +static void aio_useriocb_free(struct kioctx *); + static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) { struct file *file; @@ -572,6 +580,7 @@ static void free_ioctx(struct work_struct *work) free_rwork); pr_debug("freeing %p\n", ctx); + aio_useriocb_free(ctx); aio_free_ring(ctx); free_percpu(ctx->cpu); percpu_ref_exit(&ctx->reqs); @@ -1281,6 +1290,61 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, return ret; } +static struct iocb *aio_iocb_from_index(struct kioctx *ctx, int index) +{ + unsigned int page_index; + struct iocb *iocb; + + page_index = index >> iocb_page_shift; + index &= ((1 << iocb_page_shift) - 1); + iocb = page_address(ctx->iocb_pages[page_index]); + + return iocb + index; +} + +static void aio_useriocb_free(struct kioctx *ctx) +{ + int i; + + if (!ctx->iocb_nr_pages) + return; + + for (i = 0; i < ctx->iocb_nr_pages; i++) + put_page(ctx->iocb_pages[i]); + + kfree(ctx->iocb_pages); + ctx->iocb_pages = NULL; + ctx->iocb_nr_pages = 0; +} + +static int aio_useriocb_map(struct kioctx *ctx, struct iocb __user *iocbs) +{ + int nr_pages, ret; + + if ((unsigned long) iocbs & ~PAGE_MASK) + return -EINVAL; + + nr_pages = sizeof(struct iocb) * ctx->max_reqs; + nr_pages = (nr_pages + PAGE_SIZE - 1) >> PAGE_SHIFT; + + ctx->iocb_pages = kzalloc(nr_pages * sizeof(struct page *), GFP_KERNEL); + if (!ctx->iocb_pages) + return -ENOMEM; + + down_write(¤t->mm->mmap_sem); + ret = get_user_pages((unsigned long) iocbs, nr_pages, 0, + ctx->iocb_pages, NULL); + up_write(¤t->mm->mmap_sem); + + if (ret < nr_pages) { + kfree(ctx->iocb_pages); + return -ENOMEM; + } + + ctx->iocb_nr_pages = nr_pages; + return 0; +} + SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, iocbs, aio_context_t __user *, ctxp) { @@ -1288,7 +1352,7 @@ SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, unsigned long ctx; long ret; - if (flags) + if (flags & ~IOCTX_FLAG_USERIOCB) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1300,9 +1364,17 @@ SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, if (IS_ERR(ioctx)) goto out; + if (flags & IOCTX_FLAG_USERIOCB) { + ret = aio_useriocb_map(ioctx, iocbs); + if (ret) + goto err; + } + ret = put_user(ioctx->user_id, ctxp); - if (ret) + if (ret) { +err: kill_ioctx(current->mm, ioctx, NULL); + } percpu_ref_put(&ioctx->users); out: return ret; @@ -1851,10 +1923,13 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } } - ret = put_user(KIOCB_KEY, &user_iocb->aio_key); - if (unlikely(ret)) { - pr_debug("EFAULT: aio_key\n"); - goto out_put_req; + /* Don't support cancel on user mapped iocbs */ + if (!(ctx->flags & IOCTX_FLAG_USERIOCB)) { + ret = put_user(KIOCB_KEY, &user_iocb->aio_key); + if (unlikely(ret)) { + pr_debug("EFAULT: aio_key\n"); + goto out_put_req; + } } req->ki_user_iocb = user_iocb; @@ -1908,12 +1983,22 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, bool compat) { - struct iocb iocb; + struct iocb iocb, *iocbp; - if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) - return -EFAULT; + if (ctx->flags & IOCTX_FLAG_USERIOCB) { + unsigned long iocb_index = (unsigned long) user_iocb; + + if (iocb_index >= ctx->max_reqs) + return -EINVAL; + + iocbp = aio_iocb_from_index(ctx, iocb_index); + } else { + if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) + return -EFAULT; + iocbp = &iocb; + } - return __io_submit_one(ctx, &iocb, user_iocb, compat); + return __io_submit_one(ctx, iocbp, user_iocb, compat); } /* @@ -2063,6 +2148,9 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, if (unlikely(!ctx)) return -EINVAL; + if (ctx->flags & IOCTX_FLAG_USERIOCB) + goto err; + spin_lock_irq(&ctx->ctx_lock); kiocb = lookup_kiocb(ctx, iocb); if (kiocb) { @@ -2079,9 +2167,8 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, */ ret = -EINPROGRESS; } - +err: percpu_ref_put(&ctx->users); - return ret; } diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index 8387e0af0f76..814e6606c413 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -106,6 +106,8 @@ struct iocb { __u32 aio_resfd; }; /* 64 bytes */ +#define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ + #undef IFBIG #undef IFLITTLE From patchwork Fri Nov 30 16:56:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA67318B8 for ; Fri, 30 Nov 2018 16:57:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB79430421 for ; Fri, 30 Nov 2018 16:57:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A000930425; Fri, 30 Nov 2018 16:57:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8ED2330421 for ; Fri, 30 Nov 2018 16:57:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727612AbeLAEHO (ORCPT ); Fri, 30 Nov 2018 23:07:14 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:40646 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727592AbeLAEHO (ORCPT ); Fri, 30 Nov 2018 23:07:14 -0500 Received: by mail-it1-f196.google.com with SMTP id h193so10216773ita.5 for ; Fri, 30 Nov 2018 08:57:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=sQLMf2lCjXlFfggxx2kifv1G3GLA0iPim2lyNjDBjHI=; b=w4eaNS/CIgCC+HE0z4k10W1lLwAry1H1WqPFbXS0WS6aNggXrLqmK4OjU+adAX/Brk /YccX3wiifAdeaRmH388EXo16M8KIWl8Psv6IrGc0RHJd6dsw88CDls+RFIS9V/AgAIB HVe3PYq/AzgRd3Y5Hp0D5uobcxIjeYnoqdxjAZeyhjueaUFzDfmHqJzh4MJlGoY9Ob0h Ah2nbUK5KJK/t9W4WdI7uKEm3iQWZFKSHohfb7XplzuI3vixgMKQgm96XbAuwS6iMC+l aE0bYY/ACYtW2mf4zKh23F6bxappW4NfVPh+Oq1VD2mShffrNGs1cxgzswXe+MHi0Fca xYVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=sQLMf2lCjXlFfggxx2kifv1G3GLA0iPim2lyNjDBjHI=; b=CNUiMDOblld9EUo74GpovDVWK4UBe9faRDvlJY8r4ShlyIjjJwK7dn/rgHubXAVohF sVhz9SZNqwcM+zTn6A+xXsttsj2kbtUyOlDEAKAap/LfytgfjdUaGgZFIFSlSM0w8MwL FB0lH2bk1xoDmKdF8M9zPCLPw9NkcAaVjSyjnzUMQgTg+lxNQ4bfvKjqetEiEGFFCQzI w6ZQxWUYuFYmwcgLD2KmUZ9ZF0Mc8x9gQjQHhNNsKV8FGQBvy+QR7HlEvae6VIbCGs5Z z7sNZ987JFm//BATO++/pi+GaWXYbCgm4swdWsN9cyEyvF9PBvCHI3QuR3dPCfDKYxKg nr+g== X-Gm-Message-State: AA+aEWZ+odSE6gJW1YCF3KkGAgZH4o8D8AvJ3YFYFE+YttD3oJ1BUJq0 OYLWriF5BUHTSr0ZmVT9BsbcQBEVUts= X-Google-Smtp-Source: AFSGD/VvKwEwa+TBFAQzvIZ4gqvBNtjpve/2djpwLo+PxUqXG9REa4I6GPkFzxEnofm5cxHUoG4FCA== X-Received: by 2002:a02:9549:: with SMTP id y67mr5840339jah.4.1543597037056; Fri, 30 Nov 2018 08:57:17 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:16 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 17/27] aio: support for IO polling Date: Fri, 30 Nov 2018 09:56:36 -0700 Message-Id: <20181130165646.27341-18-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add polled variants of PREAD/PREADV and PWRITE/PWRITEV. These act like their non-polled counterparts, except we expect to poll for completion of them. The polling happens at io_getevent() time, and works just like non-polled IO. To setup an io_context for polled IO, the application must call io_setup2() with IOCTX_FLAG_IOPOLL as one of the flags. It is illegal to mix and match polled and non-polled IO on an io_context. Polled IO doesn't support the user mapped completion ring. Events must be reaped through the io_getevents() system call. For non-irq driven poll devices, there's no way to support completion reaping from userspace by just looking at the ring. The application itself is the one that pulls completion entries. Signed-off-by: Jens Axboe --- fs/aio.c | 381 +++++++++++++++++++++++++++++++---- include/uapi/linux/aio_abi.h | 3 + 2 files changed, 348 insertions(+), 36 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 380e6fe8c429..f7a49abc7694 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -143,6 +143,18 @@ struct kioctx { atomic_t reqs_available; } ____cacheline_aligned_in_smp; + /* iopoll submission state */ + struct { + spinlock_t poll_lock; + struct list_head poll_submitted; + } ____cacheline_aligned_in_smp; + + /* iopoll completion state */ + struct { + struct list_head poll_completing; + struct mutex getevents_lock; + } ____cacheline_aligned_in_smp; + struct { spinlock_t ctx_lock; struct list_head active_reqs; /* used for cancellation */ @@ -195,14 +207,27 @@ struct aio_kiocb { __u64 ki_user_data; /* user's data for completion */ struct list_head ki_list; /* the aio core uses this - * for cancellation */ + * for cancellation, or for + * polled IO */ + + unsigned long ki_flags; +#define IOCB_POLL_COMPLETED 0 +#define IOCB_POLL_BUSY 1 + refcount_t ki_refcnt; - /* - * If the aio_resfd field of the userspace iocb is not zero, - * this is the underlying eventfd context to deliver events to. - */ - struct eventfd_ctx *ki_eventfd; + union { + /* + * If the aio_resfd field of the userspace iocb is not zero, + * this is the underlying eventfd context to deliver events to. + */ + struct eventfd_ctx *ki_eventfd; + + /* + * For polled IO, stash completion info here + */ + struct io_event ki_ev; + }; }; /*------ sysctl variables----*/ @@ -223,6 +248,7 @@ static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); static void aio_useriocb_free(struct kioctx *); +static void aio_iopoll_reap_events(struct kioctx *); static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) { @@ -461,11 +487,15 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) int i; struct file *file; - /* Compensate for the ring buffer's head/tail overlap entry */ - nr_events += 2; /* 1 is required, 2 for good luck */ - + /* + * Compensate for the ring buffer's head/tail overlap entry. + * IO polling doesn't require any io event entries + */ size = sizeof(struct aio_ring); - size += sizeof(struct io_event) * nr_events; + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) { + nr_events += 2; /* 1 is required, 2 for good luck */ + size += sizeof(struct io_event) * nr_events; + } nr_pages = PFN_UP(size); if (nr_pages < 0) @@ -747,6 +777,11 @@ static struct kioctx *io_setup_flags(unsigned long ctxid, INIT_LIST_HEAD(&ctx->active_reqs); + spin_lock_init(&ctx->poll_lock); + INIT_LIST_HEAD(&ctx->poll_submitted); + INIT_LIST_HEAD(&ctx->poll_completing); + mutex_init(&ctx->getevents_lock); + if (percpu_ref_init(&ctx->users, free_ioctx_users, 0, GFP_KERNEL)) goto err; @@ -818,11 +853,15 @@ static int kill_ioctx(struct mm_struct *mm, struct kioctx *ctx, { struct kioctx_table *table; + mutex_lock(&ctx->getevents_lock); spin_lock(&mm->ioctx_lock); if (atomic_xchg(&ctx->dead, 1)) { spin_unlock(&mm->ioctx_lock); + mutex_unlock(&ctx->getevents_lock); return -EINVAL; } + aio_iopoll_reap_events(ctx); + mutex_unlock(&ctx->getevents_lock); table = rcu_dereference_raw(mm->ioctx_table); WARN_ON(ctx != rcu_access_pointer(table->table[ctx->id])); @@ -1029,6 +1068,7 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) percpu_ref_get(&ctx->reqs); req->ki_ctx = ctx; INIT_LIST_HEAD(&req->ki_list); + req->ki_flags = 0; refcount_set(&req->ki_refcnt, 0); req->ki_eventfd = NULL; } @@ -1072,6 +1112,15 @@ static inline void iocb_put(struct aio_kiocb *iocb) } } +static void iocb_put_many(struct kioctx *ctx, void **iocbs, int *nr) +{ + if (*nr) { + percpu_ref_put_many(&ctx->reqs, *nr); + kmem_cache_free_bulk(kiocb_cachep, *nr, iocbs); + *nr = 0; + } +} + static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb, long res, long res2) { @@ -1261,6 +1310,170 @@ static bool aio_read_events(struct kioctx *ctx, long min_nr, long nr, return ret < 0 || *i >= min_nr; } +#define AIO_IOPOLL_BATCH 8 + +/* + * Process completed iocb iopoll entries, copying the result to userspace. + */ +static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, + unsigned int *nr_events, long max) +{ + void *iocbs[AIO_IOPOLL_BATCH]; + struct aio_kiocb *iocb, *n; + int to_free = 0, ret = 0; + + /* Shouldn't happen... */ + if (*nr_events >= max) + return 0; + + list_for_each_entry_safe(iocb, n, &ctx->poll_completing, ki_list) { + if (*nr_events == max) + break; + if (!test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) + continue; + if (to_free == AIO_IOPOLL_BATCH) + iocb_put_many(ctx, iocbs, &to_free); + + list_del(&iocb->ki_list); + iocbs[to_free++] = iocb; + + fput(iocb->rw.ki_filp); + + if (evs && copy_to_user(evs + *nr_events, &iocb->ki_ev, + sizeof(iocb->ki_ev))) { + ret = -EFAULT; + break; + } + (*nr_events)++; + } + + if (to_free) + iocb_put_many(ctx, iocbs, &to_free); + + return ret; +} + +static int __aio_iopoll_check(struct kioctx *ctx, struct io_event __user *event, + unsigned int *nr_events, long min, long max) +{ + struct aio_kiocb *iocb; + int to_poll, polled, ret; + + /* + * Check if we already have done events that satisfy what we need + */ + if (!list_empty(&ctx->poll_completing)) { + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + } + + /* + * Take in a new working set from the submitted list, if possible. + */ + if (!list_empty_careful(&ctx->poll_submitted)) { + spin_lock(&ctx->poll_lock); + list_splice_init(&ctx->poll_submitted, &ctx->poll_completing); + spin_unlock(&ctx->poll_lock); + } + + if (list_empty(&ctx->poll_completing)) + return 0; + + /* + * Check again now that we have a new batch. + */ + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + + /* + * Find up to 'max' worth of events to poll for, including the + * events we already successfully polled + */ + polled = to_poll = 0; + list_for_each_entry(iocb, &ctx->poll_completing, ki_list) { + /* + * Poll for needed events with spin == true, anything after + * that we just check if we have more, up to max. + */ + bool spin = polled + *nr_events >= min; + struct kiocb *kiocb = &iocb->rw; + + if (test_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags)) + break; + if (++to_poll + *nr_events > max) + break; + + ret = kiocb->ki_filp->f_op->iopoll(kiocb, spin); + if (ret < 0) + return ret; + + polled += ret; + if (polled + *nr_events >= max) + break; + } + + ret = aio_iopoll_reap(ctx, event, nr_events, max); + if (ret < 0) + return ret; + if (*nr_events >= min) + return 0; + return to_poll; +} + +/* + * We can't just wait for polled events to come to us, we have to actively + * find and complete them. + */ +static void aio_iopoll_reap_events(struct kioctx *ctx) +{ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + return; + + while (!list_empty_careful(&ctx->poll_submitted) || + !list_empty(&ctx->poll_completing)) { + unsigned int nr_events = 0; + + __aio_iopoll_check(ctx, NULL, &nr_events, 1, UINT_MAX); + } +} + +static int aio_iopoll_check(struct kioctx *ctx, long min_nr, long nr, + struct io_event __user *event) +{ + unsigned int nr_events = 0; + int ret = 0; + + /* Only allow one thread polling at a time */ + if (!mutex_trylock(&ctx->getevents_lock)) + return -EBUSY; + if (unlikely(atomic_read(&ctx->dead))) { + ret = -EINVAL; + goto err; + } + + while (!nr_events || !need_resched()) { + int tmin = 0; + + if (nr_events < min_nr) + tmin = min_nr - nr_events; + + ret = __aio_iopoll_check(ctx, event, &nr_events, tmin, nr); + if (ret <= 0) + break; + ret = 0; + } + +err: + mutex_unlock(&ctx->getevents_lock); + return nr_events ? nr_events : ret; +} + static long read_events(struct kioctx *ctx, long min_nr, long nr, struct io_event __user *event, ktime_t until) @@ -1352,7 +1565,7 @@ SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, unsigned long ctx; long ret; - if (flags & ~IOCTX_FLAG_USERIOCB) + if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL)) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1485,13 +1698,8 @@ static void aio_remove_iocb(struct aio_kiocb *iocb) spin_unlock_irqrestore(&ctx->ctx_lock, flags); } -static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) +static void kiocb_end_write(struct kiocb *kiocb) { - struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); - - if (!list_empty_careful(&iocb->ki_list)) - aio_remove_iocb(iocb); - if (kiocb->ki_flags & IOCB_WRITE) { struct inode *inode = file_inode(kiocb->ki_filp); @@ -1503,19 +1711,48 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) __sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE); file_end_write(kiocb->ki_filp); } +} + +static void aio_complete_rw(struct kiocb *kiocb, long res, long res2) +{ + struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); + + if (!list_empty_careful(&iocb->ki_list)) + aio_remove_iocb(iocb); + + kiocb_end_write(kiocb); fput(kiocb->ki_filp); aio_complete(iocb, res, res2); } -static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) +static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) { + struct aio_kiocb *iocb = container_of(kiocb, struct aio_kiocb, rw); + + kiocb_end_write(kiocb); + + /* + * Handle EAGAIN from resource limits with polled IO inline, don't + * pass the event back to userspace. + */ + if (unlikely(res == -EAGAIN)) + set_bit(IOCB_POLL_BUSY, &iocb->ki_flags); + else { + aio_fill_event(&iocb->ki_ev, iocb, res, res2); + set_bit(IOCB_POLL_COMPLETED, &iocb->ki_flags); + } +} + +static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb) +{ + struct kioctx *ctx = kiocb->ki_ctx; + struct kiocb *req = &kiocb->rw; int ret; req->ki_filp = fget(iocb->aio_fildes); if (unlikely(!req->ki_filp)) return -EBADF; - req->ki_complete = aio_complete_rw; req->ki_pos = iocb->aio_offset; req->ki_flags = iocb_flags(req->ki_filp); if (iocb->aio_flags & IOCB_FLAG_RESFD) @@ -1541,9 +1778,35 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) if (unlikely(ret)) goto out_fput; - req->ki_flags &= ~IOCB_HIPRI; /* no one is going to poll for this I/O */ - return 0; + if (iocb->aio_flags & IOCB_FLAG_HIPRI) { + /* shares space in the union, and is rather pointless.. */ + ret = -EINVAL; + if (iocb->aio_flags & IOCB_FLAG_RESFD) + goto out_fput; + + /* can't submit polled IO to a non-polled ctx */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + goto out_fput; + + ret = -EOPNOTSUPP; + if (!(req->ki_flags & IOCB_DIRECT) || + !req->ki_filp->f_op->iopoll) + goto out_fput; + + req->ki_flags |= IOCB_HIPRI; + req->ki_complete = aio_complete_rw_poll; + } else { + /* can't submit non-polled IO to a polled ctx */ + ret = -EINVAL; + if (ctx->flags & IOCTX_FLAG_IOPOLL) + goto out_fput; + + /* no one is going to poll for this I/O */ + req->ki_flags &= ~IOCB_HIPRI; + req->ki_complete = aio_complete_rw; + } + return 0; out_fput: fput(req->ki_filp); return ret; @@ -1588,15 +1851,40 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } -static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb, +/* + * After the iocb has been issued, it's safe to be found on the poll list. + * Adding the kiocb to the list AFTER submission ensures that we don't + * find it from a io_getevents() thread before the issuer is done accessing + * the kiocb cookie. + */ +static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) +{ + /* + * For fast devices, IO may have already completed. If it has, add + * it to the front so we find it first. We can't add to the poll_done + * list as that's unlocked from the completion side. + */ + const int front_add = test_bit(IOCB_POLL_COMPLETED, &kiocb->ki_flags); + struct kioctx *ctx = kiocb->ki_ctx; + + spin_lock(&ctx->poll_lock); + if (front_add) + list_add(&kiocb->ki_list, &ctx->poll_submitted); + else + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); +} + +static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; + struct kiocb *req = &kiocb->rw; struct iov_iter iter; struct file *file; ssize_t ret; - ret = aio_prep_rw(req, iocb); + ret = aio_prep_rw(kiocb, iocb); if (ret) return ret; file = req->ki_filp; @@ -1621,15 +1909,16 @@ static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb, return ret; } -static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb, +static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, bool vectored, bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; + struct kiocb *req = &kiocb->rw; struct iov_iter iter; struct file *file; ssize_t ret; - ret = aio_prep_rw(req, iocb); + ret = aio_prep_rw(kiocb, iocb); if (ret) return ret; file = req->ki_filp; @@ -1900,7 +2189,8 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, return -EINVAL; } - if (!get_reqs_available(ctx)) + /* Poll IO doesn't need ring reservations */ + if (!(ctx->flags & IOCTX_FLAG_IOPOLL) && !get_reqs_available(ctx)) return -EAGAIN; ret = -EAGAIN; @@ -1923,8 +2213,8 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } } - /* Don't support cancel on user mapped iocbs */ - if (!(ctx->flags & IOCTX_FLAG_USERIOCB)) { + /* Don't support cancel on user mapped iocbs or polled context */ + if (!(ctx->flags & (IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL))) { ret = put_user(KIOCB_KEY, &user_iocb->aio_key); if (unlikely(ret)) { pr_debug("EFAULT: aio_key\n"); @@ -1935,26 +2225,33 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, req->ki_user_iocb = user_iocb; req->ki_user_data = iocb->aio_data; + ret = -EINVAL; switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(&req->rw, iocb, false, compat); + ret = aio_read(req, iocb, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(&req->rw, iocb, false, compat); + ret = aio_write(req, iocb, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(&req->rw, iocb, true, compat); + ret = aio_read(req, iocb, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(&req->rw, iocb, true, compat); + ret = aio_write(req, iocb, true, compat); break; case IOCB_CMD_FSYNC: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_fsync(&req->fsync, iocb, false); break; case IOCB_CMD_FDSYNC: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_fsync(&req->fsync, iocb, true); break; case IOCB_CMD_POLL: + if (ctx->flags & IOCTX_FLAG_IOPOLL) + break; ret = aio_poll(req, iocb); break; default: @@ -1970,13 +2267,21 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, */ if (ret) goto out_put_req; + if (ctx->flags & IOCTX_FLAG_IOPOLL) { + if (test_bit(IOCB_POLL_BUSY, &req->ki_flags)) { + ret = -EAGAIN; + goto out_put_req; + } + aio_iopoll_iocb_issued(req); + } return 0; out_put_req: if (req->ki_eventfd) eventfd_ctx_put(req->ki_eventfd); iocb_put(req); out_put_reqs_available: - put_reqs_available(ctx, 1); + if (!(ctx->flags & IOCTX_FLAG_IOPOLL)) + put_reqs_available(ctx, 1); return ret; } @@ -2148,7 +2453,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, if (unlikely(!ctx)) return -EINVAL; - if (ctx->flags & IOCTX_FLAG_USERIOCB) + if (ctx->flags & (IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL)) goto err; spin_lock_irq(&ctx->ctx_lock); @@ -2183,8 +2488,12 @@ static long do_io_getevents(aio_context_t ctx_id, long ret = -EINVAL; if (likely(ioctx)) { - if (likely(min_nr <= nr && min_nr >= 0)) - ret = read_events(ioctx, min_nr, nr, events, until); + if (likely(min_nr <= nr && min_nr >= 0)) { + if (ioctx->flags & IOCTX_FLAG_IOPOLL) + ret = aio_iopoll_check(ioctx, min_nr, nr, events); + else + ret = read_events(ioctx, min_nr, nr, events, until); + } percpu_ref_put(&ioctx->users); } diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index 814e6606c413..ea0b9a19f4df 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -52,9 +52,11 @@ enum { * is valid. * IOCB_FLAG_IOPRIO - Set if the "aio_reqprio" member of the "struct iocb" * is valid. + * IOCB_FLAG_HIPRI - Use IO completion polling */ #define IOCB_FLAG_RESFD (1 << 0) #define IOCB_FLAG_IOPRIO (1 << 1) +#define IOCB_FLAG_HIPRI (1 << 2) /* read() from /dev/aio returns these structures. */ struct io_event { @@ -107,6 +109,7 @@ struct iocb { }; /* 64 bytes */ #define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ +#define IOCTX_FLAG_IOPOLL (1 << 1) /* io_context is polled */ #undef IFBIG #undef IFLITTLE From patchwork Fri Nov 30 16:56:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 730A613A4 for ; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 653B63041C for ; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5959D30421; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B8B0730427 for ; Fri, 30 Nov 2018 16:57:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727618AbeLAEHQ (ORCPT ); Fri, 30 Nov 2018 23:07:16 -0500 Received: from mail-it1-f194.google.com ([209.85.166.194]:38872 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727606AbeLAEHP (ORCPT ); Fri, 30 Nov 2018 23:07:15 -0500 Received: by mail-it1-f194.google.com with SMTP id h65so10227275ith.3 for ; Fri, 30 Nov 2018 08:57:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0+NngYCEwZmyaHCfliqIZzKXf8tREdBkDh7BHlIPjoY=; b=iRhG9Bxi7d/+z3JHkw2TCh8ZTUuFNTxCilzedQ0/Vaq83CVsYJeAnHvZoBNXgyzI0h wHCttmqLsVKFfpGYWwBwr6MAkPdninwktf+B7SzuPaDPOlOC9vn/XJmK5mo1+V8p0gF3 kY8UDLJH7cFyAEId8SFuo4TtAw6LIe4ueARgbHmTRJomZv9u4aOl9PIo3VcPx7pDb2oi w/R4FhNcCqOYhrECk9rbKwCyD/oXvP1zO1UGNEdVqtfvtjRLaJK2To7w8xJtOAW9XGKg fUGjo3vAUvXy5cLDR1yJ34+OCiW51BlTI0azLOBy2gFw1VHx3LUT9aQGjgSPQu2t6R/Y C/Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0+NngYCEwZmyaHCfliqIZzKXf8tREdBkDh7BHlIPjoY=; b=KhbId4cuJX0BENbH99hrquOSb+SxSPqP34WLlHrIwJqratbvLNFey9lgDtP3D1aHcI dnZGcqaFDw0K3lcoEpV43NFHYhl3maUd08pWG2l1hXyTelCLuW+uimr51YJbWvHF7XCj NKI2DGnZRHsQPuskZ+4MxyY8rMxpgbEQR4GhOuFDup+tUbRy1NeuyfyUS+eVlqqDfjZ1 Xwf6G9sYHt1JYTtQGUhmYHgy6I33dkrsoqyCrR+Y2tNjzRWgAlccS+dqO3EPH+V/BLAR nxIeo0vc5ht7NP9PbVprJp8kq/g4DPRTuCGE0SUCOppREAbna1hb+APavj4n0AxuclFu sqpA== X-Gm-Message-State: AA+aEWb8gOcNffiVCK6330jh+su/t5xVLuu6Kz37QLtxjkqtDB4tkZIA V0FDREvCgUvnNmGdfc+ktcCPmu4LZ0c= X-Google-Smtp-Source: AFSGD/Ua1hyabcOerfXkHkPm/gFjbPS8SrzDWZlZ9S2gEdEmDKfvEyAf2K5HRUmf/oz0K/9YOJB41g== X-Received: by 2002:a02:1a0a:: with SMTP id 10mr5649788jai.50.1543597038740; Fri, 30 Nov 2018 08:57:18 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:17 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 18/27] aio: add submission side request cache Date: Fri, 30 Nov 2018 09:56:37 -0700 Message-Id: <20181130165646.27341-19-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We have to add each submitted polled request to the io_context poll_submitted list, which means we have to grab the poll_lock. We already use the block plug to batch submissions if we're doing a batch of IO submissions, extend that to cover the poll requests internally as well. Signed-off-by: Jens Axboe --- fs/aio.c | 136 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 113 insertions(+), 23 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index f7a49abc7694..182e2fc6ec82 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -230,6 +230,21 @@ struct aio_kiocb { }; }; +struct aio_submit_state { + struct kioctx *ctx; + + struct blk_plug plug; +#ifdef CONFIG_BLOCK + struct blk_plug_cb plug_cb; +#endif + + /* + * Polled iocbs that have been submitted, but not added to the ctx yet + */ + struct list_head req_list; + unsigned int req_count; +}; + /*------ sysctl variables----*/ static DEFINE_SPINLOCK(aio_nr_lock); unsigned long aio_nr; /* current system wide number of aio requests */ @@ -247,6 +262,15 @@ static const struct address_space_operations aio_ctx_aops; static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); +/* + * We rely on block level unplugs to flush pending requests, if we schedule + */ +#ifdef CONFIG_BLOCK +static const bool aio_use_state_req_list = true; +#else +static const bool aio_use_state_req_list = false; +#endif + static void aio_useriocb_free(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); @@ -1851,13 +1875,28 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } } +/* + * Called either at the end of IO submission, or through a plug callback + * because we're going to schedule. Moves out local batch of requests to + * the ctx poll list, so they can be found for polling + reaping. + */ +static void aio_flush_state_reqs(struct kioctx *ctx, + struct aio_submit_state *state) +{ + spin_lock(&ctx->poll_lock); + list_splice_tail_init(&state->req_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + state->req_count = 0; +} + /* * After the iocb has been issued, it's safe to be found on the poll list. * Adding the kiocb to the list AFTER submission ensures that we don't * find it from a io_getevents() thread before the issuer is done accessing * the kiocb cookie. */ -static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) +static void aio_iopoll_iocb_issued(struct aio_submit_state *state, + struct aio_kiocb *kiocb) { /* * For fast devices, IO may have already completed. If it has, add @@ -1867,12 +1906,21 @@ static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) const int front_add = test_bit(IOCB_POLL_COMPLETED, &kiocb->ki_flags); struct kioctx *ctx = kiocb->ki_ctx; - spin_lock(&ctx->poll_lock); - if (front_add) - list_add(&kiocb->ki_list, &ctx->poll_submitted); - else - list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); - spin_unlock(&ctx->poll_lock); + if (!state || !aio_use_state_req_list) { + spin_lock(&ctx->poll_lock); + if (front_add) + list_add(&kiocb->ki_list, &ctx->poll_submitted); + else + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + } else { + if (front_add) + list_add(&kiocb->ki_list, &state->req_list); + else + list_add_tail(&kiocb->ki_list, &state->req_list); + if (++state->req_count >= AIO_IOPOLL_BATCH) + aio_flush_state_reqs(ctx, state); + } } static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, @@ -2168,7 +2216,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) } static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, - struct iocb __user *user_iocb, bool compat) + struct iocb __user *user_iocb, + struct aio_submit_state *state, bool compat) { struct aio_kiocb *req; ssize_t ret; @@ -2272,7 +2321,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EAGAIN; goto out_put_req; } - aio_iopoll_iocb_issued(req); + aio_iopoll_iocb_issued(state, req); } return 0; out_put_req: @@ -2286,7 +2335,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, - bool compat) + struct aio_submit_state *state, bool compat) { struct iocb iocb, *iocbp; @@ -2303,7 +2352,44 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, iocbp = &iocb; } - return __io_submit_one(ctx, iocbp, user_iocb, compat); + return __io_submit_one(ctx, iocbp, user_iocb, state, compat); +} + +#ifdef CONFIG_BLOCK +static void aio_state_unplug(struct blk_plug_cb *cb, bool from_schedule) +{ + struct aio_submit_state *state; + + state = container_of(cb, struct aio_submit_state, plug_cb); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} +#endif + +/* + * Batched submission is done, ensure local IO is flushed out. + */ +static void aio_submit_state_end(struct aio_submit_state *state) +{ + blk_finish_plug(&state->plug); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} + +/* + * Start submission side cache. + */ +static void aio_submit_state_start(struct aio_submit_state *state, + struct kioctx *ctx) +{ + state->ctx = ctx; + INIT_LIST_HEAD(&state->req_list); + state->req_count = 0; +#ifdef CONFIG_BLOCK + state->plug_cb.callback = aio_state_unplug; + blk_start_plug(&state->plug); + list_add(&state->plug_cb.list, &state->plug.cb_list); +#endif } /* @@ -2327,10 +2413,10 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, struct iocb __user * __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2344,8 +2430,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -2354,12 +2442,12 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, break; } - ret = io_submit_one(ctx, user_iocb, false); + ret = io_submit_one(ctx, user_iocb, statep, false); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -2369,10 +2457,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, int, nr, compat_uptr_t __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2386,8 +2474,10 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { compat_uptr_t user_iocb; @@ -2396,12 +2486,12 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, break; } - ret = io_submit_one(ctx, compat_ptr(user_iocb), true); + ret = io_submit_one(ctx, compat_ptr(user_iocb), statep, true); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; From patchwork Fri Nov 30 16:56:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706801 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E3DB5109C for ; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D42723041C for ; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C889830421; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5981830422 for ; Fri, 30 Nov 2018 16:57:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727619AbeLAEHQ (ORCPT ); Fri, 30 Nov 2018 23:07:16 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:38881 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbeLAEHQ (ORCPT ); Fri, 30 Nov 2018 23:07:16 -0500 Received: by mail-it1-f195.google.com with SMTP id h65so10227413ith.3 for ; Fri, 30 Nov 2018 08:57:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=14kCfLiACV93AenWhL1xKvUDPabDTnEKBX7DXRr/kpI=; b=TiqPrLnOkfC+Zz5Ju3GWBXlcq1tYFO2h82OmJ4SGdl3VfZ9GzV0OTQ/TQt1uf8la1x CpVciynTtXqo+HhAMdm1imVoG6es+G2xqmy0nsjkONh8SkD2nbDDGkCgbSKSU8d66F3y 19sK++Ypfi7CpKiq69jkBQVzypv0+K6Tosuzw2EKqAyUyrO3E+nQvBRNGPX5lcwh6zXS LdfBYCKBOt1nBESVtMdXlcP+bmedgXCigUIoLeYkv8GWh/GyBruY7xkxwkoXpD9Akcg1 hIxFsxGqV0JqLgQRfHLohN1wl+T36KSsyjpUYW6m8ispf6W0Qp2JHpgd8q1HZ1dm9Y8v A8yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=14kCfLiACV93AenWhL1xKvUDPabDTnEKBX7DXRr/kpI=; b=lkOBqQwVQt8sUMZJ4H7u2VY6jHiO+HskOpBoj1GXlqOXz3JmU26n81f8ouRtq5O7bv 505XV7L0ca+36PA6AdqPV1h3X2pbmVhmG2CyBdrcINJ+hHlicDqn5A9J69F2PxfRN+ZJ VqU2F0hQ69xMne1jsD9OzEoE/peCb57UAnwgwizE2DhrBZZ3/fOCMxnCA4RcfZpVKj5o edSLTnn/jMQ7bLmgfSj/Ee7WnTtscrB9eQ8ORwKWIS0FTu2EWive80UqCZfOzorNzwqY 9kld//y2s4dWx95L0YfZIF8evIy1M5GrLiLomYWRfquXBbViGY9RV9B1XPkdtXJR6QAW 1MNQ== X-Gm-Message-State: AA+aEWZahcQqgRtw6BFiOa8Bp1IhN3dMo2mUkHtRbMruXSm6m3z8OT/H A6noIq5vX2J+0/DFgtdBEopqX8yQ3vA= X-Google-Smtp-Source: AFSGD/UCWQhcPAOi9IxYkGGlOZkwbmyrVMJWWnfjL/JMdacEe+udH3jNy64/LUMyfmy7JJuLVa7p6Q== X-Received: by 2002:a02:a990:: with SMTP id q16mr5882865jam.132.1543597040111; Fri, 30 Nov 2018 08:57:20 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:19 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 19/27] fs: add fget_many() and fput_many() Date: Fri, 30 Nov 2018 09:56:38 -0700 Message-Id: <20181130165646.27341-20-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Some uses cases repeatedly get and put references to the same file, but the only exposed interface is doing these one at the time. As each of these entail an atomic inc or dec on a shared structure, that cost can add up. Add fget_many(), which works just like fget(), except it takes an argument for how many references to get on the file. Ditto fput_many(), which can drop an arbitrary number of references to a file. Signed-off-by: Jens Axboe --- fs/file.c | 15 ++++++++++----- fs/file_table.c | 10 ++++++++-- include/linux/file.h | 2 ++ include/linux/fs.h | 3 ++- 4 files changed, 22 insertions(+), 8 deletions(-) diff --git a/fs/file.c b/fs/file.c index 7ffd6e9d103d..ad9870edfd51 100644 --- a/fs/file.c +++ b/fs/file.c @@ -676,7 +676,7 @@ void do_close_on_exec(struct files_struct *files) spin_unlock(&files->file_lock); } -static struct file *__fget(unsigned int fd, fmode_t mask) +static struct file *__fget(unsigned int fd, fmode_t mask, unsigned int refs) { struct files_struct *files = current->files; struct file *file; @@ -691,7 +691,7 @@ static struct file *__fget(unsigned int fd, fmode_t mask) */ if (file->f_mode & mask) file = NULL; - else if (!get_file_rcu(file)) + else if (!get_file_rcu_many(file, refs)) goto loop; } rcu_read_unlock(); @@ -699,15 +699,20 @@ static struct file *__fget(unsigned int fd, fmode_t mask) return file; } +struct file *fget_many(unsigned int fd, unsigned int refs) +{ + return __fget(fd, FMODE_PATH, refs); +} + struct file *fget(unsigned int fd) { - return __fget(fd, FMODE_PATH); + return fget_many(fd, 1); } EXPORT_SYMBOL(fget); struct file *fget_raw(unsigned int fd) { - return __fget(fd, 0); + return __fget(fd, 0, 1); } EXPORT_SYMBOL(fget_raw); @@ -738,7 +743,7 @@ static unsigned long __fget_light(unsigned int fd, fmode_t mask) return 0; return (unsigned long)file; } else { - file = __fget(fd, mask); + file = __fget(fd, mask, 1); if (!file) return 0; return FDPUT_FPUT | (unsigned long)file; diff --git a/fs/file_table.c b/fs/file_table.c index e49af4caf15d..6a3964df33e4 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -326,9 +326,9 @@ void flush_delayed_fput(void) static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); -void fput(struct file *file) +void fput_many(struct file *file, unsigned int refs) { - if (atomic_long_dec_and_test(&file->f_count)) { + if (atomic_long_sub_and_test(refs, &file->f_count)) { struct task_struct *task = current; if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { @@ -347,6 +347,12 @@ void fput(struct file *file) } } +void fput(struct file *file) +{ + fput_many(file, 1); +} + + /* * synchronous analog of fput(); for kernel threads that might be needed * in some umount() (and thus can't use flush_delayed_fput() without diff --git a/include/linux/file.h b/include/linux/file.h index 6b2fb032416c..3fcddff56bc4 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -13,6 +13,7 @@ struct file; extern void fput(struct file *); +extern void fput_many(struct file *, unsigned int); struct file_operations; struct vfsmount; @@ -44,6 +45,7 @@ static inline void fdput(struct fd fd) } extern struct file *fget(unsigned int fd); +extern struct file *fget_many(unsigned int fd, unsigned int refs); extern struct file *fget_raw(unsigned int fd); extern unsigned long __fdget(unsigned int fd); extern unsigned long __fdget_raw(unsigned int fd); diff --git a/include/linux/fs.h b/include/linux/fs.h index 6a5f71f8ae06..dc54a65c401a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -952,7 +952,8 @@ static inline struct file *get_file(struct file *f) atomic_long_inc(&f->f_count); return f; } -#define get_file_rcu(x) atomic_long_inc_not_zero(&(x)->f_count) +#define get_file_rcu_many(x, cnt) atomic_long_add_unless(&(x)->f_count, (cnt), 0) +#define get_file_rcu(x) get_file_rcu_many((x), 1) #define fput_atomic(x) atomic_long_add_unless(&(x)->f_count, -1, 1) #define file_count(x) atomic_long_read(&(x)->f_count) From patchwork Fri Nov 30 16:56:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706811 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BE55109C for ; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E02C3041C for ; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7196C30422; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CCA3730421 for ; Fri, 30 Nov 2018 16:57:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727638AbeLAEHT (ORCPT ); Fri, 30 Nov 2018 23:07:19 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:38887 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727617AbeLAEHS (ORCPT ); Fri, 30 Nov 2018 23:07:18 -0500 Received: by mail-it1-f196.google.com with SMTP id h65so10227557ith.3 for ; Fri, 30 Nov 2018 08:57:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=87XpvuDDlcLhO+RTk32hugbhh+JSJwc5rY+aocDNyxM=; b=n+cGsW4u1yCdB2HLqeIT/PbtOXG7gX6P5jfFcGsss3mXjvtmPcGMl/Y6ezEZqX17hy kXwqAJenkDRNGFmwT+mmGOHGFCDN2byioJm379u2WRMmu1FuiSMpyEGIxk0hlZ3fh5XF ghdzm6WdvjLo1oFedW4SmV/jQe50wuGKi9aXtLv38jj76XZdewFQ5mn7MOPiSJ4vxfJ1 OkIrLPRVDudqdvA7QkbnrRxgYcKVDLVPpp+Olg3KDLY7Gg7M3qxGCuCi/2n41a70ldKp +V8NK3pkUCVAdJ+mdnWZJAYtEpmM48tGGdPeBGSlnANrsrJmgEpf8L+a8YJq/VMws109 3E3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=87XpvuDDlcLhO+RTk32hugbhh+JSJwc5rY+aocDNyxM=; b=H663lvmHIFw7JbWB3bZnALuh1PIw/ECwdQQGBs+9wlU9gMO9dSB49WTgmzxli6hIxv p5dd6ll5N7dtKNaru+Q4IrpNt2Ur9YvHPxazUNMAtx5eG8mtH0054sW5Mg5u46vzQk7C iy4bqexDjXZdRZLiCX1Zj5nyHDSCR04tvKhpOfPPSNbUjvHaRXqf7X9NwTMcDzezyYfn iWuByQQbULLjcYu9IaYi8Ej51dd+2jSDdke5pZTTN2WBmNpDaIrNCW2YhIqmKAwDG27O GOvqmE41w7X5T0mDzHvn3k//BFOJaQVf85ii3azUMsxkIkbj4rOas73LXj2v/2hbT7QT ItCA== X-Gm-Message-State: AA+aEWYx867oCWMSdHmTic4p0EqCqNVKNbXnCivcfZBJBshMT6Ap1LSL 5mFkixRx9vWersHSNKbGHOmcluIX5LA= X-Google-Smtp-Source: AFSGD/X2ivh+dObCO+W2p8nGQmOA1hJhuYwxkylBqK5n/y6HogmCoMql4C2STfH5jZ79+NBP2UitLA== X-Received: by 2002:a24:f2c1:: with SMTP id j184mr5719431ith.35.1543597041885; Fri, 30 Nov 2018 08:57:21 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:21 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 20/27] aio: use fget/fput_many() for file references Date: Fri, 30 Nov 2018 09:56:39 -0700 Message-Id: <20181130165646.27341-21-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On the submission side, add file reference batching to the aio_submit_state. We get as many references as the number of iocbs we are submitting, and drop unused ones if we end up switching files. The assumption here is that we're usually only dealing with one fd, and if there are multiple, hopefuly they are at least somewhat ordered. Could trivially be extended to cover multiple fds, if needed. On the completion side we do the same thing, except this is trivially done just locally in aio_iopoll_reap(). Signed-off-by: Jens Axboe --- fs/aio.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 91 insertions(+), 15 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 182e2fc6ec82..291bbc62b2a8 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -243,6 +243,15 @@ struct aio_submit_state { */ struct list_head req_list; unsigned int req_count; + + /* + * File reference cache + */ + struct file *file; + unsigned int fd; + unsigned int has_refs; + unsigned int used_refs; + unsigned int ios_left; }; /*------ sysctl variables----*/ @@ -1344,7 +1353,8 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, { void *iocbs[AIO_IOPOLL_BATCH]; struct aio_kiocb *iocb, *n; - int to_free = 0, ret = 0; + int file_count, to_free = 0, ret = 0; + struct file *file = NULL; /* Shouldn't happen... */ if (*nr_events >= max) @@ -1361,7 +1371,20 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, list_del(&iocb->ki_list); iocbs[to_free++] = iocb; - fput(iocb->rw.ki_filp); + /* + * Batched puts of the same file, to avoid dirtying the + * file usage count multiple times, if avoidable. + */ + if (!file) { + file = iocb->rw.ki_filp; + file_count = 1; + } else if (file == iocb->rw.ki_filp) { + file_count++; + } else { + fput_many(file, file_count); + file = iocb->rw.ki_filp; + file_count = 1; + } if (evs && copy_to_user(evs + *nr_events, &iocb->ki_ev, sizeof(iocb->ki_ev))) { @@ -1371,6 +1394,9 @@ static long aio_iopoll_reap(struct kioctx *ctx, struct io_event __user *evs, (*nr_events)++; } + if (file) + fput_many(file, file_count); + if (to_free) iocb_put_many(ctx, iocbs, &to_free); @@ -1768,13 +1794,58 @@ static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) } } -static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb) +static void aio_file_put(struct aio_submit_state *state) +{ + if (state->file) { + int diff = state->has_refs - state->used_refs; + + if (diff) + fput_many(state->file, diff); + state->file = NULL; + } +} + +/* + * Get as many references to a file as we have IOs left in this submission, + * assuming most submissions are for one file, or at least that each file + * has more than one submission. + */ +static struct file *aio_file_get(struct aio_submit_state *state, int fd) +{ + if (!state) + return fget(fd); + + if (!state->file) { +get_file: + state->file = fget_many(fd, state->ios_left); + if (!state->file) + return NULL; + + state->fd = fd; + state->has_refs = state->ios_left; + state->used_refs = 1; + state->ios_left--; + return state->file; + } + + if (state->fd == fd) { + state->used_refs++; + state->ios_left--; + return state->file; + } + + aio_file_put(state); + goto get_file; +} + +static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb, + struct aio_submit_state *state) { struct kioctx *ctx = kiocb->ki_ctx; struct kiocb *req = &kiocb->rw; int ret; - req->ki_filp = fget(iocb->aio_fildes); + req->ki_filp = aio_file_get(state, iocb->aio_fildes); if (unlikely(!req->ki_filp)) return -EBADF; req->ki_pos = iocb->aio_offset; @@ -1924,7 +1995,8 @@ static void aio_iopoll_iocb_issued(struct aio_submit_state *state, } static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, - bool vectored, bool compat) + struct aio_submit_state *state, bool vectored, + bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -1932,7 +2004,7 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, struct file *file; ssize_t ret; - ret = aio_prep_rw(kiocb, iocb); + ret = aio_prep_rw(kiocb, iocb, state); if (ret) return ret; file = req->ki_filp; @@ -1958,7 +2030,8 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, } static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, - bool vectored, bool compat) + struct aio_submit_state *state, bool vectored, + bool compat) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -1966,7 +2039,7 @@ static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, struct file *file; ssize_t ret; - ret = aio_prep_rw(kiocb, iocb); + ret = aio_prep_rw(kiocb, iocb, state); if (ret) return ret; file = req->ki_filp; @@ -2277,16 +2350,16 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EINVAL; switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(req, iocb, false, compat); + ret = aio_read(req, iocb, state, false, compat); break; case IOCB_CMD_PWRITE: - ret = aio_write(req, iocb, false, compat); + ret = aio_write(req, iocb, state, false, compat); break; case IOCB_CMD_PREADV: - ret = aio_read(req, iocb, true, compat); + ret = aio_read(req, iocb, state, true, compat); break; case IOCB_CMD_PWRITEV: - ret = aio_write(req, iocb, true, compat); + ret = aio_write(req, iocb, state, true, compat); break; case IOCB_CMD_FSYNC: if (ctx->flags & IOCTX_FLAG_IOPOLL) @@ -2374,17 +2447,20 @@ static void aio_submit_state_end(struct aio_submit_state *state) blk_finish_plug(&state->plug); if (!list_empty(&state->req_list)) aio_flush_state_reqs(state->ctx, state); + aio_file_put(state); } /* * Start submission side cache. */ static void aio_submit_state_start(struct aio_submit_state *state, - struct kioctx *ctx) + struct kioctx *ctx, int max_ios) { state->ctx = ctx; INIT_LIST_HEAD(&state->req_list); state->req_count = 0; + state->file = NULL; + state->ios_left = max_ios; #ifdef CONFIG_BLOCK state->plug_cb.callback = aio_state_unplug; blk_start_plug(&state->plug); @@ -2431,7 +2507,7 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, nr = ctx->nr_events; if (nr > AIO_PLUG_THRESHOLD) { - aio_submit_state_start(&state, ctx); + aio_submit_state_start(&state, ctx, nr); statep = &state; } for (i = 0; i < nr; i++) { @@ -2475,7 +2551,7 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, nr = ctx->nr_events; if (nr > AIO_PLUG_THRESHOLD) { - aio_submit_state_start(&state, ctx); + aio_submit_state_start(&state, ctx, nr); statep = &state; } for (i = 0; i < nr; i++) { From patchwork Fri Nov 30 16:56:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706809 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68AE113A4 for ; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A7393041C for ; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4F1A330425; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0555C3041C for ; Fri, 30 Nov 2018 16:57:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727617AbeLAEHU (ORCPT ); Fri, 30 Nov 2018 23:07:20 -0500 Received: from mail-it1-f195.google.com ([209.85.166.195]:35943 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbeLAEHT (ORCPT ); Fri, 30 Nov 2018 23:07:19 -0500 Received: by mail-it1-f195.google.com with SMTP id c9so10245163itj.1 for ; Fri, 30 Nov 2018 08:57:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QMzfrApNgSsegtiyNm54nDrw8Z5lsx9xH88UNkAmxlk=; b=1PjsMo2k/rQM43AQfyKScK5BXnVtVvpXb1LlF3gELn8KexP33dB3x7LFs0drxK7yqU TzKQ/vdBnbHRnAXC6/j4KlIgwbdM6X4P6ZTz9ifSOiyR0yI9drQQOYC84YIQ5QHwXEME 8E1xOYc5LV/4lz+6RkyFTa9mR69lYNZHRggqubflttTMZdtEsJdID30cQnAi6Rar/xKI VL+cy2upK0AtiMN2tlWrDAlqwhDI7XzN7DrzIHae2jmGb9cKKeP7IXU4WzDyqyVxXoZM XqzjTOdhtEapoYTmnTsSLSEnsfWAUol5Qaac1+HgNYcfZ2NNb6KMwWfO4xEX3KwZI1qq v5SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QMzfrApNgSsegtiyNm54nDrw8Z5lsx9xH88UNkAmxlk=; b=iZRxprZwOkCc+0yCY90iF6U0dKRnOkjWEXScg3I2BfEzjZEFfn4o3VahoIobu7jKZX gePt28fZVO1L3/NdkUonypadmJcsKoKwr4R7U4dEMv67UjGi15Rus3kgZUQ64BISq49K q88MVD4KIsdTKpaAMns2DfJG9Bl/2bcDV5djtCj1mlNlarfqlPVw02pMpt0QE5DHdlYK Opw9Kt2+pbT/0MHKaAxf2hJAayp93JaLpDMQ1E1DJjfZmRZVtU2KdHAETAhNpMr22Huc Algx4EJrJqUEAUO+qmJJ93ytqr+xdBMFKHDHAxGRYhbyzbAsuY0mLktMZXHobiVZl3WS Uqsg== X-Gm-Message-State: AA+aEWa3Qd5ahaZlUwWXXmSYwSeULS4MnSWC0/zRzA0in9nXXVPEkRXJ NMd1mgwAjNmYlp3VTD/JTUyaFYstWO0= X-Google-Smtp-Source: AFSGD/WfVDvHOK1f3lWSyhg9zT73G1kmZ/RJiyHIjtEAanTVcHjuWMOIt6Hd55ywnKROuqocGmP2hA== X-Received: by 2002:a05:660c:12c7:: with SMTP id k7mr5563963itd.148.1543597043189; Fri, 30 Nov 2018 08:57:23 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:22 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 21/27] aio: split iocb init from allocation Date: Fri, 30 Nov 2018 09:56:40 -0700 Message-Id: <20181130165646.27341-22-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Jens Axboe --- fs/aio.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 291bbc62b2a8..341eb1b19319 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1088,6 +1088,16 @@ static bool get_reqs_available(struct kioctx *ctx) return __get_reqs_available(ctx); } +static void aio_iocb_init(struct kioctx *ctx, struct aio_kiocb *req) +{ + percpu_ref_get(&ctx->reqs); + req->ki_ctx = ctx; + INIT_LIST_HEAD(&req->ki_list); + req->ki_flags = 0; + refcount_set(&req->ki_refcnt, 0); + req->ki_eventfd = NULL; +} + /* aio_get_req * Allocate a slot for an aio request. * Returns NULL if no requests are free. @@ -1097,14 +1107,8 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) struct aio_kiocb *req; req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); - if (req) { - percpu_ref_get(&ctx->reqs); - req->ki_ctx = ctx; - INIT_LIST_HEAD(&req->ki_list); - req->ki_flags = 0; - refcount_set(&req->ki_refcnt, 0); - req->ki_eventfd = NULL; - } + if (req) + aio_iocb_init(ctx, req); return req; } From patchwork Fri Nov 30 16:56:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706815 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D3DC313A4 for ; Fri, 30 Nov 2018 16:57:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C38F73041C for ; Fri, 30 Nov 2018 16:57:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B810730422; Fri, 30 Nov 2018 16:57:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60EC73041C for ; Fri, 30 Nov 2018 16:57:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727645AbeLAEHV (ORCPT ); Fri, 30 Nov 2018 23:07:21 -0500 Received: from mail-it1-f193.google.com ([209.85.166.193]:34234 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbeLAEHV (ORCPT ); Fri, 30 Nov 2018 23:07:21 -0500 Received: by mail-it1-f193.google.com with SMTP id x124so2413199itd.1 for ; Fri, 30 Nov 2018 08:57:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=78SuIHvJEGe4HNzKE6mphC3LsuN6oQSJbZlJ28AUFt4=; b=CybVR32+jarjJ76TPU/bO9CZh6/NEIOw504HgVXYL9vCvsOON6+ZEirHjPBmf6QLpA LlDMypoqqsx7t66NpCwcNR6Gs++5nv/R9tITUIT6sY9vEqUi66ulxuMNlbrc0nLDeN06 0islDq9NWRGbIbyKGRKzqOrkZFGWmjkOhA8kUnMjmXT91tts0+2Rss1WuCJ5pa3WmdwT afoqaAFQpyfksFrkEx7ebXP/UctwGkl4hrNoGadaRNjcHJuqNryzUv53jfB9P+e7zhK+ 1l7Itv23/zbquTD9ukbpxF+Vbx9akkAN02BfFJFncgzo4iETOuTxWjHit/xSgiCp5duU f0bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=78SuIHvJEGe4HNzKE6mphC3LsuN6oQSJbZlJ28AUFt4=; b=HDmWUX58Jjyhd9HLNqMKXB/gEusLqZuA0Kq6Mnp2VIGsyq0GN33wPCPavUdtzdHWs2 O/ce/IDME1BTmpsZkS1Vvv0XaaBHiVSLOH1Cg365xI8TRGBKreJm+Wfgjk5Pg2d+pKnG 4fVthcNlpBSPdepfp6oPDYfrm3YW9EVC+8DsJJwu+yZ0jlEu7DA1dYZ3gaqRWkpXT3OL eaclPgJzl7BHJkiGCG0otPE2gBHz7/wFDVEezdWr6g0YdoAEMvFYIlyx4yUBBAQTnhKz c4XUe6ftnj81OuxJJCKSKaOJlm5Am2nQZRI/CpiHtEuA+y6mcb1uAYU9eqCfsX/qXaG8 e+1w== X-Gm-Message-State: AA+aEWYOa5v05/OK2gu0N8z1WOSdxqETtNCa86pF+fd+N9S4MpGEhQLP 4OqPCC14u0I/AX4g2M8QEmQOToR8J10= X-Google-Smtp-Source: AFSGD/V5dWKOWJxvrGG9zXnr/xPm16NRoYiBRF3PbvhJS+lzC5P5Gb9ZAYpneogFNlSRcOEMATZzRQ== X-Received: by 2002:a24:570a:: with SMTP id u10mr6010544ita.11.1543597044784; Fri, 30 Nov 2018 08:57:24 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:23 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 22/27] aio: batch aio_kiocb allocation Date: Fri, 30 Nov 2018 09:56:41 -0700 Message-Id: <20181130165646.27341-23-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Similarly to how we use the state->ios_left to know how many references to get to a file, we can use it to allocate the aio_kiocb's we need in bulk. Signed-off-by: Jens Axboe --- fs/aio.c | 42 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 37 insertions(+), 5 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 341eb1b19319..426939f1dae9 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -230,6 +230,8 @@ struct aio_kiocb { }; }; +#define AIO_IOPOLL_BATCH 8 + struct aio_submit_state { struct kioctx *ctx; @@ -244,6 +246,13 @@ struct aio_submit_state { struct list_head req_list; unsigned int req_count; + /* + * aio_kiocb alloc cache + */ + void *iocbs[AIO_IOPOLL_BATCH]; + unsigned int free_iocbs; + unsigned int cur_iocb; + /* * File reference cache */ @@ -1102,11 +1111,32 @@ static void aio_iocb_init(struct kioctx *ctx, struct aio_kiocb *req) * Allocate a slot for an aio request. * Returns NULL if no requests are free. */ -static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) +static struct aio_kiocb *aio_get_req(struct kioctx *ctx, + struct aio_submit_state *state) { struct aio_kiocb *req; - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); + if (!state) + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); + else if (!state->free_iocbs) { + size_t size; + + size = min_t(size_t, state->ios_left, ARRAY_SIZE(state->iocbs)); + size = kmem_cache_alloc_bulk(kiocb_cachep, GFP_KERNEL, size, + state->iocbs); + if (size < 0) + return ERR_PTR(size); + else if (!size) + return ERR_PTR(-ENOMEM); + state->free_iocbs = size - 1; + state->cur_iocb = 1; + req = state->iocbs[0]; + } else { + req = state->iocbs[state->cur_iocb]; + state->free_iocbs--; + state->cur_iocb++; + } + if (req) aio_iocb_init(ctx, req); @@ -1347,8 +1377,6 @@ static bool aio_read_events(struct kioctx *ctx, long min_nr, long nr, return ret < 0 || *i >= min_nr; } -#define AIO_IOPOLL_BATCH 8 - /* * Process completed iocb iopoll entries, copying the result to userspace. */ @@ -2320,7 +2348,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, return -EAGAIN; ret = -EAGAIN; - req = aio_get_req(ctx); + req = aio_get_req(ctx, state); if (unlikely(!req)) goto out_put_reqs_available; @@ -2452,6 +2480,9 @@ static void aio_submit_state_end(struct aio_submit_state *state) if (!list_empty(&state->req_list)) aio_flush_state_reqs(state->ctx, state); aio_file_put(state); + if (state->free_iocbs) + kmem_cache_free_bulk(kiocb_cachep, state->free_iocbs, + &state->iocbs[state->cur_iocb]); } /* @@ -2463,6 +2494,7 @@ static void aio_submit_state_start(struct aio_submit_state *state, state->ctx = ctx; INIT_LIST_HEAD(&state->req_list); state->req_count = 0; + state->free_iocbs = 0; state->file = NULL; state->ios_left = max_ios; #ifdef CONFIG_BLOCK From patchwork Fri Nov 30 16:56:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706819 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 563DB109C for ; Fri, 30 Nov 2018 16:57:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 459093041C for ; Fri, 30 Nov 2018 16:57:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 39DD930422; Fri, 30 Nov 2018 16:57:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3DD23041C for ; Fri, 30 Nov 2018 16:57:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727648AbeLAEHX (ORCPT ); Fri, 30 Nov 2018 23:07:23 -0500 Received: from mail-io1-f66.google.com ([209.85.166.66]:46929 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727510AbeLAEHW (ORCPT ); Fri, 30 Nov 2018 23:07:22 -0500 Received: by mail-io1-f66.google.com with SMTP id v10so5044942ios.13 for ; Fri, 30 Nov 2018 08:57:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3J/GNZhTTiMRCQpvsWNuMxThlxd8tYG73UBMrgizEig=; b=VR0kTYq8AB4J8Vn1toh385cB7/dnykpeeOFuWJlgk2cWwC2J1lkJUm9AdJEUUGFtYs jS3UIJO0NYz9T49YSm/R78I+D1VBHmH/Rw8d5Ui7agg09qH9coxpNEj8Mfk9vf4W5XU3 siUhQpq/UmrFCP7rqGx93aBCw8sWrMGessPkYn5HwpLPUFHl58+KQ8sUFrZlH0QLgLW9 kAREjTLcDBlqeDPFSDLCJng3YiFEeafoIL8nmuVTN5INCMjYCaAUwOTGngNyOB3EQigb rgxdWUAuaG6V8pJR7IcQkWvNoJaxJMZ2f0NuzncNptm9Ku4Ewh9T6xeQcdFQQ1vimJKV enzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3J/GNZhTTiMRCQpvsWNuMxThlxd8tYG73UBMrgizEig=; b=PbSNpksrTNueoGzyesMT/R/p0+v8WDkWeONao+rNR0JntToQ88K6QJqXfZAiKFdFN+ IWnG5aoR+2OLdDPzR47aAXE5HUhDmLx6A/y+yA5DIxyhggpiXmOBYO1S8tT6vrOlawgd leD7wcMPpIEGFJ1IV8aCaZ333NY4JaN1RX167+7qfzEq2TGhPMINlQgDckx0PDCE/rXJ vWmzkEUDAFJW4nNDLXZcLFmJ980cj1avZ6EVNR0HVe+rijfSDTcnSl3Tig7vDNI3kaqV I7tqsKLnJ+gCIEUB+CO9YSj8pZUaWaX2zEhqSUxiHi1o/EfVfh4hxHE5Ye8lGZgbfP+f lkng== X-Gm-Message-State: AA+aEWb0W6Q/ko1SiytyMLRoQDQ1GppprSFJn2qB2B5VFJmA+hlcSFKL PRCd1oF31U7/WnPicxiX5vyYhcp3I14= X-Google-Smtp-Source: AFSGD/WLH/s6hywjLCCKYbDrK7uQ++Y1OerH8wZc5uolx/bKLmJFpmLBVami+8jdnIJzzhFP+ILF4A== X-Received: by 2002:a6b:e802:: with SMTP id f2mr5209829ioh.296.1543597046133; Fri, 30 Nov 2018 08:57:26 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:25 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 23/27] block: add BIO_HOLD_PAGES flag Date: Fri, 30 Nov 2018 09:56:42 -0700 Message-Id: <20181130165646.27341-24-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For user mapped IO, we do get_user_pages() upfront, and then do a put_page() on each page at end_io time to release the page reference. In preparation for having permanently mapped pages, add a BIO_HOLD_PAGES flag that tells us not to release the pages, the caller will do that. Signed-off-by: Jens Axboe --- block/bio.c | 6 ++++-- include/linux/blk_types.h | 1 + 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index 03895cc0d74a..ab174bce5436 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1635,7 +1635,8 @@ static void bio_dirty_fn(struct work_struct *work) next = bio->bi_private; bio_set_pages_dirty(bio); - bio_release_pages(bio); + if (!bio_flagged(bio, BIO_HOLD_PAGES)) + bio_release_pages(bio); bio_put(bio); } } @@ -1651,7 +1652,8 @@ void bio_check_pages_dirty(struct bio *bio) goto defer; } - bio_release_pages(bio); + if (!bio_flagged(bio, BIO_HOLD_PAGES)) + bio_release_pages(bio); bio_put(bio); return; defer: diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index c0ba1a038ff3..78aaf7442688 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -227,6 +227,7 @@ struct bio { #define BIO_TRACE_COMPLETION 10 /* bio_endio() should trace the final completion * of this bio. */ #define BIO_QUEUE_ENTERED 11 /* can use blk_queue_enter_live() */ +#define BIO_HOLD_PAGES 12 /* don't put O_DIRECT pages */ /* See BVEC_POOL_OFFSET below before adding new flags */ From patchwork Fri Nov 30 16:56:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706823 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 73A7418B8 for ; Fri, 30 Nov 2018 16:57:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 64CEC3041C for ; Fri, 30 Nov 2018 16:57:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5976530422; Fri, 30 Nov 2018 16:57:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0A62D3041C for ; Fri, 30 Nov 2018 16:57:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727510AbeLAEHY (ORCPT ); Fri, 30 Nov 2018 23:07:24 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:42605 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726736AbeLAEHY (ORCPT ); Fri, 30 Nov 2018 23:07:24 -0500 Received: by mail-io1-f65.google.com with SMTP id x6so5060611ioa.9 for ; Fri, 30 Nov 2018 08:57:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=KTzMVdjW5NFmc+ni1z0AYs6SzISV9Szv/UKJuYhDmgM=; b=cywvOdps9izJS4wyu2a/6Vi+ryIpdgR6h2qtZGAAW5B0OBsEerT6be7yKT7QQhh8uQ 9jGKVa4iVqKwcm7kdLrIo1wgmsSmD9FK9rgIdfGQ1RwN9fleMhIiXPo8DX/WdJZghy0l zzBdwjgalTExk714jbBtbg0VsU0fleKxJVQI+LpskhRLdTUS0mcE5p5Xe6YRc5fsO45L ahkaKeWSVMY5h/kKK5GH0m5zMyhBOXmQlCXKm/jOVXOCPdYPD3tZIKRobQKqCnYi/fj3 DCqzKneZOW28xyeKxghUc3u0R4b0sLspyUe2fhXl4UPGCn2Yo/VsdCrygfcvMZkeTVb2 85Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=KTzMVdjW5NFmc+ni1z0AYs6SzISV9Szv/UKJuYhDmgM=; b=cvLcneDK1fcnQpDnan6bk/BtXqhOGxubZh6iP+G/VgBx2VjrTn+GmQNBAp5eFXN5+5 3cjk/VJenS6BTbRuj0+fhaGu5/Ca98YiRLP2pgvO4e9Wj0CcKbMmZH5dTXAyYDA8OZKD atesiYAR8O4ADwhhJMBS40U3Or2CR+jvm3A4u+FdQeLoJtx/dQ+n2IcYcSYg7EhseZCK 0KUfT7QuKUe2JITDaeX5oJeHvvpTmwvqUmNUSK0wLgenLcgPjwNk+MCfhVRAN4rFFUUD uQ0pYosI9Qgr8jzUyyvo4+9GpkN8T8uotHUHMheEwP6ByqTL8NeGIcqDebA0eItvJ0IT MTRQ== X-Gm-Message-State: AA+aEWaJtJskTUZ0yiJmapBBtfDLa4O8CCukWRoXKkfeN9iCwNcrU0Bg lZVOsFNG2wm8cD6wXMJ5jJjWTYiZiVM= X-Google-Smtp-Source: AFSGD/Xln4DhlHKpJ13gaxoheMc6RZwjWAd+PVZ6X1VhdC3IrOxkzslwk+z094NVyYcV1gcWAxy5bQ== X-Received: by 2002:a6b:f70a:: with SMTP id k10mr4834429iog.270.1543597048182; Fri, 30 Nov 2018 08:57:28 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:27 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 24/27] block: implement bio helper to add iter kvec pages to bio Date: Fri, 30 Nov 2018 09:56:43 -0700 Message-Id: <20181130165646.27341-25-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For an ITER_KVEC, we can just iterate the iov and add the pages to the bio directly. Signed-off-by: Jens Axboe --- block/bio.c | 30 ++++++++++++++++++++++++++++++ include/linux/bio.h | 1 + 2 files changed, 31 insertions(+) diff --git a/block/bio.c b/block/bio.c index ab174bce5436..7e59ef547ed4 100644 --- a/block/bio.c +++ b/block/bio.c @@ -903,6 +903,36 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) } EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); +/** + * bio_iov_kvec_add_pages - add pages from an ITER_KVEC to a bio + * @bio: bio to add pages to + * @iter: iov iterator describing the region to be added + * + * Iterate pages in the @iter and add them to the bio. We flag the + * @bio with BIO_HOLD_PAGES, telling IO completion not to free them. + */ +int bio_iov_kvec_add_pages(struct bio *bio, struct iov_iter *iter) +{ + unsigned short orig_vcnt = bio->bi_vcnt; + const struct kvec *kv; + + do { + struct page *page; + size_t size; + + kv = iter->kvec + iter->iov_offset; + page = virt_to_page(kv->iov_base); + size = bio_add_page(bio, page, kv->iov_len, + offset_in_page(kv->iov_base)); + if (size != kv->iov_len) + break; + iov_iter_advance(iter, size); + } while (iov_iter_count(iter) && !bio_full(bio)); + + bio_set_flag(bio, BIO_HOLD_PAGES); + return bio->bi_vcnt > orig_vcnt ? 0 : -EINVAL; +} + static void submit_bio_wait_endio(struct bio *bio) { complete(bio->bi_private); diff --git a/include/linux/bio.h b/include/linux/bio.h index 056fb627edb3..23ae8fb66b1e 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -434,6 +434,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, void __bio_add_page(struct bio *bio, struct page *page, unsigned int len, unsigned int off); int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter); +int bio_iov_kvec_add_pages(struct bio *bio, struct iov_iter *iter); struct rq_map_data; extern struct bio *bio_map_user_iov(struct request_queue *, struct iov_iter *, gfp_t); From patchwork Fri Nov 30 16:56:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706827 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8721118B8 for ; Fri, 30 Nov 2018 16:57:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 789BA3041C for ; Fri, 30 Nov 2018 16:57:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6CA6830422; Fri, 30 Nov 2018 16:57:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1F4483041C for ; Fri, 30 Nov 2018 16:57:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726779AbeLAEH0 (ORCPT ); Fri, 30 Nov 2018 23:07:26 -0500 Received: from mail-it1-f194.google.com ([209.85.166.194]:35513 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727653AbeLAEH0 (ORCPT ); Fri, 30 Nov 2018 23:07:26 -0500 Received: by mail-it1-f194.google.com with SMTP id p197so10253627itp.0 for ; Fri, 30 Nov 2018 08:57:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=g46PYTbAjyIcxDqCKCRd2T5v8MCWa71J81SsLDN3oOo=; b=EfcQOIoULyEeMdNtFqUGblTKvD4RYYzQLQUU34/06yRNt+Xahpv12wPlX0X2xgkFXh OTbzHOE5HiWZgeJiAuAQD+6e8bB1UwKSe4mHSmIVHFWkm5VkjLf4KBuiqdkt9DQTE1Hn e36Y/xgvavR8b2FeAjgUADEWnD7Q1lZ0ziPnX5goup4285SAkeb3U6CRfLZPN2UGRrQC U27gi+HNd23Usf/6GvXoe8nAALoDLXtGx46lElqUB8i/ahApkhrWd7Q9dzRw7rahUeQ0 W2Ix9Rw3bCXYWc1ZheeR70ZG1c+j4bidU7hBbqLlPfZDGLbNGx2Mw1SXksORTuFHDlH/ IGlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=g46PYTbAjyIcxDqCKCRd2T5v8MCWa71J81SsLDN3oOo=; b=fgKLHN4Uu44l++j+VuaYWPBjuXw8Gbvrg4n52gQl0p+Q9Z1UoxoDuFthHblUmow0wz H+zQrHTOcGmeQh21pyO2EzFIGkGXidercvHx7R0aH45vJKCY7cCMBktOEoY0l9MjeuwO O3ReCcSMVpY7Sw4PMuGFuxsQJDABJqzAgaKJShrV1elzvD2i57NPMXyzJSwEc7gHSQj2 ki1IwJWvgSCXpXrnL1Z7NUvWyRYPlaJbShMuW7bOQnWSdGklOWvs4df96GMxSfHeC3r7 eMyCS5cCE8JSqL2yodrV59WAoc/hXMgSwZ8+QiWKa1QzK46HVTBReliPREdUcoqFkTgs JhAg== X-Gm-Message-State: AA+aEWbULbEHChXcqHmTUHDwRDyBr8Tny3mpwVBHxt+4Xm6rcOD5t9ko VBX27kTdduZeu2MCZQqx5DptlGeHovs= X-Google-Smtp-Source: AFSGD/VustQw5HVZWLmhppIjvJibhAVyzP6F1VYZU8kliZqR1wrNqFEz2nrCQ6WiVVRTUoKpcbvLGA== X-Received: by 2002:a24:a08a:: with SMTP id o132mr6004259ite.1.1543597049680; Fri, 30 Nov 2018 08:57:29 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:28 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 25/27] fs: add support for mapping an ITER_KVEC for O_DIRECT Date: Fri, 30 Nov 2018 09:56:44 -0700 Message-Id: <20181130165646.27341-26-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This adds support for sync/async O_DIRECT to make a kvec type iter for bdev access, as well as iomap. Signed-off-by: Jens Axboe --- fs/block_dev.c | 16 ++++++++++++---- fs/iomap.c | 5 ++++- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index ebc3d5a0f424..b926f03de55e 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -219,7 +219,10 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, bio.bi_end_io = blkdev_bio_end_io_simple; bio.bi_ioprio = iocb->ki_ioprio; - ret = bio_iov_iter_get_pages(&bio, iter); + if (iov_iter_is_kvec(iter)) + ret = bio_iov_kvec_add_pages(&bio, iter); + else + ret = bio_iov_iter_get_pages(&bio, iter); if (unlikely(ret)) goto out; ret = bio.bi_iter.bi_size; @@ -326,8 +329,9 @@ static void blkdev_bio_end_io(struct bio *bio) struct bio_vec *bvec; int i; - bio_for_each_segment_all(bvec, bio, i) - put_page(bvec->bv_page); + if (!bio_flagged(bio, BIO_HOLD_PAGES)) + bio_for_each_segment_all(bvec, bio, i) + put_page(bvec->bv_page); bio_put(bio); } } @@ -381,7 +385,11 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) bio->bi_end_io = blkdev_bio_end_io; bio->bi_ioprio = iocb->ki_ioprio; - ret = bio_iov_iter_get_pages(bio, iter); + if (iov_iter_is_kvec(iter)) + ret = bio_iov_kvec_add_pages(bio, iter); + else + ret = bio_iov_iter_get_pages(bio, iter); + if (unlikely(ret)) { bio->bi_status = BLK_STS_IOERR; bio_endio(bio); diff --git a/fs/iomap.c b/fs/iomap.c index 96d60b9b2bea..72f58d604fab 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1671,7 +1671,10 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - ret = bio_iov_iter_get_pages(bio, &iter); + if (iov_iter_is_kvec(&iter)) + ret = bio_iov_kvec_add_pages(bio, &iter); + else + ret = bio_iov_iter_get_pages(bio, &iter); if (unlikely(ret)) { bio_put(bio); return copied ? copied : ret; From patchwork Fri Nov 30 16:56:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706831 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1439518B8 for ; Fri, 30 Nov 2018 16:57:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05E013041C for ; Fri, 30 Nov 2018 16:57:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EE58430422; Fri, 30 Nov 2018 16:57:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98C863041C for ; Fri, 30 Nov 2018 16:57:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726736AbeLAEH2 (ORCPT ); Fri, 30 Nov 2018 23:07:28 -0500 Received: from mail-io1-f66.google.com ([209.85.166.66]:43654 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727653AbeLAEH1 (ORCPT ); Fri, 30 Nov 2018 23:07:27 -0500 Received: by mail-io1-f66.google.com with SMTP id g8so5053605iop.10 for ; Fri, 30 Nov 2018 08:57:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=t5Ipy9cJT+UymoTkcHrGWgGflPH0UWisZzkkCO9C0qI=; b=SpjWkyYZItYDW+aLLrAYQSIkYTHzXKUa07G+A7smAY9qZ9aOqX7lnGMpjXxMIDPBEC 15XbsteKPaOdvTjSIye1ZmaN8/neTzAGWPGfXaRB8GT56nJlxI9OteOD6VRQBAGqPdQ8 b3x7qaNblQo1RGJ+WIvs27W3d7TJ/VgelhJYlxXqFm3tfIaOf0Bs7CObWMC7jGvlbcHa 2IKOgg80aX6SArLrN6G+EwDigtnNzWdM2iRLv3xjDV0PGzb+O6ScXvam0MZLwzaaPJyQ bC3+rBHHwnIMohGq3BFHifk/TpXv1In7ckR8oqgUwbPvG/BKhij14/fw+KxtpOLtjISj AKHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=t5Ipy9cJT+UymoTkcHrGWgGflPH0UWisZzkkCO9C0qI=; b=UUdSxiJUFTvZANS5zTeeQ/toRyITNvitrL40lWGSKlC9X70usNK/QW1+4tvqq0LDQr 11Eu2iV8FeDYkAp5hHAwNbPxyr7XMR2bJAFtLKBev6sHFWuMoqJNTP0X68JIZ7Pm+AoD /WRk0mOFCI2GdWf5u07/nSwpauH2zY55h95aHnxp3hMoomJ81b0RTrGG4IvUfR7zAINY fw6brNWG7I3nO7qKLMVDLUx2OmSP4dodWX41SOicP5pmGt5E5yXDxjJGZ0soONg3Zclb LfMtxyfMDC0aDPCDJiQapzyLEk0rACwB/8G07roYbohYmaVjr2dbzTOVbTX2N7BBmzxS cSOA== X-Gm-Message-State: AA+aEWbRzPiq4dFQf0i3EWKOVz+N7ieSR48AHLiv/9m8M795bC8WgRjO C7oPVOwE6W8+BmxBvr0TzfYUf5q22+g= X-Google-Smtp-Source: AFSGD/XWHdoKjMOdVnyilerhhmLIemiGEc8b5V5SqbnHw2OlGhkbNKO664axLyd6pyc32pWmUPBFYw== X-Received: by 2002:a6b:7946:: with SMTP id j6mr5106927iop.29.1543597051153; Fri, 30 Nov 2018 08:57:31 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:30 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 26/27] iov_iter: add import_kvec() Date: Fri, 30 Nov 2018 09:56:45 -0700 Message-Id: <20181130165646.27341-27-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This explicitly sets up an ITER_KVEC from an iovec with kernel ranges mapped. Signed-off-by: Jens Axboe --- include/linux/uio.h | 3 +++ lib/iov_iter.c | 35 ++++++++++++++++++++++++++--------- 2 files changed, 29 insertions(+), 9 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 55ce99ddb912..bbefdb421f6d 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -284,6 +284,9 @@ int compat_import_iovec(int type, const struct compat_iovec __user * uvector, int import_single_range(int type, void __user *buf, size_t len, struct iovec *iov, struct iov_iter *i); +int import_kvec(int type, const struct kvec *kvecs, unsigned nr_segs, + size_t bytes, struct iov_iter *iter); + int iov_iter_for_each_range(struct iov_iter *i, size_t bytes, int (*f)(struct kvec *vec, void *context), void *context); diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 7ebccb5c1637..a5b6dd691f37 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -431,25 +431,33 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes) } EXPORT_SYMBOL(iov_iter_fault_in_readable); -void iov_iter_init(struct iov_iter *i, unsigned int direction, - const struct iovec *iov, unsigned long nr_segs, - size_t count) +static void iov_iter_init_type(struct iov_iter *i, int type, + unsigned int direction, const struct iovec *iov, + unsigned long nr_segs, size_t count) { WARN_ON(direction & ~(READ | WRITE)); direction &= READ | WRITE; - /* It will get better. Eventually... */ - if (uaccess_kernel()) { - i->type = ITER_KVEC | direction; + i->type = type | direction; + if (i->type == ITER_KVEC) i->kvec = (struct kvec *)iov; - } else { - i->type = ITER_IOVEC | direction; + else i->iov = iov; - } + i->nr_segs = nr_segs; i->iov_offset = 0; i->count = count; } + +void iov_iter_init(struct iov_iter *i, unsigned int direction, + const struct iovec *iov, unsigned long nr_segs, + size_t count) +{ + /* It will get better. Eventually... */ + int type = uaccess_kernel() ? ITER_KVEC : ITER_IOVEC; + + iov_iter_init_type(i, type, direction, iov, nr_segs, count); +} EXPORT_SYMBOL(iov_iter_init); static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t len) @@ -1582,6 +1590,15 @@ int import_iovec(int type, const struct iovec __user * uvector, } EXPORT_SYMBOL(import_iovec); +int import_kvec(int type, const struct kvec *kvecs, unsigned nr_segs, + size_t bytes, struct iov_iter *iter) +{ + const struct iovec *p = (const struct iovec *) kvecs; + + iov_iter_init_type(iter, ITER_KVEC, type, p, nr_segs, bytes); + return 0; +} + #ifdef CONFIG_COMPAT #include From patchwork Fri Nov 30 16:56:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10706835 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 991D418B8 for ; Fri, 30 Nov 2018 16:57:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A5A43041C for ; Fri, 30 Nov 2018 16:57:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7EACE30422; Fri, 30 Nov 2018 16:57:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA17E3041C for ; Fri, 30 Nov 2018 16:57:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727653AbeLAEH3 (ORCPT ); Fri, 30 Nov 2018 23:07:29 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:40794 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727660AbeLAEH3 (ORCPT ); Fri, 30 Nov 2018 23:07:29 -0500 Received: by mail-io1-f65.google.com with SMTP id n9so5070639ioh.7 for ; Fri, 30 Nov 2018 08:57:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aki/HEwtmsNooA8ZOaa31kgTTJLrCRRouznljeQ9dMg=; b=bsrjDYFIKU2SWAVc2l2m3UuvsJCuSEwR2UFYC1qI0DgsNh8PzYEYTjDCj8rE+ZUjzU o1XssgeB0YCt30agEtU6konJP/BK9g4FY+/UCRiauQcv0r9sgIwrdNZHFgCz2OzAHpI7 2C/bzUWRhY4+Pu390nYlppjbFFp5Ik4eDjxsaZtfwJS/fmbgBPXFHnvC8RESJ6htKXK/ 6tXlWxFOmx53ZXOPtmJVn+j+GNr1+IuVK0268L2ewlIDGDoeN5nX8gq5DY3mNBkzPjsr PAIebloCaY3TEsWLyikGrLPJnGHOwkG72+rTLQf9PI0coZJvTlHnz/5JfVj92CafnnvU 2udA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aki/HEwtmsNooA8ZOaa31kgTTJLrCRRouznljeQ9dMg=; b=ECRhCwnYey4omo7jxhKIspekln9MFblsTAr9oXjjjwjyW3iYOyaTws7EKxxvNv6k5u 7/HG3vvXapIoz/B+kxdqtGZKVA8Y9qqY2HQ8U8vnl/J60WYsYS7Ian1HncnFmEPv1mUj WWrwjZyo1T35vvR+vd8cih+xXLnMpDQLpVhlQRnqFrX4cYxix/fIj9W/XmuBvKaHdQQ+ v1fFWnOaVxO7R+mk6TmQniJggUNLH2NBY5OdXSFZo4jJILw2qzQcCL3C4IAUJYmN6i+5 C0x16uxdnqxo6x2pB8TITJJwbd8onUTZN3GMCy+qvy1b7HujA4FroQi0xRrJc3olb5R0 Grsw== X-Gm-Message-State: AA+aEWZkZXuR/MpsAC97wcJSei5ma6EDzY6SMCMeXY8shnsfr4yElICo tE0fgWGoJFYM4YJO6K/Tt6OZ0vjedYs= X-Google-Smtp-Source: AFSGD/VH7pwateZNZ9rs9Hbt56WICSICYtAcNRGraKhq8NTLYJjr4DYvXMXt5t2qAc8i9H8ZvaPDKw== X-Received: by 2002:a6b:3e45:: with SMTP id l66mr4938182ioa.269.1543597052800; Fri, 30 Nov 2018 08:57:32 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id j133sm2979447itj.16.2018.11.30.08.57.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 08:57:31 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org Cc: hch@lst.de, Jens Axboe Subject: [PATCH 27/27] aio: add support for pre-mapped user IO buffers Date: Fri, 30 Nov 2018 09:56:46 -0700 Message-Id: <20181130165646.27341-28-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181130165646.27341-1-axboe@kernel.dk> References: <20181130165646.27341-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If we have fixed user buffers, we can map them into the kernel when we setup the io_context. That avoids the need to do get_user_pages() for each and every IO. To utilize this feature, the application must set both IOCTX_FLAG_USERIOCB, to provide iocb's in userspace, and then IOCTX_FLAG_FIXEDBUFS. The latter tells aio that the iocbs that are mapped already contain valid destination and sizes. These buffers can then be mapped into the kernel for the life time of the io_context, as opposed to just the duration of the each single IO. Only works with non-vectored read/write commands for now, not with PREADV/PWRITEV. A limit of 4M is imposed as the largest buffer we currently support. There's nothing preventing us from going larger, but we need some cap, and 4M seemed like it would definitely be big enough. See the fio change for how to utilize this feature: http://git.kernel.dk/cgit/fio/commit/?id=2041bd343da1c1e955253f62374588718c64f0f3 Signed-off-by: Jens Axboe --- fs/aio.c | 185 +++++++++++++++++++++++++++++++---- include/uapi/linux/aio_abi.h | 1 + 2 files changed, 169 insertions(+), 17 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 426939f1dae9..f735967488a5 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -86,6 +87,11 @@ struct ctx_rq_wait { atomic_t count; }; +struct aio_mapped_ubuf { + struct kvec *kvec; + unsigned int nr_kvecs; +}; + struct kioctx { struct percpu_ref users; atomic_t dead; @@ -124,6 +130,8 @@ struct kioctx { struct page **iocb_pages; long iocb_nr_pages; + struct aio_mapped_ubuf *user_bufs; + struct rcu_work free_rwork; /* see free_ioctx() */ /* @@ -290,6 +298,7 @@ static const bool aio_use_state_req_list = false; #endif static void aio_useriocb_free(struct kioctx *); +static void aio_iocb_buffer_unmap(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages) @@ -652,6 +661,7 @@ static void free_ioctx(struct work_struct *work) free_rwork); pr_debug("freeing %p\n", ctx); + aio_iocb_buffer_unmap(ctx); aio_useriocb_free(ctx); aio_free_ring(ctx); free_percpu(ctx->cpu); @@ -1597,6 +1607,115 @@ static struct iocb *aio_iocb_from_index(struct kioctx *ctx, int index) return iocb + index; } +static void aio_iocb_buffer_unmap(struct kioctx *ctx) +{ + int i, j; + + if (!ctx->user_bufs) + return; + + for (i = 0; i < ctx->max_reqs; i++) { + struct aio_mapped_ubuf *amu = &ctx->user_bufs[i]; + + for (j = 0; j < amu->nr_kvecs; j++) { + struct page *page; + + page = virt_to_page(amu->kvec[j].iov_base); + put_page(page); + } + kfree(amu->kvec); + amu->nr_kvecs = 0; + } + + kfree(ctx->user_bufs); + ctx->user_bufs = NULL; +} + +static int aio_iocb_buffer_map(struct kioctx *ctx) +{ + struct page **pages = NULL; + int i, j, got_pages = 0; + struct iocb *iocb; + int ret = -EINVAL; + + ctx->user_bufs = kzalloc(ctx->max_reqs * sizeof(struct aio_mapped_ubuf), + GFP_KERNEL); + if (!ctx->user_bufs) + return -ENOMEM; + + for (i = 0; i < ctx->max_reqs; i++) { + struct aio_mapped_ubuf *amu = &ctx->user_bufs[i]; + unsigned long off, start, end, ubuf; + int pret, nr_pages; + size_t size; + + iocb = aio_iocb_from_index(ctx, i); + + /* + * Don't impose further limits on the size and buffer + * constraints here, we'll -EINVAL later when IO is + * submitted if they are wrong. + */ + ret = -EFAULT; + if (!iocb->aio_buf) + goto err; + + /* arbitrary limit, but we need something */ + if (iocb->aio_nbytes > SZ_4M) + goto err; + + ubuf = iocb->aio_buf; + end = (ubuf + iocb->aio_nbytes + PAGE_SIZE - 1) >> PAGE_SHIFT; + start = ubuf >> PAGE_SHIFT; + nr_pages = end - start; + + if (!pages || nr_pages > got_pages) { + kfree(pages); + pages = kmalloc(nr_pages * sizeof(struct page *), + GFP_KERNEL); + if (!pages) { + ret = -ENOMEM; + goto err; + } + got_pages = nr_pages; + } + + amu->kvec = kmalloc(nr_pages * sizeof(struct kvec), GFP_KERNEL); + if (!amu->kvec) + goto err; + + down_write(¤t->mm->mmap_sem); + pret = get_user_pages((unsigned long) iocb->aio_buf, nr_pages, + 1, pages, NULL); + up_write(¤t->mm->mmap_sem); + + if (pret < nr_pages) { + if (pret < 0) + ret = pret; + goto err; + } + + off = ubuf & ~PAGE_MASK; + size = iocb->aio_nbytes; + for (j = 0; j < nr_pages; j++) { + size_t vec_len; + + vec_len = min_t(size_t, size, PAGE_SIZE - off); + amu->kvec[j].iov_base = page_address(pages[j]) + off; + amu->kvec[j].iov_len = vec_len; + off = 0; + size -= vec_len; + } + amu->nr_kvecs = nr_pages; + } + kfree(pages); + return 0; +err: + kfree(pages); + aio_iocb_buffer_unmap(ctx); + return ret; +} + static void aio_useriocb_free(struct kioctx *ctx) { int i; @@ -1647,7 +1766,8 @@ SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, unsigned long ctx; long ret; - if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL)) + if (flags & ~(IOCTX_FLAG_USERIOCB | IOCTX_FLAG_IOPOLL | + IOCTX_FLAG_FIXEDBUFS)) return -EINVAL; ret = get_user(ctx, ctxp); @@ -1663,6 +1783,15 @@ SYSCALL_DEFINE4(io_setup2, u32, nr_events, u32, flags, struct iocb * __user, ret = aio_useriocb_map(ioctx, iocbs); if (ret) goto err; + if (flags & IOCTX_FLAG_FIXEDBUFS) { + ret = aio_iocb_buffer_map(ioctx); + if (ret) + goto err; + } + } else if (flags & IOCTX_FLAG_FIXEDBUFS) { + /* can only support fixed bufs with user mapped iocbs */ + ret = -EINVAL; + goto err; } ret = put_user(ioctx->user_id, ctxp); @@ -1939,23 +2068,38 @@ static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb, return ret; } -static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec, - bool vectored, bool compat, struct iov_iter *iter) +static int aio_setup_rw(int rw, struct aio_kiocb *kiocb, + const struct iocb *iocb, struct iovec **iovec, bool vectored, + bool compat, bool kvecs, struct iov_iter *iter) { - void __user *buf = (void __user *)(uintptr_t)iocb->aio_buf; + void __user *ubuf = (void __user *)(uintptr_t)iocb->aio_buf; size_t len = iocb->aio_nbytes; if (!vectored) { - ssize_t ret = import_single_range(rw, buf, len, *iovec, iter); + ssize_t ret; + + if (!kvecs) { + ret = import_single_range(rw, ubuf, len, *iovec, iter); + } else { + long index = (long) kiocb->ki_user_iocb; + struct aio_mapped_ubuf *amu; + + /* __io_submit_one() already validated the index */ + amu = &kiocb->ki_ctx->user_bufs[index]; + ret = import_kvec(rw, amu->kvec, amu->nr_kvecs, + len, iter); + } *iovec = NULL; return ret; } + if (kvecs) + return -EINVAL; #ifdef CONFIG_COMPAT if (compat) - return compat_import_iovec(rw, buf, len, UIO_FASTIOV, iovec, + return compat_import_iovec(rw, ubuf, len, UIO_FASTIOV, iovec, iter); #endif - return import_iovec(rw, buf, len, UIO_FASTIOV, iovec, iter); + return import_iovec(rw, ubuf, len, UIO_FASTIOV, iovec, iter); } static inline void aio_rw_done(struct kiocb *req, ssize_t ret) @@ -2028,7 +2172,7 @@ static void aio_iopoll_iocb_issued(struct aio_submit_state *state, static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, struct aio_submit_state *state, bool vectored, - bool compat) + bool compat, bool kvecs) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -2048,9 +2192,11 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, if (unlikely(!file->f_op->read_iter)) goto out_fput; - ret = aio_setup_rw(READ, iocb, &iovec, vectored, compat, &iter); + ret = aio_setup_rw(READ, kiocb, iocb, &iovec, vectored, compat, kvecs, + &iter); if (ret) goto out_fput; + ret = rw_verify_area(READ, file, &req->ki_pos, iov_iter_count(&iter)); if (!ret) aio_rw_done(req, call_read_iter(file, req, &iter)); @@ -2063,7 +2209,7 @@ static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, struct aio_submit_state *state, bool vectored, - bool compat) + bool compat, bool kvecs) { struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs; struct kiocb *req = &kiocb->rw; @@ -2083,7 +2229,8 @@ static ssize_t aio_write(struct aio_kiocb *kiocb, const struct iocb *iocb, if (unlikely(!file->f_op->write_iter)) goto out_fput; - ret = aio_setup_rw(WRITE, iocb, &iovec, vectored, compat, &iter); + ret = aio_setup_rw(WRITE, kiocb, iocb, &iovec, vectored, compat, kvecs, + &iter); if (ret) goto out_fput; ret = rw_verify_area(WRITE, file, &req->ki_pos, iov_iter_count(&iter)); @@ -2322,7 +2469,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, struct iocb __user *user_iocb, - struct aio_submit_state *state, bool compat) + struct aio_submit_state *state, bool compat, + bool kvecs) { struct aio_kiocb *req; ssize_t ret; @@ -2382,16 +2530,16 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EINVAL; switch (iocb->aio_lio_opcode) { case IOCB_CMD_PREAD: - ret = aio_read(req, iocb, state, false, compat); + ret = aio_read(req, iocb, state, false, compat, kvecs); break; case IOCB_CMD_PWRITE: - ret = aio_write(req, iocb, state, false, compat); + ret = aio_write(req, iocb, state, false, compat, kvecs); break; case IOCB_CMD_PREADV: - ret = aio_read(req, iocb, state, true, compat); + ret = aio_read(req, iocb, state, true, compat, kvecs); break; case IOCB_CMD_PWRITEV: - ret = aio_write(req, iocb, state, true, compat); + ret = aio_write(req, iocb, state, true, compat, kvecs); break; case IOCB_CMD_FSYNC: if (ctx->flags & IOCTX_FLAG_IOPOLL) @@ -2443,6 +2591,7 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, struct aio_submit_state *state, bool compat) { struct iocb iocb, *iocbp; + bool kvecs; if (ctx->flags & IOCTX_FLAG_USERIOCB) { unsigned long iocb_index = (unsigned long) user_iocb; @@ -2450,14 +2599,16 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, if (iocb_index >= ctx->max_reqs) return -EINVAL; + kvecs = (ctx->flags & IOCTX_FLAG_FIXEDBUFS) != 0; iocbp = aio_iocb_from_index(ctx, iocb_index); } else { if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb)))) return -EFAULT; + kvecs = false; iocbp = &iocb; } - return __io_submit_one(ctx, iocbp, user_iocb, state, compat); + return __io_submit_one(ctx, iocbp, user_iocb, state, compat, kvecs); } #ifdef CONFIG_BLOCK diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h index ea0b9a19f4df..05d72cf86bd3 100644 --- a/include/uapi/linux/aio_abi.h +++ b/include/uapi/linux/aio_abi.h @@ -110,6 +110,7 @@ struct iocb { #define IOCTX_FLAG_USERIOCB (1 << 0) /* iocbs are user mapped */ #define IOCTX_FLAG_IOPOLL (1 << 1) /* io_context is polled */ +#define IOCTX_FLAG_FIXEDBUFS (1 << 2) /* IO buffers are fixed */ #undef IFBIG #undef IFLITTLE