From patchwork Mon Nov 26 16:45:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 10698717 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3E79414E2 for ; Mon, 26 Nov 2018 16:46:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2FA652A083 for ; Mon, 26 Nov 2018 16:46:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 241672A088; Mon, 26 Nov 2018 16:46:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78AD82A087 for ; Mon, 26 Nov 2018 16:46:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726735AbeK0DlE (ORCPT ); Mon, 26 Nov 2018 22:41:04 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:36153 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727140AbeK0DlD (ORCPT ); Mon, 26 Nov 2018 22:41:03 -0500 Received: by mail-it1-f196.google.com with SMTP id c9so28764088itj.1 for ; Mon, 26 Nov 2018 08:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=OIl8Cn3W6krW4SCjJSLwhD4y1eoA0bru+Cwg1YFlJhk=; b=AIf98ERr1HTugFAQffOuzWX9GOmsqg6bIZwtsiJPOgGmOhk/XaPUDGvfoiz6bZ73Ug H+hi6bKuwiGY0iOWWaK1d3VQ8za1oSx8o6Dhuv4f0z4CxBnoH/h9Nm5NIPc0fsk3SlKZ IJZAVnHwG+GE/+PBwNItBP0rMUgbIn0+3VOLlUKpHiVmEEcado4QeFaJv0b/hc5pZwTF S57y11Y8QYB+wODOjO9yOaUpLmC8vBfo5czeCfjTKikrjXa5UAKXcU+v3dB+TGtu4Jza EuK6g8wwytUjUr4DNUj1KZXk1FukhdCbM2sahTu0ZDXOMmeJKWdvhDt2/4AvYTFQSFRJ pbQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OIl8Cn3W6krW4SCjJSLwhD4y1eoA0bru+Cwg1YFlJhk=; b=sVfmJnStbMFF/0wiV9+PuDTTvxi1UlqBKgSng7TWnJkjCKtYno9iG+pgeYjPNWOBbb T+86wbyoUM+C1AZJf7BnXOhQIi7JjApk8jGDAjHFVuZdZGciliD50TZg9zWP8zA/Ntbb sJ0n453EUDI6Zlkjcljxf0EhN0YnVA0dClKF1YsOVE9vyUpnMfo/zlUlL1zHpuHpvmTp KJwphv5EFfiBNFgxKOYDk551NQJdsAyLl3Qo+KtjCj24kQyQSPUf/UCb/XoeSl+OXn8F xmA+mpGRATo5i/j92ilfVLDj6yyh9dIizAO8kFzOsaStk0VB+CajpRaydak1MHUF9S3o IsEQ== X-Gm-Message-State: AGRZ1gIJuJJYd0gwzNRSM8ny1pUHflVjXHgYunAVJKTRokbFVgz78izs gedN3rVDkdaL/NOOLbbicpA0n/IKI0g= X-Google-Smtp-Source: AFSGD/V2NAJJ8+ykLJhzffrkpsxziOEl9uC+I73XOI19DuejhXfYm6U5UyhrCE1g5Co1cwVE5qOUQw== X-Received: by 2002:a02:2702:: with SMTP id g2mr23484940jaa.83.1543250783563; Mon, 26 Nov 2018 08:46:23 -0800 (PST) Received: from localhost.localdomain ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id t65-v6sm486801ita.9.2018.11.26.08.46.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Nov 2018 08:46:22 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 18/20] aio: add submission side request cache Date: Mon, 26 Nov 2018 09:45:42 -0700 Message-Id: <20181126164544.5699-19-axboe@kernel.dk> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181126164544.5699-1-axboe@kernel.dk> References: <20181126164544.5699-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We have to add each submitted polled request to the io_context poll_submitted list, which means we have to grab the poll_lock. We already use the block plug to batch submissions if we're doing a batch of IO submissions, extend that to cover the poll requests internally as well. Signed-off-by: Jens Axboe --- fs/aio.c | 136 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 113 insertions(+), 23 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index db73c8af1a0a..74afd1ff7fc9 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -247,6 +247,15 @@ static const struct address_space_operations aio_ctx_aops; static const unsigned int iocb_page_shift = ilog2(PAGE_SIZE / sizeof(struct iocb)); +/* + * We rely on block level unplugs to flush pending requests, if we schedule + */ +#ifdef CONFIG_BLOCK +static const bool aio_use_state_req_list = true; +#else +static const bool aio_use_state_req_list = false; +#endif + static void aio_useriocb_free(struct kioctx *); static void aio_iopoll_reap_events(struct kioctx *); @@ -1724,6 +1733,21 @@ static void aio_complete_rw_poll(struct kiocb *kiocb, long res, long res2) } } +struct aio_submit_state { + struct kioctx *ctx; + + struct blk_plug plug; +#ifdef CONFIG_BLOCK + struct blk_plug_cb plug_cb; +#endif + + /* + * Polled iocbs that have been submitted, but not added to the ctx yet + */ + struct list_head req_list; + unsigned int req_count; +}; + static int aio_prep_rw(struct aio_kiocb *kiocb, const struct iocb *iocb) { struct kioctx *ctx = kiocb->ki_ctx; @@ -1832,13 +1856,28 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret) } +/* + * Called either at the end of IO submission, or through a plug callback + * because we're going to schedule. Moves out local batch of requests to + * the ctx poll list, so they can be found for polling + reaping. + */ +static void aio_flush_state_reqs(struct kioctx *ctx, + struct aio_submit_state *state) +{ + spin_lock(&ctx->poll_lock); + list_splice_tail_init(&state->req_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + state->req_count = 0; +} + /* * After the iocb has been issued, it's safe to be found on the poll list. * Adding the kiocb to the list AFTER submission ensures that we don't * find it from a io_getevents() thread before the issuer is done accessing * the kiocb cookie. */ -static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) +static void aio_iopoll_iocb_issued(struct aio_submit_state *state, + struct aio_kiocb *kiocb) { /* * For fast devices, IO may have already completed. If it has, add @@ -1848,12 +1887,21 @@ static void aio_iopoll_iocb_issued(struct aio_kiocb *kiocb) const int front_add = test_bit(IOCB_POLL_COMPLETED, &kiocb->ki_flags); struct kioctx *ctx = kiocb->ki_ctx; - spin_lock(&ctx->poll_lock); - if (front_add) - list_add(&kiocb->ki_list, &ctx->poll_submitted); - else - list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); - spin_unlock(&ctx->poll_lock); + if (!state || !aio_use_state_req_list) { + spin_lock(&ctx->poll_lock); + if (front_add) + list_add(&kiocb->ki_list, &ctx->poll_submitted); + else + list_add_tail(&kiocb->ki_list, &ctx->poll_submitted); + spin_unlock(&ctx->poll_lock); + } else { + if (front_add) + list_add(&kiocb->ki_list, &state->req_list); + else + list_add_tail(&kiocb->ki_list, &state->req_list); + if (++state->req_count >= AIO_IOPOLL_BATCH) + aio_flush_state_reqs(ctx, state); + } } static ssize_t aio_read(struct aio_kiocb *kiocb, const struct iocb *iocb, @@ -2149,7 +2197,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) } static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, - struct iocb __user *user_iocb, bool compat) + struct iocb __user *user_iocb, + struct aio_submit_state *state, bool compat) { struct aio_kiocb *req; ssize_t ret; @@ -2253,7 +2302,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, ret = -EAGAIN; goto out_put_req; } - aio_iopoll_iocb_issued(req); + aio_iopoll_iocb_issued(state, req); } return 0; out_put_req: @@ -2267,7 +2316,7 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb, } static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, - bool compat) + struct aio_submit_state *state, bool compat) { struct iocb iocb, *iocbp; @@ -2288,7 +2337,44 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, iocbp = &iocb; } - return __io_submit_one(ctx, iocbp, user_iocb, compat); + return __io_submit_one(ctx, iocbp, user_iocb, state, compat); +} + +#ifdef CONFIG_BLOCK +static void aio_state_unplug(struct blk_plug_cb *cb, bool from_schedule) +{ + struct aio_submit_state *state; + + state = container_of(cb, struct aio_submit_state, plug_cb); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} +#endif + +/* + * Batched submission is done, ensure local IO is flushed out. + */ +static void aio_submit_state_end(struct aio_submit_state *state) +{ + blk_finish_plug(&state->plug); + if (!list_empty(&state->req_list)) + aio_flush_state_reqs(state->ctx, state); +} + +/* + * Start submission side cache. + */ +static void aio_submit_state_start(struct aio_submit_state *state, + struct kioctx *ctx) +{ + state->ctx = ctx; + INIT_LIST_HEAD(&state->req_list); + state->req_count = 0; +#ifdef CONFIG_BLOCK + state->plug_cb.callback = aio_state_unplug; + blk_start_plug(&state->plug); + list_add(&state->plug_cb.list, &state->plug.cb_list); +#endif } /* @@ -2312,10 +2398,10 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb, SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, struct iocb __user * __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2329,8 +2415,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { struct iocb __user *user_iocb; @@ -2339,12 +2427,12 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, break; } - ret = io_submit_one(ctx, user_iocb, false); + ret = io_submit_one(ctx, user_iocb, statep, false); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret; @@ -2354,10 +2442,10 @@ SYSCALL_DEFINE3(io_submit, aio_context_t, ctx_id, long, nr, COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, int, nr, compat_uptr_t __user *, iocbpp) { + struct aio_submit_state state, *statep = NULL; struct kioctx *ctx; long ret = 0; int i = 0; - struct blk_plug plug; if (unlikely(nr < 0)) return -EINVAL; @@ -2371,8 +2459,10 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, if (nr > ctx->nr_events) nr = ctx->nr_events; - if (nr > AIO_PLUG_THRESHOLD) - blk_start_plug(&plug); + if (nr > AIO_PLUG_THRESHOLD) { + aio_submit_state_start(&state, ctx); + statep = &state; + } for (i = 0; i < nr; i++) { compat_uptr_t user_iocb; @@ -2381,12 +2471,12 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, break; } - ret = io_submit_one(ctx, compat_ptr(user_iocb), true); + ret = io_submit_one(ctx, compat_ptr(user_iocb), statep, true); if (ret) break; } - if (nr > AIO_PLUG_THRESHOLD) - blk_finish_plug(&plug); + if (statep) + aio_submit_state_end(statep); percpu_ref_put(&ctx->users); return i ? i : ret;