From patchwork Mon Dec 14 19:13:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11972803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2ED8FC4361B for ; Mon, 14 Dec 2020 19:14:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE92722286 for ; Mon, 14 Dec 2020 19:14:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2502039AbgLNTOe (ORCPT ); Mon, 14 Dec 2020 14:14:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2438061AbgLNTOQ (ORCPT ); Mon, 14 Dec 2020 14:14:16 -0500 Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD184C0613D6 for ; Mon, 14 Dec 2020 11:13:35 -0800 (PST) Received: by mail-io1-xd43.google.com with SMTP id m23so3182997ioy.2 for ; Mon, 14 Dec 2020 11:13:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HNS3V4F8jAe8Zg8QFutKhAJSQ459FMWmyXZjmDhEIcQ=; b=YU5GWNle8fntTtzxU0JK9Aw42CeQCGQwXiYajQgNSGMBLs0y2ZBhQcO/pFavWZmepW A23+UIUu739pCHp9EyH/angGkas8gcKMuznKNocHrwhXm4iKl/Gf/DB4FGJ4P+5E8VwV RYye3c8DkAjU9TuNeiVyke+Dc467NQA2W+1FiuLGhxKPRR7CA1MBdeVitQ2ISrvxSGTC RvzSYe9Zag4rbuPVD9ljdyxDExdZgLnuj49zV5JpCg6BGtFIRsqPl17pp5fFwGxejhsA HM2tuVMo8w08PusoNfLmSZ/5/hXa/7f93o54GdlWyWfjCShdy7E8J/7JkV6XTRCugMCo HuCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HNS3V4F8jAe8Zg8QFutKhAJSQ459FMWmyXZjmDhEIcQ=; b=JVuIIW1VyTZTmjtzDbRwunFtHdtGssvT91JEqy8c8pYVfdn94pQjoYwR0XQCtkFikP Xt4s/zMfhRPzAWin0l1dDjd3ccgF1dnY1xRd7jgKJV/tt/hr25N6e3JpIYSERgHIDWVr 9nkuldxcO2mLrL3O7rzq3oHNetHE87crWaCMTC5J7nEPoH32gktJNSSM4baihvjBWXzO KzHCU+f6PR4TKJLibKJdNp+FtoccGEpDODDAr9ISiq3AV7M37TEDszNpSkCr7EKIsPic v3oLK88SqVk6zshd2+cBfyRRkuyTrSjtOae3Itg1UF/aJEFBB1fOfS+Q1Oe5hXfe5YOT tX8A== X-Gm-Message-State: AOAM532WRpw94HYY9H8mfJK0JnMvCxjAwuuXewGrfsRYvoVKl2XHFqhA 0ZV5HieIbu5aWalgI4SXjp2DBf0uQaEFtw== X-Google-Smtp-Source: ABdhPJwh/8wmJ06/LURctn5PVfDcJORAwy8CqB470SCuUia9iEyWqB6KG2whIcTKMTTXFcfE1QPDOg== X-Received: by 2002:a05:6602:150b:: with SMTP id g11mr33681181iow.88.1607973214816; Mon, 14 Dec 2020 11:13:34 -0800 (PST) Received: from p1.localdomain ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id 11sm11760566ilt.54.2020.12.14.11.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Dec 2020 11:13:34 -0800 (PST) From: Jens Axboe To: linux-fsdevel@vger.kernel.org Cc: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, Jens Axboe Subject: [PATCH 1/4] fs: make unlazy_walk() error handling consistent Date: Mon, 14 Dec 2020 12:13:21 -0700 Message-Id: <20201214191323.173773-2-axboe@kernel.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201214191323.173773-1-axboe@kernel.dk> References: <20201214191323.173773-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Most callers check for non-zero return, and assume it's -ECHILD (which it always will be). One caller uses the actual error return. Clean this up and make it fully consistent, by having unlazy_walk() return a bool instead. Rename it to try_to_unlazy() and return true on success, and failure on error. That's easier to read. No functional changes in this patch. Cc: Al Viro Signed-off-by: Jens Axboe --- fs/namei.c | 43 +++++++++++++++++-------------------------- 1 file changed, 17 insertions(+), 26 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 03d0e11e4f36..7eb7830da298 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -669,17 +669,17 @@ static bool legitimize_root(struct nameidata *nd) */ /** - * unlazy_walk - try to switch to ref-walk mode. + * try_to_unlazy - try to switch to ref-walk mode. * @nd: nameidata pathwalk data - * Returns: 0 on success, -ECHILD on failure + * Returns: true on success, false on failure * - * unlazy_walk attempts to legitimize the current nd->path and nd->root + * try_to_unlazyk attempts to legitimize the current nd->path and nd->root * for ref-walk mode. * Must be called from rcu-walk context. - * Nothing should touch nameidata between unlazy_walk() failure and + * Nothing should touch nameidata between try_to_unlazy() failure and * terminate_walk(). */ -static int unlazy_walk(struct nameidata *nd) +static bool try_to_unlazy(struct nameidata *nd) { struct dentry *parent = nd->path.dentry; @@ -694,14 +694,14 @@ static int unlazy_walk(struct nameidata *nd) goto out; rcu_read_unlock(); BUG_ON(nd->inode != parent->d_inode); - return 0; + return true; out1: nd->path.mnt = NULL; nd->path.dentry = NULL; out: rcu_read_unlock(); - return -ECHILD; + return false; } /** @@ -792,7 +792,7 @@ static int complete_walk(struct nameidata *nd) */ if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED))) nd->root.mnt = NULL; - if (unlikely(unlazy_walk(nd))) + if (!try_to_unlazy(nd)) return -ECHILD; } @@ -1466,7 +1466,7 @@ static struct dentry *lookup_fast(struct nameidata *nd, unsigned seq; dentry = __d_lookup_rcu(parent, &nd->last, &seq); if (unlikely(!dentry)) { - if (unlazy_walk(nd)) + if (!try_to_unlazy(nd)) return ERR_PTR(-ECHILD); return NULL; } @@ -1567,10 +1567,8 @@ static inline int may_lookup(struct nameidata *nd) { if (nd->flags & LOOKUP_RCU) { int err = inode_permission(nd->inode, MAY_EXEC|MAY_NOT_BLOCK); - if (err != -ECHILD) + if (err != -ECHILD || !try_to_unlazy(nd)) return err; - if (unlazy_walk(nd)) - return -ECHILD; } return inode_permission(nd->inode, MAY_EXEC); } @@ -1592,7 +1590,7 @@ static int reserve_stack(struct nameidata *nd, struct path *link, unsigned seq) // unlazy even if we fail to grab the link - cleanup needs it bool grabbed_link = legitimize_path(nd, link, seq); - if (unlazy_walk(nd) != 0 || !grabbed_link) + if (!try_to_unlazy(nd) != 0 || !grabbed_link) return -ECHILD; if (nd_alloc_stack(nd)) @@ -1634,7 +1632,7 @@ static const char *pick_link(struct nameidata *nd, struct path *link, touch_atime(&last->link); cond_resched(); } else if (atime_needs_update(&last->link, inode)) { - if (unlikely(unlazy_walk(nd))) + if (!try_to_unlazy(nd)) return ERR_PTR(-ECHILD); touch_atime(&last->link); } @@ -1651,11 +1649,8 @@ static const char *pick_link(struct nameidata *nd, struct path *link, get = inode->i_op->get_link; if (nd->flags & LOOKUP_RCU) { res = get(NULL, inode, &last->done); - if (res == ERR_PTR(-ECHILD)) { - if (unlikely(unlazy_walk(nd))) - return ERR_PTR(-ECHILD); + if (res == ERR_PTR(-ECHILD) && try_to_unlazy(nd)) res = get(link->dentry, inode, &last->done); - } } else { res = get(link->dentry, inode, &last->done); } @@ -2193,7 +2188,7 @@ static int link_path_walk(const char *name, struct nameidata *nd) } if (unlikely(!d_can_lookup(nd->path.dentry))) { if (nd->flags & LOOKUP_RCU) { - if (unlazy_walk(nd)) + if (!try_to_unlazy(nd)) return -ECHILD; } return -ENOTDIR; @@ -3127,7 +3122,6 @@ static const char *open_last_lookups(struct nameidata *nd, struct inode *inode; struct dentry *dentry; const char *res; - int error; nd->flags |= op->intent; @@ -3151,9 +3145,8 @@ static const char *open_last_lookups(struct nameidata *nd, } else { /* create side of things */ if (nd->flags & LOOKUP_RCU) { - error = unlazy_walk(nd); - if (unlikely(error)) - return ERR_PTR(error); + if (!try_to_unlazy(nd)) + return ERR_PTR(-ECHILD); } audit_inode(nd->name, dir, AUDIT_INODE_PARENT); /* trailing slashes? */ @@ -3162,9 +3155,7 @@ static const char *open_last_lookups(struct nameidata *nd, } if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) { - error = mnt_want_write(nd->path.mnt); - if (!error) - got_write = true; + got_write = !mnt_want_write(nd->path.mnt); /* * do _not_ fail yet - we might not need that or fail with * a different error; let lookup_open() decide; we'll be From patchwork Mon Dec 14 19:13:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11972807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9A79C4361B for ; Mon, 14 Dec 2020 19:14:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B157122286 for ; Mon, 14 Dec 2020 19:14:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732908AbgLNTOs (ORCPT ); Mon, 14 Dec 2020 14:14:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2439834AbgLNTOR (ORCPT ); Mon, 14 Dec 2020 14:14:17 -0500 Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05161C061793 for ; Mon, 14 Dec 2020 11:13:37 -0800 (PST) Received: by mail-io1-xd2a.google.com with SMTP id n4so17979196iow.12 for ; Mon, 14 Dec 2020 11:13:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eErfZhcM70ZZqvloxWZ8BUhglfgk7CIfMQqCeYN0MEQ=; b=DULW0PdyKe7iTWA4J5RQgXlizOy2dkjY4EXbm5Z9DKvzwRlV3978QpkQjfegjxy9z4 qJ5YQhEAWcJ9mQv6KeFAS25g+try+xGMekdQwi4dBrjTFbCpf/l8pR0+4fSq/kIBDjqk KLHNDO877uI4bTmM1EDzE8Vk0Esg6tw+BwT5iyaZ68hGwCU5JptRAhzq/EZUyehBFTAA 0DfAxs09TSMJ1u2t3o8YQB+anudGn+Hh9dt8RSisAdk5R+mHIQ2proapWKRib+DeTV+H 06Uxs9q26Pn8waZa7jKWpKcC+1qCZSFR3ij2iKBRlqwQa3BosutMpO0puDDH0apWWvnd eFjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=eErfZhcM70ZZqvloxWZ8BUhglfgk7CIfMQqCeYN0MEQ=; b=AlJg1vvTbse3Nv/qzWQb91xH5VSBALZDy+nHv7icUuOT4JAGIO9dJ7SVBeyM9l23VN JhlN4v1AIMJ0bMa5xUptb7eEWgxnBEtL2YueH8M05iLNjzHufEFRbi0dEEo9+yI+DIBF x2D5Wn3JmkoUGtaqkmmSzWt4nYrLCDKaTJuv/vZwOzyuP55FHq/8ubzdyEU18u1LFPLm Z1ezo8qn9nsEvOu5ihPr7LfkcTawAqSPvIczVW6iXccmzT0g87iN1yZ+KjD1ULTADO39 xuIqUL5o3e4pvaQTlB1cpYC1n/S8sKLAhx4tlIK5TxU1pgz+nZqFlXVouixYZjeUsTzq 5Jhg== X-Gm-Message-State: AOAM531II9E+C90dmj7uz6e9xWpVqtVEJAj879NRje7wwv6BdAQ3BtXg OH0VU7UQe6/u9mpwIW6/IFZVjUG0wGfq3g== X-Google-Smtp-Source: ABdhPJxDH4EVvuL/niFklJujxx17TqS6wRLlX043oU8n4JyE7WGiPzgiwUlgr1iqGSwCn2SoxyzpLA== X-Received: by 2002:a5e:a815:: with SMTP id c21mr32241787ioa.141.1607973216093; Mon, 14 Dec 2020 11:13:36 -0800 (PST) Received: from p1.localdomain ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id 11sm11760566ilt.54.2020.12.14.11.13.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Dec 2020 11:13:35 -0800 (PST) From: Jens Axboe To: linux-fsdevel@vger.kernel.org Cc: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, Jens Axboe Subject: [PATCH 2/4] fs: add support for LOOKUP_NONBLOCK Date: Mon, 14 Dec 2020 12:13:22 -0700 Message-Id: <20201214191323.173773-3-axboe@kernel.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201214191323.173773-1-axboe@kernel.dk> References: <20201214191323.173773-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org io_uring always punts opens to async context, since there's no control over whether the lookup blocks or not. Add LOOKUP_NONBLOCK to support just doing the fast RCU based lookups, which we know will not block. If we can do a cached path resolution of the filename, then we don't have to always punt lookups for a worker. We explicitly disallow O_CREAT | O_TRUNC opens, as those will require blocking, and O_TMPFILE as that requires filesystem interactions and there's currently no way to pass down an attempt to do nonblocking operations there. This basically boils down to whether or not we can do the fast path of open or not. If we can't, then return -EAGAIN and let the caller retry from an appropriate context that can handle blocking. During path resolution, we always do LOOKUP_RCU first. If that fails and we terminate LOOKUP_RCU, then fail a LOOKUP_NONBLOCK attempt as well. Cc: Al Viro Signed-off-by: Jens Axboe --- fs/namei.c | 27 ++++++++++++++++++++++++++- include/linux/namei.h | 1 + 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/fs/namei.c b/fs/namei.c index 7eb7830da298..83a7f7866232 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -686,6 +686,8 @@ static bool try_to_unlazy(struct nameidata *nd) BUG_ON(!(nd->flags & LOOKUP_RCU)); nd->flags &= ~LOOKUP_RCU; + if (nd->flags & LOOKUP_NONBLOCK) + goto out1; if (unlikely(!legitimize_links(nd))) goto out1; if (unlikely(!legitimize_path(nd, &nd->path, nd->seq))) @@ -722,6 +724,8 @@ static int unlazy_child(struct nameidata *nd, struct dentry *dentry, unsigned se BUG_ON(!(nd->flags & LOOKUP_RCU)); nd->flags &= ~LOOKUP_RCU; + if (nd->flags & LOOKUP_NONBLOCK) + goto out2; if (unlikely(!legitimize_links(nd))) goto out2; if (unlikely(!legitimize_mnt(nd->path.mnt, nd->m_seq))) @@ -792,6 +796,7 @@ static int complete_walk(struct nameidata *nd) */ if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED))) nd->root.mnt = NULL; + nd->flags &= ~LOOKUP_NONBLOCK; if (!try_to_unlazy(nd)) return -ECHILD; } @@ -2202,6 +2207,10 @@ static const char *path_init(struct nameidata *nd, unsigned flags) int error; const char *s = nd->name->name; + /* LOOKUP_NONBLOCK requires RCU, ask caller to retry */ + if ((flags & (LOOKUP_RCU | LOOKUP_NONBLOCK)) == LOOKUP_NONBLOCK) + return ERR_PTR(-EAGAIN); + if (!*s) flags &= ~LOOKUP_RCU; if (flags & LOOKUP_RCU) @@ -3140,6 +3149,12 @@ static const char *open_last_lookups(struct nameidata *nd, return ERR_CAST(dentry); if (likely(dentry)) goto finish_lookup; + /* + * We can't guarantee nonblocking semantics beyond this, if + * the fast lookup fails. + */ + if (nd->flags & LOOKUP_NONBLOCK) + return ERR_PTR(-EAGAIN); BUG_ON(nd->flags & LOOKUP_RCU); } else { @@ -3233,6 +3248,7 @@ static int do_open(struct nameidata *nd, open_flag &= ~O_TRUNC; acc_mode = 0; } else if (d_is_reg(nd->path.dentry) && open_flag & O_TRUNC) { + WARN_ON_ONCE(nd->flags & LOOKUP_NONBLOCK); error = mnt_want_write(nd->path.mnt); if (error) return error; @@ -3299,7 +3315,16 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags, { struct dentry *child; struct path path; - int error = path_lookupat(nd, flags | LOOKUP_DIRECTORY, &path); + int error; + + /* + * We can't guarantee that the fs doesn't block further down, so + * just disallow nonblock attempts at O_TMPFILE for now. + */ + if (flags & LOOKUP_NONBLOCK) + return -EAGAIN; + + error = path_lookupat(nd, flags | LOOKUP_DIRECTORY, &path); if (unlikely(error)) return error; error = mnt_want_write(path.mnt); diff --git a/include/linux/namei.h b/include/linux/namei.h index a4bb992623c4..c36c4e0805fc 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -46,6 +46,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; #define LOOKUP_NO_XDEV 0x040000 /* No mountpoint crossing. */ #define LOOKUP_BENEATH 0x080000 /* No escaping from starting point. */ #define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as fs root. */ +#define LOOKUP_NONBLOCK 0x200000 /* don't block for lookup */ /* LOOKUP_* flags which do scope-related checks based on the dirfd. */ #define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT) From patchwork Mon Dec 14 19:13:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11972809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76B5BC4361B for ; Mon, 14 Dec 2020 19:14:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0375722286 for ; Mon, 14 Dec 2020 19:14:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2502154AbgLNTOt (ORCPT ); Mon, 14 Dec 2020 14:14:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440496AbgLNTOS (ORCPT ); Mon, 14 Dec 2020 14:14:18 -0500 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14292C061794 for ; Mon, 14 Dec 2020 11:13:38 -0800 (PST) Received: by mail-il1-x143.google.com with SMTP id x15so16878492ilq.1 for ; Mon, 14 Dec 2020 11:13:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l/Y3zSbEAx5VH6dVCdGyeMeVWBXxWYk68SsntHDex+o=; b=KSeF2CyC8kVq08OiG0rdey3qndGohgYbEYoGM5Bdhl+y4+DNu+DU5wMHLxj4bpZnJQ Hwnt1HMwbEeuCgfbr+yBT7wimq5eZhhhXYnkYgWuXMMWVg8Hjcs+vC1C1zSbjgfhkM7m a4RV8wwA7IcweatXnqvhoHjb8SBGLxNaVuf+Tfc9rwPZoO//38unRMQcEGslHoalrTRo PE9Q8E+J3QbKHwkcdsBD0xD9ap/71JPnoibXQSuBmAMP0Yu6C0q9YMM3ZK9yy0yJilXP /5XnLJd6B5Gb9M+sq4MLZ91IYzwvJUtunIpH81GoVTI4J5msfzHEbKO1vfl0nG8Kjwvw 9rQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l/Y3zSbEAx5VH6dVCdGyeMeVWBXxWYk68SsntHDex+o=; b=SQrcymxSZcQUwtnCUGR1WKvDB8gayfH3ew9Rp6LoMeZMxeWOs27mBZvxTFgBuQoXrs AQcQ6BlYutfIkJYbWFuj1YeFfMJKfQR299ZyCos0rrY7eY9zFsSPkcLRHRD/6S4TxCkQ d1ehKkmJvJHopwnIMu6lic9Uyt1aSX0JBaiTybk+6hmdnx4HsD3wF762CSTYLFTvy4zr +HV4KKO234u3Yt2geSnIyhGj5BxvV3dS8+qAhQNPqSLeMJN9PxBV55wc7zY50GY3Kgnc eI0x0aRWcKf2T7Q8bkqOXyvZDle+X8FN6Bf5R2I4XKf53qFSHUJEWK3WL3D25fPuQjaS inaA== X-Gm-Message-State: AOAM5316lZ9pQIGjxkxi3PBSQwdx/n0DlLyEHX2Nq+5yiBIOKA0jYM52 QOHJgu2/ehKFl9KAo0kUDsuEBs4/vjx/jw== X-Google-Smtp-Source: ABdhPJyxZfghV7XdWRj2wNuXLTtRUvrRFhrzHaoTiJpEYR5daHl95XjV0MJ2iB/pjnGU9QE6w8ZrNg== X-Received: by 2002:a92:607:: with SMTP id x7mr35864078ilg.34.1607973217131; Mon, 14 Dec 2020 11:13:37 -0800 (PST) Received: from p1.localdomain ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id 11sm11760566ilt.54.2020.12.14.11.13.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Dec 2020 11:13:36 -0800 (PST) From: Jens Axboe To: linux-fsdevel@vger.kernel.org Cc: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, Jens Axboe Subject: [PATCH 3/4] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Date: Mon, 14 Dec 2020 12:13:23 -0700 Message-Id: <20201214191323.173773-4-axboe@kernel.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201214191323.173773-1-axboe@kernel.dk> References: <20201214191323.173773-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that we support non-blocking path resolution internally, expose it via openat2() in the struct open_how ->resolve flags. This allows applications using openat2() to limit path resolution to the extent that it is already cached. If the lookup cannot be satisfied in a non-blocking manner, openat2(2) will return -1/-EAGAIN. Cc: Al Viro Signed-off-by: Jens Axboe --- fs/open.c | 6 ++++++ include/linux/fcntl.h | 2 +- include/uapi/linux/openat2.h | 4 ++++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/fs/open.c b/fs/open.c index 9af548fb841b..a83434cfe01c 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1087,6 +1087,12 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op) lookup_flags |= LOOKUP_BENEATH; if (how->resolve & RESOLVE_IN_ROOT) lookup_flags |= LOOKUP_IN_ROOT; + if (how->resolve & RESOLVE_NONBLOCK) { + /* Don't bother even trying for create/truncate open */ + if (flags & (O_TRUNC | O_CREAT)) + return -EAGAIN; + lookup_flags |= LOOKUP_NONBLOCK; + } op->lookup_flags = lookup_flags; return 0; diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h index 921e750843e6..919a13c9317c 100644 --- a/include/linux/fcntl.h +++ b/include/linux/fcntl.h @@ -19,7 +19,7 @@ /* List of all valid flags for the how->resolve argument: */ #define VALID_RESOLVE_FLAGS \ (RESOLVE_NO_XDEV | RESOLVE_NO_MAGICLINKS | RESOLVE_NO_SYMLINKS | \ - RESOLVE_BENEATH | RESOLVE_IN_ROOT) + RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_NONBLOCK) /* List of all open_how "versions". */ #define OPEN_HOW_SIZE_VER0 24 /* sizeof first published struct */ diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h index 58b1eb711360..7bc1d0c35108 100644 --- a/include/uapi/linux/openat2.h +++ b/include/uapi/linux/openat2.h @@ -35,5 +35,9 @@ struct open_how { #define RESOLVE_IN_ROOT 0x10 /* Make all jumps to "/" and ".." be scoped inside the dirfd (similar to chroot(2)). */ +#define RESOLVE_NONBLOCK 0x20 /* Only complete if resolution can be + completed through cached lookup. May + return -EAGAIN if that's not + possible. */ #endif /* _UAPI_LINUX_OPENAT2_H */ From patchwork Mon Dec 14 19:13:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11972805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28E8BC2BB40 for ; Mon, 14 Dec 2020 19:14:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D8C5522286 for ; Mon, 14 Dec 2020 19:14:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2438061AbgLNTOq (ORCPT ); Mon, 14 Dec 2020 14:14:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440528AbgLNTOT (ORCPT ); Mon, 14 Dec 2020 14:14:19 -0500 Received: from mail-io1-xd42.google.com (mail-io1-xd42.google.com [IPv6:2607:f8b0:4864:20::d42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1F2DC06179C for ; Mon, 14 Dec 2020 11:13:38 -0800 (PST) Received: by mail-io1-xd42.google.com with SMTP id p187so17994240iod.4 for ; Mon, 14 Dec 2020 11:13:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TMZPnAjXizjL/GC4e0rRBbgTIwmRcqkOa9SMMz0fkM0=; b=Z06/gjqN7oYySiwQGKhtiiKTNcV0dZ6yEdjyWSBHsL7wAchVf0CFhYboSHB66AuKsS 6Lp1z9d+yaQ+ly+phdjzsIwCM5xxNB6bq5BLk+wPWvUsc4iHlFAL6ardgKliKdLDsyk3 f9zHh5H7jOIn/3uHP4OOoYjeD265AyAY0ZiCEiNCJhYqMXcp9IjqCqFG5eucn/CKDVwc PS8skOsdKgjhfLSzlmnQggtHxioLnY7zPZrZr9Yx/OO/blvc8xgthVNcym94JDd7/9kL YhW2BmdhNDlNo9BEEHvz5SxmMIgfFsVUU+mDAy3pgsyc5n7ETM/yP4Fk67i2x14jhbNt c0aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TMZPnAjXizjL/GC4e0rRBbgTIwmRcqkOa9SMMz0fkM0=; b=MuVM7XlNGQ5QRuqNEAHFKx6M8AouN6rAIZ5Iy1zfCD/Nor7Tq8sI6XBqRMHA7hcE5G xVObO5AFShwW5QnaLYlUC8qSSgMhB1Un/30TI7rk8qrT0sAoszGNxTrG/X0MhYT1U/qY Ni6CXvJ3qnwm3WIK1Brz4sMSXZQ/qK4pcu0n7XH0BWEK8M825IzqszcvyPruw2S7Da9J CcE2mXZQRHkuNSbb0gFklprwaAF3+UmFmg6YiKmAf8IHH/xtjwkZZhTHJ2GLqVJEAMfC saooqYZV9vyt2tLsILs/8d2EVqh28i74so6JtRH6i3Gtc9CEGECKOVoGKO0ywb/ftTD3 TRTg== X-Gm-Message-State: AOAM532y+bEosFNBrhKTxJQAVM31YLecBPaPV4izcyfTfMuBMT2tJwwz vx95iJaR4pizRnWjri9TYIiIG4LWDUk/XA== X-Google-Smtp-Source: ABdhPJxyC0vd/4UjsF/H7s+udDRBJKpBXKBK5C7FTl/GOPoU97395XUKsKm2rv3WfXYccWI2sUJtFw== X-Received: by 2002:a05:6602:2110:: with SMTP id x16mr31702738iox.127.1607973218086; Mon, 14 Dec 2020 11:13:38 -0800 (PST) Received: from p1.localdomain ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id 11sm11760566ilt.54.2020.12.14.11.13.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Dec 2020 11:13:37 -0800 (PST) From: Jens Axboe To: linux-fsdevel@vger.kernel.org Cc: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, Jens Axboe Subject: [PATCH 4/4] io_uring: enable LOOKUP_NONBLOCK path resolution for filename lookups Date: Mon, 14 Dec 2020 12:13:24 -0700 Message-Id: <20201214191323.173773-5-axboe@kernel.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201214191323.173773-1-axboe@kernel.dk> References: <20201214191323.173773-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Instead of being pessimistic and assume that path lookup will block, use LOOKUP_NONBLOCK to attempt just a cached lookup. This ensures that the fast path is always done inline, and we only punt to async context if IO is needed to satisfy the lookup. For forced nonblock open attempts, mark the file O_NONBLOCK over the actual ->open() call as well. We can safely clear this again before doing fd_install(), so it'll never be user visible that we fiddled with it. Signed-off-by: Jens Axboe --- fs/io_uring.c | 44 ++++++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 9d7baf8ba77a..6734a2616990 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -487,7 +487,6 @@ struct io_sr_msg { struct io_open { struct file *file; int dfd; - bool ignore_nonblock; struct filename *filename; struct open_how how; unsigned long nofile; @@ -3998,7 +3997,6 @@ static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe return ret; } req->open.nofile = rlimit(RLIMIT_NOFILE); - req->open.ignore_nonblock = false; req->flags |= REQ_F_NEED_CLEANUP; return 0; } @@ -4040,39 +4038,45 @@ static int io_openat2(struct io_kiocb *req, bool force_nonblock) { struct open_flags op; struct file *file; + bool nonblock_set; int ret; - if (force_nonblock && !req->open.ignore_nonblock) - return -EAGAIN; - ret = build_open_flags(&req->open.how, &op); if (ret) goto err; + nonblock_set = op.open_flag & O_NONBLOCK; + if (force_nonblock) { + /* + * Don't bother trying for O_TRUNC or O_CREAT open, it'll + * always -EAGAIN + */ + if (req->open.how.flags & (O_TRUNC | O_CREAT)) + return -EAGAIN; + op.lookup_flags |= LOOKUP_NONBLOCK; + op.open_flag |= O_NONBLOCK; + } ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile); if (ret < 0) goto err; file = do_filp_open(req->open.dfd, req->open.filename, &op); + if (force_nonblock && file == ERR_PTR(-EAGAIN)) { + /* + * We could hang on to this 'fd', but seems like marginal + * gain for something that is now known to be a slower path. + * So just put it, and we'll get a new one when we retry. + */ + put_unused_fd(ret); + return -EAGAIN; + } + if (IS_ERR(file)) { put_unused_fd(ret); ret = PTR_ERR(file); - /* - * A work-around to ensure that /proc/self works that way - * that it should - if we get -EOPNOTSUPP back, then assume - * that proc_self_get_link() failed us because we're in async - * context. We should be safe to retry this from the task - * itself with force_nonblock == false set, as it should not - * block on lookup. Would be nice to know this upfront and - * avoid the async dance, but doesn't seem feasible. - */ - if (ret == -EOPNOTSUPP && io_wq_current_is_worker()) { - req->open.ignore_nonblock = true; - refcount_inc(&req->refs); - io_req_task_queue(req); - return 0; - } } else { + if (force_nonblock && !nonblock_set) + file->f_flags &= ~O_NONBLOCK; fsnotify_open(file); fd_install(ret, file); }