From patchwork Wed Aug 9 04:30:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13347410 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31D1DC001B0 for ; Wed, 9 Aug 2023 04:31:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229986AbjHIEbG (ORCPT ); Wed, 9 Aug 2023 00:31:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230100AbjHIEbF (ORCPT ); Wed, 9 Aug 2023 00:31:05 -0400 Received: from mail-yw1-x112c.google.com (mail-yw1-x112c.google.com [IPv6:2607:f8b0:4864:20::112c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92B371BC3 for ; Tue, 8 Aug 2023 21:31:03 -0700 (PDT) Received: by mail-yw1-x112c.google.com with SMTP id 00721157ae682-5861116fd74so62361967b3.0 for ; Tue, 08 Aug 2023 21:31:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691555463; x=1692160263; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=0QEq4BtWj7XF+SzsAQAvfuVLV+e829beqWHSxHJfrd4=; b=HJptNXYyAYDNQBptp9KJN1ynN6d4xCu37HYicEN6u5bGotYv6/fxImpozw/WEF6Omm XWZa6948pq1yCfnYIPHZt+ZWG2sDZc0Ek3Xbq0KkHEJlVzDyOWyGO8vqhhio9SGnt+Gy kOUV4G21FiiPWrRBz6qOkWN/8WOKiZKJYBFV/QHl7tlDTu8sZGSQA/DMCaL8SEQ93aJE uIpYmLvLAp3D+BDbdhvTkCuWCOAj3CaJr9KZksD2j4vtQiUBkOMnE2VLOA4pSN3BwmEi ktY+0MhDjLeXRwsjx0AQKAbXt2oJQjNezKGDyUzB6awQdWLnvZ9a/RuBiLKePQBxQa0x YziQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691555463; x=1692160263; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0QEq4BtWj7XF+SzsAQAvfuVLV+e829beqWHSxHJfrd4=; b=CWoWK9H1yb0BPHfk8ezfoHJ6HF2WHPG1YzrLxJd3UuZv6ARc4VJ3/YGQIuM8ybdAlZ O51R47mzJJq7NryLatEAzYM7qv9vwG6gQCq4DqNSOPcyMtv5tCmL8MPoeFmeBCEpcDC0 5gffjYbu0bYaBWb4IQyXq58ydYqi4By4/YTRivHWraFojhsAtyM2uBxEZ4pwPqDnU3t2 ZHVfgza2VmYQq0/JdyUgZTUFAZVRFmc+VE2DVfTfr9jSkmsHbqKkyIDpe45zfkWtzi7K tgfjg0ZgmR8iAB//olJOZkSdZXKKFLwM7WY06Pg2eEnxx+W0fSniNAdIv9VfAAC7SOR4 yl5w== X-Gm-Message-State: AOJu0YyCAxBKq5yfTSt7WR1KJZ6RlUXi99hFr15f0Q1J3+X7Bex03A8w QbQVuhTUpLO21rmlSOcbTPVBkA== X-Google-Smtp-Source: AGHT+IFC/9yEPmxQ2d3jqattfPdMdJ9f0tThZS9o2vogrFxNROEZWWPd9XKK26aiRb6Kf73SCBCPeA== X-Received: by 2002:a25:361d:0:b0:d06:d1ae:dcf2 with SMTP id d29-20020a25361d000000b00d06d1aedcf2mr1670959yba.13.1691555462645; Tue, 08 Aug 2023 21:31:02 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id d130-20020a254f88000000b00d0b0bbe574asm3212321ybb.44.2023.08.08.21.31.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 21:31:02 -0700 (PDT) Date: Tue, 8 Aug 2023 21:30:59 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Christian Brauner cc: Andrew Morton , Oleksandr Tymoshenko , Carlos Maiolino , Jeff Layton , Chuck Lever , Jan Kara , Miklos Szeredi , Daniel Xu , Chris Down , Tejun Heo , Greg Kroah-Hartman , Matthew Wilcox , Christoph Hellwig , Pete Zaitcev , Helge Deller , Topi Miettinen , Yu Kuai , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH vfs.tmpfs 1/5] xattr: simple_xattr_set() return old_xattr to be freed In-Reply-To: Message-ID: <158c6585-2aa7-d4aa-90ff-f7c3f8fe407c@google.com> References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org tmpfs wants to support limited user extended attributes, but kernfs (or cgroupfs, the only kernfs with KERNFS_ROOT_SUPPORT_USER_XATTR) already supports user extended attributes through simple xattrs: but limited by a policy (128KiB per inode) too liberal to be used on tmpfs. To allow a different limiting policy for tmpfs, without affecting the policy for kernfs, change simple_xattr_set() to return the replaced or removed xattr (if any), leaving the caller to update their accounting then free the xattr (by simple_xattr_free(), renamed from the static free_simple_xattr()). Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara Reviewed-by: Christian Brauner Reviewed-by: Carlos Maiolino --- fs/kernfs/inode.c | 46 +++++++++++++++++++++++++--------------- fs/xattr.c | 51 +++++++++++++++++++-------------------------- include/linux/xattr.h | 7 ++++--- mm/shmem.c | 10 +++++---- 4 files changed, 61 insertions(+), 53 deletions(-) diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index b22b74d1a115..fec5d5f78f07 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -306,11 +306,17 @@ int kernfs_xattr_get(struct kernfs_node *kn, const char *name, int kernfs_xattr_set(struct kernfs_node *kn, const char *name, const void *value, size_t size, int flags) { + struct simple_xattr *old_xattr; struct kernfs_iattrs *attrs = kernfs_iattrs(kn); if (!attrs) return -ENOMEM; - return simple_xattr_set(&attrs->xattrs, name, value, size, flags, NULL); + old_xattr = simple_xattr_set(&attrs->xattrs, name, value, size, flags); + if (IS_ERR(old_xattr)) + return PTR_ERR(old_xattr); + + simple_xattr_free(old_xattr); + return 0; } static int kernfs_vfs_xattr_get(const struct xattr_handler *handler, @@ -342,7 +348,7 @@ static int kernfs_vfs_user_xattr_add(struct kernfs_node *kn, { atomic_t *sz = &kn->iattr->user_xattr_size; atomic_t *nr = &kn->iattr->nr_user_xattrs; - ssize_t removed_size; + struct simple_xattr *old_xattr; int ret; if (atomic_inc_return(nr) > KERNFS_MAX_USER_XATTRS) { @@ -355,13 +361,18 @@ static int kernfs_vfs_user_xattr_add(struct kernfs_node *kn, goto dec_size_out; } - ret = simple_xattr_set(xattrs, full_name, value, size, flags, - &removed_size); - - if (!ret && removed_size >= 0) - size = removed_size; - else if (!ret) + old_xattr = simple_xattr_set(xattrs, full_name, value, size, flags); + if (!old_xattr) return 0; + + if (IS_ERR(old_xattr)) { + ret = PTR_ERR(old_xattr); + goto dec_size_out; + } + + ret = 0; + size = old_xattr->size; + simple_xattr_free(old_xattr); dec_size_out: atomic_sub(size, sz); dec_count_out: @@ -376,18 +387,19 @@ static int kernfs_vfs_user_xattr_rm(struct kernfs_node *kn, { atomic_t *sz = &kn->iattr->user_xattr_size; atomic_t *nr = &kn->iattr->nr_user_xattrs; - ssize_t removed_size; - int ret; + struct simple_xattr *old_xattr; - ret = simple_xattr_set(xattrs, full_name, value, size, flags, - &removed_size); + old_xattr = simple_xattr_set(xattrs, full_name, value, size, flags); + if (!old_xattr) + return 0; - if (removed_size >= 0) { - atomic_sub(removed_size, sz); - atomic_dec(nr); - } + if (IS_ERR(old_xattr)) + return PTR_ERR(old_xattr); - return ret; + atomic_sub(old_xattr->size, sz); + atomic_dec(nr); + simple_xattr_free(old_xattr); + return 0; } static int kernfs_vfs_user_xattr_set(const struct xattr_handler *handler, diff --git a/fs/xattr.c b/fs/xattr.c index e7bbb7f57557..ba37a8f5cfd1 100644 --- a/fs/xattr.c +++ b/fs/xattr.c @@ -1040,12 +1040,12 @@ const char *xattr_full_name(const struct xattr_handler *handler, EXPORT_SYMBOL(xattr_full_name); /** - * free_simple_xattr - free an xattr object + * simple_xattr_free - free an xattr object * @xattr: the xattr object * * Free the xattr object. Can handle @xattr being NULL. */ -static inline void free_simple_xattr(struct simple_xattr *xattr) +void simple_xattr_free(struct simple_xattr *xattr) { if (xattr) kfree(xattr->name); @@ -1164,7 +1164,6 @@ int simple_xattr_get(struct simple_xattrs *xattrs, const char *name, * @value: the value to store along the xattr * @size: the size of @value * @flags: the flags determining how to set the xattr - * @removed_size: the size of the removed xattr * * Set a new xattr object. * If @value is passed a new xattr object will be allocated. If XATTR_REPLACE @@ -1181,29 +1180,27 @@ int simple_xattr_get(struct simple_xattrs *xattrs, const char *name, * nothing if XATTR_CREATE is specified in @flags or @flags is zero. For * XATTR_REPLACE we fail as mentioned above. * - * Return: On success zero and on error a negative error code is returned. + * Return: On success, the removed or replaced xattr is returned, to be freed + * by the caller; or NULL if none. On failure a negative error code is returned. */ -int simple_xattr_set(struct simple_xattrs *xattrs, const char *name, - const void *value, size_t size, int flags, - ssize_t *removed_size) +struct simple_xattr *simple_xattr_set(struct simple_xattrs *xattrs, + const char *name, const void *value, + size_t size, int flags) { - struct simple_xattr *xattr = NULL, *new_xattr = NULL; + struct simple_xattr *old_xattr = NULL, *new_xattr = NULL; struct rb_node *parent = NULL, **rbp; int err = 0, ret; - if (removed_size) - *removed_size = -1; - /* value == NULL means remove */ if (value) { new_xattr = simple_xattr_alloc(value, size); if (!new_xattr) - return -ENOMEM; + return ERR_PTR(-ENOMEM); new_xattr->name = kstrdup(name, GFP_KERNEL); if (!new_xattr->name) { - free_simple_xattr(new_xattr); - return -ENOMEM; + simple_xattr_free(new_xattr); + return ERR_PTR(-ENOMEM); } } @@ -1217,12 +1214,12 @@ int simple_xattr_set(struct simple_xattrs *xattrs, const char *name, else if (ret > 0) rbp = &(*rbp)->rb_right; else - xattr = rb_entry(*rbp, struct simple_xattr, rb_node); - if (xattr) + old_xattr = rb_entry(*rbp, struct simple_xattr, rb_node); + if (old_xattr) break; } - if (xattr) { + if (old_xattr) { /* Fail if XATTR_CREATE is requested and the xattr exists. */ if (flags & XATTR_CREATE) { err = -EEXIST; @@ -1230,12 +1227,10 @@ int simple_xattr_set(struct simple_xattrs *xattrs, const char *name, } if (new_xattr) - rb_replace_node(&xattr->rb_node, &new_xattr->rb_node, - &xattrs->rb_root); + rb_replace_node(&old_xattr->rb_node, + &new_xattr->rb_node, &xattrs->rb_root); else - rb_erase(&xattr->rb_node, &xattrs->rb_root); - if (!err && removed_size) - *removed_size = xattr->size; + rb_erase(&old_xattr->rb_node, &xattrs->rb_root); } else { /* Fail if XATTR_REPLACE is requested but no xattr is found. */ if (flags & XATTR_REPLACE) { @@ -1260,12 +1255,10 @@ int simple_xattr_set(struct simple_xattrs *xattrs, const char *name, out_unlock: write_unlock(&xattrs->lock); - if (err) - free_simple_xattr(new_xattr); - else - free_simple_xattr(xattr); - return err; - + if (!err) + return old_xattr; + simple_xattr_free(new_xattr); + return ERR_PTR(err); } static bool xattr_is_trusted(const char *name) @@ -1386,7 +1379,7 @@ void simple_xattrs_free(struct simple_xattrs *xattrs) rbp_next = rb_next(rbp); xattr = rb_entry(rbp, struct simple_xattr, rb_node); rb_erase(&xattr->rb_node, &xattrs->rb_root); - free_simple_xattr(xattr); + simple_xattr_free(xattr); rbp = rbp_next; } } diff --git a/include/linux/xattr.h b/include/linux/xattr.h index d591ef59aa98..e37fe667ae04 100644 --- a/include/linux/xattr.h +++ b/include/linux/xattr.h @@ -116,11 +116,12 @@ struct simple_xattr { void simple_xattrs_init(struct simple_xattrs *xattrs); void simple_xattrs_free(struct simple_xattrs *xattrs); struct simple_xattr *simple_xattr_alloc(const void *value, size_t size); +void simple_xattr_free(struct simple_xattr *xattr); int simple_xattr_get(struct simple_xattrs *xattrs, const char *name, void *buffer, size_t size); -int simple_xattr_set(struct simple_xattrs *xattrs, const char *name, - const void *value, size_t size, int flags, - ssize_t *removed_size); +struct simple_xattr *simple_xattr_set(struct simple_xattrs *xattrs, + const char *name, const void *value, + size_t size, int flags); ssize_t simple_xattr_list(struct inode *inode, struct simple_xattrs *xattrs, char *buffer, size_t size); void simple_xattr_add(struct simple_xattrs *xattrs, diff --git a/mm/shmem.c b/mm/shmem.c index 0f83d86fd8b4..df3cabf54206 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3595,15 +3595,17 @@ static int shmem_xattr_handler_set(const struct xattr_handler *handler, size_t size, int flags) { struct shmem_inode_info *info = SHMEM_I(inode); - int err; + struct simple_xattr *old_xattr; name = xattr_full_name(handler, name); - err = simple_xattr_set(&info->xattrs, name, value, size, flags, NULL); - if (!err) { + old_xattr = simple_xattr_set(&info->xattrs, name, value, size, flags); + if (!IS_ERR(old_xattr)) { + simple_xattr_free(old_xattr); + old_xattr = NULL; inode->i_ctime = current_time(inode); inode_inc_iversion(inode); } - return err; + return PTR_ERR(old_xattr); } static const struct xattr_handler shmem_security_xattr_handler = { From patchwork Wed Aug 9 04:32:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13347416 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 597BFC04E69 for ; Wed, 9 Aug 2023 04:32:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229576AbjHIEc1 (ORCPT ); Wed, 9 Aug 2023 00:32:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229478AbjHIEc0 (ORCPT ); Wed, 9 Aug 2023 00:32:26 -0400 Received: from mail-yw1-x112f.google.com (mail-yw1-x112f.google.com [IPv6:2607:f8b0:4864:20::112f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94F941BC3 for ; Tue, 8 Aug 2023 21:32:25 -0700 (PDT) Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-583b3939521so74119437b3.0 for ; Tue, 08 Aug 2023 21:32:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691555545; x=1692160345; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=fwvZIfkWxNDBAA2VDHz4cV0MGL3i8yqkmn2UYcR8Z9E=; b=5Alg+xenp1VtdRcTTiXegQmvVx8wy8aB4AnR3TYfFwxSSxxPNENMNSt97yEdGrUomF NVZb/9pCOJoouoToNfrQBTHhZBj1EesKjIvAxP6VCdurukvN0Q8leXUyfg2LOat9CP31 Q5FWQVPkNrMwaPA2S0kCOX/DeVEZl+iG2xcwrtiUoANuf8nGtYz5QN7xi0CTGFxuSqbN PSRfnYgOr2B3jxdIuMkYej0ebVCRGknFkxX6EnXyoSXhG5MW6xZasEahhgUijN4NRvYh V6baPuomPigc6GllJHR8SBcrJsJayUbJgFKFUTJSIvtzrUi6egvr428yal/Y3mjtHP4e Zetw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691555545; x=1692160345; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fwvZIfkWxNDBAA2VDHz4cV0MGL3i8yqkmn2UYcR8Z9E=; b=fr54V2KLYZuJIgz2Z77ehqJ+B1XplueOIQvogVW7Oyx875XdHDl7cCH52Bp4aTkTU7 Crw1w5vsdQxYRVuJGSjDphFA4cCOajoYQZYXGFN7wLp+OiDdwfuOgDlYLywZGykcUiIV +iJ28aCo0EtpaUa1ZquRXhr5jCiQxIfQD1iRjsggJtFJ7WEPWwPyke4O7Bujg94Hg59B k1+sH4oEwGZSI1KlabQNNd3bnKQKdGWOJfk6k8CwhAD/WLSLkAxS01BlGcdtW2LK7ZiI zUMbGxc9RC3GwIZB4HeiWVZJvTRpQebcy+RnMYBME3XZrapvl6JV72WOfUUNBaa8WRfg XQlA== X-Gm-Message-State: AOJu0YyzOwpNKfqNjt9mGl/F701i/5SQES4SR0OpUSQJDWwZphbImoyi 1JX6usT3h8jxZeKdB6rEREDW5g== X-Google-Smtp-Source: AGHT+IFRq2+oQxP1jY8hLbXzDUaZNnkuiKMnohvmGHSRa3VzhKzvm1cl65r3lJF2CoEm8sENiDLztQ== X-Received: by 2002:a0d:dfd7:0:b0:571:bd3e:73ca with SMTP id i206-20020a0ddfd7000000b00571bd3e73camr1696829ywe.16.1691555544708; Tue, 08 Aug 2023 21:32:24 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id u62-20020a818441000000b0057a5302e2fesm3768585ywf.5.2023.08.08.21.32.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 21:32:24 -0700 (PDT) Date: Tue, 8 Aug 2023 21:32:21 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Christian Brauner cc: Andrew Morton , Oleksandr Tymoshenko , Carlos Maiolino , Jeff Layton , Chuck Lever , Jan Kara , Miklos Szeredi , Daniel Xu , Chris Down , Tejun Heo , Greg Kroah-Hartman , Matthew Wilcox , Christoph Hellwig , Pete Zaitcev , Helge Deller , Topi Miettinen , Yu Kuai , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH vfs.tmpfs 2/5] tmpfs: track free_ispace instead of free_inodes In-Reply-To: Message-ID: <4fe1739-d9e7-8dfd-5bce-12e7339711da@google.com> References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org In preparation for assigning some inode space to extended attributes, keep track of free_ispace instead of number of free_inodes: as if one tmpfs inode (and accompanying dentry) occupies very approximately 1KiB. Unsigned long is large enough for free_ispace, on 64-bit and on 32-bit: but take care to enforce the maximum. And fix the nr_blocks maximum on 32-bit: S64_MAX would be too big for it there, so say LONG_MAX instead. Delete the incorrect limited<->unlimited blocks/inodes comment above shmem_reconfigure(): leave it to the error messages below to describe. Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara Reviewed-by: Carlos Maiolino --- include/linux/shmem_fs.h | 2 +- mm/shmem.c | 33 +++++++++++++++++---------------- 2 files changed, 18 insertions(+), 17 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 9b2d2faff1d0..6b0c626620f5 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -54,7 +54,7 @@ struct shmem_sb_info { unsigned long max_blocks; /* How many blocks are allowed */ struct percpu_counter used_blocks; /* How many are allocated */ unsigned long max_inodes; /* How many inodes are allowed */ - unsigned long free_inodes; /* How many are left for allocation */ + unsigned long free_ispace; /* How much ispace left for allocation */ raw_spinlock_t stat_lock; /* Serialize shmem_sb_info changes */ umode_t mode; /* Mount mode for root directory */ unsigned char huge; /* Whether to try for hugepages */ diff --git a/mm/shmem.c b/mm/shmem.c index df3cabf54206..c39471384168 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -90,6 +90,9 @@ static struct vfsmount *shm_mnt; /* Pretend that each entry is of this size in directory's i_size */ #define BOGO_DIRENT_SIZE 20 +/* Pretend that one inode + its dentry occupy this much memory */ +#define BOGO_INODE_SIZE 1024 + /* Symlink up to this size is kmalloc'ed instead of using a swappable page */ #define SHORT_SYMLINK_LEN 128 @@ -137,7 +140,8 @@ static unsigned long shmem_default_max_inodes(void) { unsigned long nr_pages = totalram_pages(); - return min(nr_pages - totalhigh_pages(), nr_pages / 2); + return min3(nr_pages - totalhigh_pages(), nr_pages / 2, + ULONG_MAX / BOGO_INODE_SIZE); } #endif @@ -331,11 +335,11 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) if (!(sb->s_flags & SB_KERNMOUNT)) { raw_spin_lock(&sbinfo->stat_lock); if (sbinfo->max_inodes) { - if (!sbinfo->free_inodes) { + if (sbinfo->free_ispace < BOGO_INODE_SIZE) { raw_spin_unlock(&sbinfo->stat_lock); return -ENOSPC; } - sbinfo->free_inodes--; + sbinfo->free_ispace -= BOGO_INODE_SIZE; } if (inop) { ino = sbinfo->next_ino++; @@ -394,7 +398,7 @@ static void shmem_free_inode(struct super_block *sb) struct shmem_sb_info *sbinfo = SHMEM_SB(sb); if (sbinfo->max_inodes) { raw_spin_lock(&sbinfo->stat_lock); - sbinfo->free_inodes++; + sbinfo->free_ispace += BOGO_INODE_SIZE; raw_spin_unlock(&sbinfo->stat_lock); } } @@ -3155,7 +3159,7 @@ static int shmem_statfs(struct dentry *dentry, struct kstatfs *buf) } if (sbinfo->max_inodes) { buf->f_files = sbinfo->max_inodes; - buf->f_ffree = sbinfo->free_inodes; + buf->f_ffree = sbinfo->free_ispace / BOGO_INODE_SIZE; } /* else leave those fields 0 like simple_statfs */ @@ -3815,13 +3819,13 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param) break; case Opt_nr_blocks: ctx->blocks = memparse(param->string, &rest); - if (*rest || ctx->blocks > S64_MAX) + if (*rest || ctx->blocks > LONG_MAX) goto bad_value; ctx->seen |= SHMEM_SEEN_BLOCKS; break; case Opt_nr_inodes: ctx->inodes = memparse(param->string, &rest); - if (*rest) + if (*rest || ctx->inodes > ULONG_MAX / BOGO_INODE_SIZE) goto bad_value; ctx->seen |= SHMEM_SEEN_INODES; break; @@ -4002,21 +4006,17 @@ static int shmem_parse_options(struct fs_context *fc, void *data) /* * Reconfigure a shmem filesystem. - * - * Note that we disallow change from limited->unlimited blocks/inodes while any - * are in use; but we must separately disallow unlimited->limited, because in - * that case we have no record of how much is already in use. */ static int shmem_reconfigure(struct fs_context *fc) { struct shmem_options *ctx = fc->fs_private; struct shmem_sb_info *sbinfo = SHMEM_SB(fc->root->d_sb); - unsigned long inodes; + unsigned long used_isp; struct mempolicy *mpol = NULL; const char *err; raw_spin_lock(&sbinfo->stat_lock); - inodes = sbinfo->max_inodes - sbinfo->free_inodes; + used_isp = sbinfo->max_inodes * BOGO_INODE_SIZE - sbinfo->free_ispace; if ((ctx->seen & SHMEM_SEEN_BLOCKS) && ctx->blocks) { if (!sbinfo->max_blocks) { @@ -4034,7 +4034,7 @@ static int shmem_reconfigure(struct fs_context *fc) err = "Cannot retroactively limit inodes"; goto out; } - if (ctx->inodes < inodes) { + if (ctx->inodes * BOGO_INODE_SIZE < used_isp) { err = "Too few inodes for current use"; goto out; } @@ -4080,7 +4080,7 @@ static int shmem_reconfigure(struct fs_context *fc) sbinfo->max_blocks = ctx->blocks; if (ctx->seen & SHMEM_SEEN_INODES) { sbinfo->max_inodes = ctx->inodes; - sbinfo->free_inodes = ctx->inodes - inodes; + sbinfo->free_ispace = ctx->inodes * BOGO_INODE_SIZE - used_isp; } /* @@ -4211,7 +4211,8 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_flags |= SB_NOUSER; #endif sbinfo->max_blocks = ctx->blocks; - sbinfo->free_inodes = sbinfo->max_inodes = ctx->inodes; + sbinfo->max_inodes = ctx->inodes; + sbinfo->free_ispace = sbinfo->max_inodes * BOGO_INODE_SIZE; if (sb->s_flags & SB_KERNMOUNT) { sbinfo->ino_batch = alloc_percpu(ino_t); if (!sbinfo->ino_batch) From patchwork Wed Aug 9 04:33:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13347417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CF81C001B0 for ; Wed, 9 Aug 2023 04:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229882AbjHIEeC (ORCPT ); Wed, 9 Aug 2023 00:34:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229478AbjHIEeB (ORCPT ); Wed, 9 Aug 2023 00:34:01 -0400 Received: from mail-yb1-xb2f.google.com (mail-yb1-xb2f.google.com [IPv6:2607:f8b0:4864:20::b2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E151A198D for ; Tue, 8 Aug 2023 21:33:59 -0700 (PDT) Received: by mail-yb1-xb2f.google.com with SMTP id 3f1490d57ef6-ccc462deca6so6855074276.0 for ; Tue, 08 Aug 2023 21:33:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691555639; x=1692160439; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=LKmWNaTQHWbW/f06pFcnEOfwhbTzT9a9ji8bFsf1bck=; b=tisuCV0B7ipG83St9qBA5XGO5RJ5cDSm15eriMOM13gYSGm+trwDTq6Et7srBx5Qac W9gkD1UY2kBQrTJHTbPQoS8Oz1veJFvqonVrcltvyBatz2qmfN6jjSZmkBG8sdEZN2D8 hj0c1iSVkxaWskybtQEQipRS8+Tu3cgS5PX8gBTHNyTnmUkfnE15RzvSxwVkGYYupjJQ YHopNDtlvGjLLk4UFZJ34DfM1rbLyMGwrZN1vIld2YeF4m+eq3xTz6PhTZB6XxqQZvQO EIrZBGQYP0oTCkxcI4TeilpLP9YGlBTEhJcxWzm0Oj1FM3iXxVHDHp9Jq2GzcVVzeOGl 2nSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691555639; x=1692160439; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LKmWNaTQHWbW/f06pFcnEOfwhbTzT9a9ji8bFsf1bck=; b=abi7N+66tAKQwlo9wnWuEW6Lg60tx75plhFRX9myfxEHV/+Rs7nXhYpQsghEtPH9D7 yq/KAp5uzy/DdknirtQMZJg9FFS7Uivmz677pi1plKiNBSpxhpOU4jElX4D/CTAxxlUU NILjN7m6a18lqUSJmfRn3VVqBJl0xVuYteQ5utkFbtr000qs9HoBBy20A3RxK1Svp2XO ClaKcyXSs9jH9SJJwE68FMjYhGIWYMDD+qXNkp4PXKtw68UTo99HZM+PSJC7udZxrkhb 6+8F8JXBUxQHOukwgIoFqwkztrEjf6K9USNbcd97eD3Eih16/v7/rVqKD/w+ZSD/O+e5 Xh/w== X-Gm-Message-State: AOJu0Ywm9bQEoyhPlaKElMdCDETNlCc508sK4OjQEaxMIYz00+DZ396U bzigSYypzPPvZBacuxAERFgWDA== X-Google-Smtp-Source: AGHT+IHfObTLHvN18MX9PxQb0aEQKcQX/w+V4oLEKKEHeUObobIaXthO2HgX/Yf7Qiyo4bn7+JcVQw== X-Received: by 2002:a05:6902:20a:b0:d07:1bdb:7780 with SMTP id j10-20020a056902020a00b00d071bdb7780mr1815999ybs.60.1691555638942; Tue, 08 Aug 2023 21:33:58 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id c2-20020a258802000000b00d0c698ed6b6sm3079740ybl.41.2023.08.08.21.33.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 21:33:58 -0700 (PDT) Date: Tue, 8 Aug 2023 21:33:56 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Christian Brauner cc: Andrew Morton , Oleksandr Tymoshenko , Carlos Maiolino , Jeff Layton , Chuck Lever , Jan Kara , Miklos Szeredi , Daniel Xu , Chris Down , Tejun Heo , Greg Kroah-Hartman , Matthew Wilcox , Christoph Hellwig , Pete Zaitcev , Helge Deller , Topi Miettinen , Yu Kuai , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH vfs.tmpfs 3/5] tmpfs,xattr: enable limited user extended attributes In-Reply-To: Message-ID: <2e63b26e-df46-5baa-c7d6-f9a8dd3282c5@google.com> References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Enable "user." extended attributes on tmpfs, limiting them by tracking the space they occupy, and deducting that space from the limited ispace (unless tmpfs mounted with nr_inodes=0 to leave that ispace unlimited). tmpfs inodes and simple xattrs are both unswappable, and have to be in lowmem on a 32-bit highmem kernel: so the ispace limit is appropriate for xattrs, without any need for a further mount option. Add simple_xattr_space() to give approximate but deterministic estimate of the space taken up by each xattr: with simple_xattrs_free() outputting the space freed if required (but kernfs and even some tmpfs usages do not require that, so don't waste time on strlen'ing if not needed). Security and trusted xattrs were already supported: for consistency and simplicity, account them from the same pool; though there's a small risk that a tmpfs with enough space before would now be considered too small. When extended attributes are used, "df -i" does show more IUsed and less IFree than can be explained by the inodes: document that (manpage later). xfstests tests/generic which were not run on tmpfs before but now pass: 020 037 062 070 077 097 103 117 337 377 454 486 523 533 611 618 728 with no new failures. Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara Reviewed-by: Carlos Maiolino --- Documentation/filesystems/tmpfs.rst | 7 ++- fs/Kconfig | 4 +- fs/kernfs/dir.c | 2 +- fs/xattr.c | 28 ++++++++++- include/linux/xattr.h | 3 +- mm/shmem.c | 78 +++++++++++++++++++++++++++---- 6 files changed, 106 insertions(+), 16 deletions(-) diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst index 67422ee10e03..56a26c843dbe 100644 --- a/Documentation/filesystems/tmpfs.rst +++ b/Documentation/filesystems/tmpfs.rst @@ -21,8 +21,8 @@ explained further below, some of which can be reconfigured dynamically on the fly using a remount ('mount -o remount ...') of the filesystem. A tmpfs filesystem can be resized but it cannot be resized to a size below its current usage. tmpfs also supports POSIX ACLs, and extended attributes for the -trusted.* and security.* namespaces. ramfs does not use swap and you cannot -modify any parameter for a ramfs filesystem. The size limit of a ramfs +trusted.*, security.* and user.* namespaces. ramfs does not use swap and you +cannot modify any parameter for a ramfs filesystem. The size limit of a ramfs filesystem is how much memory you have available, and so care must be taken if used so to not run out of memory. @@ -97,6 +97,9 @@ mount with such options, since it allows any user with write access to use up all the memory on the machine; but enhances the scalability of that instance in a system with many CPUs making intensive use of it. +If nr_inodes is not 0, that limited space for inodes is also used up by +extended attributes: "df -i"'s IUsed and IUse% increase, IFree decreases. + tmpfs blocks may be swapped out, when there is a shortage of memory. tmpfs has a mount option to disable its use of swap: diff --git a/fs/Kconfig b/fs/Kconfig index 8218a71933f9..7da21f563192 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -205,8 +205,8 @@ config TMPFS_XATTR Extended attributes are name:value pairs associated with inodes by the kernel or by users (see the attr(5) manual page for details). - Currently this enables support for the trusted.* and - security.* namespaces. + This enables support for the trusted.*, security.* and user.* + namespaces. You need this for POSIX ACL support on tmpfs. diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 5a1a4af9d3d2..660995856a04 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -556,7 +556,7 @@ void kernfs_put(struct kernfs_node *kn) kfree_const(kn->name); if (kn->iattr) { - simple_xattrs_free(&kn->iattr->xattrs); + simple_xattrs_free(&kn->iattr->xattrs, NULL); kmem_cache_free(kernfs_iattrs_cache, kn->iattr); } spin_lock(&kernfs_idr_lock); diff --git a/fs/xattr.c b/fs/xattr.c index ba37a8f5cfd1..2d607542281b 100644 --- a/fs/xattr.c +++ b/fs/xattr.c @@ -1039,6 +1039,26 @@ const char *xattr_full_name(const struct xattr_handler *handler, } EXPORT_SYMBOL(xattr_full_name); +/** + * simple_xattr_space - estimate the memory used by a simple xattr + * @name: the full name of the xattr + * @size: the size of its value + * + * This takes no account of how much larger the two slab objects actually are: + * that would depend on the slab implementation, when what is required is a + * deterministic number, which grows with name length and size and quantity. + * + * Return: The approximate number of bytes of memory used by such an xattr. + */ +size_t simple_xattr_space(const char *name, size_t size) +{ + /* + * Use "40" instead of sizeof(struct simple_xattr), to return the + * same result on 32-bit and 64-bit, and even if simple_xattr grows. + */ + return 40 + size + strlen(name); +} + /** * simple_xattr_free - free an xattr object * @xattr: the xattr object @@ -1363,14 +1383,17 @@ void simple_xattrs_init(struct simple_xattrs *xattrs) /** * simple_xattrs_free - free xattrs * @xattrs: xattr header whose xattrs to destroy + * @freed_space: approximate number of bytes of memory freed from @xattrs * * Destroy all xattrs in @xattr. When this is called no one can hold a * reference to any of the xattrs anymore. */ -void simple_xattrs_free(struct simple_xattrs *xattrs) +void simple_xattrs_free(struct simple_xattrs *xattrs, size_t *freed_space) { struct rb_node *rbp; + if (freed_space) + *freed_space = 0; rbp = rb_first(&xattrs->rb_root); while (rbp) { struct simple_xattr *xattr; @@ -1379,6 +1402,9 @@ void simple_xattrs_free(struct simple_xattrs *xattrs) rbp_next = rb_next(rbp); xattr = rb_entry(rbp, struct simple_xattr, rb_node); rb_erase(&xattr->rb_node, &xattrs->rb_root); + if (freed_space) + *freed_space += simple_xattr_space(xattr->name, + xattr->size); simple_xattr_free(xattr); rbp = rbp_next; } diff --git a/include/linux/xattr.h b/include/linux/xattr.h index e37fe667ae04..d20051865800 100644 --- a/include/linux/xattr.h +++ b/include/linux/xattr.h @@ -114,7 +114,8 @@ struct simple_xattr { }; void simple_xattrs_init(struct simple_xattrs *xattrs); -void simple_xattrs_free(struct simple_xattrs *xattrs); +void simple_xattrs_free(struct simple_xattrs *xattrs, size_t *freed_space); +size_t simple_xattr_space(const char *name, size_t size); struct simple_xattr *simple_xattr_alloc(const void *value, size_t size); void simple_xattr_free(struct simple_xattr *xattr); int simple_xattr_get(struct simple_xattrs *xattrs, const char *name, diff --git a/mm/shmem.c b/mm/shmem.c index c39471384168..7420b510a9f3 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -393,12 +393,12 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) return 0; } -static void shmem_free_inode(struct super_block *sb) +static void shmem_free_inode(struct super_block *sb, size_t freed_ispace) { struct shmem_sb_info *sbinfo = SHMEM_SB(sb); if (sbinfo->max_inodes) { raw_spin_lock(&sbinfo->stat_lock); - sbinfo->free_ispace += BOGO_INODE_SIZE; + sbinfo->free_ispace += BOGO_INODE_SIZE + freed_ispace; raw_spin_unlock(&sbinfo->stat_lock); } } @@ -1232,6 +1232,7 @@ static void shmem_evict_inode(struct inode *inode) { struct shmem_inode_info *info = SHMEM_I(inode); struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); + size_t freed; if (shmem_mapping(inode->i_mapping)) { shmem_unacct_size(info->flags, inode->i_size); @@ -1258,9 +1259,9 @@ static void shmem_evict_inode(struct inode *inode) } } - simple_xattrs_free(&info->xattrs); + simple_xattrs_free(&info->xattrs, sbinfo->max_inodes ? &freed : NULL); + shmem_free_inode(inode->i_sb, freed); WARN_ON(inode->i_blocks); - shmem_free_inode(inode->i_sb); clear_inode(inode); #ifdef CONFIG_TMPFS_QUOTA dquot_free_inode(inode); @@ -2440,7 +2441,7 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap, inode = new_inode(sb); if (!inode) { - shmem_free_inode(sb); + shmem_free_inode(sb, 0); return ERR_PTR(-ENOSPC); } @@ -3281,7 +3282,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr ret = simple_offset_add(shmem_get_offset_ctx(dir), dentry); if (ret) { if (inode->i_nlink) - shmem_free_inode(inode->i_sb); + shmem_free_inode(inode->i_sb, 0); goto out; } @@ -3301,7 +3302,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) struct inode *inode = d_inode(dentry); if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode)) - shmem_free_inode(inode->i_sb); + shmem_free_inode(inode->i_sb, 0); simple_offset_remove(shmem_get_offset_ctx(dir), dentry); @@ -3554,21 +3555,40 @@ static int shmem_initxattrs(struct inode *inode, void *fs_info) { struct shmem_inode_info *info = SHMEM_I(inode); + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); const struct xattr *xattr; struct simple_xattr *new_xattr; + size_t ispace = 0; size_t len; + if (sbinfo->max_inodes) { + for (xattr = xattr_array; xattr->name != NULL; xattr++) { + ispace += simple_xattr_space(xattr->name, + xattr->value_len + XATTR_SECURITY_PREFIX_LEN); + } + if (ispace) { + raw_spin_lock(&sbinfo->stat_lock); + if (sbinfo->free_ispace < ispace) + ispace = 0; + else + sbinfo->free_ispace -= ispace; + raw_spin_unlock(&sbinfo->stat_lock); + if (!ispace) + return -ENOSPC; + } + } + for (xattr = xattr_array; xattr->name != NULL; xattr++) { new_xattr = simple_xattr_alloc(xattr->value, xattr->value_len); if (!new_xattr) - return -ENOMEM; + break; len = strlen(xattr->name) + 1; new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len, GFP_KERNEL); if (!new_xattr->name) { kvfree(new_xattr); - return -ENOMEM; + break; } memcpy(new_xattr->name, XATTR_SECURITY_PREFIX, @@ -3579,6 +3599,16 @@ static int shmem_initxattrs(struct inode *inode, simple_xattr_add(&info->xattrs, new_xattr); } + if (xattr->name != NULL) { + if (ispace) { + raw_spin_lock(&sbinfo->stat_lock); + sbinfo->free_ispace += ispace; + raw_spin_unlock(&sbinfo->stat_lock); + } + simple_xattrs_free(&info->xattrs, NULL); + return -ENOMEM; + } + return 0; } @@ -3599,16 +3629,39 @@ static int shmem_xattr_handler_set(const struct xattr_handler *handler, size_t size, int flags) { struct shmem_inode_info *info = SHMEM_I(inode); + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); struct simple_xattr *old_xattr; + size_t ispace = 0; name = xattr_full_name(handler, name); + if (value && sbinfo->max_inodes) { + ispace = simple_xattr_space(name, size); + raw_spin_lock(&sbinfo->stat_lock); + if (sbinfo->free_ispace < ispace) + ispace = 0; + else + sbinfo->free_ispace -= ispace; + raw_spin_unlock(&sbinfo->stat_lock); + if (!ispace) + return -ENOSPC; + } + old_xattr = simple_xattr_set(&info->xattrs, name, value, size, flags); if (!IS_ERR(old_xattr)) { + ispace = 0; + if (old_xattr && sbinfo->max_inodes) + ispace = simple_xattr_space(old_xattr->name, + old_xattr->size); simple_xattr_free(old_xattr); old_xattr = NULL; inode->i_ctime = current_time(inode); inode_inc_iversion(inode); } + if (ispace) { + raw_spin_lock(&sbinfo->stat_lock); + sbinfo->free_ispace += ispace; + raw_spin_unlock(&sbinfo->stat_lock); + } return PTR_ERR(old_xattr); } @@ -3624,9 +3677,16 @@ static const struct xattr_handler shmem_trusted_xattr_handler = { .set = shmem_xattr_handler_set, }; +static const struct xattr_handler shmem_user_xattr_handler = { + .prefix = XATTR_USER_PREFIX, + .get = shmem_xattr_handler_get, + .set = shmem_xattr_handler_set, +}; + static const struct xattr_handler *shmem_xattr_handlers[] = { &shmem_security_xattr_handler, &shmem_trusted_xattr_handler, + &shmem_user_xattr_handler, NULL }; From patchwork Wed Aug 9 04:34:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13347418 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2547FC04A94 for ; Wed, 9 Aug 2023 04:35:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230338AbjHIEfS (ORCPT ); Wed, 9 Aug 2023 00:35:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230267AbjHIEfG (ORCPT ); Wed, 9 Aug 2023 00:35:06 -0400 Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F1941FDF for ; Tue, 8 Aug 2023 21:34:58 -0700 (PDT) Received: by mail-yb1-xb31.google.com with SMTP id 3f1490d57ef6-d593a63e249so1891091276.3 for ; Tue, 08 Aug 2023 21:34:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691555697; x=1692160497; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=fTcwtqovj/n/LD3Icai5ShB8o9Zi6NmsgjgaBMH67ps=; b=vAzcYUsbMguHrchC9hn1CB5osMk/5YiGsgK2RYGkiVTG6y4+Y3oMgkq6hSzJnCK0io 3Ip0koO49LOYhVedkmqMZB+RDZkLYAOLd2zPtQ4qwtftMWaGwkQQudBVKheQz+SLQaal ZsXqL1XPhXf09s7c2NmiwIIKjqFMqYoaOqVBJXWEdX/pg2dDzHCG+bKLqIJA6U7Q3MVN 36liji+AHJWyLgr5x93GFh/6fl9PwkPArG8a5guP4Pz0Xkfg5yptXlr6sZyAaDLJECM7 l0E2bsZWY4rLln97xOdUzijmTFxcPPrakoc/PcuKyj0ZZysgXgTO+H1G4WoQDiTqJQCF SG4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691555697; x=1692160497; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fTcwtqovj/n/LD3Icai5ShB8o9Zi6NmsgjgaBMH67ps=; b=eSB0GI0xOAde3IznGEzWiEtEpDdhHaF97jlAzfmGap6zoNnlQb1HkMbFA1+KTBiPRB grKSwXuBVDRyNEh+EhM1eij0Q8sLL3+zvRazhIuykTuG/qUAFUatRDsTSq56e4RiocI6 MCOaqwJm+9CnuD5KdFlTEKnMx+wHqmpsdh3XZwoKEOnEjavKy/li7PLcwFW3Zjg1wUS9 MkkrVjIM5e2JaFzWoIOO6RfzoItYbkcRHjiDWj448XuBadLr2FZdkDwblDJAkLz1oiOO NXe4+S6x5OdkxCQDy+SlDTBPq9aF6W2HvptKK4A0G/dlJdztwQgtCqm/dJ0XDwWn9+Ax B2DA== X-Gm-Message-State: AOJu0YyoODERWyuy3QVOud8B+K+L49XL8Y4c+Z+Mf2XR7LxbDVG1c5O4 cSVIloNW0SOsFli8Iz6A6cX7lA== X-Google-Smtp-Source: AGHT+IHKKV4hQ5vytluAdPVMcaWjO0NzAaMwo9dYu1hyneZrdnCvcm251k5acja++/q3pN0aSKgfDA== X-Received: by 2002:a0d:e811:0:b0:576:d65d:2802 with SMTP id r17-20020a0de811000000b00576d65d2802mr1729866ywe.3.1691555697390; Tue, 08 Aug 2023 21:34:57 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id y139-20020a0dd691000000b00570589c5aedsm3785800ywd.7.2023.08.08.21.34.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 21:34:57 -0700 (PDT) Date: Tue, 8 Aug 2023 21:34:54 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Christian Brauner cc: Andrew Morton , Oleksandr Tymoshenko , Carlos Maiolino , Jeff Layton , Chuck Lever , Jan Kara , Miklos Szeredi , Daniel Xu , Chris Down , Tejun Heo , Greg Kroah-Hartman , Matthew Wilcox , Christoph Hellwig , Pete Zaitcev , Helge Deller , Topi Miettinen , Yu Kuai , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH vfs.tmpfs 4/5] tmpfs: trivial support for direct IO In-Reply-To: Message-ID: <7c12819-9b94-d56-ff88-35623aa34180@google.com> References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Depending upon your philosophical viewpoint, either tmpfs always does direct IO, or it cannot ever do direct IO; but whichever, if tmpfs is to stand in for a more sophisticated filesystem, it can be helpful for tmpfs to support O_DIRECT. So, give tmpfs a shmem_direct_IO() method, of the simplest kind: by just returning 0 done, it leaves all the work to the buffered fallback (and everything else just happens to work out okay - in particular, its dirty pages don't get lost to invalidation). xfstests auto generic which were not run on tmpfs before but now pass: 036 091 113 125 130 133 135 198 207 208 209 210 211 212 214 226 239 263 323 355 391 406 412 422 427 446 451 465 551 586 591 609 615 647 708 729 with no new failures. LTP dio tests which were not run on tmpfs before but now pass: dio01 through dio30, except for dio04 and dio10, which fail because tmpfs dio read and write allow odd count: tmpfs could be made stricter, but would that be an improvement? Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara --- mm/shmem.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index 7420b510a9f3..4d5599e566df 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2720,6 +2720,16 @@ shmem_write_end(struct file *file, struct address_space *mapping, return copied; } +static ssize_t shmem_direct_IO(struct kiocb *iocb, struct iov_iter *iter) +{ + /* + * Just leave all the work to the buffered fallback. + * Some LTP tests may expect us to enforce alignment restrictions, + * but the fallback works just fine with any alignment, so allow it. + */ + return 0; +} + static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) { struct file *file = iocb->ki_filp; @@ -4421,6 +4431,7 @@ const struct address_space_operations shmem_aops = { #ifdef CONFIG_TMPFS .write_begin = shmem_write_begin, .write_end = shmem_write_end, + .direct_IO = shmem_direct_IO, #endif #ifdef CONFIG_MIGRATION .migrate_folio = migrate_folio, From patchwork Wed Aug 9 04:36:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13347419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CB48C41513 for ; Wed, 9 Aug 2023 04:36:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229913AbjHIEgV (ORCPT ); Wed, 9 Aug 2023 00:36:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229576AbjHIEgS (ORCPT ); Wed, 9 Aug 2023 00:36:18 -0400 Received: from mail-yw1-x1131.google.com (mail-yw1-x1131.google.com [IPv6:2607:f8b0:4864:20::1131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55D911BC3 for ; Tue, 8 Aug 2023 21:36:17 -0700 (PDT) Received: by mail-yw1-x1131.google.com with SMTP id 00721157ae682-57026f4bccaso69126367b3.2 for ; Tue, 08 Aug 2023 21:36:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691555776; x=1692160576; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=dHla0CWZ0WHqYi4zECfU9k4rSR5m4RPnWM7U84PXzs4=; b=jfYt/KAwov5gjcGl3FcNqvOk5TKV42theXc90cj7+enViQ6fF9qu+PQj5ASY8WE7W3 v+iyUH0Txp5OMV/jluSYyP1REsaSFfUc3faZnlHJsEaHZ1LpsFH4+b/KfbKvn+MrdBoq kQMtsan3i4QOMU14zUeISfohUjToi6NDWoUpq3tXnu5co9QbSNtkHBPD3ON2J3UHbcBS bfpt+qB9NthEFbsYKt5nPgUflp+NCiswQhhQSLWM0EjAHFkf0HEZphqYfnT4rpIcffWK LDOe8CrxbjPWtBmEx1RJW6yZPSAVfVu+TU5K0iBt3PkGu3mAS21fnhljpCNCtAifpm+j umcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691555776; x=1692160576; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dHla0CWZ0WHqYi4zECfU9k4rSR5m4RPnWM7U84PXzs4=; b=UIC6b8v1eAeJQNpi7HO9eNrNbAAPW/P2sb8uTaYQ0nT6kp291adS1wu8EPHvghGyLT +7AFoYG8NDzw2P5sy69t3DSrwuGAyjKTJ7qcWiwQHNldCQq/4Z0Z3i0cMFqw3iey4tDj cRUg35HSrdPFZYPSN2p/MZ+sQ5QeYjd8YSziYOvhNFcWRlZFnzrdsTJtPrRmjiiYjuYS hoMUz0f7j2w+P01dT9nqyAfBreTVT3sMw6e6jsNv6XamzUjfTAQdapq1Al96Oh6sL3nH lAJ+aKBy4IhAN/UmCNImr9kD3ifU0i/MP5BnSeNzItoJLKpvhcOgI5VPz8vdLAv87DxK GdTA== X-Gm-Message-State: AOJu0YyPnt0BteNq5hthqNHdSbjugN1zdlBx9l64/YvtaZqckutYUQD4 /dV8gP7/6Dx5Cx23tAuvsiLEeA== X-Google-Smtp-Source: AGHT+IETKivu9DrTnaNoT4T/s8DpkVlW+OXa1kLEZyP5WGJw/nJ4J6ztHKF9wuaAq6Z7ieKEERVmcg== X-Received: by 2002:a81:81c5:0:b0:583:42d3:8a18 with SMTP id r188-20020a8181c5000000b0058342d38a18mr1283732ywf.52.1691555776457; Tue, 08 Aug 2023 21:36:16 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id z186-20020a8189c3000000b0058390181d16sm3788417ywf.30.2023.08.08.21.36.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 21:36:16 -0700 (PDT) Date: Tue, 8 Aug 2023 21:36:12 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Christian Brauner cc: Andrew Morton , Oleksandr Tymoshenko , Carlos Maiolino , Jeff Layton , Chuck Lever , Jan Kara , Miklos Szeredi , Daniel Xu , Chris Down , Tejun Heo , Greg Kroah-Hartman , Matthew Wilcox , Christoph Hellwig , Pete Zaitcev , Helge Deller , Topi Miettinen , Yu Kuai , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH vfs.tmpfs 5/5] mm: invalidation check mapping before folio_contains In-Reply-To: Message-ID: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Enabling tmpfs "direct IO" exposes it to invalidate_inode_pages2_range(), which when swapping can hit the VM_BUG_ON_FOLIO(!folio_contains()): the folio has been moved from page cache to swap cache (with folio->mapping reset to NULL), but the folio_index() embedded in folio_contains() sees swapcache, and so returns the swapcache_index() - whereas folio->index would be the right one to check against the index from mapping's xarray. There are different ways to fix this, but my preference is just to order the checks in invalidate_inode_pages2_range() the same way that they are in __filemap_get_folio() and find_lock_entries() and filemap_fault(): check folio->mapping before folio_contains(). Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara --- mm/truncate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/truncate.c b/mm/truncate.c index 95d1291d269b..c3320e66d6ea 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -657,11 +657,11 @@ int invalidate_inode_pages2_range(struct address_space *mapping, } folio_lock(folio); - VM_BUG_ON_FOLIO(!folio_contains(folio, indices[i]), folio); - if (folio->mapping != mapping) { + if (unlikely(folio->mapping != mapping)) { folio_unlock(folio); continue; } + VM_BUG_ON_FOLIO(!folio_contains(folio, indices[i]), folio); folio_wait_writeback(folio); if (folio_mapped(folio))