From patchwork Fri Aug 23 17:33:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 13775613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4696DC531DC for ; Fri, 23 Aug 2024 17:33:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A1F73800B7; Fri, 23 Aug 2024 13:33:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D038800B4; Fri, 23 Aug 2024 13:33:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 896F5800B7; Fri, 23 Aug 2024 13:33:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6FF3D800B4 for ; Fri, 23 Aug 2024 13:33:56 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 06449A0D84 for ; Fri, 23 Aug 2024 17:33:56 +0000 (UTC) X-FDA: 82484208072.27.1DB4033 Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by imf26.hostedemail.com (Postfix) with ESMTP id 1908614001E for ; Fri, 23 Aug 2024 17:33:51 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=igalia.com header.s=20170329 header.b=pwiLiZ0j; spf=pass (imf26.hostedemail.com: domain of andrealmeid@igalia.com designates 178.60.130.6 as permitted sender) smtp.mailfrom=andrealmeid@igalia.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724434392; a=rsa-sha256; cv=none; b=bXnp2IdiOClLdAe0u/lTlFrCdvd15xhY7CA/lirTaKIHRyi5SMU23t5cVIb0uFvdybUuDS 1o1m2YxnheUkNdLT2a0D/W58fcBF1Y17EiOd8g6UWBRdTSOQw824E3WKtf7kuzqd14uxvz GQn5lLiW1bYdjm9hSdJ2IIDjd8+L9X0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=igalia.com header.s=20170329 header.b=pwiLiZ0j; spf=pass (imf26.hostedemail.com: domain of andrealmeid@igalia.com designates 178.60.130.6 as permitted sender) smtp.mailfrom=andrealmeid@igalia.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724434391; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y9QOgUnGmMfOXt3imEXMFmKZoYcPek4W3U8ViHPXL9E=; b=eiRldyXSO0HJiAorAvJ47drBYBH2KXMMKZWP666WTwVV1YfZkm/GQpotUpQrjQo8FpLRMI blFFJH93ewieNH4AT2ZDv1MU1dz56UOavPUUUc6w/70Gr68KytkzRBBsbayT9FBe1rSwcW dHAAaCiaNOKkDusk5KsJ5W6n7/XEJwk= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=y9QOgUnGmMfOXt3imEXMFmKZoYcPek4W3U8ViHPXL9E=; b=pwiLiZ0jOOthH9F2SeGr7Suney P0Qx8u9N4jscQ8YEczpVwDus5CS5wSiBL8Q9JsZhhRGxJRkTX8IiRtNuQFZszoxBK03p4c1oSyO3s g4yUreFx0v96lY3zFE7qXYo8AX3nJhsPHPruhMF0sZGug7HQDKPfggxEmM+NFMM8rFc9MkUpAN6nk LmoJGYx0+rsiURnMC6sIdfZN0J4JMV31wPv+kreX3IYbO57f3UBjGJw5GTJrWztr6FCQR+bwveXV4 w9xbxxMZzi9mpFMK/8maAtNIi7nebYcTsBC/mFVqnktHaptQjcHwYl79abuigeUzxB1H02SMuQdXv IqpaWIaA==; Received: from [179.118.186.198] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1shYAP-0048Ww-Nr; Fri, 23 Aug 2024 19:33:41 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: Hugh Dickins , Andrew Morton , Alexander Viro , Christian Brauner , Jan Kara Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, kernel-dev@igalia.com, krisman@kernel.org, Daniel Rosenberg , smcv@collabora.com, =?utf-8?q?Andr?= =?utf-8?q?=C3=A9_Almeida?= Subject: [PATCH 1/5] tmpfs: Add casefold lookup support Date: Fri, 23 Aug 2024 14:33:28 -0300 Message-ID: <20240823173332.281211-2-andrealmeid@igalia.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240823173332.281211-1-andrealmeid@igalia.com> References: <20240823173332.281211-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Stat-Signature: pkng3uq6ir45aozf7wam7yiwxro8fwpz X-Rspamd-Queue-Id: 1908614001E X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1724434431-723577 X-HE-Meta: U2FsdGVkX1+U9ugZVueLQz988jMXn/7eS74t36OPwfttmUjB2c57VRtx01NtW9PaCdzAlgf5cmJS9SzT/wGswpBrKqPcd07SOBC6oy/Jhf21Vn0BPgPE6MNRG+3tC6u5KbEcJcB11PkF/FR9HJZpPdDD8m5E1SSmXF2mpG4TbHqK58/8hOFNcYzclHKmD4dYPg0PfhqnCY/CPKV+dR7pRtXGzIULsEovS9gjMPRUJ/NBxlLHMaC152ZpsLYwwaDya4mGBG8u2mt1TP4HmgQka9KAAsOpV9Jm65jCshLiU92WKDi76SIBf+liR25odk7HPUjkzBh2fWAcpjqmElH7ozuoZriXb7BxIEKIeamVk6W5VMoDjWOK6lbtD3Y6RWmP90zacKNwN1S6TpXpsnPl7Ne4E+SCGJt4WVtGVRGkOZ5AYwVmW/7DmcPtvpdTJIACk3X5y0e0oL2q0gv77Z6yHAo7woSmIBjmYiHtp+rpNZvxHON8yXBKxFnUb8hHAKzETCk61vsvTkNgKKdaMqkgQBxYJ21fA7w8Z3QeP2Ye/UNlshk1WwzeimleDCQpetuqsV6DBpnTtaAChK+wXg9ZiDh7ulOHlhp00SP6qMbuAyweOL6UljnLALxFqb80TNCwjPFCknKbJ4yFqkJscfy+zK6eg/8pHTBkLuU36bNvakuX8HfLK07WblW9NgD5BS77uok0G7gjqqGmOy8W5FQDQidqMpkItNNdr2xcsFBzyV52Zjkq9vfO4JSszw71tByKjTsULoKezxqjiZ6beCo1IxZdnIO4/EmU/RE7AOmgHqAAX91eJ9NE0/Bomr80jD7BfBHQmcPvEKiV8vclvc5AgmLXeufvESoPibgQIS3ZGbQ70VEBV0mbTumySVfaLiDcXYqFAy4v0eWDOxrVZHszFV4D/+1qO4bM36isKi82+Z/dyta3ASowONF8NTOxU85YeCgQtbsdk+KW/zZyGB9 lyslJisu 4C9xHAOsbgT5YQEaALgcdDxRev7EwSV5u8l0J7u4hZz7DqjjgdmEjbiS68No/WO9k6sy5lPj5Ar4CoEqJn6ZnHc1tbwXUSxl+iArUnrHeqAUKivQ7XkQC4KRny3hdqwvmLI7FtgdvTrmlFeHjx5/dNCxQWoQlhrqTXxhu4+og3z8JP4TJURMlYvRaF5+nrlyUiFPBx/xXHxTxLAv/gDhXnJvEpQoXnvRUnkmMzcUJ1bGLQDCjJPQi8eLo7dGPP1Cz8BnDYmxB8FyOc1Qo4rDMXJUOQoPctIH1jjXL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Enable casefold lookup in tmpfs, based on the enconding defined by userspace. That means that instead of comparing byte per byte a file name, it compares to a case-insensitive equivalent of the Unicode string. * dcache handling There's a special need when dealing with case-insensitive dentries. First of all, we currently invalidated every negative casefold dentries. That happens because currently VFS code has no proper support to deal with that, giving that it could incorrectly reuse a previous filename for a new file that has a casefold match. For instance, this could happen: $ mkdir DIR $ rm -r DIR $ mkdir dir $ ls DIR/ And would be perceived as inconsistency from userspace point of view, because even that we match files in a case-insensitive manner, we still honor whatever is the initial filename. Along with that, tmpfs stores only the first equivalent name dentry used in the dcache, preventing duplications of dentries in the dcache. The d_compare() version for casefold files stores a normalized string, and before every lookup, the filename is normalized as well, achieving a casefolded lookup. Signed-off-by: André Almeida --- include/linux/shmem_fs.h | 1 + mm/shmem.c | 63 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 63 insertions(+), 1 deletion(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 1d06b1e5408a..1a1196b077a6 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -73,6 +73,7 @@ struct shmem_sb_info { struct list_head shrinklist; /* List of shinkable inodes */ unsigned long shrinklist_len; /* Length of shrinklist */ struct shmem_quota_limits qlimits; /* Default quota limits */ + bool casefold; }; static inline struct shmem_inode_info *SHMEM_I(struct inode *inode) diff --git a/mm/shmem.c b/mm/shmem.c index 5a77acf6ac6a..aa272c62f811 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -40,6 +40,8 @@ #include #include #include +#include +#include #include "swap.h" static struct vfsmount *shm_mnt __ro_after_init; @@ -123,6 +125,8 @@ struct shmem_options { bool noswap; unsigned short quota_types; struct shmem_quota_limits qlimits; + struct unicode_map *encoding; + bool strict_encoding; #define SHMEM_SEEN_BLOCKS 1 #define SHMEM_SEEN_INODES 2 #define SHMEM_SEEN_HUGE 4 @@ -3427,6 +3431,12 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, if (IS_ERR(inode)) return PTR_ERR(inode); +#if IS_ENABLED(CONFIG_UNICODE) + if (sb_has_strict_encoding(dir->i_sb) && IS_CASEFOLDED(dir) && + dir->i_sb->s_encoding && utf8_validate(dir->i_sb->s_encoding, &dentry->d_name)) + return -EINVAL; +#endif + error = simple_acl_create(dir, inode); if (error) goto out_iput; @@ -3435,6 +3445,9 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, if (error && error != -EOPNOTSUPP) goto out_iput; + if (IS_CASEFOLDED(dir)) + d_add(dentry, NULL); + error = simple_offset_add(shmem_get_offset_ctx(dir), dentry); if (error) goto out_iput; @@ -3526,6 +3539,9 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, goto out; } + if (IS_CASEFOLDED(dir)) + d_add(dentry, NULL); + dir->i_size += BOGO_DIRENT_SIZE; inode_set_mtime_to_ts(dir, inode_set_ctime_to_ts(dir, inode_set_ctime_current(inode))); @@ -3553,6 +3569,14 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) inode_inc_iversion(dir); drop_nlink(inode); dput(dentry); /* Undo the count from "create" - does all the work */ + + /* + * For now, VFS can't deal with case-insensitive negative dentries, so + * we destroy them + */ + if (IS_CASEFOLDED(dir)) + d_invalidate(dentry); + return 0; } @@ -3697,6 +3721,8 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, dir->i_size += BOGO_DIRENT_SIZE; inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); inode_inc_iversion(dir); + if (IS_CASEFOLDED(dir)) + d_add(dentry, NULL); d_instantiate(dentry, inode); dget(dentry); return 0; @@ -4471,6 +4497,11 @@ static void shmem_put_super(struct super_block *sb) { struct shmem_sb_info *sbinfo = SHMEM_SB(sb); +#if IS_ENABLED(CONFIG_UNICODE) + if (sbinfo->casefold) + utf8_unload(sb->s_encoding); +#endif + #ifdef CONFIG_TMPFS_QUOTA shmem_disable_quotas(sb); #endif @@ -4515,6 +4546,17 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) } sb->s_export_op = &shmem_export_ops; sb->s_flags |= SB_NOSEC | SB_I_VERSION; + +#if IS_ENABLED(CONFIG_UNICODE) + if (ctx->encoding) { + sb->s_encoding = ctx->encoding; + generic_set_sb_d_ops(sb); + if (ctx->strict_encoding) + sb->s_encoding_flags = SB_ENC_STRICT_MODE_FL; + sbinfo->casefold = true; + } +#endif + #else sb->s_flags |= SB_NOUSER; #endif @@ -4704,11 +4746,28 @@ static const struct inode_operations shmem_inode_operations = { #endif }; +static struct dentry *shmem_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) +{ + if (dentry->d_name.len > NAME_MAX) + return ERR_PTR(-ENAMETOOLONG); + + /* + * For now, VFS can't deal with case-insensitive negative dentries, so + * we prevent them from being created + */ + if (IS_CASEFOLDED(dir)) + return NULL; + + d_add(dentry, NULL); + + return NULL; +} + static const struct inode_operations shmem_dir_inode_operations = { #ifdef CONFIG_TMPFS .getattr = shmem_getattr, .create = shmem_create, - .lookup = simple_lookup, + .lookup = shmem_lookup, .link = shmem_link, .unlink = shmem_unlink, .symlink = shmem_symlink, @@ -4791,6 +4850,8 @@ int shmem_init_fs_context(struct fs_context *fc) ctx->uid = current_fsuid(); ctx->gid = current_fsgid(); + ctx->encoding = NULL; + fc->fs_private = ctx; fc->ops = &shmem_fs_context_ops; return 0;