From patchwork Wed Nov 27 15:28:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13887147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FE03D6ACEB for ; Wed, 27 Nov 2024 15:28:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1B766B0093; Wed, 27 Nov 2024 10:28:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EA5246B0095; Wed, 27 Nov 2024 10:28:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF5126B0096; Wed, 27 Nov 2024 10:28:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A9C5D6B0093 for ; Wed, 27 Nov 2024 10:28:26 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 69FBE81830 for ; Wed, 27 Nov 2024 15:28:26 +0000 (UTC) X-FDA: 82832256234.09.46427F2 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf15.hostedemail.com (Postfix) with ESMTP id DD35EA0013 for ; Wed, 27 Nov 2024 15:28:18 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=BAQrJMwO; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf15.hostedemail.com: domain of cel@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=cel@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732721299; a=rsa-sha256; cv=none; b=S7eyzwhjuOBUC3v7MoOyTBOUCumr8KIs4+EQlN0o55KflM5NtggsbL9UtdYUxbndMEOvv4 5JAqU7YnBeOU+zFMRrXSrYulsZ997AgJJGtQE4/NuI5d35taY7XsYjrMVBSh9tlYDz88wK 3BKt5/Rvg0Fi6afBeU4kgiCZIoxBrm8= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=BAQrJMwO; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf15.hostedemail.com: domain of cel@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=cel@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732721299; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c41AD2//6mZTVnR06FLjS9SBcMx53903xNe/tEbNKz0=; b=i5KEeeWihXl2kG42RdSNFY/4rSfe5z3/+5EVt6+NnWqNzUcfgl1dZ3T/8UMuWsCw8H+gg4 wHePd21tTPSqIkvaNTt3wvbExocCb55vu/IZSmv7fJQYam4hhmZyDyL3VK/0hue8DwLMu2 +PFxx6L/bDMozA7zkpMAgvpsI7Silpk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 6D36CA43938; Wed, 27 Nov 2024 15:26:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 338BBC4CED8; Wed, 27 Nov 2024 15:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732721303; bh=cm3XNveZKUOxWBzRqaJXCYdMMoTxeTEl5G9Suuio1jg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BAQrJMwO3HcTBcQJtl9J0sN0/3Y2EaWG1HoGPOJVdXnlxSk8tgna/zSNZUpzF60BQ 3dRUcDoFH67L8gUn89drpvzoZcl7it2YQwMoAFiks5/PUQxhDyxKmxYM+O6WWJEWlc FQIyCfAcqppQ5YYaKu+sco2Dy80Ug5cJOo/6T5axaEkQ0ldr4nCmcLxGs/KcL9CdM5 BlWKRM9OkdVQQzbHH6nPFhE7I9pZ907/6bqzK6KGTHGDpC9RAkEXw7UyaOuPFGjwcU SCblQFp8y4T4FtLnBq7x0hIIAL/rlW8ih3aD8uw47R+heJXIeKxMQXLr0mcP36v18G g0Y5KKEt0904g== From: cel@kernel.org To: Hugh Dickens , Christian Brauner , Al Viro Cc: , , yukuai3@huawei.com, yangerkun@huaweicloud.com, Chuck Lever Subject: [RFC PATCH v3 5/5] libfs: Refactor offset_iterate_dir() Date: Wed, 27 Nov 2024 10:28:15 -0500 Message-ID: <20241127152815.151781-6-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241127152815.151781-1-cel@kernel.org> References: <20241127152815.151781-1-cel@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DD35EA0013 X-Stat-Signature: 1m9midk4imk3zphtw8ox7nwz6wtdzwk7 X-Rspam-User: X-HE-Tag: 1732721298-510220 X-HE-Meta: U2FsdGVkX19ZIMObBGnAsO4V3o7X4Y35cDloA9TqYQa+6NMa8QOwlQ6BASHgjZouK47XV1tWfiX3SASFaHgXAUSFjuuOUjqiU62+RZCsJoqs5rwbkd30NBMeJnFmQzbPQHCy9LVuip3LU4f3pFEksTKIwfGjOc4pjXDzeeNkGvP0ZBAsiVJwsEmJ8O9avDPLzAMMmOiDZtBlmYgsYe8gDqXNb58/Ws66kDRIX04y+KbI9jjmtip5A1DybpFc/Zu1xddgsUEVK1qixMi0L5ek70hePBokREaD0Gl5lBvjEhdluSkLxi4CisBHs58SIMGWpn9ykZsng4XD13084uC/RvTLdaA8HsuY2uqZYcB5IUlCRjL4hGfH0dBPYqTIbsxAUNFTSM/bcWQGAb+8woSTSsPOH/upTQRuXoECCzNm5di7w9jDcUS+iZ+O/vcbCQwXdzPrHmXx0J7Bi2zVtoYRUYbvbi6VR43xarWjrJlwn7CVIDHJiJ6acfDyB0Yj/iqLn8r0RwxoMKBib/hdbKYgMocPBsfBQ/yld1CM6aMr+N65SD1izYO9jY6zVLH5ZyYlD4/dQE2J0hexrVcjHiqxAzbkTuOMwNPzygy6+ktA2KJF+aNcOZoBuD62D7U8ZgnLX6IOWfGNY8IELu6EZCRtwXR8tV3dXZdotRx21WVB0lQpqHI6dAmwqpWY+GAkjC3aZrr8ZULgZNsJ4J1aLcUF6zC9ZTPkS3cfbM0Zc4ZFTciUn2JgrGEK1nbsknh3178yDU9NSX+jygMh716vQMj0JAlkmsvVEv2dUSzdDW7dDwxiXyrvMY8DAADiwtNh8OzjnPI6bcHmCHQxx6Pljbrt8qdqaD7QmfTifk38NxzP3j19BUo5KH5EEhoOeswPn0mCC1CJkWBRwmg92q55xCCB/5rh4SfPxXOCFlBlvFt19wuPZSoc1Qi58Uoo5c1TB2qMocbsIyyYMW00M8Chvvn Nz6CHzU7 nY0FthQe1r6Wxk4FPpTBYVTXaaF+QiJKa4B3j7zB9X9kVWigOQRLAB0NZtymXYCRli3vUGfSTm80SpbAuZOMdO6Zoo6+UpNf7Ft7Gds6apeqM9h3tOxZbBoAQpL7W9yZrOM3p6ONFDBHynmDQh7wXn4sENaj9dqfB7ZieUq1u+e7uk15eLzqsfyrqEswKC3O6BnaW+WD7STZh4Kx5yxkdmnwn+x8hvIgspqQmL+2rPj3+ME+hK5F4gUciddJr7LvvN0xKil9NPNSJLasujxW0GY+FKw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuck Lever This line in offset_iterate_dir(): ctx->pos = dentry2offset(dentry) + 1; assumes that the next child entry has an offset value that is greater than the current child entry. Since directory offsets are actually cookies, this heuristic is not always correct. We have tested the current code with a limited offset range to see if this is an operational problem. It doesn't seem to be, but doing a "+ 1" on what is supposed to be an opaque cookie is very likely wrong and brittle. Instead of using the mtree to emit entries in the order of their offset values, use it only to map the initial ctx->pos to a starting entry. Then use the directory's d_children list, which is already maintained by the dcache, to find the next child to emit, as the simple cursor-based implementation still does. Signed-off-by: Chuck Lever --- fs/libfs.c | 95 ++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 74 insertions(+), 21 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 0deff5390abb..2616421bbe0e 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -241,9 +241,9 @@ const struct inode_operations simple_dir_inode_operations = { }; EXPORT_SYMBOL(simple_dir_inode_operations); -/* 0 is '.', 1 is '..', so always start with offset 2 or more */ enum { - DIR_OFFSET_MIN = 2, + DIR_OFFSET_FIRST = 2, /* seek to the first real entry */ + DIR_OFFSET_MIN = 3, /* minimum allocated offset value */ }; static void offset_set(struct dentry *dentry, long offset) @@ -507,19 +507,53 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) return vfs_setpos(file, offset, LONG_MAX); } -static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) +/* Cf. find_next_child() */ +static struct dentry *find_next_sibling_locked(struct dentry *dentry) { - MA_STATE(mas, &octx->mt, offset, offset); + struct dentry *found = NULL; + + hlist_for_each_entry_from(dentry, d_sib) { + if (!simple_positive(dentry)) + continue; + spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); + if (simple_positive(dentry)) + found = dget_dlock(dentry); + spin_unlock(&dentry->d_lock); + if (likely(found)) + break; + } + return found; +} + +static noinline_for_stack struct dentry *offset_dir_first(struct file *file) +{ + struct dentry *parent = file->f_path.dentry; + struct dentry *found; + + spin_lock(&parent->d_lock); + found = find_next_sibling_locked(d_first_child(parent)); + spin_unlock(&parent->d_lock); + return found; +} + +static noinline_for_stack struct dentry * +offset_dir_lookup(struct file *file, loff_t offset) +{ + struct dentry *parent = file->f_path.dentry; struct dentry *child, *found = NULL; + struct inode *inode = d_inode(parent); + struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); + + MA_STATE(mas, &octx->mt, offset, offset); rcu_read_lock(); child = mas_find(&mas, LONG_MAX); if (!child) goto out; - spin_lock(&child->d_lock); - if (simple_positive(child)) - found = dget_dlock(child); - spin_unlock(&child->d_lock); + + spin_lock(&parent->d_lock); + found = find_next_sibling_locked(child); + spin_unlock(&parent->d_lock); out: rcu_read_unlock(); return found; @@ -534,29 +568,48 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) inode->i_ino, fs_umode_to_dtype(inode->i_mode)); } +static struct dentry *offset_dir_next(struct dentry *child) +{ + struct dentry *parent = child->d_parent; + struct dentry *found; + + spin_lock(&parent->d_lock); + found = find_next_sibling_locked(d_next_sibling(child)); + spin_unlock(&parent->d_lock); + return found; +} + static void offset_iterate_dir(struct file *file, struct dir_context *ctx) { - struct dentry *dir = file->f_path.dentry; - struct inode *inode = d_inode(dir); - struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); - struct dentry *dentry; + struct dentry *dentry, *next = NULL; + + if (ctx->pos == DIR_OFFSET_FIRST) + dentry = offset_dir_first(file); + else + dentry = offset_dir_lookup(file, ctx->pos); + if (!dentry) { + /* ->private_data is protected by f_pos_lock */ + offset_set_eod(file); + return; + } while (true) { - dentry = offset_find_next(octx, ctx->pos); - if (!dentry) { - /* ->private_data is protected by f_pos_lock */ - offset_set_eod(file); - return; - } - if (!offset_dir_emit(ctx, dentry)) { - dput(dentry); + ctx->pos = dentry2offset(dentry); + break; + } + + next = offset_dir_next(dentry); + if (!next) { + offset_set_eod(file); break; } - ctx->pos = dentry2offset(dentry) + 1; dput(dentry); + dentry = next; } + + dput(dentry); } /**