From patchwork Tue Nov 26 15:54:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13886127 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EA8E1D0DEC; Tue, 26 Nov 2024 15:54:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636494; cv=none; b=SipUpyGtzosK1QBLCvm9xfFXcU4jIE/ND/OTvniwb/vrlDVXmXQXShCc0W4fBtgLUbCBR7C9kXrasMImFbZDtZ/V9jWakYevZDdnDgGQFpy10ZemVHBov9BQGlmn8RQyVXT+ch8SJP4QYzm4BG598Kq0lkgmf5c3jzcAPGCDVYw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636494; c=relaxed/simple; bh=uhgh93plhiFV1pHFjFAu6nt5SYYT8N+YCq9mpcqsLXk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FuACKR/E1Hjm/mHhx/bX1tYf9Iv/VEMv3WZs0B2VFL4+zSv8BCOveH2TLiyKQ1DDoCrX/J701whnKLGiCdZ0zA/SjEyrWwLBDVqEdgeVb+wgrOU39P+LfFnPFzrxkasqMdm3fz6fKTBu/gf7HvtgAbGwsDQEGoBDuD/2jt3J8II= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fIAh4dxF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fIAh4dxF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 345EFC4CED0; Tue, 26 Nov 2024 15:54:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732636494; bh=uhgh93plhiFV1pHFjFAu6nt5SYYT8N+YCq9mpcqsLXk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fIAh4dxF+U4xexzovtB/7kPFGF35L5IKVEm3YrWxmpi09Xhg+s3Q6xBHA+BEOKxM6 01c8Ahzm2uaKRidP1kBlNetkAHbGcVjfgSZQcMQsSVcnkzT0DMkz1+VBNapdeMsbVW ggHJdbDQPhH5XARuL9TLITKoR450QibdFfJ4W/Xn+OhcLvlvPGY8eNqgc66lmwXcem RlTQItF+ZF8mUqqa7lJxN0dnHSjampVwpgtBVUukrEZkV10plMr9XHU9RyWafi/DGu Mmqs1R8Ie5cshvo6L+ztqqUnOPCJWlkOhaXYEp1Aaci3D1uCUKchh4g1N9J1OOlxuO KmFZcPfEyEVEg== From: cel@kernel.org To: Hugh Dickens , Christian Brauner , Al Viro Cc: , , yukuai3@huawei.com, yangerkun@huaweicloud.com, Chuck Lever , stable@vger.kernel.org Subject: [RFC PATCH v2 1/5] libfs: Return ENOSPC when the directory offset range is exhausted Date: Tue, 26 Nov 2024 10:54:40 -0500 Message-ID: <20241126155444.2556-2-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241126155444.2556-1-cel@kernel.org> References: <20241126155444.2556-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever Testing shows that the EBUSY error return from mtree_alloc_cyclic() leaks into user space. The ERRORS section of "man creat(2)" says: > EBUSY O_EXCL was specified in flags and pathname refers > to a block device that is in use by the system > (e.g., it is mounted). ENOSPC is closer to what applications expect in this situation. Note that the normal range of simple directory offset values is 2..2^63, so hitting this error is going to be rare to impossible. Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets") Cc: # v6.9+ Signed-off-by: Chuck Lever Reviewed-by: Yang Erkun --- fs/libfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/libfs.c b/fs/libfs.c index 46966fd8bcf9..bf67954b525b 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -288,7 +288,9 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN, LONG_MAX, &octx->next_offset, GFP_KERNEL); - if (ret < 0) + if (unlikely(ret == -EBUSY)) + return -ENOSPC; + if (unlikely(ret < 0)) return ret; offset_set(dentry, offset); From patchwork Tue Nov 26 15:54:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13886128 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74F4A1DA60D for ; Tue, 26 Nov 2024 15:54:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636495; cv=none; b=sGnBPlRxcgYidWV0Ridz3zzLuV/sD20M5dJIQP0BAb+rz3piPSKuo6IIPdscb9iog5BQkToHIbXeHVmd8eHobuUM5tWCEWgd4ETxxNHDxQoVijYZvk0r6XnrRMNF+BBX/HVp6Ff51TynJnrqrhFgYbE4gu2y1SrTuQXsS2bLp8E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636495; c=relaxed/simple; bh=LzHZSIB3uuUqqSEGYbIshH/HG/EMysiZv3qwp3h8skw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=B+dVpFSyj7rRTSfbNvJDt7mHZu8gbj7q0Cetc+eqJlR/f8QE6HA73Kr1sZjaaWLC8z/V5/aaCsON2c3PnX6S7gBC5vl/zG9RaSfaNvlOQj/FdeQc6EPKOAK3auiobr0zmKR0mixn/Vgmv+LK8k12P+qY3mMcBuLycjlAdW83bV8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oOKjw2+F; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oOKjw2+F" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 436F8C4CECF; Tue, 26 Nov 2024 15:54:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732636495; bh=LzHZSIB3uuUqqSEGYbIshH/HG/EMysiZv3qwp3h8skw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oOKjw2+FzS1xUbmtZewdYyxDYIfYCcte0hAkepPYuNfbsifvo+xMGVOkNw7OhlSQ9 OJYlnQN3efHG+cxx1dSKQYCKF6kjnNGnR537DsU3brjQo+rJYRAjHQLSdAcUnAOzGS bnkOLvMiElXJSCCz26M2/WPO25rUPQgH/ZJkUvb97X2pKwG/IdIfcnIYdmfFZiPoR4 b9dLVWzC8LUIkMRsQrBOEsBw4MO1u7TQQ/AEOL8M8GVoxPKQYM+tLCM1/WvHxEQMIH ZW85T3EPgmXUfDGOkoa8P7VGRY9J2eqiHsFTKcAhFs4jnu8rgzFmu2l4C/r6GmRQ4a SBBYdKTfKqUKQ== From: cel@kernel.org To: Hugh Dickens , Christian Brauner , Al Viro Cc: , , yukuai3@huawei.com, yangerkun@huaweicloud.com, Chuck Lever Subject: [RFC PATCH v2 2/5] libfs: Check dentry before locking in simple_offset_empty() Date: Tue, 26 Nov 2024 10:54:41 -0500 Message-ID: <20241126155444.2556-3-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241126155444.2556-1-cel@kernel.org> References: <20241126155444.2556-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever Defensive change: Don't try to lock a dentry unless it is positive. Trying to lock a negative entry will generate a refcount underflow. The underflow has been seen only while testing. Fixes: ecba88a3b32d ("libfs: Add simple_offset_empty()") Signed-off-by: Chuck Lever --- fs/libfs.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index bf67954b525b..c88ed15437c7 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -347,13 +347,14 @@ int simple_offset_empty(struct dentry *dentry) index = DIR_OFFSET_MIN; octx = inode->i_op->get_offset_ctx(inode); mt_for_each(&octx->mt, child, index, LONG_MAX) { - spin_lock(&child->d_lock); if (simple_positive(child)) { + spin_lock(&child->d_lock); + if (simple_positive(child)) + ret = 0; spin_unlock(&child->d_lock); - ret = 0; - break; + if (!ret) + break; } - spin_unlock(&child->d_lock); } return ret; From patchwork Tue Nov 26 15:54:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13886129 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 682431DA631 for ; Tue, 26 Nov 2024 15:54:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636496; cv=none; b=rkqalK/n06KfWCthxBStUeRJrlXAyJCiKuJ04t5WQi0WaL8iTe1f2mjDwEqg1QICgFvuyyFnOI8FyI35qocZJv31sJZrvfx2eQfYGUk1AuOC1KxR8LEMWOBBPnuu345Co2FUz01Iz3b4aeIiuKMZYbOtnuyMfteHkncfRgCqB/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636496; c=relaxed/simple; bh=G5Isx5LWQiXPXi2iyTyoRug884HtfgCAxYAVKEyJLvo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WfGsaQGytluyuA0Nmr1FxlvwQSLmPRpiKpQKV02NSh4zyfE83XLJTYoI7qxrjxqju5+ffgEbgD71YT10sIqUNIlNvB1hZAaO9da0XHRSQbBTu8cTDOkHIc3CZZeRUULpgeU7zNu8NDjZNSwqee6O+zbF+TSHBYi4LkTjeZbNiPw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=By05q8Qa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="By05q8Qa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3B2F4C4CED2; Tue, 26 Nov 2024 15:54:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732636496; bh=G5Isx5LWQiXPXi2iyTyoRug884HtfgCAxYAVKEyJLvo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=By05q8QaTXtyBsRr33oV31LREOAX2SwvLMusYu3wtZsr58fcPnb2O6q6OVj5NXrg5 K1HCHoPJ9iV1kyOupcUv9D6Ev255H5pDTnKE365/Boa7xD8q261E8HpblyqRdUFGXN 6S1Eq4+P6nDCAoOIubvwlhe7PAn+gCCxgqdSx96oVZgn4WfXaNavV4Y5a6wZ+Ef2mg vQCS0Sqr0O1LnmdolPkaIKmcFv9f/BtAUfjFLLQkVEtni3j0/8iy2pkb12EV9Fdy9h 2y+NhHSiEobBm0DiCwuxEhfElV5uaHugC2fdXausDp9XXrpmj9Ob5/PLxCqr1E2ofB FlWAyDTdmQJtA== From: cel@kernel.org To: Hugh Dickens , Christian Brauner , Al Viro Cc: , , yukuai3@huawei.com, yangerkun@huaweicloud.com, Chuck Lever Subject: [RFC PATCH v2 3/5] Revert "libfs: fix infinite directory reads for offset dir" Date: Tue, 26 Nov 2024 10:54:42 -0500 Message-ID: <20241126155444.2556-4-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241126155444.2556-1-cel@kernel.org> References: <20241126155444.2556-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever Using octx->next_offset to determine the newest entries works only because the offset value range is 63-bits. If an offset were to wrap, existing entries are no longer visible to readdir because offset_readdir() stops listing entries once an entry's offset is larger than octx->next_offset. Revert this fix for the infinite readdir loop bug to make room for a better fix. Signed-off-by: Chuck Lever --- fs/libfs.c | 35 +++++++++++------------------------ 1 file changed, 11 insertions(+), 24 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index c88ed15437c7..e6c46b13fc71 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -453,14 +453,6 @@ void simple_offset_destroy(struct offset_ctx *octx) mtree_destroy(&octx->mt); } -static int offset_dir_open(struct inode *inode, struct file *file) -{ - struct offset_ctx *ctx = inode->i_op->get_offset_ctx(inode); - - file->private_data = (void *)ctx->next_offset; - return 0; -} - /** * offset_dir_llseek - Advance the read position of a directory descriptor * @file: an open directory whose position is to be updated @@ -474,9 +466,6 @@ static int offset_dir_open(struct inode *inode, struct file *file) */ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) { - struct inode *inode = file->f_inode; - struct offset_ctx *ctx = inode->i_op->get_offset_ctx(inode); - switch (whence) { case SEEK_CUR: offset += file->f_pos; @@ -490,8 +479,7 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) } /* In this case, ->private_data is protected by f_pos_lock */ - if (!offset) - file->private_data = (void *)ctx->next_offset; + file->private_data = NULL; return vfs_setpos(file, offset, LONG_MAX); } @@ -522,7 +510,7 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) inode->i_ino, fs_umode_to_dtype(inode->i_mode)); } -static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx, long last_index) +static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) { struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); struct dentry *dentry; @@ -530,21 +518,17 @@ static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx, lon while (true) { dentry = offset_find_next(octx, ctx->pos); if (!dentry) - return; - - if (dentry2offset(dentry) >= last_index) { - dput(dentry); - return; - } + return ERR_PTR(-ENOENT); if (!offset_dir_emit(ctx, dentry)) { dput(dentry); - return; + break; } ctx->pos = dentry2offset(dentry) + 1; dput(dentry); } + return NULL; } /** @@ -571,19 +555,22 @@ static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx, lon static int offset_readdir(struct file *file, struct dir_context *ctx) { struct dentry *dir = file->f_path.dentry; - long last_index = (long)file->private_data; lockdep_assert_held(&d_inode(dir)->i_rwsem); if (!dir_emit_dots(file, ctx)) return 0; - offset_iterate_dir(d_inode(dir), ctx, last_index); + /* In this case, ->private_data is protected by f_pos_lock */ + if (ctx->pos == DIR_OFFSET_MIN) + file->private_data = NULL; + else if (file->private_data == ERR_PTR(-ENOENT)) + return 0; + file->private_data = offset_iterate_dir(d_inode(dir), ctx); return 0; } const struct file_operations simple_offset_dir_operations = { - .open = offset_dir_open, .llseek = offset_dir_llseek, .iterate_shared = offset_readdir, .read = generic_read_dir, From patchwork Tue Nov 26 15:54:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13886130 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81F7D1DAC90 for ; Tue, 26 Nov 2024 15:54:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636497; cv=none; b=Ib+bHzlRxst1aHYdMg61XvTav9HlLRbNequ5zhOiQr0p1Y2kNQy7F/sgNt3VDYKJW2ChhgFYa/sfQZWBxBALl74c5f/I0mqpOjvH1qUise7ZABBiw5KUg/9GYnjPiepoV+Yh6CI4Onn/vZqGYfTVxot/7Q07SbU20/UqwD3kuOI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636497; c=relaxed/simple; bh=tMz/47mxpsuDcyrl6vBlEBAasdrsXvapQPHriWB0vcU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MKz2+praieyHTE+uy13S6ydoAy1uX/iH7WsslJPfxBkZro48yPhWl0lNd8N/8gRMzPhOEPY1vso83MxDrhTfIHrpSz2SAz6MoRKO2AZY05+4sbXd8VpWWGiyB9UuaPsLF1Lmw2nU/oru9tHGILJNlXtJ4IUpthxzb62Tnt7aqO8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SsK6NV8j; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SsK6NV8j" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3AA74C4CED3; Tue, 26 Nov 2024 15:54:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732636497; bh=tMz/47mxpsuDcyrl6vBlEBAasdrsXvapQPHriWB0vcU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SsK6NV8jA66kTudkpuY9LMmdM8pQatvG2B2mhSyAac4M/AEOF25/vatqmM9Falwci yj7mYfpwIf2l+hqSnEoF+aP5ks6Sa9qzjw4ccEmVHYRy4GL0VFDkDcsJNEqWotFeud presvLu/aV0ui65cWMC7eg09NrL7cNa2Nq2ua9M6mxyMZYUPfoiJYim0q/GAjg3Md5 VO5R3CDpPk9qijalnRWIXExd8XetVGsblXaetFQ8LqpSTHEuyY/f27mhUzno/q2yQp FjMxtZUy+R4esvZFh/CugOAt1+qwdk6nNVWCiKHQe0RHfsWh95kIV/HLyZBzAJto/H HCyI48l739NTg== From: cel@kernel.org To: Hugh Dickens , Christian Brauner , Al Viro Cc: , , yukuai3@huawei.com, yangerkun@huaweicloud.com, Chuck Lever Subject: [RFC PATCH v2 4/5] libfs: Refactor end-of-directory detection for simple_offset directories Date: Tue, 26 Nov 2024 10:54:43 -0500 Message-ID: <20241126155444.2556-5-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241126155444.2556-1-cel@kernel.org> References: <20241126155444.2556-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever This mechanism seems have been misunderstood more than once. Make the code more self-documentary. Signed-off-by: Chuck Lever --- fs/libfs.c | 54 ++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 42 insertions(+), 12 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index e6c46b13fc71..be641a84047a 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -453,6 +453,34 @@ void simple_offset_destroy(struct offset_ctx *octx) mtree_destroy(&octx->mt); } +static void offset_set_eod(struct file *file) +{ + file->private_data = ERR_PTR(-ENOENT); +} + +static void offset_clear_eod(struct file *file) +{ + file->private_data = NULL; +} + +static bool offset_at_eod(struct file *file) +{ + return file->private_data == ERR_PTR(-ENOENT); +} + +/** + * offset_dir_open - Open a directory descriptor + * @inode: directory to be opened + * @file: struct file to instantiate + * + * Returns zero on success, or a negative errno value. + */ +static int offset_dir_open(struct inode *inode, struct file *file) +{ + offset_clear_eod(file); + return 0; +} + /** * offset_dir_llseek - Advance the read position of a directory descriptor * @file: an open directory whose position is to be updated @@ -478,8 +506,8 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) return -EINVAL; } - /* In this case, ->private_data is protected by f_pos_lock */ - file->private_data = NULL; + /* ->private_data is protected by f_pos_lock */ + offset_clear_eod(file); return vfs_setpos(file, offset, LONG_MAX); } @@ -510,15 +538,20 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) inode->i_ino, fs_umode_to_dtype(inode->i_mode)); } -static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) +static void offset_iterate_dir(struct file *file, struct dir_context *ctx) { + struct dentry *dir = file->f_path.dentry; + struct inode *inode = d_inode(dir); struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); struct dentry *dentry; while (true) { dentry = offset_find_next(octx, ctx->pos); - if (!dentry) - return ERR_PTR(-ENOENT); + if (!dentry) { + /* ->private_data is protected by f_pos_lock */ + offset_set_eod(file); + return; + } if (!offset_dir_emit(ctx, dentry)) { dput(dentry); @@ -528,7 +561,6 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) ctx->pos = dentry2offset(dentry) + 1; dput(dentry); } - return NULL; } /** @@ -561,16 +593,14 @@ static int offset_readdir(struct file *file, struct dir_context *ctx) if (!dir_emit_dots(file, ctx)) return 0; - /* In this case, ->private_data is protected by f_pos_lock */ - if (ctx->pos == DIR_OFFSET_MIN) - file->private_data = NULL; - else if (file->private_data == ERR_PTR(-ENOENT)) - return 0; - file->private_data = offset_iterate_dir(d_inode(dir), ctx); + /* ->private_data is protected by f_pos_lock */ + if (!offset_at_eod(file)) + offset_iterate_dir(file, ctx); return 0; } const struct file_operations simple_offset_dir_operations = { + .open = offset_dir_open, .llseek = offset_dir_llseek, .iterate_shared = offset_readdir, .read = generic_read_dir, From patchwork Tue Nov 26 15:54:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13886131 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C87C1DA61B for ; Tue, 26 Nov 2024 15:54:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636498; cv=none; b=rnMhQB9mrPvFFcbZ69bS5BmdyMk3ciS7nwVraREjiEDVx5XC/8hvdlwuYh+lDgg/OFWjiYDw0T5lND6cDgPNS3+bOWOQSIvvnTtz3ToSKlLOSjigV50JBUo9LqZTQ+ZRXkmQYst72cha5KqmiS3wH4YKTqn1W3y89TuA3IOnnbY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732636498; c=relaxed/simple; bh=l4XCZVlwO/32TPnzhT69mYkOXGa3rUk3rjWN4U/BZdE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WsZmWM/ZfA9482I7NUmoW/ZjFKjO2xZYl6FyqwqybLUetC5wCn3ris4SbPsg6ttUhQArBe9xXkakXe1Rk3hMkEv4sDTJCgpaLOl2kyI3g0dZBxsW1181nNZWLYUs+NAHRRhinCX5u9H3vsKcsOWFxI51ZJtoFWazofpoxUxqJ3c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aukl0fCD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aukl0fCD" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3870BC4CED7; Tue, 26 Nov 2024 15:54:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732636498; bh=l4XCZVlwO/32TPnzhT69mYkOXGa3rUk3rjWN4U/BZdE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aukl0fCDiCZH1mltSMfE078b+QiSx1oYrncyG7BPLLeBrJpI/mYBPvRwqP1VOvqOE 4aTKsTb5LJePJEXAnF7uz+znj+PvK2gAXJF6PcrL4fsAqfdVwHAR3Ps38LP1lCeO/v AaMJZKZaQJoXpdnpHVK0hvWY0RNqxnymlHPEqP8d9JLHUsZvdUWdBioiU/veJtU/hw kc7H44lmJRBVOsJm3BZWhBscorvatt1RzkelvzMeMSfZteV/eE8AphXPhC/j8ShCkn wonsfhF5alWIOvuXOnRfGI0/r5d7g/iEDUc2lstZe411KI0GGiaX1bfHbZTcQYo5jQ uun6vWQx5mkwg== From: cel@kernel.org To: Hugh Dickens , Christian Brauner , Al Viro Cc: , , yukuai3@huawei.com, yangerkun@huaweicloud.com, Chuck Lever Subject: [RFC PATCH v2 5/5] libfs: Refactor offset_iterate_dir() Date: Tue, 26 Nov 2024 10:54:44 -0500 Message-ID: <20241126155444.2556-6-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241126155444.2556-1-cel@kernel.org> References: <20241126155444.2556-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever This line in offset_iterate_dir(): ctx->pos = dentry2offset(dentry) + 1; assumes that the next child entry has an offset value that is greater than the current child entry. Since directory offsets are actually cookies, this heuristic is not always correct. We have tested the current code with a limited offset range to see if this is an operational problem. It doesn't seem to be, but doing a "+ 1" on what is supposed to be an opaque cookie is very likely wrong and brittle. Instead of using the mtree to emit entries in the order of their offset values, use it only to map the initial ctx->pos to a starting entry. Then use the directory's d_children list, which is already maintained by the dcache, to find the next child to emit, as the simple cursor-based implementation still does. Signed-off-by: Chuck Lever --- fs/libfs.c | 89 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 71 insertions(+), 18 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index be641a84047a..862b4203d389 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -241,9 +241,9 @@ const struct inode_operations simple_dir_inode_operations = { }; EXPORT_SYMBOL(simple_dir_inode_operations); -/* 0 is '.', 1 is '..', so always start with offset 2 or more */ enum { - DIR_OFFSET_MIN = 2, + DIR_OFFSET_FIRST = 2, /* seek to the first real entry */ + DIR_OFFSET_MIN = 3, /* lowest real offset value */ }; static void offset_set(struct dentry *dentry, long offset) @@ -267,7 +267,7 @@ void simple_offset_init(struct offset_ctx *octx) { mt_init_flags(&octx->mt, MT_FLAGS_ALLOC_RANGE); lockdep_set_class(&octx->mt.ma_lock, &simple_offset_lock_class); - octx->next_offset = DIR_OFFSET_MIN; + octx->next_offset = 0; } /** @@ -511,10 +511,30 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) return vfs_setpos(file, offset, LONG_MAX); } -static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) +static noinline_for_stack struct dentry *offset_dir_first(struct file *file) { + struct dentry *child, *found = NULL, *dir = file->f_path.dentry; + + spin_lock(&dir->d_lock); + child = d_first_child(dir); + if (child && simple_positive(child)) { + spin_lock_nested(&child->d_lock, DENTRY_D_LOCK_NESTED); + if (simple_positive(child)) + found = dget_dlock(child); + spin_unlock(&child->d_lock); + } + spin_unlock(&dir->d_lock); + return found; +} + +static noinline_for_stack struct dentry * +offset_dir_lookup(struct file *file, loff_t offset) +{ + struct dentry *child, *found = NULL, *dir = file->f_path.dentry; + struct inode *inode = d_inode(dir); + struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); + MA_STATE(mas, &octx->mt, offset, offset); - struct dentry *child, *found = NULL; rcu_read_lock(); child = mas_find(&mas, LONG_MAX); @@ -538,29 +558,62 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) inode->i_ino, fs_umode_to_dtype(inode->i_mode)); } +/* + * This is find_next_child() without the dput() tail. We might + * combine offset_dir_next() and find_next_child(). + */ +static struct dentry *offset_dir_next(struct dentry *dentry) +{ + struct dentry *parent = dentry->d_parent; + struct dentry *d, *found = NULL; + + spin_lock(&parent->d_lock); + d = d_next_sibling(dentry); + hlist_for_each_entry_from(d, d_sib) { + if (simple_positive(d)) { + spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED); + if (simple_positive(d)) + found = dget_dlock(d); + spin_unlock(&d->d_lock); + if (likely(found)) + break; + } + } + spin_unlock(&parent->d_lock); + return found; +} + static void offset_iterate_dir(struct file *file, struct dir_context *ctx) { - struct dentry *dir = file->f_path.dentry; - struct inode *inode = d_inode(dir); - struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); - struct dentry *dentry; + struct dentry *dentry, *next = NULL; + + if (ctx->pos == DIR_OFFSET_FIRST) + dentry = offset_dir_first(file); + else + dentry = offset_dir_lookup(file, ctx->pos); + if (!dentry) { + /* ->private_data is protected by f_pos_lock */ + offset_set_eod(file); + return; + } while (true) { - dentry = offset_find_next(octx, ctx->pos); - if (!dentry) { - /* ->private_data is protected by f_pos_lock */ - offset_set_eod(file); - return; - } - if (!offset_dir_emit(ctx, dentry)) { - dput(dentry); + ctx->pos = dentry2offset(dentry); + break; + } + + next = offset_dir_next(dentry); + if (!next) { + offset_set_eod(file); break; } - ctx->pos = dentry2offset(dentry) + 1; dput(dentry); + dentry = next; } + + dput(dentry); } /**