From patchwork Fri Aug 2 21:45:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13752039 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDFE113210A; Fri, 2 Aug 2024 21:45:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635123; cv=none; b=eKt+urKv6ppS4Dy/30EItipDq0hEJWhqhXJGrT7TYawcM+Ds4Wj0Xk7PZurqZQYgGCZPhirTEv1hVTnRDfDO/oc/253jWpTH6KkAwp4Xd77+NcRO7e/b7Z00gWdxmNEnjc5iqTHsR6OKFBzVu4cusR/AlZg2TIjMHfnOhJk1JMA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635123; c=relaxed/simple; bh=6qKot8nTK1kvBUPPUsJ5bWUIp/afDZuPfPxM/tjbhpU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=V48Z+Q1V9FUavR6Qlk6wVJodM9+kPHLZr0q51DUMWb4Cg18qbrSkEwI1a0R8xBlMzTo5R/g/v/vgwFxY+djwtrcuf+uMVxjU+wQjX9q/64I4BMdk8yMHDMWIYlIkPwUg854yR57USL0nGuxQ3J0CkKIvhAvqGWCh7tdhzxrrFJA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BWKEWzVP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BWKEWzVP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A6B7EC4AF15; Fri, 2 Aug 2024 21:45:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722635123; bh=6qKot8nTK1kvBUPPUsJ5bWUIp/afDZuPfPxM/tjbhpU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=BWKEWzVP/ZIFtrn5ML4cPyXOuqfvQqCfOtPpOMySN4oAl6Yp0fUUEDgxqYVi7946f T3pi24I9YcnmaMjkjPqZqdTZTM4I6RuRu3Z4EABVlNoJTOz9YVinUPotfLfX49A9T1 z3FbiGRiwhrUXeY4XfZVCmLHq75thjrB5Q64GU9CoC/R9zacERfcsS+b1xvzusGLFJ sJHfllE0xet9HYbxsnNJAWemsDjW9F99XlAVuorGxS6uBEPoCppfqh5UuebELMfo0h rE48/2ExLf2Do+RsKCVpHVTJqGXaM3qkr5rtFU2fhmuVVejDuyZ/JGngRSoB5SxmMo i+Ontze2ziRhA== From: Jeff Layton Date: Fri, 02 Aug 2024 17:45:02 -0400 Subject: [PATCH RFC 1/4] fs: remove comment about d_rcu_to_refcount Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240802-openfast-v1-1-a1cff2a33063@kernel.org> References: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> In-Reply-To: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton Cc: Josef Bacik , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=794; i=jlayton@kernel.org; h=from:subject:message-id; bh=6qKot8nTK1kvBUPPUsJ5bWUIp/afDZuPfPxM/tjbhpU=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBmrVNx1Qd5+YGDWT97thq4Zi5gvv7eYPURZHz09 VJMSSllcD+JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZq1TcQAKCRAADmhBGVaC FVqQEADVu9TOnBnlwMuqcTukb1la1Jl6wrXrykpp1D4dAXQpt7PgGPee1AXM3B5UVsV5HOcFnaI kG+LXKjI8NhtkQxhguH07LUImd9qwjXW+a/G1U1sfZpfY575getD4BRR6WrJUTm5z9kj1jVrSw2 ApULDwDYLuefh8s/+XKQfbwvbNNz/hg5kJpP+5QN4cSHYfavmy05+lMN8yP+Cq2g7fKHdqfEjo+ jI+zIy1Nxq5qpOH+nBEL3A0tKDlLykRG3GBrPlLz3PKl5DBb0VnqqVQlBy+s++IdZcSJYVW0b/O sMIimOP0THNIax3h3Eptur5JVyk0fxVp+SFIJMgjjEYMV6DU2p9XGZa6T3DOqi7S3fDuxPLzIWx FY9Z0JLHEGflw2DxpQWmO7ysbC8rfHhAn2KVA3iFKqU8gAIiErSx41IUoJCqvEEkPD3jQHqsuDS knthIiJI61WmnxcsC+u7d+jUn9VgzROD2OiJUqGmo960Ee4ASAor2Y52ieTkYQL/bVBuaS//rrP /YKQpNZH/KY312LiB5bhD9+rA+46OEeUSFCPs0Nd1+fTfEVdolsLcEj9G/8y+1zdbT3SMEQIDkn SOL4MpQEf7iS65bg4RrzaBdt9W19Z96+wJetMaxi2AZgPcBja2Qt7zlMsI1pB2JE8nRCSdGOUKo o8zvPnsOboY8tyw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 This function no longer exists. Signed-off-by: Jeff Layton --- fs/dcache.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 4c144519aa70..474c5205a3db 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2160,9 +2160,6 @@ static noinline struct dentry *__d_lookup_rcu_op_compare( * without taking d_lock and checking d_seq sequence count against @seq * returned here. * - * A refcount may be taken on the found dentry with the d_rcu_to_refcount - * function. - * * Alternatively, __d_lookup_rcu may be called again to look up the child of * the returned dentry, so long as its parent's seqlock is checked after the * child is looked up. Thus, an interlocking stepping of sequence lock checks From patchwork Fri Aug 2 21:45:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13752040 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87C9313D510; Fri, 2 Aug 2024 21:45:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635125; cv=none; b=KLZNT5LdyUYur+tyetXR/ddggXZzWfXjqFp4R6RkqEJi2oEuFSteZg2JYvOl6whgrHYo5l1mzNlDgXiilav3TeHxirsjE/iVc1CBjsO15bzTPLgGdEmcYkUaLkTes5TAK7rg7SB2ISJnnuGEkAwXbvDUflFi+W30R2pUks+nWi0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635125; c=relaxed/simple; bh=eFIVIaqhqy20S7ZAovlsajPhxGNdqzxpPG9//wXU0WA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dE34yaF4HPrx3SRNLgY5D3HvXCL4KJB/DDFOVcc6yZeAHXvEoSzD6Abk69BuABirAjOJLenjS6N4uwoJ+WXFRmKRzabHckv9bmf0jJjfLN/+FAMq5gJMR01AP2XIk2N7+rB67585rMlz0OHNOj6Oi0DoZU+XppBg17CnoYLXyLk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BZmrOL88; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BZmrOL88" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92916C32782; Fri, 2 Aug 2024 21:45:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722635124; bh=eFIVIaqhqy20S7ZAovlsajPhxGNdqzxpPG9//wXU0WA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=BZmrOL88Z9LfmDG/8mfdfCNLwM75SVTr/gR8Pd5txtPdGPq2jUmspqmydfU++pOgZ o7fUdUL1+QKQNFoE3Z6G8HlJrnoNMhc09VaWcJxcNAjcAW4vgg4mgLzWpuN6hYjcr9 oWH0Dy7sdsP3DfZoLhjxqDcxRzkSqXhVDZE/aj8Mqp/4De1SbL9+O0yZ6YU/u0nANO FryXeOqlbogtSE1yAu3jneUmBz5QvfwG4UgzaVRBBr/I3yxpxX0O9idiSa1sMMgxJg UklSLXs9fydVUr+CDaZqzdyGx7RBxtRFKqMw1ZsufgJuM3SXX5k/5/ryOhjKu9+7v9 vPcTh5BhBZYcg== From: Jeff Layton Date: Fri, 02 Aug 2024 17:45:03 -0400 Subject: [PATCH RFC 2/4] fs: add a kerneldoc header over lookup_fast Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240802-openfast-v1-2-a1cff2a33063@kernel.org> References: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> In-Reply-To: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton Cc: Josef Bacik , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1242; i=jlayton@kernel.org; h=from:subject:message-id; bh=eFIVIaqhqy20S7ZAovlsajPhxGNdqzxpPG9//wXU0WA=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBmrVNxYuWJj+TSUz08yvtyJg7ibZtNis2OxnxCT 4hTbU2e3R6JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZq1TcQAKCRAADmhBGVaC FR9iD/9Vg2jCsCKRseyOmQ+CeQbzqdkJDfzKereToztyPsCGuI2ZjKFHpsK/ZYGsMqGIVnxgZ13 a1shwfqNNGu51d/Vq8FX1x0Jd+aWLMheJbNw9sPf1YDV2igbd+8Wu6kWB+85tTv0VOAUd9RUvNr AIDVkV8mJppQ2d3skNgQzTAyxtT34eLFrmlwdNhQ/XXaEvAMlS6KYiJYxR8N1cUIVubpBmvZ8vd K7bbyDheVMgw9p1ynHipib+Sq9PybH1yx9niHqKkwY9jNBH6HiPVo8bkuL38HvvyWKRs3q9lPaR N8uY14BF9H31y1fiMDnX56jIOxxoKzndTyR0YdZnkr7CwdXThJAj1I1E+ocbxjFBJR80WIVcefJ kDEzHEl5HFnwKEmKsfiqaumyMRAAycotgSSfO8GEIJKgDvws/lZNmntJcwrPrqgv5+H9/tVlIl3 6AlOIeXy3rZQKDmA1TAFitLfe4E4/WMCfCoBQBSdSrgbxA+4w/JUZBZluxjUVWdblzhOKdLCK39 T3BAvmEzi8+NvMe2hSjMaytXvGeWxBp+uFg4QlD2uJSaFTGdm/lUA5jCc2qtSZaNUhkqwhGV45F Iwj/AQP/DTXbk8IYfaOcD3K69gG4ABucc4azNp7A39SLYodhU2VYCyCAfxpT1yBaY4qcweRotwb J7wpugLvsPcLaZw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The lookup_fast helper in fs/namei.c has some subtlety in how dentries are returned. Document them. Signed-off-by: Jeff Layton --- fs/namei.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/namei.c b/fs/namei.c index 1e05a0f3f04d..b9bdb8e6214a 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1613,6 +1613,20 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name, } EXPORT_SYMBOL(lookup_one_qstr_excl); +/** + * lookup_fast - do fast lockless (but racy) lookup of a dentry + * @nd: current nameidata + * + * Do a fast, but racy lookup in the dcache for the given dentry, and + * revalidate it. Returns a valid dentry pointer or NULL if one wasn't + * found. On error, an ERR_PTR will be returned. + * + * If this function returns a valid dentry and the walk is no longer + * lazy, the dentry will carry a reference that must later be put. If + * RCU mode is still in force, then this is not the case and the dentry + * must be legitimized before use. If this returns NULL, then the walk + * will no longer be in RCU mode. + */ static struct dentry *lookup_fast(struct nameidata *nd) { struct dentry *dentry, *parent = nd->path.dentry; From patchwork Fri Aug 2 21:45:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13752041 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87C4B13D504; Fri, 2 Aug 2024 21:45:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635125; cv=none; b=A0dRF2cHoqPf9XkIEInNwIz3JSMw3ToRefuiQWlOBL6BnJlyxLDTLnlt3GWD8BS0eqTzvLnl59jQKDxZbeYZtRlUTQ+v0Gdbpd4WHIErrhiQMIKuQtBrGCz9NmYk312Z57kF+kmA5v7Y94kNElzg7ZLDA95ABQRZcna82FcAqWY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635125; c=relaxed/simple; bh=9E9fR9YI/6HM/keRPtqoAkea05ZS6v/bw14Wsy6B4E4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VGjCidROfqIK7/X3cjHNxvcF6rGxaek8QPd2d/nxkyESCNeYmtI8DFh3idUzYGRTdXxq6JENqRXcJEVLLSpBmDOQtAcUWnHRFy9qciKkdn08dDA4Wfzm45kRBpVd6J4yr0sL5C/Pdauq81PNILBG9Kks9t7xA1WkZ+i/9E++BkA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MGv1WaCx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MGv1WaCx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 803AEC4AF10; Fri, 2 Aug 2024 21:45:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722635125; bh=9E9fR9YI/6HM/keRPtqoAkea05ZS6v/bw14Wsy6B4E4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=MGv1WaCxaf2gT1lRVaz6IRpZZutNCuP4mM9VIqT4aPtGILTlcPt/hl7rmIyipx4lD 7tfLldpBfZtFg6jI0Xbr/nCbwPTiKVUWxkfzvqfKQC5PueRc9t9KIojjzTpQT4FF1N hN5J2DpHZAHZbk9vdb0xoLe9YlBH0CGw8ctFviZrGF2LnVODPgKEIknnqWcDFMoTxe Bqj9wzWA7JkmGGUASKynfrmpmSUnAMw1itpzwJICP+q8MyQYMxEEZrZD8/lUI3f0lU ZHznyM/QlmVGZNuXmiNnQlD1hBu7CsK6HaP0nqy0oj679rhb52WVUNtAIAzgPw0p9a 6QgMf3d8Wd9zA== From: Jeff Layton Date: Fri, 02 Aug 2024 17:45:04 -0400 Subject: [PATCH RFC 3/4] lockref: rework CMPXCHG_LOOP to handle contention better Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240802-openfast-v1-3-a1cff2a33063@kernel.org> References: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> In-Reply-To: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton Cc: Josef Bacik , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5030; i=jlayton@kernel.org; h=from:subject:message-id; bh=9E9fR9YI/6HM/keRPtqoAkea05ZS6v/bw14Wsy6B4E4=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBmrVNx4uFmosPNS1siz4exI/ZfMzFBWC3I64yDO gPHgx9OtnuJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZq1TcQAKCRAADmhBGVaC FQJOEADPeMaaFdNdsEzwQU3XPKWqf1DE8A7TajeLSe+KhYbKV2o40Q6AHMOQ+R2FMF01H9T2DFD HgeffqIomj0W8mUODdbWRd2zvtSP5bc2ofG1nZ4s7gHnde9zZL6hDe/arIJ821uGKVkzpnxd4Tp TpWZI5cqLM7tBHWoIYtdGQ1Xk4uEDh6JlegJdXJQ1EcDI08IhYflyCCwneFDQhDvZblB1fwCMcB /oqaPNMEWo43ilvy6u21JJwceDhC+zQoOA1AysMhtuedlFjuicXQ1DKREAKsiBWtbI2ExWGXNGn yYPm9PYp2YgVP3MZeTWsvdguv2BwzWs3d+jdKvBHcGN9fCPuBV23yrF3uS8hRgcJYPg2nY2+oxH JYeAcUZYk6DxvkC48tg9CQDEufEK1OojN9wVYiWjlDhzlVZHOgbjsDpENvu1vJryKahKYCWjkZy YrAQjI+VAaZlyyZnXEbexo0QoFyerTAexxxlsQ5fafSfg8DO/7odYfL6oXKnFn4nSvKXkqJJnNQ ngugncp060mBTJBBPgY3KDUwv4o70XNMLAzlMS2ktYMArCmuMTT4J2wk0WsnktRqDlhpI6NM4iJ nRMWB9yu2v8Mxl82MTEDd8g8rh5y3hbDcmIeH3i9UtUZ0in/eULfC/oEJXvLlcRa4DMMs9P8LVQ wShaph306be5EWg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 In a later patch, we want to change the open(..., O_CREAT) codepath to avoid taking the inode->i_rwsem for write when the dentry already exists. When we tested that initially, the performance devolved significantly due to contention for the parent's d_lockref spinlock. There are two problems with lockrefs today: First, once any concurrent task takes the spinlock, they all end up taking the spinlock, which is much more costly than a single cmpxchg operation. The second problem is that once any task fails to cmpxchg 100 times, it falls back to the spinlock. The upshot there is that even moderate contention can cause a fallback to serialized spinlocking, which worsens performance. This patch changes CMPXCHG_LOOP in 2 ways: First, change the loop to spin instead of falling back to a locked codepath when the spinlock is held. Once the lock is released, allow the task to continue trying its cmpxchg loop as before instead of taking the lock. Second, don't allow the cmpxchg loop to give up after 100 retries. Just continue infinitely. This greatly reduces contention on the lockref when there are large numbers of concurrent increments and decrements occurring. Signed-off-by: Jeff Layton --- lib/lockref.c | 85 ++++++++++++++++++++++------------------------------------- 1 file changed, 32 insertions(+), 53 deletions(-) diff --git a/lib/lockref.c b/lib/lockref.c index 2afe4c5d8919..b76941043fe9 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -8,22 +8,25 @@ * Note that the "cmpxchg()" reloads the "old" value for the * failure case. */ -#define CMPXCHG_LOOP(CODE, SUCCESS) do { \ - int retry = 100; \ - struct lockref old; \ - BUILD_BUG_ON(sizeof(old) != 8); \ - old.lock_count = READ_ONCE(lockref->lock_count); \ - while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \ - struct lockref new = old; \ - CODE \ - if (likely(try_cmpxchg64_relaxed(&lockref->lock_count, \ - &old.lock_count, \ - new.lock_count))) { \ - SUCCESS; \ - } \ - if (!--retry) \ - break; \ - } \ +#define CMPXCHG_LOOP(CODE, SUCCESS) do { \ + struct lockref old; \ + BUILD_BUG_ON(sizeof(old) != 8); \ + old.lock_count = READ_ONCE(lockref->lock_count); \ + for (;;) { \ + struct lockref new = old; \ + \ + if (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \ + CODE \ + if (likely(try_cmpxchg64_relaxed(&lockref->lock_count, \ + &old.lock_count, \ + new.lock_count))) { \ + SUCCESS; \ + } \ + } else { \ + cpu_relax(); \ + old.lock_count = READ_ONCE(lockref->lock_count); \ + } \ + } \ } while (0) #else @@ -46,10 +49,8 @@ void lockref_get(struct lockref *lockref) , return; ); - - spin_lock(&lockref->lock); - lockref->count++; - spin_unlock(&lockref->lock); + /* should never get here */ + WARN_ON_ONCE(1); } EXPORT_SYMBOL(lockref_get); @@ -60,8 +61,6 @@ EXPORT_SYMBOL(lockref_get); */ int lockref_get_not_zero(struct lockref *lockref) { - int retval; - CMPXCHG_LOOP( new.count++; if (old.count <= 0) @@ -69,15 +68,9 @@ int lockref_get_not_zero(struct lockref *lockref) , return 1; ); - - spin_lock(&lockref->lock); - retval = 0; - if (lockref->count > 0) { - lockref->count++; - retval = 1; - } - spin_unlock(&lockref->lock); - return retval; + /* should never get here */ + WARN_ON_ONCE(1); + return -1; } EXPORT_SYMBOL(lockref_get_not_zero); @@ -88,8 +81,6 @@ EXPORT_SYMBOL(lockref_get_not_zero); */ int lockref_put_not_zero(struct lockref *lockref) { - int retval; - CMPXCHG_LOOP( new.count--; if (old.count <= 1) @@ -97,15 +88,9 @@ int lockref_put_not_zero(struct lockref *lockref) , return 1; ); - - spin_lock(&lockref->lock); - retval = 0; - if (lockref->count > 1) { - lockref->count--; - retval = 1; - } - spin_unlock(&lockref->lock); - return retval; + /* should never get here */ + WARN_ON_ONCE(1); + return -1; } EXPORT_SYMBOL(lockref_put_not_zero); @@ -125,6 +110,8 @@ int lockref_put_return(struct lockref *lockref) , return new.count; ); + /* should never get here */ + WARN_ON_ONCE(1); return -1; } EXPORT_SYMBOL(lockref_put_return); @@ -171,8 +158,6 @@ EXPORT_SYMBOL(lockref_mark_dead); */ int lockref_get_not_dead(struct lockref *lockref) { - int retval; - CMPXCHG_LOOP( new.count++; if (old.count < 0) @@ -180,14 +165,8 @@ int lockref_get_not_dead(struct lockref *lockref) , return 1; ); - - spin_lock(&lockref->lock); - retval = 0; - if (lockref->count >= 0) { - lockref->count++; - retval = 1; - } - spin_unlock(&lockref->lock); - return retval; + /* should never get here */ + WARN_ON_ONCE(1); + return -1; } EXPORT_SYMBOL(lockref_get_not_dead); From patchwork Fri Aug 2 21:45:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13752042 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4084513DDBA; Fri, 2 Aug 2024 21:45:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635126; cv=none; b=EYm6y+uPUw9flxW+BgWvqnEI/uHTgjQPxCg6YGpyYLAfdQfIalrZqRR0lE2Lr1gi7FP+Xcy1AvUyC6fI5+L6sSCFmjOrQdSL7M3SGuoBXGYjUpiAVMWmDKvFReQG4FJQ2AB+i2m7ZJHLpLUBE+orh+q4OhU6mFLjSlo7in8l6Po= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722635126; c=relaxed/simple; bh=PJfFAxXqJiTj0TbDik95zZrZzvqxpfVymiA5KUdqD9k=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=F/ltFYp63MD6XeBEetnqxOHUQg42GAFIiOsnp9ZgWrq4X12oYPnQVgb60PnbwrbIJskRuNP5nr+oKtxcj8vf5Ye7bt04UNDMJZolho2bDYFDGSsnXf0ozr6Hs/g0pTJoQJ641UE1xteI61lE/nXPYPuFRsI51hioeAYsKjegum8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=sRTXQoxO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sRTXQoxO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6EC7BC4AF0E; Fri, 2 Aug 2024 21:45:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722635126; bh=PJfFAxXqJiTj0TbDik95zZrZzvqxpfVymiA5KUdqD9k=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=sRTXQoxONJKIeoMzcg4mrdp4XI3nnn8g0rhCYCtoRHZ4UEGxa6egJi2FJXgu40JV4 IH0XV744hqZMMzcU9mpB5Y5ixgDEmFjzfK6oH3uuFo57TSfeSph/qwgq7MI6cwdwAb v6fLA5Tr4O+dUKCr6sRDhoWJmr2gZwttTbq+fJQ+e2+eU7NPB542jkI/04pHfaa8YD qzESgVne/bpKlc9Y31cpYFiFffI0VXoW4Rylinm5CMxGq1MAT91KE80gU98thc8lvH JlfVjKeQGAfn/E6/3XswlO3qE0DLTVBc3YSYVP3g3qoe+gCEYl55nWl5astdHXvB+l sLffw3cyD03jA== From: Jeff Layton Date: Fri, 02 Aug 2024 17:45:05 -0400 Subject: [PATCH RFC 4/4] fs: try an opportunistic lookup for O_CREAT opens too Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240802-openfast-v1-4-a1cff2a33063@kernel.org> References: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> In-Reply-To: <20240802-openfast-v1-0-a1cff2a33063@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton Cc: Josef Bacik , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3154; i=jlayton@kernel.org; h=from:subject:message-id; bh=PJfFAxXqJiTj0TbDik95zZrZzvqxpfVymiA5KUdqD9k=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBmrVNxIhP58dOD7Dtw1/dwb2bJod6sjqh8QafBf Ub1KN2jttiJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZq1TcQAKCRAADmhBGVaC Ff0GEADCM2i95QIKMKfiz52kq9KA+6Y1cXwUto+w0I0y61CVEU5ePeyp26LeoPF3AEpTQS6O4si 7CrKtcUtW2TnqJkFV+CYfRni0c3QbXGFD6pEuuRPU4SiI6UG+drdR5MQDIQPqFTJIKXf1LsQuwu nTD+0PoCLxhEBnmhIALGLDeKhYD8H+Xrz4JS177y+bznWVYAb9pM+vVLwdka8ZsSvDrc5b4Kxdo Ia1+T13J2Ke1waVDKPwcfGjMXuIG4hLP9c77w/Q2Ytugr10/iJ8lMMQ6LpRY+31LQ9SKDXB23Nj B1nnK5Au6vshRZ9dNXaxT76hLBp1KUSn/AcohwvvTYxBu0dcRsJq6vPHt1bULClBKN59OiKln7X kGONNDTEHJr8ez7q9AYd+ITeXqRGZCHYpJGyr41wWaGMKYNCHI2C/JpYZPe8gwfHJ4jjsQhOzIg IDviVG1FjmUk+hKyeaFS4cLva+eGnj9Y1ZbNIDF5tfgCIxKP7BfgRi8AYXh7ENKZEVdtCFbL+TG DAKq4nITf0m7zLEz08CbhreqrWhVv1N/M6DjaQ1Ru7PQ9H3cxa9XEce34fAXv5C9Zef3ZZi5vsB rohrmXcO8j/uXyMn8oAhgfuNrGAsdgM2hjwe/t8ZiSxq4K7wHKQ4bH8t7EikUcdCB6HgwH9t42s dg1EkfGm8fYYhbQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Today, when opening a file we'll typically do a fast lookup, but if O_CREAT is set, the kernel always takes the exclusive inode lock. I'm sure this was done with the expectation that O_CREAT being set means that we expect to do the create, but that's often not the case. Many programs set O_CREAT even in scenarios where the file already exists. This patch rearranges the pathwalk-for-open code to also attempt a fast_lookup in the O_CREAT case. Have the code always do a fast_lookup (unless O_EXCL is set), and return that without taking the inode_lock when a positive dentry is found in the O_CREAT codepath. Signed-off-by: Jeff Layton --- fs/namei.c | 43 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 36 insertions(+), 7 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index b9bdb8e6214a..1793ed090314 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3538,7 +3538,7 @@ static const char *open_last_lookups(struct nameidata *nd, struct dentry *dir = nd->path.dentry; int open_flag = op->open_flag; bool got_write = false; - struct dentry *dentry; + struct dentry *dentry = NULL; const char *res; nd->flags |= op->intent; @@ -3549,28 +3549,57 @@ static const char *open_last_lookups(struct nameidata *nd, return handle_dots(nd, nd->last_type); } - if (!(open_flag & O_CREAT)) { - if (nd->last.name[nd->last.len]) + /* + * We _can_ be in RCU mode here. For everything but O_EXCL case, do a + * fast lookup for the dentry first. For O_CREAT case, we are only + * interested in positive dentries. If nothing suitable is found, + * fall back to locked codepath. + */ + if ((open_flag & (O_CREAT | O_EXCL)) != (O_CREAT | O_EXCL)) { + /* Trailing slashes? */ + if (unlikely(nd->last.name[nd->last.len])) nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY; - /* we _can_ be in RCU mode here */ + dentry = lookup_fast(nd); if (IS_ERR(dentry)) return ERR_CAST(dentry); + } + + if (!(open_flag & O_CREAT)) { if (likely(dentry)) goto finish_lookup; if (WARN_ON_ONCE(nd->flags & LOOKUP_RCU)) return ERR_PTR(-ECHILD); } else { - /* create side of things */ + /* If negative dentry was found earlier, + * discard it as we'll need to use the slow path anyway. + */ if (nd->flags & LOOKUP_RCU) { - if (!try_to_unlazy(nd)) + bool unlazied; + + /* discard negative dentry if one was found */ + if (dentry && !dentry->d_inode) + dentry = NULL; + + unlazied = dentry ? try_to_unlazy_next(nd, dentry) : + try_to_unlazy(nd); + if (!unlazied) return ERR_PTR(-ECHILD); + } else if (dentry && !dentry->d_inode) { + /* discard negative dentry if one was found */ + dput(dentry); + dentry = NULL; } audit_inode(nd->name, dir, AUDIT_INODE_PARENT); + /* trailing slashes? */ - if (unlikely(nd->last.name[nd->last.len])) + if (unlikely(nd->last.name[nd->last.len])) { + dput(dentry); return ERR_PTR(-EISDIR); + } + if (dentry) + goto finish_lookup; } if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) {