From patchwork Fri Nov 8 17:32:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13868571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CB9DD64070 for ; Fri, 8 Nov 2024 17:34:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A27A6B00B2; Fri, 8 Nov 2024 12:34:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 250C46B00B4; Fri, 8 Nov 2024 12:34:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0CAD16B00B5; Fri, 8 Nov 2024 12:34:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DE7BC6B00B2 for ; Fri, 8 Nov 2024 12:34:29 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9BF80160EBC for ; Fri, 8 Nov 2024 17:34:29 +0000 (UTC) X-FDA: 82763625420.19.1D4B169 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id BFF4C40023 for ; Fri, 8 Nov 2024 17:33:35 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iISAXF23; spf=pass (imf07.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731087098; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RVoEs9N2N3+Yy5ytz8xwWGrwrjCUqjBQVK6VxNHCBSE=; b=8ewBANx3djs8onlZHna0EGsO8Fhf9iA68oaVeIBEaaYefRYT5it5QCPW5eCLR5DRwXTRFW mKOmZ4ErwZAcSy9nT6k+6G+BnKgw8LvSrron0txX9NnRDgI8EYAEXJFaufRCKkJxH3s6Bv FeK/gyJPeJEtE1mYLhvnjg3czIolhIM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iISAXF23; spf=pass (imf07.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731087098; a=rsa-sha256; cv=none; b=I7nr1NWxZxbPLlFfkoKn4MB/3J+49RyJTAO1sWXpFu/1zYyCxxL+oJaWkj4ezJic1Uzbw8 4XU2jtoR+59N429aO/OQnm6f2UHtfen0+CYVDIBymL+CFLTGoJCUOJZ7OtkmSBv4iX0SBk glKaKidaey9+eSKKD8vGPKuPLMijA5I= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1731087267; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RVoEs9N2N3+Yy5ytz8xwWGrwrjCUqjBQVK6VxNHCBSE=; b=iISAXF23XfeykSAjk0SnH7ToMNq94RqhEENi62ok2m2zi8OqKcxGN40qTPuj77fRaFWvbQ hr6hdZsDk/4P0607+lEno0pgI8xCxzqffnQcnQmA8xspee5SdP7c2KdwgjH2c82djKLGxN M7aBfq8CtmDzyEtlBy3mxhcGvWhrMt4= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-618-KoCwqDKmMdm7hYFDo0ZXHw-1; Fri, 08 Nov 2024 12:34:23 -0500 X-MC-Unique: KoCwqDKmMdm7hYFDo0ZXHw-1 X-Mimecast-MFC-AGG-ID: KoCwqDKmMdm7hYFDo0ZXHw Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B40CC1955F42; Fri, 8 Nov 2024 17:34:20 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.231]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2B0EC195E480; Fri, 8 Nov 2024 17:34:14 +0000 (UTC) From: David Howells To: Christian Brauner , Steve French , Matthew Wilcox Cc: David Howells , Jeff Layton , Gao Xiang , Dominique Martinet , Marc Dionne , Paulo Alcantara , Shyam Prasad N , Tom Talpey , Eric Van Hensbergen , Ilya Dryomov , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 13/33] afs: Don't use mutex for I/O operation lock Date: Fri, 8 Nov 2024 17:32:14 +0000 Message-ID: <20241108173236.1382366-14-dhowells@redhat.com> In-Reply-To: <20241108173236.1382366-1-dhowells@redhat.com> References: <20241108173236.1382366-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: BFF4C40023 X-Stat-Signature: cixgkcjwmezencywkreatuat4zntqp9i X-Rspam-User: X-HE-Tag: 1731087215-134332 X-HE-Meta: U2FsdGVkX1/BjYhZSLfBUjswwNywdMFVr+G3tEfWcYiE6x/bigGHqXm7aJxAlw8oBHoZmBrRWy0JQRnDInP5n0QCq8HiRSrGsWr275vE67IQXqfG4oNZzRkpcRIlDIQLSRaAQ8YdECq35Gtt/s15qtZ9jD43ZcSSQ46X31z5u8AuRqJg17IBUQu3Assc2Rbje0UYjn6RO4mn94cg6DMcKvyJ5n3XOLt+BSH2A2i+80RbjYxW+Kzxa6Cu7ShBqEzx1NvaxtVuzBlm23B89gwKmzp9iL+wV8JW95IrI1Bhjs8JbeDawqCXsGmuC3MTlG49Bi71N+4fKkRabm4FuSWaPXcV1KaHqtFT6y19GkbDKpetLJY/9YxF3qxUW7qXbp+dd4tK7QTTjTlUbZbfRmpHJBCB5pbuAFsv2t9w9xn1lbr7M2F4PjzXncNT2ICscD+86W3y9KSiTiwSHL4U9CZ1cZYSsZjeYccqbFRdS2KPHVkyINzzzL/WcZOmHkXNsiwWBI7PgI84b2y+a/itJLHJrDD6RTbV2rz/B0jbsJxMAUuPBnwHliTwrrbtGC/Re2xmAQMQq5lycEEEg8rqP4eWBwbtqDMlBi7ARVE9DsDIVBXdctZBVFJ1K883ZMtHS9J697EoL1Fvpex+T10jRwcMa/aCZPvtdBOfVQ6GnHZBtrWSxjWlsHfz5Fa2u2LcCYpV8D1GMO+RG3V7FJ78rKTN8Wa9K6qvRK9pUiplPJU49yKVBuqLxpGYPpR0DgYOXYpmpbTRaY+fDEjugYdw8yVfFiwp9Pz2i7FrfVCUX0uucv/4tg/x3zIAhsldo4Qki0KOm9p6i1qgeZPnJG2JvBxL0xIkhBiqRz67n73I3czjqpl9L3j/95y32Hr1ifBySgIE7UM4ngqQwuYdhmGjOOGJucyRFur26hB+fOSCN+TQPtZEwYt6kmvY7qJutG8puCM2DrJyV7g5EjxUQTbvLDy VGc8y+/T Zc/efVbstdRepK1jLU1D5O8FUBvs6XxlFuYkbIzpx9mvH2GDJA+WFBgI3+dgQaLaThIkcq5hKAat4yG/0rq06mT3oEAgYNM+x3cYr8Dh1P1SF8PmwDCi5OTuxgyA4agAd53HDNUYOpIaDl3ghglvCS3RgdbTfTFowgoJIQMItL+sRIBaxQX2ktQH5/HQ5zlmXXvEEV8lS0tSrRWgMOGVr1R6z+jl/wXpFxiL6pFxg8ik5pwLY+w/+RCQyYEI8GnHcvIZPYRI3TGB022vEo6sml+4JbocYgT03MB05upI2aFQKRFU8v1+0o6VHEOYD+hN+Svs4BhJREdjtvBJAXRaiNSU+6YZ95tGR9DJjkWICkxgtBy/+IzSDyAxl0+s3IC5OHgruCbhG4BqL2uomBmrc6L6VRa5Q1FfSQuWV/rsQyZfouquiZeEy1OaiJAT3vTvixypuoyYXFxr2j9uF0yW5ciIdA3q2yNS+55wbIx8gJK5kOMOxET8xV4u3gp7wUsR6kI6DXFW6RGyp6N0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Don't use the standard mutex for the I/O operation lock, but rather implement our own as the standard mutex must be released in the same thread as locked it. This is a problem when it comes to doing async FetchData where the lock will be dropped from the workqueue that processed the incoming data and not from the issuing thread. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org --- fs/afs/fs_operation.c | 111 +++++++++++++++++++++++++++++++++++++++--- fs/afs/internal.h | 3 +- fs/afs/super.c | 2 +- 3 files changed, 108 insertions(+), 8 deletions(-) diff --git a/fs/afs/fs_operation.c b/fs/afs/fs_operation.c index 428721bbe4f6..8488ff8183fa 100644 --- a/fs/afs/fs_operation.c +++ b/fs/afs/fs_operation.c @@ -49,6 +49,105 @@ struct afs_operation *afs_alloc_operation(struct key *key, struct afs_volume *vo return op; } +struct afs_io_locker { + struct list_head link; + struct task_struct *task; + unsigned long have_lock; +}; + +/* + * Unlock the I/O lock on a vnode. + */ +static void afs_unlock_for_io(struct afs_vnode *vnode) +{ + struct afs_io_locker *locker; + + spin_lock(&vnode->lock); + locker = list_first_entry_or_null(&vnode->io_lock_waiters, + struct afs_io_locker, link); + if (locker) { + list_del(&locker->link); + smp_store_release(&locker->have_lock, 1); + smp_mb__after_atomic(); /* Store have_lock before task state */ + wake_up_process(locker->task); + } else { + clear_bit(AFS_VNODE_IO_LOCK, &vnode->flags); + } + spin_unlock(&vnode->lock); +} + +/* + * Lock the I/O lock on a vnode uninterruptibly. We can't use an ordinary + * mutex as lockdep will complain if we unlock it in the wrong thread. + */ +static void afs_lock_for_io(struct afs_vnode *vnode) +{ + struct afs_io_locker myself = { .task = current, }; + + spin_lock(&vnode->lock); + + if (!test_and_set_bit(AFS_VNODE_IO_LOCK, &vnode->flags)) { + spin_unlock(&vnode->lock); + return; + } + + list_add_tail(&myself.link, &vnode->io_lock_waiters); + spin_unlock(&vnode->lock); + + for (;;) { + set_current_state(TASK_UNINTERRUPTIBLE); + if (smp_load_acquire(&myself.have_lock)) + break; + schedule(); + } + __set_current_state(TASK_RUNNING); +} + +/* + * Lock the I/O lock on a vnode interruptibly. We can't use an ordinary mutex + * as lockdep will complain if we unlock it in the wrong thread. + */ +static int afs_lock_for_io_interruptible(struct afs_vnode *vnode) +{ + struct afs_io_locker myself = { .task = current, }; + int ret = 0; + + spin_lock(&vnode->lock); + + if (!test_and_set_bit(AFS_VNODE_IO_LOCK, &vnode->flags)) { + spin_unlock(&vnode->lock); + return 0; + } + + list_add_tail(&myself.link, &vnode->io_lock_waiters); + spin_unlock(&vnode->lock); + + for (;;) { + set_current_state(TASK_INTERRUPTIBLE); + if (smp_load_acquire(&myself.have_lock) || + signal_pending(current)) + break; + schedule(); + } + __set_current_state(TASK_RUNNING); + + /* If we got a signal, try to transfer the lock onto the next + * waiter. + */ + if (unlikely(signal_pending(current))) { + spin_lock(&vnode->lock); + if (myself.have_lock) { + spin_unlock(&vnode->lock); + afs_unlock_for_io(vnode); + } else { + list_del(&myself.link); + spin_unlock(&vnode->lock); + } + ret = -ERESTARTSYS; + } + return ret; +} + /* * Lock the vnode(s) being operated upon. */ @@ -60,7 +159,7 @@ static bool afs_get_io_locks(struct afs_operation *op) _enter(""); if (op->flags & AFS_OPERATION_UNINTR) { - mutex_lock(&vnode->io_lock); + afs_lock_for_io(vnode); op->flags |= AFS_OPERATION_LOCK_0; _leave(" = t [1]"); return true; @@ -72,7 +171,7 @@ static bool afs_get_io_locks(struct afs_operation *op) if (vnode2 > vnode) swap(vnode, vnode2); - if (mutex_lock_interruptible(&vnode->io_lock) < 0) { + if (afs_lock_for_io_interruptible(vnode) < 0) { afs_op_set_error(op, -ERESTARTSYS); op->flags |= AFS_OPERATION_STOP; _leave(" = f [I 0]"); @@ -81,10 +180,10 @@ static bool afs_get_io_locks(struct afs_operation *op) op->flags |= AFS_OPERATION_LOCK_0; if (vnode2) { - if (mutex_lock_interruptible_nested(&vnode2->io_lock, 1) < 0) { + if (afs_lock_for_io_interruptible(vnode2) < 0) { afs_op_set_error(op, -ERESTARTSYS); op->flags |= AFS_OPERATION_STOP; - mutex_unlock(&vnode->io_lock); + afs_unlock_for_io(vnode); op->flags &= ~AFS_OPERATION_LOCK_0; _leave(" = f [I 1]"); return false; @@ -104,9 +203,9 @@ static void afs_drop_io_locks(struct afs_operation *op) _enter(""); if (op->flags & AFS_OPERATION_LOCK_1) - mutex_unlock(&vnode2->io_lock); + afs_unlock_for_io(vnode2); if (op->flags & AFS_OPERATION_LOCK_0) - mutex_unlock(&vnode->io_lock); + afs_unlock_for_io(vnode); } static void afs_prepare_vnode(struct afs_operation *op, struct afs_vnode_param *vp, diff --git a/fs/afs/internal.h b/fs/afs/internal.h index c9d620175e80..07b8f7083e73 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -702,13 +702,14 @@ struct afs_vnode { struct afs_file_status status; /* AFS status info for this file */ afs_dataversion_t invalid_before; /* Child dentries are invalid before this */ struct afs_permits __rcu *permit_cache; /* cache of permits so far obtained */ - struct mutex io_lock; /* Lock for serialising I/O on this mutex */ + struct list_head io_lock_waiters; /* Threads waiting for the I/O lock */ struct rw_semaphore validate_lock; /* lock for validating this vnode */ struct rw_semaphore rmdir_lock; /* Lock for rmdir vs sillyrename */ struct key *silly_key; /* Silly rename key */ spinlock_t wb_lock; /* lock for wb_keys */ spinlock_t lock; /* waitqueue/flags lock */ unsigned long flags; +#define AFS_VNODE_IO_LOCK 0 /* Set if the I/O serialisation lock is held */ #define AFS_VNODE_UNSET 1 /* set if vnode attributes not yet set */ #define AFS_VNODE_DIR_VALID 2 /* Set if dir contents are valid */ #define AFS_VNODE_ZAP_DATA 3 /* set if vnode's data should be invalidated */ diff --git a/fs/afs/super.c b/fs/afs/super.c index f3ba1c3e72f5..7631302c1984 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -663,7 +663,7 @@ static void afs_i_init_once(void *_vnode) memset(vnode, 0, sizeof(*vnode)); inode_init_once(&vnode->netfs.inode); - mutex_init(&vnode->io_lock); + INIT_LIST_HEAD(&vnode->io_lock_waiters); init_rwsem(&vnode->validate_lock); spin_lock_init(&vnode->wb_lock); spin_lock_init(&vnode->lock);