From patchwork Mon Feb 24 23:41:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989084 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC65B20C479 for ; Mon, 24 Feb 2025 23:42:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440533; cv=none; b=stgoypnA/PfoIzkpNyzsVVnNvVZJkLlrkji1vnvlgtSrJST0KTwxdRNxktHymRLcCCfutVpC8vUouACaa+Fcn1butcYI1ZLtQ4GukWp2yKJRbnwC/Zg5ISj87vWqcon30ubKx7MF5xTna1o6GZF+g1YaxB04Rkjy8r0NaygHwAY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440533; c=relaxed/simple; bh=LdM6YWJorqkUh0RrJpizFWheSHgWKS3j4SIexQeKkT8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LYD3z+VnqESbO6/I3xxrJl0ph/zAMdO6wFjhw808zzWNBUNBQiNHzUysfvHpGHQdi48Um2wFplNWXrLy3mUZIpalwnAdnSFkMdYFT8BdPy/HL5UHTWKUhOGiVGXGaBP8rH1G8/Q872t1lnYPVakZDmYAUajm4YQSYYttTt4fqew= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Xw25v1sL; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Xw25v1sL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440530; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TPX5X0FLbu8i/pSR/GZDg1fvDLeMt1ooWaCQcD/mFIo=; b=Xw25v1sL0KgEVjF8AdM5sJwt3INc7RZgAF8mhryYP88k/con3EzoCAoz/PvXMb2nEynoOk vX9n5d98D/Ro5tIGp8oUsiKaYWTsz3oE7D5mLsxTkc1d2BkO1MPk6KFMVT2Do1wCmKM3fc d5rA2kMTNwnBnZMk8urm25KEXpGYF5E= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-670-t6NQHXqbPRK9KrwrBhFdPw-1; Mon, 24 Feb 2025 18:42:06 -0500 X-MC-Unique: t6NQHXqbPRK9KrwrBhFdPw-1 X-Mimecast-MFC-AGG-ID: t6NQHXqbPRK9KrwrBhFdPw_1740440525 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F02DE180036F; Mon, 24 Feb 2025 23:42:04 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DFA91180035E; Mon, 24 Feb 2025 23:42:01 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 01/15] rxrpc: rxperf: Fix missing decoding of terminal magic cookie Date: Mon, 24 Feb 2025 23:41:38 +0000 Message-ID: <20250224234154.2014840-2-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Patchwork-Delegate: kuba@kernel.org The rxperf RPCs seem to have a magic cookie at the end of the request that was failing to be taken account of by the unmarshalling of the request. Fix the rxperf code to expect this. Fixes: 75bfdbf2fca3 ("rxrpc: Implement an in-kernel rxperf server for testing purposes") Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- net/rxrpc/rxperf.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/net/rxrpc/rxperf.c b/net/rxrpc/rxperf.c index 7ef93407be83..e848a4777b8c 100644 --- a/net/rxrpc/rxperf.c +++ b/net/rxrpc/rxperf.c @@ -478,6 +478,18 @@ static int rxperf_deliver_request(struct rxperf_call *call) call->unmarshal++; fallthrough; case 2: + ret = rxperf_extract_data(call, true); + if (ret < 0) + return ret; + + /* Deal with the terminal magic cookie. */ + call->iov_len = 4; + call->kvec[0].iov_len = call->iov_len; + call->kvec[0].iov_base = call->tmp; + iov_iter_kvec(&call->iter, READ, call->kvec, 1, call->iov_len); + call->unmarshal++; + fallthrough; + case 3: ret = rxperf_extract_data(call, false); if (ret < 0) return ret; From patchwork Mon Feb 24 23:41:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989085 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29189207DEA for ; Mon, 24 Feb 2025 23:42:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440537; cv=none; b=HhmpDYYAhUTIlcesxG2khHVuHzd6SEzSF7vRsdBOASIBrAqWWhhI13i6e/uUKQ3Zue+rFqP6D3naGtHY8NxE1SGJjgM9N04b4EMDHE+upg27EK71DinkxPnIL5aw8K673HWb0P1PCGgQjykdfl198Et7VUhTJY64oQC1zeX1MjQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440537; c=relaxed/simple; bh=GAF4rTI1DsGws7r7GRla1/im4tYifATcIPq/nactC60=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Rt35uK51VY1uQNhJbTXPgm5wr7634c52H4fP2euiitqzrQQ/YmdfhEbTet/8k/v2glXUhn+MjWrOTfo0oBMSEIB3oQwuSBMJ3gV5IsDIv7KuYcokvt9i09eTNFbo5zG8AUk+Y7I1vqSXL1urijcWBfenu8cE4/iJG71julUeZoE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Li0nk5Qm; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Li0nk5Qm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rJysTY3l6efKg+rxbSyGDPWajwbgdCVI4+yZliqv/jI=; b=Li0nk5QmcHjIjMzsy/JkC2sgWhnNxG3qKjBsze8E+AkTmHjv1y249WNT4DR1Xx1m7e7E0j aKSvyaxJnBiRYC+/V3p+oDjswiEOUS5Ndx11laSII7eOBmoXspJD14Igpz1VQTQpa/y5lx vdrLMQnBqWuXlDGaxsP0TrHF9HSrLsQ= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-197-lKJIW5ZiOr-4stc6ovGtfA-1; Mon, 24 Feb 2025 18:42:10 -0500 X-MC-Unique: lKJIW5ZiOr-4stc6ovGtfA-1 X-Mimecast-MFC-AGG-ID: lKJIW5ZiOr-4stc6ovGtfA_1740440529 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7ED791800872; Mon, 24 Feb 2025 23:42:09 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 756B8180035F; Mon, 24 Feb 2025 23:42:06 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 02/15] rxrpc: peer->mtu_lock is redundant Date: Mon, 24 Feb 2025 23:41:39 +0000 Message-ID: <20250224234154.2014840-3-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Patchwork-Delegate: kuba@kernel.org The peer->mtu_lock is only used to lock around writes to peer->max_data - and nothing else; further, all such writes take place in the I/O thread and the lock is only ever write-locked and never read-locked. In a couple of places, the write_seqcount_begin() is wrapped in preempt_disable/enable(), but not in all places. This can cause lockdep to throw a complaint: WARNING: CPU: 0 PID: 1549 at include/linux/seqlock.h:221 rxrpc_input_ack_trailer+0x305/0x430 ... RIP: 0010:rxrpc_input_ack_trailer+0x305/0x430 Fix this by just getting rid of the lock. Fixes: eeaedc5449d9 ("rxrpc: Implement path-MTU probing using padded PING ACKs (RFC8899)") Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- net/rxrpc/ar-internal.h | 1 - net/rxrpc/input.c | 2 -- net/rxrpc/peer_event.c | 9 +-------- net/rxrpc/peer_object.c | 1 - 4 files changed, 1 insertion(+), 12 deletions(-) diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 5e740c486203..a64a0cab1bf7 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -360,7 +360,6 @@ struct rxrpc_peer { u8 pmtud_jumbo; /* Max jumbo packets for the MTU */ bool ackr_adv_pmtud; /* T if the peer advertises path-MTU */ unsigned int ackr_max_data; /* Maximum data advertised by peer */ - seqcount_t mtu_lock; /* Lockless MTU access management */ unsigned int if_mtu; /* Local interface MTU (- hdrsize) for this peer */ unsigned int max_data; /* Maximum packet data capacity for this peer */ unsigned short hdrsize; /* header size (IP + UDP + RxRPC) */ diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 9047ba13bd31..24aceb183c2c 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -810,9 +810,7 @@ static void rxrpc_input_ack_trailer(struct rxrpc_call *call, struct sk_buff *skb if (max_mtu < peer->max_data) { trace_rxrpc_pmtud_reduce(peer, sp->hdr.serial, max_mtu, rxrpc_pmtud_reduce_ack); - write_seqcount_begin(&peer->mtu_lock); peer->max_data = max_mtu; - write_seqcount_end(&peer->mtu_lock); } max_data = umin(max_mtu, peer->max_data); diff --git a/net/rxrpc/peer_event.c b/net/rxrpc/peer_event.c index bc283da9ee40..7f4729234957 100644 --- a/net/rxrpc/peer_event.c +++ b/net/rxrpc/peer_event.c @@ -130,9 +130,7 @@ static void rxrpc_adjust_mtu(struct rxrpc_peer *peer, unsigned int mtu) peer->pmtud_bad = max_data + 1; trace_rxrpc_pmtud_reduce(peer, 0, max_data, rxrpc_pmtud_reduce_icmp); - write_seqcount_begin(&peer->mtu_lock); peer->max_data = max_data; - write_seqcount_end(&peer->mtu_lock); } } @@ -408,13 +406,8 @@ void rxrpc_input_probe_for_pmtud(struct rxrpc_connection *conn, rxrpc_serial_t a } max_data = umin(max_data, peer->ackr_max_data); - if (max_data != peer->max_data) { - preempt_disable(); - write_seqcount_begin(&peer->mtu_lock); + if (max_data != peer->max_data) peer->max_data = max_data; - write_seqcount_end(&peer->mtu_lock); - preempt_enable(); - } jumbo = max_data + sizeof(struct rxrpc_jumbo_header); jumbo /= RXRPC_JUMBO_SUBPKTLEN; diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index 0fcc87f0409f..2ddc8ed68742 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -235,7 +235,6 @@ struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp, peer->service_conns = RB_ROOT; seqlock_init(&peer->service_conn_lock); spin_lock_init(&peer->lock); - seqcount_init(&peer->mtu_lock); peer->debug_id = atomic_inc_return(&rxrpc_debug_id); peer->recent_srtt_us = UINT_MAX; peer->cong_ssthresh = RXRPC_TX_MAX_WINDOW; From patchwork Mon Feb 24 23:41:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989086 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D81D20E024 for ; Mon, 24 Feb 2025 23:42:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440541; cv=none; b=r505GIfSHUCEA3WWL1SXIUQ9ctgzYRtfRFHzQMtkJVqpTZXiH4ebANIt1ElE1fWWTPHGvcxM82HmmP+FP1v7G8G4+UDWpLuPbw9NCW2MemFYc0hejrsNym0UW+cWA+oM1TNiL4wC2K88JmnFZKjmJJluv6J67+jro4mOI05nHTo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440541; c=relaxed/simple; bh=eZXHQ2G41uUOKVDh8j9cpZv9XKLuoY45JwNn5kAktpA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rx0/Vxnt4OTnQm+cnDOGNQ8n/xe/KdtL7FDEqhhNbyreXTiLFyv3r/x4Mwj3RX3OuRLCA3ZDb+XWoTLHnh7uQRDbsJFynGIKBcA6e+TqWjoFywiqhj3x5+uG0R3WbHeUvsOki5BlyoQBtvN4YRblZ1YC36mhBC96xwqJE42eGCw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Q7i4e7cS; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Q7i4e7cS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WaiYoCGsFpAWLKJ3d+qpfykW2KxLfKQHY0MVDwtgSTg=; b=Q7i4e7cSc573W5JSqG2CSF7v8knGM8+avZQu1gVFP5xMBneCKVvh5o3MTrQFj+3FCEHjAt qaLEuC0ure7yjHAxvlgSttDpl31+YrdCgYLwsAN/tjnoZFnk06wYpSGBO8/4Yf9hnOd+Nk 7OnHp4EN21hb7BFi1ZfRK7FpHAwkEvk= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-101-DIjDmn3aNqWTIY8mq3OH-Q-1; Mon, 24 Feb 2025 18:42:15 -0500 X-MC-Unique: DIjDmn3aNqWTIY8mq3OH-Q-1 X-Mimecast-MFC-AGG-ID: DIjDmn3aNqWTIY8mq3OH-Q_1740440534 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1BE7618EB2C3; Mon, 24 Feb 2025 23:42:14 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 005AD19560A3; Mon, 24 Feb 2025 23:42:10 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 03/15] rxrpc: Fix locking issues with the peer record hash Date: Mon, 24 Feb 2025 23:41:40 +0000 Message-ID: <20250224234154.2014840-4-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Patchwork-Delegate: kuba@kernel.org rxrpc_new_incoming_peer() can't use spin_lock_bh() whilst its caller has interrupts disabled. WARNING: CPU: 0 PID: 1550 at kernel/softirq.c:369 __local_bh_enable_ip+0x46/0xd0 ... Call Trace: rxrpc_alloc_incoming_call+0x1b0/0x400 rxrpc_new_incoming_call+0x1dd/0x5e0 rxrpc_input_packet+0x84a/0x920 rxrpc_io_thread+0x40d/0xb40 kthread+0x2ec/0x300 ret_from_fork+0x24/0x40 ret_from_fork_asm+0x1a/0x30 irq event stamp: 1811 hardirqs last enabled at (1809): _raw_spin_unlock_irq+0x24/0x50 hardirqs last disabled at (1810): _raw_read_lock_irq+0x17/0x70 softirqs last enabled at (1182): handle_softirqs+0x3ee/0x430 softirqs last disabled at (1811): rxrpc_new_incoming_peer+0x56/0x120 Fix this by using a plain spin_lock() instead. IRQs are held, so softirqs can't happen. Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread") Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- net/rxrpc/peer_object.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index 2ddc8ed68742..56e09d161a97 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -324,10 +324,10 @@ void rxrpc_new_incoming_peer(struct rxrpc_local *local, struct rxrpc_peer *peer) hash_key = rxrpc_peer_hash_key(local, &peer->srx); rxrpc_init_peer(local, peer, hash_key); - spin_lock_bh(&rxnet->peer_hash_lock); + spin_lock(&rxnet->peer_hash_lock); hash_add_rcu(rxnet->peer_hash, &peer->hash_link, hash_key); list_add_tail(&peer->keepalive_link, &rxnet->peer_keepalive_new); - spin_unlock_bh(&rxnet->peer_hash_lock); + spin_unlock(&rxnet->peer_hash_lock); } /* From patchwork Mon Feb 24 23:41:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989087 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B10C20E6E8 for ; Mon, 24 Feb 2025 23:42:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440547; cv=none; b=BnnBuZeeRQOYJQfK2bxKsQd4KzU48jZ7TnlCpaKXxjhGYb/TIODC6m0HHaa4I+F5ZqsPvFUEf9r0wIl7YdSAmEIohlZICTdemdl4KVtPFdDp+LYbBP4FoOZjMLqRh8um7ENPxZ2trkCngUtQ+sxspWuHaIlH2gcIq6ftnNFz8cc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440547; c=relaxed/simple; bh=wtEOdGs2YSi6iCFHbcNdsyGHm5cwYWHaRiJS1CB7fE4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sydjx1nGxOPpMb6dBFTGiYXWDmKC4wHxTewrJBqkwUvVhNszYuorkPs28uJrX48bMl3oV4yUBtnGPPjUfPNmDr7850dqfezlm6NNbz90GyIBg5VPxn4SP+OCA56OY6utj19cKhvjZ6HoSDjgNr0KQwHHr8l+Ka0nj9nnFQtkof4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=BllaTQef; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BllaTQef" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440545; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qtAxVvogmmtKwzUjDbJVTd9fTHTMk8w5vttE1KliMWs=; b=BllaTQefLEk91OnBag8JBHonml7klJ0/qVWrtpGKzBIJnpTbhGx1Dt4Tw5/CcTTyYzlycW nFycmT4R7rq6xtNeAaNABlNOJEiWnbCeapEAVZoaAu5BEFfVPB3/+OFkgOutOwrb80ZwpZ xqhTX3oko7LKK13F9rZz+IXhz6RAryk= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-410-wWc2qRQ2PcmcnhPUDd99_w-1; Mon, 24 Feb 2025 18:42:20 -0500 X-MC-Unique: wWc2qRQ2PcmcnhPUDd99_w-1 X-Mimecast-MFC-AGG-ID: wWc2qRQ2PcmcnhPUDd99_w_1740440539 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4E1B3180087D; Mon, 24 Feb 2025 23:42:19 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 941941800944; Mon, 24 Feb 2025 23:42:15 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 04/15] afs: Fix the server_list to unuse a displaced server rather than putting it Date: Mon, 24 Feb 2025 23:41:41 +0000 Message-ID: <20250224234154.2014840-5-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Patchwork-Delegate: kuba@kernel.org When allocating and building an afs_server_list struct object from a VLDB record, we look up each server address to get the server record for it - but a server may have more than one entry in the record and we discard the duplicate pointers. Currently, however, when we discard, we only put a server record, not unuse it - but the lookup got as an active-user count. The active-user count on an afs_server_list object determines its lifetime whereas the refcount keeps the memory backing it around. Failing to reduce the active-user counter prevents the record from being cleaned up and can lead to multiple copied being seen - and pointing to deleted afs_cell objects and other such things. Fix this by switching the incorrect 'put' to an 'unuse' instead. Without this, occasionally, a dead server record can be seen in /proc/net/afs/servers and list corruption may be observed: list_del corruption. prev->next should be ffff888102423e40, but was 0000000000000000. (prev=ffff88810140cd38) Fixes: 977e5f8ed0ab ("afs: Split the usage count on struct afs_server") Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/server_list.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c index 7e7e567a7f8a..d20cd902ef94 100644 --- a/fs/afs/server_list.c +++ b/fs/afs/server_list.c @@ -97,8 +97,8 @@ struct afs_server_list *afs_alloc_server_list(struct afs_volume *volume, break; if (j < slist->nr_servers) { if (slist->servers[j].server == server) { - afs_put_server(volume->cell->net, server, - afs_server_trace_put_slist_isort); + afs_unuse_server(volume->cell->net, server, + afs_server_trace_put_slist_isort); continue; } From patchwork Mon Feb 24 23:41:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989088 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C7F1D20CCD3 for ; Mon, 24 Feb 2025 23:42:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440551; cv=none; b=JBPQ4Ihf28RuovvKeRN+smrJHyIGZTvemUuBEPF6LfSYepKKUk5ldZEm2IhXW1GyBcWlmiCA20kO3VYSpJK40qjbxF0UES67UKc5Pt2iiLE4wNXifLzZ/QbX8T7VQ+FRlm1dccCBd4Lq2RT8PygxwOdPZt/JJYyQ0Vohn5hOZ8w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440551; c=relaxed/simple; bh=U5l0j4YtFyzUAgf2XpWLt95eq0hDVZnNbYuPslcwWvI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WOegku35bondXFjllljgo2pVb5IytHVGJ8rDYaB7i4DCi80/NKVvKWAzB39v0oduXj75C1JqRf08yxey19JOfFJHeouyJ4YNT1QexvwMGeHoQU+uZWnFwXLWwGJhRCNKU4j3cGda4lM1xsr6mGtKupDh/qaECHWvWVUNRJqtMtU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=PSR0VC/A; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PSR0VC/A" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440549; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2sOyOEyWoaoUiEdZIH8OlV8/GrITSd01ztjRDINYAAk=; b=PSR0VC/ASXNEnoYbMS17ImEfi6Fo4GRrUEYQWgF5o7yrfph/0ZU8KlYsiPtUIwhbIRG9xt 3o6zWQGhuyM9HP3vDU0Ng9HHyUkcIV0LsSEeM255Sk2gbLvifh1+LkXh0nhe3b82UaxQ2U qIOxLxu0NaJ2LH5B7PtGT726TJnZAys= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-683-ZPhTii-dMiCqLtsQbcNkQA-1; Mon, 24 Feb 2025 18:42:25 -0500 X-MC-Unique: ZPhTii-dMiCqLtsQbcNkQA-1 X-Mimecast-MFC-AGG-ID: ZPhTii-dMiCqLtsQbcNkQA_1740440544 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 18C1C19783B4; Mon, 24 Feb 2025 23:42:24 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3613C19560AA; Mon, 24 Feb 2025 23:42:21 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 05/15] afs: Give an afs_server object a ref on the afs_cell object it points to Date: Mon, 24 Feb 2025 23:41:42 +0000 Message-ID: <20250224234154.2014840-6-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Patchwork-Delegate: kuba@kernel.org Give an afs_server object a ref on the afs_cell object it points to so that the cell doesn't get deleted before the server record. Whilst this is circular (cell -> vol -> server_list -> server -> cell), the ref only pins the memory, not the lifetime as that's controlled by the activity counter. When the volume's activity counter reaches 0, it detaches from the cell and discards its server list; when a cell's activity counter reaches 0, it discards its root volume. At that point, the circularity is broken. Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation") Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/server.c | 3 +++ include/trace/events/afs.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/fs/afs/server.c b/fs/afs/server.c index 038f9d0ae3af..4504e16b458c 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -163,6 +163,8 @@ static struct afs_server *afs_install_server(struct afs_cell *cell, rb_insert_color(&server->uuid_rb, &net->fs_servers); hlist_add_head_rcu(&server->proc_link, &net->fs_proc); + afs_get_cell(cell, afs_cell_trace_get_server); + added_dup: write_seqlock(&net->fs_addr_lock); estate = rcu_dereference_protected(server->endpoint_state, @@ -442,6 +444,7 @@ static void afs_server_rcu(struct rcu_head *rcu) atomic_read(&server->active), afs_server_trace_free); afs_put_endpoint_state(rcu_access_pointer(server->endpoint_state), afs_estate_trace_put_server); + afs_put_cell(server->cell, afs_cell_trace_put_server); kfree(server); } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index b0db89058c91..958a2460330c 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -174,6 +174,7 @@ enum yfs_cm_operation { EM(afs_cell_trace_get_queue_dns, "GET q-dns ") \ EM(afs_cell_trace_get_queue_manage, "GET q-mng ") \ EM(afs_cell_trace_get_queue_new, "GET q-new ") \ + EM(afs_cell_trace_get_server, "GET server") \ EM(afs_cell_trace_get_vol, "GET vol ") \ EM(afs_cell_trace_insert, "INSERT ") \ EM(afs_cell_trace_manage, "MANAGE ") \ @@ -182,6 +183,7 @@ enum yfs_cm_operation { EM(afs_cell_trace_put_destroy, "PUT destry") \ EM(afs_cell_trace_put_queue_work, "PUT q-work") \ EM(afs_cell_trace_put_queue_fail, "PUT q-fail") \ + EM(afs_cell_trace_put_server, "PUT server") \ EM(afs_cell_trace_put_vol, "PUT vol ") \ EM(afs_cell_trace_see_source, "SEE source") \ EM(afs_cell_trace_see_ws, "SEE ws ") \ From patchwork Mon Feb 24 23:41:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989089 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EBDB20F079 for ; Mon, 24 Feb 2025 23:42:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440555; cv=none; b=MuN+seAzHKcesm9vAhPd7X9ebAmPbhGNfTb8Y7/JxkO6UMrsSZn+srkO29v4enPf+0D9PJyHN+p9W0ZaND8wQqK/Xz0aEJ4NLdQMm7OUpcWGy8AHfuQaC5IObC5jCZD1c/J8qvYt9ED9GXBgBqTOB4oIFdfJ/JoPlgB6ljpXtN8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440555; c=relaxed/simple; bh=GdbljZFNj7POPRFhKVaSUpADd+7BpiTUNmm49Xp9Ki4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AoqP0MTkD3HZKfSuqJNdJObyC1eEhcBbPHBBB20amEaRP33U0GX4VFbqOcZCVhnLhbPkNsRtvXFYaOEFRBmYLfZhEOvn5Lgqpr5NqxBL8n6FnOwMDmUx27QInN0niJFjRJri3WJXF5rjen2OP77RA9P0ZZessD0mUxItdIaq6mo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=PylwmiKG; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PylwmiKG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440553; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gv/5i8PhIxgYjeghwWkBeRPJ7ZuuG8uq4ln7l4W9+kw=; b=PylwmiKGSBUA/1PnUp67lbI9jAw8mKo/cRpBUsHJdcgurzTai+HBM/LNudnz3bKWnjCIKo 0V8urV02rUb9jT++55rc3uV0DEQgKcj6Sbk9y73/iYMM0AF5zPgT2G67wO/gNZ+mgCwDlT kxnrY5dS6N9CKBKFmk25Yv9ypjBekKQ= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-518-oGPOxP_TMyy4TZCd_9iL-w-1; Mon, 24 Feb 2025 18:42:29 -0500 X-MC-Unique: oGPOxP_TMyy4TZCd_9iL-w-1 X-Mimecast-MFC-AGG-ID: oGPOxP_TMyy4TZCd_9iL-w_1740440548 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 503BF1979057; Mon, 24 Feb 2025 23:42:28 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8DBAC300018D; Mon, 24 Feb 2025 23:42:25 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 06/15] afs: Remove the "autocell" mount option Date: Mon, 24 Feb 2025 23:41:43 +0000 Message-ID: <20250224234154.2014840-7-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Patchwork-Delegate: kuba@kernel.org Remove the "autocell" mount option. It was an attempt to do automounting of arbitrary cells based on what the user looked up but within the root directory of a mounted volume. This isn't really the right thing to do, and using the "dyn" mount option to get the dynamic root is the right way to do it. The kafs-client package uses "-o dyn" when mounting /afs, so it should be safe to drop "-o autocell". Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/dir.c | 5 ++--- fs/afs/dynroot.c | 5 +---- fs/afs/internal.h | 2 -- fs/afs/super.c | 5 ----- 4 files changed, 3 insertions(+), 14 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 02cbf38e1a77..9f62b8938350 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -1004,9 +1004,8 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry, afs_stat_v(dvnode, n_lookup); inode = afs_do_lookup(dir, dentry); if (inode == ERR_PTR(-ENOENT)) - inode = afs_try_auto_mntpt(dentry, dir); - - if (!IS_ERR_OR_NULL(inode)) + inode = NULL; + else if (!IS_ERR_OR_NULL(inode)) fid = AFS_FS_I(inode)->fid; _debug("splice %p", dentry->d_inode); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index d8bf52f77d93..bdc86dedd964 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -155,7 +155,7 @@ static int afs_probe_cell_name(struct dentry *dentry) * Try to auto mount the mountpoint with pseudo directory, if the autocell * operation is setted. */ -struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) +static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) { struct afs_vnode *vnode = AFS_FS_I(dir); struct inode *inode; @@ -164,9 +164,6 @@ struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) _enter("%p{%pd}, {%llx:%llu}", dentry, dentry, vnode->fid.vid, vnode->fid.vnode); - if (!test_bit(AFS_VNODE_AUTOCELL, &vnode->flags)) - goto out; - ret = afs_probe_cell_name(dentry); if (ret < 0) goto out; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 90f407774a9a..be9d5b9cc7f6 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -700,7 +700,6 @@ struct afs_vnode { #define AFS_VNODE_ZAP_DATA 3 /* set if vnode's data should be invalidated */ #define AFS_VNODE_DELETED 4 /* set if vnode deleted on server */ #define AFS_VNODE_MOUNTPOINT 5 /* set if vnode is a mountpoint symlink */ -#define AFS_VNODE_AUTOCELL 6 /* set if Vnode is an auto mount point */ #define AFS_VNODE_PSEUDODIR 7 /* set if Vnode is a pseudo directory */ #define AFS_VNODE_NEW_CONTENT 8 /* Set if file has new content (create/trunc-0) */ #define AFS_VNODE_SILLY_DELETED 9 /* Set if file has been silly-deleted */ @@ -1111,7 +1110,6 @@ extern int afs_silly_iput(struct dentry *, struct inode *); extern const struct inode_operations afs_dynroot_inode_operations; extern const struct dentry_operations afs_dynroot_dentry_operations; -extern struct inode *afs_try_auto_mntpt(struct dentry *, struct inode *); extern int afs_dynroot_mkdir(struct afs_net *, struct afs_cell *); extern void afs_dynroot_rmdir(struct afs_net *, struct afs_cell *); extern int afs_dynroot_populate(struct super_block *); diff --git a/fs/afs/super.c b/fs/afs/super.c index a9bee610674e..2f18aa8e2806 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -194,8 +194,6 @@ static int afs_show_options(struct seq_file *m, struct dentry *root) if (as->dyn_root) seq_puts(m, ",dyn"); - if (test_bit(AFS_VNODE_AUTOCELL, &AFS_FS_I(d_inode(root))->flags)) - seq_puts(m, ",autocell"); switch (as->flock_mode) { case afs_flock_mode_unset: break; case afs_flock_mode_local: p = "local"; break; @@ -478,9 +476,6 @@ static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) if (IS_ERR(inode)) return PTR_ERR(inode); - if (ctx->autocell || as->dyn_root) - set_bit(AFS_VNODE_AUTOCELL, &AFS_FS_I(inode)->flags); - ret = -ENOMEM; sb->s_root = d_make_root(inode); if (!sb->s_root) From patchwork Mon Feb 24 23:41:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989090 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE7BA20FA86 for ; Mon, 24 Feb 2025 23:42:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440561; cv=none; b=aCU1Av843gLl8SnVLvHfYRJwTHBhKZPlFPlZL346tBn0Zmjz0mayaCbb9PjyoWz8ZZRUC3kYwt06ApVCghWRkxUnoyZyKppiaSza/us5tJMpFPy/CbeuMS7MWFy9QEseRoV8F/eWzY0UIcJXzOPvbRfXm83b0ftESEV7+RupDOM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440561; c=relaxed/simple; bh=6a8wp4d4KRGG7WY9wUY7Cli8tOb3pJ8L34hcKYrkZAc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oshw/UXsSHNd71CAiAsHS+oeUDa7RM2tvpeNZNzZgQB2fsNPJvnTeywpgZt8oQnqtLoUrAQ7SKxs+to+EQ1AAU2Lc1tNqo+Kf6CjVAqmT9l4upjsm15MoYDpamiEDUAmbapAvO0mygfD05FpNoiyNPXyEZCmU4IQNRZ/PlHdyyU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=OZ97mGzS; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OZ97mGzS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nXPK2jh4p59Y/xP8AO8Ys4XqSf+hPP/RT+ietmj233E=; b=OZ97mGzSj0egBHREAZPAZwIRLzV9jNFK8gT2REBC6I5/h9XvXZGXPG0DYx6Xiv5AcmYu1k hH4FtukTljfRt6eiBjMCamfdFWQ5JDha7DDQjMcmStn3Ok3Khm2ZPHv94VFarzmH4NYaI6 a5IJ2A6dZyK/ovxk9xV/XNJOvESKw8I= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-265-5R7r-Uv_PZWzxLAd3K6Ttw-1; Mon, 24 Feb 2025 18:42:34 -0500 X-MC-Unique: 5R7r-Uv_PZWzxLAd3K6Ttw-1 X-Mimecast-MFC-AGG-ID: 5R7r-Uv_PZWzxLAd3K6Ttw_1740440553 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E4445190FF83; Mon, 24 Feb 2025 23:42:32 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id CCC941800366; Mon, 24 Feb 2025 23:42:29 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 07/15] afs: Change dynroot to create contents on demand Date: Mon, 24 Feb 2025 23:41:44 +0000 Message-ID: <20250224234154.2014840-8-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Patchwork-Delegate: kuba@kernel.org Change the AFS dynamic root to do things differently: (1) Rather than having the creation of cell records create inodes and dentries for cell mountpoints, create them on demand during lookup. This simplifies cell management and locking as we no longer have to create these objects in advance *and* on speculative lookup by the user for a cell that isn't precreated. (2) Rather than using the libfs dentry-based readdir (the dentries now no longer exist until accessed from (1)), have readdir generate the contents by reading the list of cells. The @cell symlinks get pushed in positions 2 and 3 if rootcell has been configured. (3) Make the @cell symlink dentries persist for the life of the superblock or until reclaimed, but make cell mountpoints disappear immediately if unused. It's not perfect as someone doing an "ls -l /afs" may create a whole bunch of dentries which will be garbage collected immediately. But any dentry that gets automounted will be pinned by the mount, so it shouldn't be too bad. (4) Allocate the inode numbers for the cell mountpoints from an IDR to prevent duplicates appearing in the event it cycles round. The number allocated from the IDR is doubled to provide two inode numbers - one for the normal cell name (RO) and one for the dotted cell name (RW). Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/cell.c | 8 +- fs/afs/dynroot.c | 480 +++++++++++++++---------------------- fs/afs/internal.h | 8 +- fs/afs/main.c | 3 + fs/afs/super.c | 8 +- include/trace/events/afs.h | 2 + 6 files changed, 210 insertions(+), 299 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index cee42646736c..9f5d8bc2bc5f 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -203,7 +203,12 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, cell->dns_status = vllist->status; smp_store_release(&cell->dns_lookup_count, 1); /* vs source/status */ atomic_inc(&net->cells_outstanding); + cell->dynroot_ino = idr_alloc_cyclic(&net->cells_dyn_ino, cell, + 2, INT_MAX / 2, GFP_KERNEL); + if (cell->dynroot_ino < 0) + goto error; cell->debug_id = atomic_inc_return(&cell_debug_id); + trace_afs_cell(cell->debug_id, 1, 0, afs_cell_trace_alloc); _leave(" = %p", cell); @@ -512,6 +517,7 @@ static void afs_cell_destroy(struct rcu_head *rcu) afs_put_vlserverlist(net, rcu_access_pointer(cell->vl_servers)); afs_unuse_cell(net, cell->alias_of, afs_cell_trace_unuse_alias); key_put(cell->anonymous_key); + idr_remove(&net->cells_dyn_ino, cell->dynroot_ino); kfree(cell->name - 1); kfree(cell); @@ -705,7 +711,6 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell) if (cell->proc_link.next) cell->proc_link.next->pprev = &cell->proc_link.next; - afs_dynroot_mkdir(net, cell); mutex_unlock(&net->proc_cells_lock); return 0; } @@ -722,7 +727,6 @@ static void afs_deactivate_cell(struct afs_net *net, struct afs_cell *cell) mutex_lock(&net->proc_cells_lock); if (!hlist_unhashed(&cell->proc_link)) hlist_del_rcu(&cell->proc_link); - afs_dynroot_rmdir(net, cell); mutex_unlock(&net->proc_cells_lock); _leave(""); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index bdc86dedd964..0cf5a196d4d7 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -10,16 +10,19 @@ #include #include "internal.h" -static atomic_t afs_autocell_ino; +#define AFS_MIN_DYNROOT_CELL_INO 4 /* Allow for ., .., @cell, .@cell */ +#define AFS_MAX_DYNROOT_CELL_INO ((unsigned int)INT_MAX) + +static struct dentry *afs_lookup_atcell(struct inode *dir, struct dentry *dentry, ino_t ino); /* * iget5() comparator for inode created by autocell operations - * - * These pseudo inodes don't match anything. */ static int afs_iget5_pseudo_test(struct inode *inode, void *opaque) { - return 0; + struct afs_fid *fid = opaque; + + return inode->i_ino == fid->vnode; } /* @@ -39,28 +42,16 @@ static int afs_iget5_pseudo_set(struct inode *inode, void *opaque) } /* - * Create an inode for a dynamic root directory or an autocell dynamic - * automount dir. + * Create an inode for an autocell dynamic automount dir. */ -struct inode *afs_iget_pseudo_dir(struct super_block *sb, bool root) +static struct inode *afs_iget_pseudo_dir(struct super_block *sb, ino_t ino) { - struct afs_super_info *as = AFS_FS_S(sb); struct afs_vnode *vnode; struct inode *inode; - struct afs_fid fid = {}; + struct afs_fid fid = { .vnode = ino, .unique = 1, }; _enter(""); - if (as->volume) - fid.vid = as->volume->vid; - if (root) { - fid.vnode = 1; - fid.unique = 1; - } else { - fid.vnode = atomic_inc_return(&afs_autocell_ino); - fid.unique = 0; - } - inode = iget5_locked(sb, fid.vnode, afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid); if (!inode) { @@ -73,112 +64,70 @@ struct inode *afs_iget_pseudo_dir(struct super_block *sb, bool root) vnode = AFS_FS_I(inode); - /* there shouldn't be an existing inode */ - BUG_ON(!(inode->i_state & I_NEW)); - - netfs_inode_init(&vnode->netfs, NULL, false); - inode->i_size = 0; - inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO; - if (root) { - inode->i_op = &afs_dynroot_inode_operations; - inode->i_fop = &simple_dir_operations; - } else { - inode->i_op = &afs_autocell_inode_operations; - } - set_nlink(inode, 2); - inode->i_uid = GLOBAL_ROOT_UID; - inode->i_gid = GLOBAL_ROOT_GID; - simple_inode_init_ts(inode); - inode->i_blocks = 0; - inode->i_generation = 0; - - set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); - if (!root) { + if (inode->i_state & I_NEW) { + netfs_inode_init(&vnode->netfs, NULL, false); + simple_inode_init_ts(inode); + set_nlink(inode, 2); + inode->i_size = 0; + inode->i_mode = S_IFDIR | 0555; + inode->i_op = &afs_autocell_inode_operations; + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + inode->i_blocks = 0; + inode->i_generation = 0; + inode->i_flags |= S_AUTOMOUNT | S_NOATIME; + + set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); set_bit(AFS_VNODE_MOUNTPOINT, &vnode->flags); - inode->i_flags |= S_AUTOMOUNT; - } - inode->i_flags |= S_NOATIME; - unlock_new_inode(inode); + unlock_new_inode(inode); + } _leave(" = %p", inode); return inode; } /* - * Probe to see if a cell may exist. This prevents positive dentries from - * being created unnecessarily. + * Try to automount the mountpoint with pseudo directory, if the autocell + * option is set. */ -static int afs_probe_cell_name(struct dentry *dentry) +static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry *dentry, + unsigned int flags) { - struct afs_cell *cell; + struct afs_cell *cell = NULL; struct afs_net *net = afs_d2net(dentry); + struct inode *inode = NULL; const char *name = dentry->d_name.name; size_t len = dentry->d_name.len; - char *result = NULL; - int ret; + bool dotted = false; + int ret = -ENOENT; /* Names prefixed with a dot are R/W mounts. */ if (name[0] == '.') { - if (len == 1) - return -EINVAL; name++; len--; + dotted = true; } - cell = afs_find_cell(net, name, len, afs_cell_trace_use_probe); - if (!IS_ERR(cell)) { - afs_unuse_cell(net, cell, afs_cell_trace_unuse_probe); - return 0; - } - - ret = dns_query(net->net, "afsdb", name, len, "srv=1", - &result, NULL, false); - if (ret == -ENODATA || ret == -ENOKEY || ret == 0) - ret = -ENOENT; - if (ret > 0 && ret >= sizeof(struct dns_server_list_v1_header)) { - struct dns_server_list_v1_header *v1 = (void *)result; - - if (v1->hdr.zero == 0 && - v1->hdr.content == DNS_PAYLOAD_IS_SERVER_LIST && - v1->hdr.version == 1 && - (v1->status != DNS_LOOKUP_GOOD && - v1->status != DNS_LOOKUP_GOOD_WITH_BAD)) - return -ENOENT; - + cell = afs_lookup_cell(net, name, len, NULL, false); + if (IS_ERR(cell)) { + ret = PTR_ERR(cell); + goto out_no_cell; } - kfree(result); - return ret; -} - -/* - * Try to auto mount the mountpoint with pseudo directory, if the autocell - * operation is setted. - */ -static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) -{ - struct afs_vnode *vnode = AFS_FS_I(dir); - struct inode *inode; - int ret = -ENOENT; - - _enter("%p{%pd}, {%llx:%llu}", - dentry, dentry, vnode->fid.vid, vnode->fid.vnode); - - ret = afs_probe_cell_name(dentry); - if (ret < 0) - goto out; - - inode = afs_iget_pseudo_dir(dir->i_sb, false); + inode = afs_iget_pseudo_dir(dir->i_sb, cell->dynroot_ino * 2 + dotted); if (IS_ERR(inode)) { ret = PTR_ERR(inode); goto out; } - _leave("= %p", inode); - return inode; + dentry->d_fsdata = cell; + return d_splice_alias(inode, dentry); out: - _leave("= %d", ret); + afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_lookup_dynroot); +out_no_cell: + if (!inode) + return d_splice_alias(inode, dentry); return ret == -ENOENT ? NULL : ERR_PTR(ret); } @@ -190,8 +139,6 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr { _enter("%pd", dentry); - ASSERTCMP(d_inode(dentry), ==, NULL); - if (flags & LOOKUP_CREATE) return ERR_PTR(-EOPNOTSUPP); @@ -200,98 +147,49 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr return ERR_PTR(-ENAMETOOLONG); } - return d_splice_alias(afs_try_auto_mntpt(dentry, dir), dentry); + if (dentry->d_name.len == 5 && + memcmp(dentry->d_name.name, "@cell", 5) == 0) + return afs_lookup_atcell(dir, dentry, 2); + + if (dentry->d_name.len == 6 && + memcmp(dentry->d_name.name, ".@cell", 6) == 0) + return afs_lookup_atcell(dir, dentry, 3); + + return afs_dynroot_lookup_cell(dir, dentry, flags); } const struct inode_operations afs_dynroot_inode_operations = { .lookup = afs_dynroot_lookup, }; -const struct dentry_operations afs_dynroot_dentry_operations = { - .d_delete = always_delete_dentry, - .d_release = afs_d_release, - .d_automount = afs_d_automount, -}; - -/* - * Create a manually added cell mount directory. - * - The caller must hold net->proc_cells_lock - */ -int afs_dynroot_mkdir(struct afs_net *net, struct afs_cell *cell) -{ - struct super_block *sb = net->dynroot_sb; - struct dentry *root, *subdir, *dsubdir; - char *dotname = cell->name - 1; - int ret; - - if (!sb || atomic_read(&sb->s_active) == 0) - return 0; - - /* Let the ->lookup op do the creation */ - root = sb->s_root; - inode_lock(root->d_inode); - subdir = lookup_one_len(cell->name, root, cell->name_len); - if (IS_ERR(subdir)) { - ret = PTR_ERR(subdir); - goto unlock; - } - - dsubdir = lookup_one_len(dotname, root, cell->name_len + 1); - if (IS_ERR(dsubdir)) { - ret = PTR_ERR(dsubdir); - dput(subdir); - goto unlock; - } - - /* Note that we're retaining extra refs on the dentries. */ - subdir->d_fsdata = (void *)1UL; - dsubdir->d_fsdata = (void *)1UL; - ret = 0; -unlock: - inode_unlock(root->d_inode); - return ret; -} - -static void afs_dynroot_rm_one_dir(struct dentry *root, const char *name, size_t name_len) +static void afs_dynroot_d_release(struct dentry *dentry) { - struct dentry *subdir; - - /* Don't want to trigger a lookup call, which will re-add the cell */ - subdir = try_lookup_one_len(name, root, name_len); - if (IS_ERR_OR_NULL(subdir)) { - _debug("lookup %ld", PTR_ERR(subdir)); - return; - } - - _debug("rmdir %pd %u", subdir, d_count(subdir)); + struct afs_cell *cell = dentry->d_fsdata; - if (subdir->d_fsdata) { - _debug("unpin %u", d_count(subdir)); - subdir->d_fsdata = NULL; - dput(subdir); - } - dput(subdir); + afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_dynroot_mntpt); } /* - * Remove a manually added cell mount directory. - * - The caller must hold net->proc_cells_lock + * Keep @cell symlink dentries around, but only keep cell autodirs when they're + * being used. */ -void afs_dynroot_rmdir(struct afs_net *net, struct afs_cell *cell) +static int afs_dynroot_delete_dentry(const struct dentry *dentry) { - struct super_block *sb = net->dynroot_sb; - char *dotname = cell->name - 1; - - if (!sb || atomic_read(&sb->s_active) == 0) - return; + const struct qstr *name = &dentry->d_name; - inode_lock(sb->s_root->d_inode); - afs_dynroot_rm_one_dir(sb->s_root, cell->name, cell->name_len); - afs_dynroot_rm_one_dir(sb->s_root, dotname, cell->name_len + 1); - inode_unlock(sb->s_root->d_inode); - _leave(""); + if (name->len == 5 && memcmp(name->name, "@cell", 5) == 0) + return 0; + if (name->len == 6 && memcmp(name->name, ".@cell", 6) == 0) + return 0; + return 1; } +const struct dentry_operations afs_dynroot_dentry_operations = { + .d_delete = afs_dynroot_delete_dentry, + .d_release = afs_dynroot_d_release, + .d_automount = afs_d_automount, +}; + static void afs_atcell_delayed_put_cell(void *arg) { struct afs_cell *cell = arg; @@ -333,149 +231,161 @@ static const struct inode_operations afs_atcell_inode_operations = { }; /* - * Look up @cell or .@cell in a dynroot directory. This is a substitution for - * the local cell name for the net namespace. + * Create an inode for the @cell or .@cell symlinks. */ -static struct dentry *afs_dynroot_create_symlink(struct dentry *root, const char *name) +static struct dentry *afs_lookup_atcell(struct inode *dir, struct dentry *dentry, ino_t ino) { struct afs_vnode *vnode; - struct afs_fid fid = { .vnode = 2, .unique = 1, }; - struct dentry *dentry; struct inode *inode; + struct afs_fid fid = { .vnode = ino, .unique = 1, }; - if (name[0] == '.') - fid.vnode = 3; - - dentry = d_alloc_name(root, name); - if (!dentry) - return ERR_PTR(-ENOMEM); - - inode = iget5_locked(dentry->d_sb, fid.vnode, + inode = iget5_locked(dir->i_sb, fid.vnode, afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid); - if (!inode) { - dput(dentry); + if (!inode) return ERR_PTR(-ENOMEM); - } vnode = AFS_FS_I(inode); - /* there shouldn't be an existing inode */ - if (WARN_ON_ONCE(!(inode->i_state & I_NEW))) { - iput(inode); - dput(dentry); - return ERR_PTR(-EIO); + if (inode->i_state & I_NEW) { + netfs_inode_init(&vnode->netfs, NULL, false); + simple_inode_init_ts(inode); + set_nlink(inode, 1); + inode->i_size = 0; + inode->i_mode = S_IFLNK | 0555; + inode->i_op = &afs_atcell_inode_operations; + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + inode->i_blocks = 0; + inode->i_generation = 0; + inode->i_flags |= S_NOATIME; + + unlock_new_inode(inode); } - - netfs_inode_init(&vnode->netfs, NULL, false); - simple_inode_init_ts(inode); - set_nlink(inode, 1); - inode->i_size = 0; - inode->i_mode = S_IFLNK | 0555; - inode->i_op = &afs_atcell_inode_operations; - inode->i_uid = GLOBAL_ROOT_UID; - inode->i_gid = GLOBAL_ROOT_GID; - inode->i_blocks = 0; - inode->i_generation = 0; - inode->i_flags |= S_NOATIME; - - unlock_new_inode(inode); - d_splice_alias(inode, dentry); - return dentry; + return d_splice_alias(inode, dentry); } /* - * Create @cell and .@cell symlinks. + * Transcribe the cell database into readdir content under the RCU read lock. + * Each cell produces two entries, one prefixed with a dot and one not. */ -static int afs_dynroot_symlink(struct afs_net *net) +static int afs_dynroot_readdir_cells(struct afs_net *net, struct dir_context *ctx) { - struct super_block *sb = net->dynroot_sb; - struct dentry *root, *symlink, *dsymlink; - int ret; - - /* Let the ->lookup op do the creation */ - root = sb->s_root; - inode_lock(root->d_inode); - symlink = afs_dynroot_create_symlink(root, "@cell"); - if (IS_ERR(symlink)) { - ret = PTR_ERR(symlink); - goto unlock; - } + const struct afs_cell *cell; + loff_t newpos; + + _enter("%llu", ctx->pos); + + for (;;) { + unsigned int ix = ctx->pos >> 1; + + cell = idr_get_next(&net->cells_dyn_ino, &ix); + if (!cell) + return 0; + if (READ_ONCE(cell->state) == AFS_CELL_FAILED || + READ_ONCE(cell->state) == AFS_CELL_REMOVED) { + ctx->pos += 2; + ctx->pos &= ~1; + continue; + } - dsymlink = afs_dynroot_create_symlink(root, ".@cell"); - if (IS_ERR(dsymlink)) { - ret = PTR_ERR(dsymlink); - dput(symlink); - goto unlock; - } + newpos = ix << 1; + if (newpos > ctx->pos) + ctx->pos = newpos; - /* Note that we're retaining extra refs on the dentries. */ - symlink->d_fsdata = (void *)1UL; - dsymlink->d_fsdata = (void *)1UL; - ret = 0; -unlock: - inode_unlock(root->d_inode); - return ret; + _debug("pos %llu -> cell %u", ctx->pos, cell->dynroot_ino); + + if ((ctx->pos & 1) == 0) { + if (!dir_emit(ctx, cell->name, cell->name_len, + cell->dynroot_ino, DT_DIR)) + return 0; + ctx->pos++; + } + if ((ctx->pos & 1) == 1) { + if (!dir_emit(ctx, cell->name - 1, cell->name_len + 1, + cell->dynroot_ino + 1, DT_DIR)) + return 0; + ctx->pos++; + } + } + return 0; } /* - * Populate a newly created dynamic root with cell names. + * Read the AFS dynamic root directory. This produces a list of cellnames, + * dotted and undotted, along with @cell and .@cell links if configured. */ -int afs_dynroot_populate(struct super_block *sb) +static int afs_dynroot_readdir(struct file *file, struct dir_context *ctx) { - struct afs_cell *cell; - struct afs_net *net = afs_sb2net(sb); - int ret; + struct afs_net *net = afs_d2net(file->f_path.dentry); + int ret = 0; - mutex_lock(&net->proc_cells_lock); - - net->dynroot_sb = sb; - ret = afs_dynroot_symlink(net); - if (ret < 0) - goto error; + if (!dir_emit_dots(file, ctx)) + return 0; - hlist_for_each_entry(cell, &net->proc_cells, proc_link) { - ret = afs_dynroot_mkdir(net, cell); - if (ret < 0) - goto error; + if (ctx->pos == 2) { + if (net->ws_cell && !dir_emit(ctx, "@cell", 5, 2, DT_LNK)) + return 0; + ctx->pos = 3; + } + if (ctx->pos == 3) { + if (net->ws_cell && !dir_emit(ctx, ".@cell", 6, 3, DT_LNK)) + return 0; + ctx->pos = 4; } - ret = 0; -out: - mutex_unlock(&net->proc_cells_lock); + if ((unsigned long long)ctx->pos <= AFS_MAX_DYNROOT_CELL_INO) { + rcu_read_lock(); + ret = afs_dynroot_readdir_cells(net, ctx); + rcu_read_unlock(); + } return ret; - -error: - net->dynroot_sb = NULL; - goto out; } +static const struct file_operations afs_dynroot_file_operations = { + .llseek = generic_file_llseek, + .read = generic_read_dir, + .iterate_shared = afs_dynroot_readdir, + .fsync = noop_fsync, +}; + /* - * When a dynamic root that's in the process of being destroyed, depopulate it - * of pinned directories. + * Create an inode for a dynamic root directory. */ -void afs_dynroot_depopulate(struct super_block *sb) +struct inode *afs_dynroot_iget_root(struct super_block *sb) { - struct afs_net *net = afs_sb2net(sb); - struct dentry *root = sb->s_root, *subdir; - - /* Prevent more subdirs from being created */ - mutex_lock(&net->proc_cells_lock); - if (net->dynroot_sb == sb) - net->dynroot_sb = NULL; - mutex_unlock(&net->proc_cells_lock); - - if (root) { - struct hlist_node *n; - inode_lock(root->d_inode); - - /* Remove all the pins for dirs created for manually added cells */ - hlist_for_each_entry_safe(subdir, n, &root->d_children, d_sib) { - if (subdir->d_fsdata) { - subdir->d_fsdata = NULL; - dput(subdir); - } - } + struct afs_super_info *as = AFS_FS_S(sb); + struct afs_vnode *vnode; + struct inode *inode; + struct afs_fid fid = { .vid = 0, .vnode = 1, .unique = 1,}; + + if (as->volume) + fid.vid = as->volume->vid; - inode_unlock(root->d_inode); + inode = iget5_locked(sb, fid.vnode, + afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid); + if (!inode) + return ERR_PTR(-ENOMEM); + + vnode = AFS_FS_I(inode); + + /* there shouldn't be an existing inode */ + if (inode->i_state & I_NEW) { + netfs_inode_init(&vnode->netfs, NULL, false); + simple_inode_init_ts(inode); + set_nlink(inode, 2); + inode->i_size = 0; + inode->i_mode = S_IFDIR | 0555; + inode->i_op = &afs_dynroot_inode_operations; + inode->i_fop = &afs_dynroot_file_operations; + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + inode->i_blocks = 0; + inode->i_generation = 0; + inode->i_flags |= S_NOATIME; + + set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); + unlock_new_inode(inode); } + _leave(" = %p", inode); + return inode; } diff --git a/fs/afs/internal.h b/fs/afs/internal.h index be9d5b9cc7f6..89be1014968d 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -287,6 +287,7 @@ struct afs_net { /* Cell database */ struct rb_root cells; + struct idr cells_dyn_ino; /* cell->dynroot_ino mapping */ struct afs_cell *ws_cell; struct work_struct cells_manager; struct timer_list cells_timer; @@ -398,6 +399,7 @@ struct afs_cell { enum dns_lookup_status dns_status:8; /* Latest status of data from lookup */ unsigned int dns_lookup_count; /* Counter of DNS lookups */ unsigned int debug_id; + unsigned int dynroot_ino; /* Inode numbers for dynroot (a pair) */ /* The volumes belonging to this cell */ struct rw_semaphore vs_lock; /* Lock for server->volumes */ @@ -1110,10 +1112,7 @@ extern int afs_silly_iput(struct dentry *, struct inode *); extern const struct inode_operations afs_dynroot_inode_operations; extern const struct dentry_operations afs_dynroot_dentry_operations; -extern int afs_dynroot_mkdir(struct afs_net *, struct afs_cell *); -extern void afs_dynroot_rmdir(struct afs_net *, struct afs_cell *); -extern int afs_dynroot_populate(struct super_block *); -extern void afs_dynroot_depopulate(struct super_block *); +struct inode *afs_dynroot_iget_root(struct super_block *sb); /* * file.c @@ -1226,7 +1225,6 @@ int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); extern void afs_vnode_commit_status(struct afs_operation *, struct afs_vnode_param *); extern int afs_fetch_status(struct afs_vnode *, struct key *, bool, afs_access_t *); extern int afs_ilookup5_test_by_fid(struct inode *, void *); -extern struct inode *afs_iget_pseudo_dir(struct super_block *, bool); extern struct inode *afs_iget(struct afs_operation *, struct afs_vnode_param *); extern struct inode *afs_root_iget(struct super_block *, struct key *); extern int afs_getattr(struct mnt_idmap *idmap, const struct path *, diff --git a/fs/afs/main.c b/fs/afs/main.c index 1ae0067f772d..a7c7dc268302 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -76,6 +76,7 @@ static int __net_init afs_net_init(struct net *net_ns) mutex_init(&net->socket_mutex); net->cells = RB_ROOT; + idr_init(&net->cells_dyn_ino); init_rwsem(&net->cells_lock); INIT_WORK(&net->cells_manager, afs_manage_cells); timer_setup(&net->cells_timer, afs_cells_timer, 0); @@ -137,6 +138,7 @@ static int __net_init afs_net_init(struct net *net_ns) error_proc: afs_put_sysnames(net->sysnames); error_sysnames: + idr_destroy(&net->cells_dyn_ino); net->live = false; return ret; } @@ -155,6 +157,7 @@ static void __net_exit afs_net_exit(struct net *net_ns) afs_close_socket(net); afs_proc_cleanup(net); afs_put_sysnames(net->sysnames); + idr_destroy(&net->cells_dyn_ino); kfree_rcu(rcu_access_pointer(net->address_prefs), rcu); } diff --git a/fs/afs/super.c b/fs/afs/super.c index 2f18aa8e2806..dfc109f48ad5 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -466,7 +466,7 @@ static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) /* allocate the root inode and dentry */ if (as->dyn_root) { - inode = afs_iget_pseudo_dir(sb, true); + inode = afs_dynroot_iget_root(sb); } else { sprintf(sb->s_id, "%llu", as->volume->vid); afs_activate_volume(as->volume); @@ -483,9 +483,6 @@ static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) if (as->dyn_root) { sb->s_d_op = &afs_dynroot_dentry_operations; - ret = afs_dynroot_populate(sb); - if (ret < 0) - goto error; } else { sb->s_d_op = &afs_fs_dentry_operations; rcu_assign_pointer(as->volume->sb, sb); @@ -534,9 +531,6 @@ static void afs_kill_super(struct super_block *sb) { struct afs_super_info *as = AFS_FS_S(sb); - if (as->dyn_root) - afs_dynroot_depopulate(sb); - /* Clear the callback interests (which will do ilookup5) before * deactivating the superblock. */ diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 958a2460330c..c19132605f41 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -190,8 +190,10 @@ enum yfs_cm_operation { EM(afs_cell_trace_unuse_alias, "UNU alias ") \ EM(afs_cell_trace_unuse_check_alias, "UNU chk-al") \ EM(afs_cell_trace_unuse_delete, "UNU delete") \ + EM(afs_cell_trace_unuse_dynroot_mntpt, "UNU dyn-mp") \ EM(afs_cell_trace_unuse_fc, "UNU fc ") \ EM(afs_cell_trace_unuse_lookup, "UNU lookup") \ + EM(afs_cell_trace_unuse_lookup_dynroot, "UNU lu-dyn") \ EM(afs_cell_trace_unuse_mntpt, "UNU mntpt ") \ EM(afs_cell_trace_unuse_no_pin, "UNU no-pin") \ EM(afs_cell_trace_unuse_parse, "UNU parse ") \ From patchwork Mon Feb 24 23:41:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989091 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32078211476 for ; Mon, 24 Feb 2025 23:42:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440566; cv=none; b=piawB+ssEBoJx+k602xBwWCKKS3hYSqNaKSmZGLUKL+XKar8qX69HhOYjWlu5gHQMT/Fv8rEOa6TfJLgGo/BHcn1RCTaQx031fb5lJ3wqeP6AtsE/jn/yBHE+YCsetHra12nu8gbyQpc7Hm50PXi0lLHvGQ5Bhw8+GCV2KQHoxQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440566; c=relaxed/simple; bh=uWeiT+1xVVDMPmHbQYp+hU7kEdI797i6pjlm/t6kV4s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sGaUaWhj3TjAiMmAeRqcpbC6tFk+MG59/RmV7m0/0usgAyIA+RZzvtB3iMxdRlU9u3VuIwed1Y79cxb6Sm1D5cJRxI0MGsFWfowqDff1iObMtZ60HQEGErJKDtadNidkOwtEqKs4jAEKMoGLQ3wkoD7Ps/od3uOTRmKhJrsx67Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=cPLSNk2i; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cPLSNk2i" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440564; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pEjw7fxonTaV44564ll+XhxsCuf1s5tZ1MB5XWAbMQQ=; b=cPLSNk2i6qUYeR4rifkPD6qISGtRjxNs7+LNL4uzZrXmjrihP6G+6leNO6nAKAzpoIwsD2 l1lZ67cHFmkK+zfiXLTj7BOvkAyjHzfQ6cE1yicZq41h39K2raTF+exvEa6osvS4LHryr6 QUQXMu2SbyWBPCkeI3yXZ15YIc4PctY= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-501-o3JMLdP-PBqvP9jONwGEvg-1; Mon, 24 Feb 2025 18:42:38 -0500 X-MC-Unique: o3JMLdP-PBqvP9jONwGEvg-1 X-Mimecast-MFC-AGG-ID: o3JMLdP-PBqvP9jONwGEvg_1740440557 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2A25519783B2; Mon, 24 Feb 2025 23:42:37 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6642A180035E; Mon, 24 Feb 2025 23:42:34 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 08/15] afs: Improve afs_volume tracing to display a debug ID Date: Mon, 24 Feb 2025 23:41:45 +0000 Message-ID: <20250224234154.2014840-9-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Patchwork-Delegate: kuba@kernel.org Improve the tracing of afs_volume objects to include displaying a debug ID so that different instances of volumes with the same "vid" can be distinguished. Also be consistent about displaying the volume's refcount (and not the cell's). Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/internal.h | 1 + fs/afs/volume.c | 15 +++++++++------ include/trace/events/afs.h | 18 +++++++++++------- 3 files changed, 21 insertions(+), 13 deletions(-) diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 89be1014968d..bbd550d496a7 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -623,6 +623,7 @@ struct afs_volume { afs_volid_t vid; /* The volume ID of this volume */ afs_volid_t vids[AFS_MAXTYPES]; /* All associated volume IDs */ refcount_t ref; + unsigned int debug_id; /* Debugging ID for traces */ time64_t update_at; /* Time at which to next update */ struct afs_cell *cell; /* Cell to which belongs (pins ref) */ struct rb_node cell_node; /* Link in cell->volumes */ diff --git a/fs/afs/volume.c b/fs/afs/volume.c index af3a3f57c1b3..0efff3d25133 100644 --- a/fs/afs/volume.c +++ b/fs/afs/volume.c @@ -10,6 +10,7 @@ #include "internal.h" static unsigned __read_mostly afs_volume_record_life = 60 * 60; +static atomic_t afs_volume_debug_id; static void afs_destroy_volume(struct work_struct *work); @@ -59,7 +60,7 @@ static void afs_remove_volume_from_cell(struct afs_volume *volume) struct afs_cell *cell = volume->cell; if (!hlist_unhashed(&volume->proc_link)) { - trace_afs_volume(volume->vid, refcount_read(&cell->ref), + trace_afs_volume(volume->debug_id, volume->vid, refcount_read(&volume->ref), afs_volume_trace_remove); write_seqlock(&cell->volume_lock); hlist_del_rcu(&volume->proc_link); @@ -84,6 +85,7 @@ static struct afs_volume *afs_alloc_volume(struct afs_fs_context *params, if (!volume) goto error_0; + volume->debug_id = atomic_inc_return(&afs_volume_debug_id); volume->vid = vldb->vid[params->type]; volume->update_at = ktime_get_real_seconds() + afs_volume_record_life; volume->cell = afs_get_cell(params->cell, afs_cell_trace_get_vol); @@ -115,7 +117,7 @@ static struct afs_volume *afs_alloc_volume(struct afs_fs_context *params, *_slist = slist; rcu_assign_pointer(volume->servers, slist); - trace_afs_volume(volume->vid, 1, afs_volume_trace_alloc); + trace_afs_volume(volume->debug_id, volume->vid, 1, afs_volume_trace_alloc); return volume; error_1: @@ -247,7 +249,7 @@ static void afs_destroy_volume(struct work_struct *work) afs_remove_volume_from_cell(volume); afs_put_serverlist(volume->cell->net, slist); afs_put_cell(volume->cell, afs_cell_trace_put_vol); - trace_afs_volume(volume->vid, refcount_read(&volume->ref), + trace_afs_volume(volume->debug_id, volume->vid, refcount_read(&volume->ref), afs_volume_trace_free); kfree_rcu(volume, rcu); @@ -262,7 +264,7 @@ bool afs_try_get_volume(struct afs_volume *volume, enum afs_volume_trace reason) int r; if (__refcount_inc_not_zero(&volume->ref, &r)) { - trace_afs_volume(volume->vid, r + 1, reason); + trace_afs_volume(volume->debug_id, volume->vid, r + 1, reason); return true; } return false; @@ -278,7 +280,7 @@ struct afs_volume *afs_get_volume(struct afs_volume *volume, int r; __refcount_inc(&volume->ref, &r); - trace_afs_volume(volume->vid, r + 1, reason); + trace_afs_volume(volume->debug_id, volume->vid, r + 1, reason); } return volume; } @@ -290,12 +292,13 @@ struct afs_volume *afs_get_volume(struct afs_volume *volume, void afs_put_volume(struct afs_volume *volume, enum afs_volume_trace reason) { if (volume) { + unsigned int debug_id = volume->debug_id; afs_volid_t vid = volume->vid; bool zero; int r; zero = __refcount_dec_and_test(&volume->ref, &r); - trace_afs_volume(vid, r - 1, reason); + trace_afs_volume(debug_id, vid, r - 1, reason); if (zero) schedule_work(&volume->destructor); } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index c19132605f41..cf94bf1e8286 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -1539,25 +1539,29 @@ TRACE_EVENT(afs_server, ); TRACE_EVENT(afs_volume, - TP_PROTO(afs_volid_t vid, int ref, enum afs_volume_trace reason), + TP_PROTO(unsigned int debug_id, afs_volid_t vid, int ref, + enum afs_volume_trace reason), - TP_ARGS(vid, ref, reason), + TP_ARGS(debug_id, vid, ref, reason), TP_STRUCT__entry( + __field(unsigned int, debug_id) __field(afs_volid_t, vid) __field(int, ref) __field(enum afs_volume_trace, reason) ), TP_fast_assign( - __entry->vid = vid; - __entry->ref = ref; - __entry->reason = reason; + __entry->debug_id = debug_id; + __entry->vid = vid; + __entry->ref = ref; + __entry->reason = reason; ), - TP_printk("V=%llx %s ur=%d", - __entry->vid, + TP_printk("V=%08x %s vid=%llx r=%d", + __entry->debug_id, __print_symbolic(__entry->reason, afs_volume_traces), + __entry->vid, __entry->ref) ); From patchwork Mon Feb 24 23:41:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989092 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0D7F20DD5A for ; Mon, 24 Feb 2025 23:42:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440571; cv=none; b=sSspYErrsUj5cTOMmRuigF3nFeIdhphNu5aFkVsuJN9GtOPpbEKuU6vLLa+K1LsJtO5ToxeJ+C4BK/S8OhIJd94nctMFkn1WIzPD3h9hCw0xuYYl+nNN2IlClFFkrbG+yc66def58Ql3tWY+aLJd3GZeoLTkb2cVxkhgkHmSKKw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440571; c=relaxed/simple; bh=bpjjkkgI8qN2Jz1CB8CZPSqqM/uWhxzK31+nJdUFGD4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DBUImW/lYIazIni3dkAwZhZyAR08N62r/wUcP7F0ybUwbvst6jcXWOP6FRcp6J1/b1EShx8jHFnZNRR/qqjm9AI//Mu4JC+gxFNQv38WDd/KpXSdMarE3BkMzKy1W3TnpiaSE48czgOb3XjYkYArvnWcIKnYERePtmt2aOUOt+Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=eG3osvWp; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eG3osvWp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qt3MUZYI6xKRH9u2P0Md1Im0HF5ENEnctvns3u/UBs0=; b=eG3osvWpz454y9uiX/pMMyEpTEnCNgLdw5ed5XQh+ee2VKWsFZFMCMVhK9xIp5HEeVKJBh Jb4xiO2eDFvTyjbC8iEKaJBkXAKreQeeXBwPfhg+S4mDf2iekmrrXhCYwr5EQvDDzPgbWo ljPmgMOhbvlYv3Zq9qb2/v/tIlNVYtg= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-235-LBVwjxxZM-KZdG8v_nl5Eg-1; Mon, 24 Feb 2025 18:42:43 -0500 X-MC-Unique: LBVwjxxZM-KZdG8v_nl5Eg-1 X-Mimecast-MFC-AGG-ID: LBVwjxxZM-KZdG8v_nl5Eg_1740440561 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4CB6C1800878; Mon, 24 Feb 2025 23:42:41 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8880C1955BD4; Mon, 24 Feb 2025 23:42:38 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 09/15] afs: Improve server refcount/active count tracing Date: Mon, 24 Feb 2025 23:41:46 +0000 Message-ID: <20250224234154.2014840-10-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Patchwork-Delegate: kuba@kernel.org Improve server refcount/active count tracing to distinguish between simply getting/putting a ref and using/unusing the server record (which changes the activity count as well as the refcount). This makes it a bit easier to work out what's going on. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/fsclient.c | 4 ++-- fs/afs/rxrpc.c | 2 +- fs/afs/server.c | 11 ++++++----- fs/afs/server_list.c | 4 ++-- include/trace/events/afs.h | 27 +++++++++++++++------------ 5 files changed, 26 insertions(+), 22 deletions(-) diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c index 1d9ecd5418d8..9f46d9aebc33 100644 --- a/fs/afs/fsclient.c +++ b/fs/afs/fsclient.c @@ -1653,7 +1653,7 @@ int afs_fs_give_up_all_callbacks(struct afs_net *net, struct afs_server *server, bp = call->request; *bp++ = htonl(FSGIVEUPALLCALLBACKS); - call->server = afs_use_server(server, afs_server_trace_give_up_cb); + call->server = afs_use_server(server, afs_server_trace_use_give_up_cb); afs_make_call(call, GFP_NOFS); afs_wait_for_call_to_complete(call); ret = call->error; @@ -1760,7 +1760,7 @@ bool afs_fs_get_capabilities(struct afs_net *net, struct afs_server *server, return false; call->key = key; - call->server = afs_use_server(server, afs_server_trace_get_caps); + call->server = afs_use_server(server, afs_server_trace_use_get_caps); call->peer = rxrpc_kernel_get_peer(estate->addresses->addrs[addr_index].peer); call->probe = afs_get_endpoint_state(estate, afs_estate_trace_get_getcaps); call->probe_index = addr_index; diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index 886416ea1d96..de9e10575bdd 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -179,7 +179,7 @@ static void afs_free_call(struct afs_call *call) if (call->type->destructor) call->type->destructor(call); - afs_unuse_server_notime(call->net, call->server, afs_server_trace_put_call); + afs_unuse_server_notime(call->net, call->server, afs_server_trace_unuse_call); kfree(call->request); o = atomic_read(&net->nr_outstanding_calls); diff --git a/fs/afs/server.c b/fs/afs/server.c index 4504e16b458c..923e07c37032 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -33,7 +33,7 @@ struct afs_server *afs_find_server(struct afs_net *net, const struct rxrpc_peer do { if (server) - afs_unuse_server_notime(net, server, afs_server_trace_put_find_rsq); + afs_unuse_server_notime(net, server, afs_server_trace_unuse_find_rsq); server = NULL; seq++; /* 2 on the 1st/lockless path, otherwise odd */ read_seqbegin_or_lock(&net->fs_addr_lock, &seq); @@ -49,7 +49,7 @@ struct afs_server *afs_find_server(struct afs_net *net, const struct rxrpc_peer server = NULL; continue; found: - server = afs_maybe_use_server(server, afs_server_trace_get_by_addr); + server = afs_maybe_use_server(server, afs_server_trace_use_by_addr); } while (need_seqretry(&net->fs_addr_lock, seq)); @@ -76,7 +76,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu * changes. */ if (server) - afs_unuse_server(net, server, afs_server_trace_put_uuid_rsq); + afs_unuse_server(net, server, afs_server_trace_unuse_uuid_rsq); server = NULL; seq++; /* 2 on the 1st/lockless path, otherwise odd */ read_seqbegin_or_lock(&net->fs_lock, &seq); @@ -91,7 +91,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu } else if (diff > 0) { p = p->rb_right; } else { - afs_use_server(server, afs_server_trace_get_by_uuid); + afs_use_server(server, afs_server_trace_use_by_uuid); break; } @@ -273,7 +273,8 @@ static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell, } /* - * Get or create a fileserver record. + * Get or create a fileserver record and return it with an active-use count on + * it. */ struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key, const uuid_t *uuid, u32 addr_version) diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c index d20cd902ef94..784236b9b2a9 100644 --- a/fs/afs/server_list.c +++ b/fs/afs/server_list.c @@ -16,7 +16,7 @@ void afs_put_serverlist(struct afs_net *net, struct afs_server_list *slist) if (slist && refcount_dec_and_test(&slist->usage)) { for (i = 0; i < slist->nr_servers; i++) afs_unuse_server(net, slist->servers[i].server, - afs_server_trace_put_slist); + afs_server_trace_unuse_slist); kfree_rcu(slist, rcu); } } @@ -98,7 +98,7 @@ struct afs_server_list *afs_alloc_server_list(struct afs_volume *volume, if (j < slist->nr_servers) { if (slist->servers[j].server == server) { afs_unuse_server(volume->cell->net, server, - afs_server_trace_put_slist_isort); + afs_server_trace_unuse_slist_isort); continue; } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index cf94bf1e8286..24d99fbc298f 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -132,22 +132,25 @@ enum yfs_cm_operation { EM(afs_server_trace_destroy, "DESTROY ") \ EM(afs_server_trace_free, "FREE ") \ EM(afs_server_trace_gc, "GC ") \ - EM(afs_server_trace_get_by_addr, "GET addr ") \ - EM(afs_server_trace_get_by_uuid, "GET uuid ") \ - EM(afs_server_trace_get_caps, "GET caps ") \ EM(afs_server_trace_get_install, "GET inst ") \ - EM(afs_server_trace_get_new_cbi, "GET cbi ") \ EM(afs_server_trace_get_probe, "GET probe") \ - EM(afs_server_trace_give_up_cb, "giveup-cb") \ EM(afs_server_trace_purging, "PURGE ") \ - EM(afs_server_trace_put_call, "PUT call ") \ EM(afs_server_trace_put_cbi, "PUT cbi ") \ - EM(afs_server_trace_put_find_rsq, "PUT f-rsq") \ EM(afs_server_trace_put_probe, "PUT probe") \ - EM(afs_server_trace_put_slist, "PUT slist") \ - EM(afs_server_trace_put_slist_isort, "PUT isort") \ - EM(afs_server_trace_put_uuid_rsq, "PUT u-req") \ - E_(afs_server_trace_update, "UPDATE") + EM(afs_server_trace_see_expired, "SEE expd ") \ + EM(afs_server_trace_unuse_call, "UNU call ") \ + EM(afs_server_trace_unuse_create_fail, "UNU cfail") \ + EM(afs_server_trace_unuse_find_rsq, "UNU f-rsq") \ + EM(afs_server_trace_unuse_slist, "UNU slist") \ + EM(afs_server_trace_unuse_slist_isort, "UNU isort") \ + EM(afs_server_trace_unuse_uuid_rsq, "PUT u-req") \ + EM(afs_server_trace_update, "UPDATE ") \ + EM(afs_server_trace_use_by_addr, "USE addr ") \ + EM(afs_server_trace_use_by_uuid, "USE uuid ") \ + EM(afs_server_trace_use_cm_call, "USE cm-cl") \ + EM(afs_server_trace_use_get_caps, "USE gcaps") \ + EM(afs_server_trace_use_give_up_cb, "USE gvupc") \ + E_(afs_server_trace_wait_create, "WAIT crt ") #define afs_volume_traces \ EM(afs_volume_trace_alloc, "ALLOC ") \ @@ -1531,7 +1534,7 @@ TRACE_EVENT(afs_server, __entry->reason = reason; ), - TP_printk("s=%08x %s u=%d a=%d", + TP_printk("s=%08x %s r=%d a=%d", __entry->server, __print_symbolic(__entry->reason, afs_server_traces), __entry->ref, From patchwork Mon Feb 24 23:41:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989093 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C59DE212B0D for ; Mon, 24 Feb 2025 23:42:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440574; cv=none; b=jmP+3t2FH1+PDk323WRUDUJSPr378JKjTAzHHBu2MNv1Gf5JRR3qqVhoqlohtPHx2ktENEOA6XmVOnjmpPwwcTnUHG+i8AQnNj76FOkELVWp4fmhRrgTky2xIe6Vsm415mxljPR4r/cy6hiID3opJd385RB5GgjoCvCHOPbLyUA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440574; c=relaxed/simple; bh=wti+nj9OPTh/bErQ5E4oKhRSKmC0KR1tHufxQXAG6jo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QM9zzcCgd4MRi2cChFqrYf0ghx2OZFtlYaSOydR8BHm73vyOGeD9A41A9LiOaSCTpSwXMAfUh7casXK1KU3gUWvVmIUtX0XnweIQ9UD98wOR3hO4x9sw3U47uSgX+VSPSvlYlJsz/p7VfU8WgvJLe7SYkcxBtVjNIrbum74NEbs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=GbF12Qe5; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GbF12Qe5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440572; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xhNgi6WOoRTfLf6jGt4MeG/7EO7nsIITUX1Wg/NI4TM=; b=GbF12Qe5zJJaGMsTzdWSDfDS1LVq+a0FYT/Pc4XWoET8IiPIjWfBjxbVcoqHbb0cdu65jV /Z15yA1Q0xvRnlpXcf8FFQBwxuJC2/Dd2Tc5uq776VsW7BBFG0W9fXl/QfQos0A6CtMwXU dXFOm6IuRQeUnyw9QCyS5KUFD7ORFKs= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-463-BudGJgQyN6uppa6OnIc8Ow-1; Mon, 24 Feb 2025 18:42:47 -0500 X-MC-Unique: BudGJgQyN6uppa6OnIc8Ow-1 X-Mimecast-MFC-AGG-ID: BudGJgQyN6uppa6OnIc8Ow_1740440565 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CAA7A1800878; Mon, 24 Feb 2025 23:42:45 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C71B719560AE; Mon, 24 Feb 2025 23:42:42 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 10/15] afs: Make afs_lookup_cell() take a trace note Date: Mon, 24 Feb 2025 23:41:47 +0000 Message-ID: <20250224234154.2014840-11-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Patchwork-Delegate: kuba@kernel.org Pass a note to be added to the afs_cell tracepoint to afs_lookup_cell() so that different callers can be distinguished. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/cell.c | 13 ++++++++----- fs/afs/dynroot.c | 3 ++- fs/afs/internal.h | 6 ++++-- fs/afs/mntpt.c | 3 ++- fs/afs/proc.c | 3 ++- fs/afs/super.c | 3 ++- fs/afs/vl_alias.c | 3 ++- include/trace/events/afs.h | 7 ++++++- 8 files changed, 28 insertions(+), 13 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 9f5d8bc2bc5f..acdcd955f644 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -231,6 +231,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, * @namesz: The strlen of the cell name. * @vllist: A colon/comma separated list of numeric IP addresses or NULL. * @excl: T if an error should be given if the cell name already exists. + * @trace: The reason to be logged if the lookup is successful. * * Look up a cell record by name and query the DNS for VL server addresses if * needed. Note that that actual DNS query is punted off to the manager thread @@ -239,7 +240,8 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, */ struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *name, unsigned int namesz, - const char *vllist, bool excl) + const char *vllist, bool excl, + enum afs_cell_trace trace) { struct afs_cell *cell, *candidate, *cursor; struct rb_node *parent, **pp; @@ -249,7 +251,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, _enter("%s,%s", name, vllist); if (!excl) { - cell = afs_find_cell(net, name, namesz, afs_cell_trace_use_lookup); + cell = afs_find_cell(net, name, namesz, trace); if (!IS_ERR(cell)) goto wait_for_cell; } @@ -325,7 +327,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, if (excl) { ret = -EEXIST; } else { - afs_use_cell(cursor, afs_cell_trace_use_lookup); + afs_use_cell(cursor, trace); ret = 0; } up_write(&net->cells_lock); @@ -380,8 +382,9 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) if (cp && cp < rootcell + len) return -EINVAL; - /* allocate a cell record for the root cell */ - new_root = afs_lookup_cell(net, rootcell, len, vllist, false); + /* allocate a cell record for the root/workstation cell */ + new_root = afs_lookup_cell(net, rootcell, len, vllist, false, + afs_cell_trace_use_lookup_ws); if (IS_ERR(new_root)) { _leave(" = %ld", PTR_ERR(new_root)); return PTR_ERR(new_root); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 0cf5a196d4d7..d5bb5668e788 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -108,7 +108,8 @@ static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry * dotted = true; } - cell = afs_lookup_cell(net, name, len, NULL, false); + cell = afs_lookup_cell(net, name, len, NULL, false, + afs_cell_trace_use_lookup_dynroot); if (IS_ERR(cell)) { ret = PTR_ERR(cell); goto out_no_cell; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index bbd550d496a7..fcef2f7cb8ad 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1046,8 +1046,10 @@ static inline bool afs_cb_is_broken(unsigned int cb_break, extern int afs_cell_init(struct afs_net *, const char *); extern struct afs_cell *afs_find_cell(struct afs_net *, const char *, unsigned, enum afs_cell_trace); -extern struct afs_cell *afs_lookup_cell(struct afs_net *, const char *, unsigned, - const char *, bool); +struct afs_cell *afs_lookup_cell(struct afs_net *net, + const char *name, unsigned int namesz, + const char *vllist, bool excl, + enum afs_cell_trace trace); extern struct afs_cell *afs_use_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_unuse_cell(struct afs_net *, struct afs_cell *, enum afs_cell_trace); extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 507c25a5b2cb..4a3edb9990b0 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -107,7 +107,8 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) if (size > AFS_MAXCELLNAME) return -ENAMETOOLONG; - cell = afs_lookup_cell(ctx->net, p, size, NULL, false); + cell = afs_lookup_cell(ctx->net, p, size, NULL, false, + afs_cell_trace_use_lookup_mntpt); if (IS_ERR(cell)) { pr_err("kAFS: unable to lookup cell '%pd'\n", mntpt); return PTR_ERR(cell); diff --git a/fs/afs/proc.c b/fs/afs/proc.c index e7614f4f30c2..8e71720a86bb 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -122,7 +122,8 @@ static int afs_proc_cells_write(struct file *file, char *buf, size_t size) if (strcmp(buf, "add") == 0) { struct afs_cell *cell; - cell = afs_lookup_cell(net, name, strlen(name), args, true); + cell = afs_lookup_cell(net, name, strlen(name), args, true, + afs_cell_trace_use_lookup_add); if (IS_ERR(cell)) { ret = PTR_ERR(cell); goto done; diff --git a/fs/afs/super.c b/fs/afs/super.c index dfc109f48ad5..aa6a3ccf39b5 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -290,7 +290,8 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) /* lookup the cell record */ if (cellname) { cell = afs_lookup_cell(ctx->net, cellname, cellnamesz, - NULL, false); + NULL, false, + afs_cell_trace_use_lookup_mount); if (IS_ERR(cell)) { pr_err("kAFS: unable to lookup cell '%*.*s'\n", cellnamesz, cellnamesz, cellname ?: ""); diff --git a/fs/afs/vl_alias.c b/fs/afs/vl_alias.c index f9e76b604f31..ffcfba1725e6 100644 --- a/fs/afs/vl_alias.c +++ b/fs/afs/vl_alias.c @@ -269,7 +269,8 @@ static int yfs_check_canonical_cell_name(struct afs_cell *cell, struct key *key) if (!name_len || name_len > AFS_MAXCELLNAME) master = ERR_PTR(-EOPNOTSUPP); else - master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, false); + master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, false, + afs_cell_trace_use_lookup_canonical); kfree(cell_name); if (IS_ERR(master)) return PTR_ERR(master); diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 24d99fbc298f..42c3a51db72b 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -208,7 +208,12 @@ enum yfs_cm_operation { EM(afs_cell_trace_use_check_alias, "USE chk-al") \ EM(afs_cell_trace_use_fc, "USE fc ") \ EM(afs_cell_trace_use_fc_alias, "USE fc-al ") \ - EM(afs_cell_trace_use_lookup, "USE lookup") \ + EM(afs_cell_trace_use_lookup_add, "USE lu-add") \ + EM(afs_cell_trace_use_lookup_canonical, "USE lu-can") \ + EM(afs_cell_trace_use_lookup_dynroot, "USE lu-dyn") \ + EM(afs_cell_trace_use_lookup_mntpt, "USE lu-mpt") \ + EM(afs_cell_trace_use_lookup_mount, "USE lu-mnt") \ + EM(afs_cell_trace_use_lookup_ws, "USE lu-ws ") \ EM(afs_cell_trace_use_mntpt, "USE mntpt ") \ EM(afs_cell_trace_use_pin, "USE pin ") \ EM(afs_cell_trace_use_probe, "USE probe ") \ From patchwork Mon Feb 24 23:41:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989094 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22D4320E019 for ; Mon, 24 Feb 2025 23:42:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440577; cv=none; b=TxewHpWmu9K9LApljg61iag+4I1XjuVOIgr0c/V6iCAEMlWDHPRwT7/otttx8vgd7ePvglGQjfw4aXoYZxtVUrNKYLe3+9uA+Q4fWsUhuC2lCp/H4NXTXeFI6iLggbcCG40aMjNKRj4ehRasDOiP8ElcU81fd4rE28z4pvEkW74= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440577; c=relaxed/simple; bh=TW4yi+8iqfKJGKIDwzFQqVPD1AJxvhKR7TpZaTeqlKs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iKFjz49b7llG8GhIeiMRk3w2Tkjso8EDly4LlC3ZIJ8qAJrdt8TKdbWrXJgPR7g5SR4/o2ZOgzzB3vqcrtfNR73u546vN7fj1XXAuFWcpUdDqVV7hj/AfnMKSHz17XC+d9Pnmv8zvBVzxktPXKxCHzXNFcJs4/bkvZoUB1Eu7AI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=D4ow4Ar5; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="D4ow4Ar5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440575; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hHXvOcmSIrpvi0gaZOmenyVSRyZA3sGbnx+mn80L5Q4=; b=D4ow4Ar5jmw0+T/Qm9Lo7w+JKESLhHaSEivr5v5pCYNGF+Uf1ACEwjiNpSlCVULFnJIlO7 i10y8wGAF1gZ6BFBkB46Tws7D5FgwxPdr++MQbq5+a+fp4ILFIVEf0evN5S4IaFWJo6/w4 7k3ae3PbFjVPn2fJRZVgZ9/p1UUV8ug= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-167-VcUDDBobNouTdZ0MZ0Dkag-1; Mon, 24 Feb 2025 18:42:52 -0500 X-MC-Unique: VcUDDBobNouTdZ0MZ0Dkag-1 X-Mimecast-MFC-AGG-ID: VcUDDBobNouTdZ0MZ0Dkag_1740440570 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9B4441800875; Mon, 24 Feb 2025 23:42:50 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 14E02180035E; Mon, 24 Feb 2025 23:42:46 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 11/15] afs: Drop the net parameter from afs_unuse_cell() Date: Mon, 24 Feb 2025 23:41:48 +0000 Message-ID: <20250224234154.2014840-12-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Patchwork-Delegate: kuba@kernel.org Remove the redundant net parameter to afs_unuse_cell() as cell->net can be used instead. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/cell.c | 12 ++++++------ fs/afs/dynroot.c | 4 ++-- fs/afs/internal.h | 2 +- fs/afs/mntpt.c | 2 +- fs/afs/proc.c | 2 +- fs/afs/super.c | 9 ++++----- fs/afs/vl_alias.c | 4 ++-- include/trace/events/afs.h | 1 + 8 files changed, 18 insertions(+), 18 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index acdcd955f644..09f9613a46dc 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -337,7 +337,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, goto wait_for_cell; goto error_noput; error: - afs_unuse_cell(net, cell, afs_cell_trace_unuse_lookup); + afs_unuse_cell(cell, afs_cell_trace_unuse_lookup_error); error_noput: _leave(" = %d [error]", ret); return ERR_PTR(ret); @@ -400,7 +400,7 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) net->ws_cell = new_root; up_write(&net->cells_lock); - afs_unuse_cell(net, old_root, afs_cell_trace_unuse_ws); + afs_unuse_cell(old_root, afs_cell_trace_unuse_ws); _leave(" = 0"); return 0; } @@ -518,7 +518,7 @@ static void afs_cell_destroy(struct rcu_head *rcu) trace_afs_cell(cell->debug_id, r, atomic_read(&cell->active), afs_cell_trace_free); afs_put_vlserverlist(net, rcu_access_pointer(cell->vl_servers)); - afs_unuse_cell(net, cell->alias_of, afs_cell_trace_unuse_alias); + afs_unuse_cell(cell->alias_of, afs_cell_trace_unuse_alias); key_put(cell->anonymous_key); idr_remove(&net->cells_dyn_ino, cell->dynroot_ino); kfree(cell->name - 1); @@ -606,7 +606,7 @@ struct afs_cell *afs_use_cell(struct afs_cell *cell, enum afs_cell_trace reason) * Record a cell becoming less active. When the active counter reaches 1, it * is scheduled for destruction, but may get reactivated. */ -void afs_unuse_cell(struct afs_net *net, struct afs_cell *cell, enum afs_cell_trace reason) +void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason) { unsigned int debug_id; time64_t now, expire_delay; @@ -630,7 +630,7 @@ void afs_unuse_cell(struct afs_net *net, struct afs_cell *cell, enum afs_cell_tr WARN_ON(a == 0); if (a == 1) /* 'cell' may now be garbage collected. */ - afs_set_cell_timer(net, expire_delay); + afs_set_cell_timer(cell->net, expire_delay); } /* @@ -955,7 +955,7 @@ void afs_cell_purge(struct afs_net *net) ws = net->ws_cell; net->ws_cell = NULL; up_write(&net->cells_lock); - afs_unuse_cell(net, ws, afs_cell_trace_unuse_ws); + afs_unuse_cell(ws, afs_cell_trace_unuse_ws); _debug("del timer"); if (del_timer_sync(&net->cells_timer)) diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index d5bb5668e788..a34b45f97d30 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -125,7 +125,7 @@ static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry * return d_splice_alias(inode, dentry); out: - afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_lookup_dynroot); + afs_unuse_cell(cell, afs_cell_trace_unuse_lookup_dynroot); out_no_cell: if (!inode) return d_splice_alias(inode, dentry); @@ -167,7 +167,7 @@ static void afs_dynroot_d_release(struct dentry *dentry) { struct afs_cell *cell = dentry->d_fsdata; - afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_dynroot_mntpt); + afs_unuse_cell(cell, afs_cell_trace_unuse_dynroot_mntpt); } /* diff --git a/fs/afs/internal.h b/fs/afs/internal.h index fcef2f7cb8ad..388f60b5c995 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1051,7 +1051,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *vllist, bool excl, enum afs_cell_trace trace); extern struct afs_cell *afs_use_cell(struct afs_cell *, enum afs_cell_trace); -extern void afs_unuse_cell(struct afs_net *, struct afs_cell *, enum afs_cell_trace); +void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason); extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_see_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_put_cell(struct afs_cell *, enum afs_cell_trace); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 4a3edb9990b0..45cee6534122 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -87,7 +87,7 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) ctx->force = true; } if (ctx->cell) { - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_mntpt); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_mntpt); ctx->cell = NULL; } if (test_bit(AFS_VNODE_PSEUDODIR, &vnode->flags)) { diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 8e71720a86bb..f07c1c52ef5d 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -130,7 +130,7 @@ static int afs_proc_cells_write(struct file *file, char *buf, size_t size) } if (test_and_set_bit(AFS_CELL_FL_NO_GC, &cell->flags)) - afs_unuse_cell(net, cell, afs_cell_trace_unuse_no_pin); + afs_unuse_cell(cell, afs_cell_trace_unuse_no_pin); } else { goto inval; } diff --git a/fs/afs/super.c b/fs/afs/super.c index aa6a3ccf39b5..25b306db6992 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -297,7 +297,7 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) cellnamesz, cellnamesz, cellname ?: ""); return PTR_ERR(cell); } - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_parse); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_parse); afs_see_cell(cell, afs_cell_trace_see_source); ctx->cell = cell; } @@ -394,7 +394,7 @@ static int afs_validate_fc(struct fs_context *fc) ctx->key = NULL; cell = afs_use_cell(ctx->cell->alias_of, afs_cell_trace_use_fc_alias); - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_fc); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_fc); ctx->cell = cell; goto reget_key; } @@ -520,9 +520,8 @@ static struct afs_super_info *afs_alloc_sbi(struct fs_context *fc) static void afs_destroy_sbi(struct afs_super_info *as) { if (as) { - struct afs_net *net = afs_net(as->net_ns); afs_put_volume(as->volume, afs_volume_trace_put_destroy_sbi); - afs_unuse_cell(net, as->cell, afs_cell_trace_unuse_sbi); + afs_unuse_cell(as->cell, afs_cell_trace_unuse_sbi); put_net(as->net_ns); kfree(as); } @@ -605,7 +604,7 @@ static void afs_free_fc(struct fs_context *fc) afs_destroy_sbi(fc->s_fs_info); afs_put_volume(ctx->volume, afs_volume_trace_put_free_fc); - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_fc); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_fc); key_put(ctx->key); kfree(ctx); } diff --git a/fs/afs/vl_alias.c b/fs/afs/vl_alias.c index ffcfba1725e6..709b4cdb723e 100644 --- a/fs/afs/vl_alias.c +++ b/fs/afs/vl_alias.c @@ -205,11 +205,11 @@ static int afs_query_for_alias(struct afs_cell *cell, struct key *key) goto is_alias; if (mutex_lock_interruptible(&cell->net->proc_cells_lock) < 0) { - afs_unuse_cell(cell->net, p, afs_cell_trace_unuse_check_alias); + afs_unuse_cell(p, afs_cell_trace_unuse_check_alias); return -ERESTARTSYS; } - afs_unuse_cell(cell->net, p, afs_cell_trace_unuse_check_alias); + afs_unuse_cell(p, afs_cell_trace_unuse_check_alias); } mutex_unlock(&cell->net->proc_cells_lock); diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 42c3a51db72b..82d20c28dc0d 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -197,6 +197,7 @@ enum yfs_cm_operation { EM(afs_cell_trace_unuse_fc, "UNU fc ") \ EM(afs_cell_trace_unuse_lookup, "UNU lookup") \ EM(afs_cell_trace_unuse_lookup_dynroot, "UNU lu-dyn") \ + EM(afs_cell_trace_unuse_lookup_error, "UNU lu-err") \ EM(afs_cell_trace_unuse_mntpt, "UNU mntpt ") \ EM(afs_cell_trace_unuse_no_pin, "UNU no-pin") \ EM(afs_cell_trace_unuse_parse, "UNU parse ") \ From patchwork Mon Feb 24 23:41:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989095 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4FBB62135CA for ; Mon, 24 Feb 2025 23:43:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440583; cv=none; b=KCGjyLADyvKMb6R4mx0gl22Po71mt48L+jbcD+UnNXGVUUiS8ZGpomEttIPEThC48FrA6m7nZ95PT8D8kwGr0IAB7VmR1tuNhAQjHsoJCHLYnDaw/ggwP/C00tHrA16eQGCEeD2d17EmkOlMZENCo1iC/sJTmPno9iDaz0gob4Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440583; c=relaxed/simple; bh=7GSX7JKuB/vKM5jaru61sD+LYgBzM0cIZr1NEMjQElc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dFOrVnyLsoFNqwA7YpAkRXTDkCjVq/nRZE6H8MQd5g6aV8ELzejkVVLRyquTaIoVq7EBmY52rhLRW4qSKKy4Q99lwzOMCsgVeI1EkJsueRbq8TZ1td9pWcq+6ozIQDxoHkHDGvqVjIUD9aTXWKN7cX5uVx6R6Y4mE8dY6EVQMmA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ff0diw0K; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ff0diw0K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440581; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q5ues3zY+9vJROA+24t/iPCUg8VWVFjV9Q9yVGRjWjU=; b=Ff0diw0K15yFG179nptsxzZiN3N3G1ig0pCVwN42KG3jT42becMnQqdxZKnPlCmdh8aBpS hlmsl2o4KTxrJm4sqNb4IkolcMUe3drFbZ4p5fNk4HHHNWjxUDLQTjJOjO9JmEK4VOObFm 7ZI/fSCzTz5SiPv3B46SnsgTWL/TVjs= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-517-Q7eUhoPUMHK8NKDGFirnfg-1; Mon, 24 Feb 2025 18:42:56 -0500 X-MC-Unique: Q7eUhoPUMHK8NKDGFirnfg-1 X-Mimecast-MFC-AGG-ID: Q7eUhoPUMHK8NKDGFirnfg_1740440574 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7703A19039CA; Mon, 24 Feb 2025 23:42:54 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B33C31955BD4; Mon, 24 Feb 2025 23:42:51 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 12/15] rxrpc: Allow the app to store private data on peer structs Date: Mon, 24 Feb 2025 23:41:49 +0000 Message-ID: <20250224234154.2014840-13-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Patchwork-Delegate: kuba@kernel.org Provide a way for the application (e.g. the afs filesystem) to store private data on the rxrpc_peer structs for later retrieval via the call object. This will allow afs to store a pointer to the afs_server object on the rxrpc_peer struct, thereby obviating the need for afs to keep lookup tables by which it can associate an incoming call with server that transmitted it. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- include/net/af_rxrpc.h | 2 ++ net/rxrpc/ar-internal.h | 1 + net/rxrpc/peer_object.c | 26 ++++++++++++++++++++++++++ 3 files changed, 29 insertions(+) diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h index 0754c463224a..cf793d18e5df 100644 --- a/include/net/af_rxrpc.h +++ b/include/net/af_rxrpc.h @@ -69,6 +69,8 @@ struct rxrpc_peer *rxrpc_kernel_get_peer(struct rxrpc_peer *peer); struct rxrpc_peer *rxrpc_kernel_get_call_peer(struct socket *sock, struct rxrpc_call *call); const struct sockaddr_rxrpc *rxrpc_kernel_remote_srx(const struct rxrpc_peer *peer); const struct sockaddr *rxrpc_kernel_remote_addr(const struct rxrpc_peer *peer); +unsigned long rxrpc_kernel_set_peer_data(struct rxrpc_peer *peer, unsigned long app_data); +unsigned long rxrpc_kernel_get_peer_data(const struct rxrpc_peer *peer); unsigned int rxrpc_kernel_get_srtt(const struct rxrpc_peer *); int rxrpc_kernel_charge_accept(struct socket *, rxrpc_notify_rx_t, rxrpc_user_attach_call_t, unsigned long, gfp_t, diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index a64a0cab1bf7..3cc3af15086f 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -344,6 +344,7 @@ struct rxrpc_peer { struct hlist_head error_targets; /* targets for net error distribution */ struct rb_root service_conns; /* Service connections */ struct list_head keepalive_link; /* Link in net->peer_keepalive[] */ + unsigned long app_data; /* Application data (e.g. afs_server) */ time64_t last_tx_at; /* Last time packet sent here */ seqlock_t service_conn_lock; spinlock_t lock; /* access lock */ diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index 56e09d161a97..a0c0e4d590f5 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -520,3 +520,29 @@ const struct sockaddr *rxrpc_kernel_remote_addr(const struct rxrpc_peer *peer) (peer ? &peer->srx.transport : &rxrpc_null_addr.transport); } EXPORT_SYMBOL(rxrpc_kernel_remote_addr); + +/** + * rxrpc_kernel_set_peer_data - Set app-specific data on a peer. + * @peer: The peer to alter + * @app_data: The data to set + * + * Set the app-specific data on a peer. AF_RXRPC makes no effort to retain + * anything the data might refer to. The previous app_data is returned. + */ +unsigned long rxrpc_kernel_set_peer_data(struct rxrpc_peer *peer, unsigned long app_data) +{ + return xchg(&peer->app_data, app_data); +} +EXPORT_SYMBOL(rxrpc_kernel_set_peer_data); + +/** + * rxrpc_kernel_get_peer_data - Get app-specific data from a peer. + * @peer: The peer to query + * + * Retrieve the app-specific data from a peer. + */ +unsigned long rxrpc_kernel_get_peer_data(const struct rxrpc_peer *peer) +{ + return peer->app_data; +} +EXPORT_SYMBOL(rxrpc_kernel_get_peer_data); From patchwork Mon Feb 24 23:41:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989096 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B7D52135D7 for ; Mon, 24 Feb 2025 23:43:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440587; cv=none; b=tkpKHTBhMzr6gFZ2pMCJiDB+bSAaN2cyah5lS2mMaPgrMOKeD8oNS8JyseX8kbv+VOtZ2Q1vux2BV2JiyWt/9Mnd1cZULNXyP6E004FPf/+HonktM/W51JCZ5W/FvVGPHBlRydMdN1nTRyDnxHHQnfNUibhtEA4TIINMenSnPi0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440587; c=relaxed/simple; bh=Q0BA5gjvk977xbCrLjpmhWqjx+hEkNWlFcia2JuuILc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ffTlJwY+uETy0RYymu+8AD/Pmph0XAwbyU/8PiQeOfSi8A5WgyaTR/g4+YQweIRu3fkNhqYZVTpnF0OfmR72WjCOzugxMIqwgDUSzoMOzkB1cB6tLy07vDD+iMIbvRQh29KjAJtu8cWPmzDuUu0vDf8f3LnB1XZOVwzNIhQDCk8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=fAT3R5Dk; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fAT3R5Dk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440584; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JqRP6+B4vUKaQBcH12f3hD8auaak33wCem06F1VD/2Q=; b=fAT3R5DkNhJDMptAjsdSYOKuow3K0qZK5hW6KhJDjXfBh1/VxWzZgjOhH2wetXqws9YNPS wObI8zhXEHgU+dqq8dkFrQq0ugwTAKs+/YyxBhHTtW2ER6e9GnDe5JHCC8liQIywyHP15k A7Yuug5+/io6dAm+Cvj31JX/GpuIexA= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-196-75w83OFMP-Gka926ufNZQQ-1; Mon, 24 Feb 2025 18:43:00 -0500 X-MC-Unique: 75w83OFMP-Gka926ufNZQQ-1 X-Mimecast-MFC-AGG-ID: 75w83OFMP-Gka926ufNZQQ_1740440579 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id AFDED18EB2C3; Mon, 24 Feb 2025 23:42:58 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ED7661954B00; Mon, 24 Feb 2025 23:42:55 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 13/15] afs: Use the per-peer app data provided by rxrpc Date: Mon, 24 Feb 2025 23:41:50 +0000 Message-ID: <20250224234154.2014840-14-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Patchwork-Delegate: kuba@kernel.org Make use of the per-peer application data that rxrpc now allows the application to store on the rxrpc_peer struct to hold a back pointer to the afs_server record that peer represents an endpoint for. Then, when a call comes in to the AFS cache manager, this can be used to map it to the correct server record rather than having to use a UUID-to-server mapping table and having to do an additional lookup. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/addr_list.c | 50 +++++++++++++++++++++++ fs/afs/cmservice.c | 82 ++++++-------------------------------- fs/afs/fs_probe.c | 32 ++++++++++----- fs/afs/internal.h | 9 +++-- fs/afs/proc.c | 10 ++++- fs/afs/rxrpc.c | 6 +++ fs/afs/server.c | 46 ++++++--------------- include/trace/events/afs.h | 4 +- net/rxrpc/peer_object.c | 4 +- 9 files changed, 120 insertions(+), 123 deletions(-) diff --git a/fs/afs/addr_list.c b/fs/afs/addr_list.c index 6d42f85c6be5..e941da5b6dd9 100644 --- a/fs/afs/addr_list.c +++ b/fs/afs/addr_list.c @@ -362,3 +362,53 @@ int afs_merge_fs_addr6(struct afs_net *net, struct afs_addr_list *alist, alist->nr_addrs++; return 0; } + +/* + * Set the app data on the rxrpc peers an address list points to + */ +void afs_set_peer_appdata(struct afs_server *server, + struct afs_addr_list *old_alist, + struct afs_addr_list *new_alist) +{ + unsigned long data = (unsigned long)server; + int n = 0, o = 0; + + if (!old_alist) { + /* New server. Just set all. */ + for (; n < new_alist->nr_addrs; n++) + rxrpc_kernel_set_peer_data(new_alist->addrs[n].peer, data); + return; + } + if (!new_alist) { + /* Dead server. Just remove all. */ + for (; o < old_alist->nr_addrs; o++) + rxrpc_kernel_set_peer_data(old_alist->addrs[o].peer, 0); + return; + } + + /* Walk through the two lists simultaneously, setting new peers and + * clearing old ones. The two lists are ordered by pointer to peer + * record. + */ + while (n < new_alist->nr_addrs && o < old_alist->nr_addrs) { + struct rxrpc_peer *pn = new_alist->addrs[n].peer; + struct rxrpc_peer *po = old_alist->addrs[o].peer; + + if (pn == po) + continue; + if (pn < po) { + rxrpc_kernel_set_peer_data(pn, data); + n++; + } else { + rxrpc_kernel_set_peer_data(po, 0); + o++; + } + } + + if (n < new_alist->nr_addrs) + for (; n < new_alist->nr_addrs; n++) + rxrpc_kernel_set_peer_data(new_alist->addrs[n].peer, data); + if (o < old_alist->nr_addrs) + for (; o < old_alist->nr_addrs; o++) + rxrpc_kernel_set_peer_data(old_alist->addrs[o].peer, 0); +} diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c index 99a3f20bc786..1a906805a9e3 100644 --- a/fs/afs/cmservice.c +++ b/fs/afs/cmservice.c @@ -138,49 +138,6 @@ bool afs_cm_incoming_call(struct afs_call *call) } } -/* - * Find the server record by peer address and record a probe to the cache - * manager from a server. - */ -static int afs_find_cm_server_by_peer(struct afs_call *call) -{ - struct sockaddr_rxrpc srx; - struct afs_server *server; - struct rxrpc_peer *peer; - - peer = rxrpc_kernel_get_call_peer(call->net->socket, call->rxcall); - - server = afs_find_server(call->net, peer); - if (!server) { - trace_afs_cm_no_server(call, &srx); - return 0; - } - - call->server = server; - return 0; -} - -/* - * Find the server record by server UUID and record a probe to the cache - * manager from a server. - */ -static int afs_find_cm_server_by_uuid(struct afs_call *call, - struct afs_uuid *uuid) -{ - struct afs_server *server; - - rcu_read_lock(); - server = afs_find_server_by_uuid(call->net, call->request); - rcu_read_unlock(); - if (!server) { - trace_afs_cm_no_server_u(call, call->request); - return 0; - } - - call->server = server; - return 0; -} - /* * Clean up a cache manager call. */ @@ -322,10 +279,7 @@ static int afs_deliver_cb_callback(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - - /* we'll need the file server record as that tells us which set of - * vnodes to operate upon */ - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -349,18 +303,10 @@ static void SRXAFSCB_InitCallBackState(struct work_struct *work) */ static int afs_deliver_cb_init_call_back_state(struct afs_call *call) { - int ret; - _enter(""); afs_extract_discard(call, 0); - ret = afs_extract_data(call, false); - if (ret < 0) - return ret; - - /* we'll need the file server record as that tells us which set of - * vnodes to operate upon */ - return afs_find_cm_server_by_peer(call); + return afs_extract_data(call, false); } /* @@ -373,8 +319,6 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call) __be32 *b; int ret; - _enter(""); - _enter("{%u}", call->unmarshall); switch (call->unmarshall) { @@ -421,9 +365,13 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - /* we'll need the file server record as that tells us which set of - * vnodes to operate upon */ - return afs_find_cm_server_by_uuid(call, call->request); + if (memcmp(call->request, &call->server->_uuid, sizeof(call->server->_uuid)) != 0) { + pr_notice("Callback UUID does not match fileserver UUID\n"); + trace_afs_cm_no_server_u(call, call->request); + return 0; + } + + return 0; } /* @@ -455,7 +403,7 @@ static int afs_deliver_cb_probe(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -533,7 +481,7 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -593,7 +541,7 @@ static int afs_deliver_cb_tell_me_about_yourself(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -667,9 +615,5 @@ static int afs_deliver_yfs_cb_callback(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - - /* We'll need the file server record as that tells us which set of - * vnodes to operate upon. - */ - return afs_find_cm_server_by_peer(call); + return 0; } diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c index b516d05b0fef..07a8bfbdd9b9 100644 --- a/fs/afs/fs_probe.c +++ b/fs/afs/fs_probe.c @@ -235,20 +235,20 @@ void afs_fileserver_probe_result(struct afs_call *call) * Probe all of a fileserver's addresses to find out the best route and to * query its capabilities. */ -void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, - struct afs_addr_list *new_alist, struct key *key) +int afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, + struct afs_addr_list *new_alist, struct key *key) { struct afs_endpoint_state *estate, *old; - struct afs_addr_list *alist; + struct afs_addr_list *old_alist = NULL, *alist; unsigned long unprobed; _enter("%pU", &server->uuid); estate = kzalloc(sizeof(*estate), GFP_KERNEL); if (!estate) - return; + return -ENOMEM; - refcount_set(&estate->ref, 1); + refcount_set(&estate->ref, 2); estate->server_id = server->debug_id; estate->rtt = UINT_MAX; @@ -256,21 +256,31 @@ void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, old = rcu_dereference_protected(server->endpoint_state, lockdep_is_held(&server->fs_lock)); - estate->responsive_set = old->responsive_set; - estate->addresses = afs_get_addrlist(new_alist ?: old->addresses, - afs_alist_trace_get_estate); + if (old) { + estate->responsive_set = old->responsive_set; + if (!new_alist) + new_alist = old->addresses; + } + + if (old_alist != new_alist) + afs_set_peer_appdata(server, old_alist, new_alist); + + estate->addresses = afs_get_addrlist(new_alist, afs_alist_trace_get_estate); alist = estate->addresses; estate->probe_seq = ++server->probe_counter; atomic_set(&estate->nr_probing, alist->nr_addrs); + if (new_alist) + server->addr_version = new_alist->version; rcu_assign_pointer(server->endpoint_state, estate); - set_bit(AFS_ESTATE_SUPERSEDED, &old->flags); write_unlock(&server->fs_lock); + if (old) + set_bit(AFS_ESTATE_SUPERSEDED, &old->flags); trace_afs_estate(estate->server_id, estate->probe_seq, refcount_read(&estate->ref), afs_estate_trace_alloc_probe); - afs_get_address_preferences(net, alist); + afs_get_address_preferences(net, new_alist); server->probed_at = jiffies; unprobed = (1UL << alist->nr_addrs) - 1; @@ -293,6 +303,8 @@ void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, } afs_put_endpoint_state(old, afs_estate_trace_put_probe); + afs_put_endpoint_state(estate, afs_estate_trace_put_probe); + return 0; } /* diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 388f60b5c995..4925d9ad1fb2 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1010,6 +1010,9 @@ extern int afs_merge_fs_addr4(struct afs_net *net, struct afs_addr_list *addr, __be32 xdr, u16 port); extern int afs_merge_fs_addr6(struct afs_net *net, struct afs_addr_list *addr, __be32 *xdr, u16 port); +void afs_set_peer_appdata(struct afs_server *server, + struct afs_addr_list *old_alist, + struct afs_addr_list *new_alist); /* * addr_prefs.c @@ -1207,8 +1210,8 @@ struct afs_endpoint_state *afs_get_endpoint_state(struct afs_endpoint_state *est enum afs_estate_trace where); void afs_put_endpoint_state(struct afs_endpoint_state *estate, enum afs_estate_trace where); extern void afs_fileserver_probe_result(struct afs_call *); -void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, - struct afs_addr_list *new_addrs, struct key *key); +int afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, + struct afs_addr_list *new_alist, struct key *key); int afs_wait_for_fs_probes(struct afs_operation *op, struct afs_server_state *states, bool intr); extern void afs_probe_fileserver(struct afs_net *, struct afs_server *); extern void afs_fs_probe_dispatcher(struct work_struct *); @@ -1509,7 +1512,7 @@ extern void __exit afs_clean_up_permit_cache(void); */ extern spinlock_t afs_server_peer_lock; -extern struct afs_server *afs_find_server(struct afs_net *, const struct rxrpc_peer *); +struct afs_server *afs_find_server(const struct rxrpc_peer *peer); extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *); extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32); extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace); diff --git a/fs/afs/proc.c b/fs/afs/proc.c index f07c1c52ef5d..0af94c846504 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -444,8 +444,6 @@ static int afs_proc_servers_show(struct seq_file *m, void *v) } server = list_entry(v, struct afs_server, proc_link); - estate = rcu_dereference(server->endpoint_state); - alist = estate->addresses; seq_printf(m, "%pU %3d %3d %s\n", &server->uuid, refcount_read(&server->ref), @@ -455,10 +453,16 @@ static int afs_proc_servers_show(struct seq_file *m, void *v) server->flags, server->rtt); seq_printf(m, " - probe: last=%d\n", (int)(jiffies - server->probed_at) / HZ); + + estate = rcu_dereference(server->endpoint_state); + if (!estate) + goto out; failed = estate->failed_set; seq_printf(m, " - ESTATE pq=%x np=%u rsp=%lx f=%lx\n", estate->probe_seq, atomic_read(&estate->nr_probing), estate->responsive_set, estate->failed_set); + + alist = estate->addresses; seq_printf(m, " - ALIST v=%u ap=%u\n", alist->version, alist->addr_pref_version); for (i = 0; i < alist->nr_addrs; i++) { @@ -471,6 +475,8 @@ static int afs_proc_servers_show(struct seq_file *m, void *v) rxrpc_kernel_get_srtt(addr->peer), addr->last_error, addr->prio); } + +out: return 0; } diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index de9e10575bdd..d5e480a33859 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -766,8 +766,14 @@ static void afs_rx_discard_new_call(struct rxrpc_call *rxcall, static void afs_rx_new_call(struct sock *sk, struct rxrpc_call *rxcall, unsigned long user_call_ID) { + struct afs_call *call = (struct afs_call *)user_call_ID; struct afs_net *net = afs_sock2net(sk); + call->peer = rxrpc_kernel_get_call_peer(sk->sk_socket, call->rxcall); + call->server = afs_find_server(call->peer); + if (!call->server) + trace_afs_cm_no_server(call, rxrpc_kernel_remote_srx(call->peer)); + queue_work(afs_wq, &net->charge_preallocation_work); } diff --git a/fs/afs/server.c b/fs/afs/server.c index 923e07c37032..1140773f7aed 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -21,42 +21,13 @@ static void __afs_put_server(struct afs_net *, struct afs_server *); /* * Find a server by one of its addresses. */ -struct afs_server *afs_find_server(struct afs_net *net, const struct rxrpc_peer *peer) +struct afs_server *afs_find_server(const struct rxrpc_peer *peer) { - const struct afs_endpoint_state *estate; - const struct afs_addr_list *alist; - struct afs_server *server = NULL; - unsigned int i; - int seq = 1; - - rcu_read_lock(); - - do { - if (server) - afs_unuse_server_notime(net, server, afs_server_trace_unuse_find_rsq); - server = NULL; - seq++; /* 2 on the 1st/lockless path, otherwise odd */ - read_seqbegin_or_lock(&net->fs_addr_lock, &seq); - - hlist_for_each_entry_rcu(server, &net->fs_addresses, addr_link) { - estate = rcu_dereference(server->endpoint_state); - alist = estate->addresses; - for (i = 0; i < alist->nr_addrs; i++) - if (alist->addrs[i].peer == peer) - goto found; - } + struct afs_server *server = (struct afs_server *)rxrpc_kernel_get_peer_data(peer); - server = NULL; - continue; - found: - server = afs_maybe_use_server(server, afs_server_trace_use_by_addr); - - } while (need_seqretry(&net->fs_addr_lock, seq)); - - done_seqretry(&net->fs_addr_lock, seq); - - rcu_read_unlock(); - return server; + if (!server) + return NULL; + return afs_maybe_use_server(server, afs_server_trace_use_cm_call); } /* @@ -468,9 +439,16 @@ static void afs_give_up_callbacks(struct afs_net *net, struct afs_server *server */ static void afs_destroy_server(struct afs_net *net, struct afs_server *server) { + struct afs_endpoint_state *estate; + if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags)) afs_give_up_callbacks(net, server); + /* Unbind the rxrpc_peer records from the server. */ + estate = rcu_access_pointer(server->endpoint_state); + if (estate) + afs_set_peer_appdata(server, estate->addresses, NULL); + afs_put_server(net, server, afs_server_trace_destroy); } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 82d20c28dc0d..4d798b9e43bf 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -140,12 +140,10 @@ enum yfs_cm_operation { EM(afs_server_trace_see_expired, "SEE expd ") \ EM(afs_server_trace_unuse_call, "UNU call ") \ EM(afs_server_trace_unuse_create_fail, "UNU cfail") \ - EM(afs_server_trace_unuse_find_rsq, "UNU f-rsq") \ EM(afs_server_trace_unuse_slist, "UNU slist") \ EM(afs_server_trace_unuse_slist_isort, "UNU isort") \ EM(afs_server_trace_unuse_uuid_rsq, "PUT u-req") \ EM(afs_server_trace_update, "UPDATE ") \ - EM(afs_server_trace_use_by_addr, "USE addr ") \ EM(afs_server_trace_use_by_uuid, "USE uuid ") \ EM(afs_server_trace_use_cm_call, "USE cm-cl") \ EM(afs_server_trace_use_get_caps, "USE gcaps") \ @@ -1281,7 +1279,7 @@ TRACE_EVENT(afs_bulkstat_error, ); TRACE_EVENT(afs_cm_no_server, - TP_PROTO(struct afs_call *call, struct sockaddr_rxrpc *srx), + TP_PROTO(struct afs_call *call, const struct sockaddr_rxrpc *srx), TP_ARGS(call, srx), diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index a0c0e4d590f5..71b6e07bf161 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -461,7 +461,7 @@ void rxrpc_destroy_all_peers(struct rxrpc_net *rxnet) continue; hlist_for_each_entry(peer, &rxnet->peer_hash[i], hash_link) { - pr_err("Leaked peer %u {%u} %pISp\n", + pr_err("Leaked peer %x {%u} %pISp\n", peer->debug_id, refcount_read(&peer->ref), &peer->srx.transport); @@ -478,7 +478,7 @@ void rxrpc_destroy_all_peers(struct rxrpc_net *rxnet) */ struct rxrpc_peer *rxrpc_kernel_get_call_peer(struct socket *sock, struct rxrpc_call *call) { - return call->peer; + return rxrpc_get_peer(call->peer, rxrpc_peer_get_application); } EXPORT_SYMBOL(rxrpc_kernel_get_call_peer); From patchwork Mon Feb 24 23:41:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989097 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70830213E8A for ; Mon, 24 Feb 2025 23:43:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440594; cv=none; b=AcLjocvrfwdgahoVWeJVSrUjIPk/Q/8CwyE3DeDA0spWxjKaqIsGs+N9DAvXWpamrBHF8/ohO6mXt7cLv4qGIQ9017nAvm/6jwYUbdYKzD1qv4xteynFb0va15IXGAUOXMFFzB+YMLTv4egDWqBeVfHikNuz8kmgUF02Mypeo+E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440594; c=relaxed/simple; bh=CO7sx0ZbJXMqr0N6f4IJ5qOK4fZlcVIhPa3j6EZN3BI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gO8tbmyp6MniDhr8CnqT3lWHx2247/PPB6tJpZt4aKiZuUlQcYvgdsWTBPF7voEL3+3F/SmXj3oWG2teS1Ev5LlZodGJharC29/c/89WhXLx67gsMKUTptuXQSzKEYlRHAez+r1IsSh33V+R1YHc8RZ3fbZLDxq5UMkeXET45V4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=NsM2+Cbt; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NsM2+Cbt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440588; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vnvc/pqfsFrr8Rfv5I3NDPvGkdLqN2GNzbmWcG8/e6k=; b=NsM2+CbtrnnoC31l8ZPTnisLdATZgl5n8WiIbZezaeUlDEGZM3AnZTpFYvx591u/ALwado kdqIaZ85Q3c1ISB6tFCSRao/49WeR+412OgYv5A+u1hZNT0e61tCXvosMdeGva3whahi/c gL36d0+8/9Bb93JFK1MzuRO9wO8ES6M= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-637-pD91HJi_Ng2HkZ6_Vfx5hA-1; Mon, 24 Feb 2025 18:43:04 -0500 X-MC-Unique: pD91HJi_Ng2HkZ6_Vfx5hA-1 X-Mimecast-MFC-AGG-ID: pD91HJi_Ng2HkZ6_Vfx5hA_1740440583 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 835A419560B9; Mon, 24 Feb 2025 23:43:03 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 35F9519560AE; Mon, 24 Feb 2025 23:43:00 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 14/15] afs: Fix afs_server ref accounting Date: Mon, 24 Feb 2025 23:41:51 +0000 Message-ID: <20250224234154.2014840-15-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Patchwork-Delegate: kuba@kernel.org The current way that afs_server refs are accounted and cleaned up sometimes cause rmmod to hang when it is waiting for cell records to be removed. The problem is that the cell cleanup might occasionally happen before the server cleanup and then there's nothing that causes the cell to garbage-collect the remaining servers as they become inactive. Partially fix this by: (1) Give each afs_server record its own management timer that rather than relying on the cell manager's central timer to drive each individual cell's maintenance work item to garbage collect servers. This timer is set when afs_unuse_server() reduces a server's activity count to zero and will schedule the server's destroyer work item upon firing. (2) Give each afs_server record its own destroyer work item that removes the record from the cell's database, shuts down the timer, cancels any pending work for itself, sends an RPC to the server to cancel outstanding callbacks. This change, in combination with the timer, obviates the need to try and coordinate so closely between the cell record and a bunch of other server records to try and tear everything down in a coordinated fashion. With this, the cell record is pinned until the server RCU is complete and namespace/module removal will wait until all the cell records are removed. (3) Now that incoming calls are mapped to servers (and thus cells) using data attached to an rxrpc_peer, the UUID-to-server mapping tree is moved from the namespace to the cell (cell->fs_servers). This means there can no longer be duplicates therein - and that allows the mapping tree to be simpler as there doesn't need to be a chain of same-UUID servers that are in different cells. (4) The lock protecting the UUID mapping tree is switched to an rw_semaphore on the cell rather than a seqlock on the namespace as it's now only used during mounting in contexts in which we're allowed to sleep. (5) When it comes time for a cell that is being removed to purge its set of servers, it just needs to iterate over them and wake them up. Once a server becomes inactive, its destroyer work item will observe the state of the cell and immediately remove that record. (6) When a server record is removed, it is marked AFS_SERVER_FL_EXPIRED to prevent reattempts at removal. The record will be dispatched to RCU for destruction once its refcount reaches 0. (7) The AFS_SERVER_FL_UNCREATED/CREATING flags are used to synchronise simultaneous creation attempts. If one attempt fails, it will abandon the attempt and allow another to try again. Note that the record can't just be abandoned when dead as it's bound into a server list attached to a volume and only subject to replacement if the server list obtained for the volume from the VLDB changes. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/cell.c | 3 +- fs/afs/fsclient.c | 4 +- fs/afs/internal.h | 54 ++-- fs/afs/main.c | 10 +- fs/afs/server.c | 564 ++++++++++++++++--------------------- fs/afs/server_list.c | 4 +- include/trace/events/afs.h | 7 +- 7 files changed, 289 insertions(+), 357 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 09f9613a46dc..67632205edfd 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -168,7 +168,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, INIT_HLIST_HEAD(&cell->proc_volumes); seqlock_init(&cell->volume_lock); cell->fs_servers = RB_ROOT; - seqlock_init(&cell->fs_lock); + init_rwsem(&cell->fs_lock); rwlock_init(&cell->vl_servers_lock); cell->flags = (1 << AFS_CELL_FL_CHECK_ALIAS); @@ -836,6 +836,7 @@ static void afs_manage_cell(struct afs_cell *cell) /* The root volume is pinning the cell */ afs_put_volume(cell->root_volume, afs_volume_trace_put_cell_root); cell->root_volume = NULL; + afs_purge_servers(cell); afs_put_cell(cell, afs_cell_trace_put_destroy); } diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c index 9f46d9aebc33..bc9556991d7c 100644 --- a/fs/afs/fsclient.c +++ b/fs/afs/fsclient.c @@ -1653,7 +1653,7 @@ int afs_fs_give_up_all_callbacks(struct afs_net *net, struct afs_server *server, bp = call->request; *bp++ = htonl(FSGIVEUPALLCALLBACKS); - call->server = afs_use_server(server, afs_server_trace_use_give_up_cb); + call->server = afs_use_server(server, false, afs_server_trace_use_give_up_cb); afs_make_call(call, GFP_NOFS); afs_wait_for_call_to_complete(call); ret = call->error; @@ -1760,7 +1760,7 @@ bool afs_fs_get_capabilities(struct afs_net *net, struct afs_server *server, return false; call->key = key; - call->server = afs_use_server(server, afs_server_trace_use_get_caps); + call->server = afs_use_server(server, false, afs_server_trace_use_get_caps); call->peer = rxrpc_kernel_get_peer(estate->addresses->addrs[addr_index].peer); call->probe = afs_get_endpoint_state(estate, afs_estate_trace_get_getcaps); call->probe_index = addr_index; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 4925d9ad1fb2..bbc272caf31b 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -302,18 +302,11 @@ struct afs_net { * cell, but in practice, people create aliases and subsets and there's * no easy way to distinguish them. */ - seqlock_t fs_lock; /* For fs_servers, fs_probe_*, fs_proc */ - struct rb_root fs_servers; /* afs_server (by server UUID or address) */ + seqlock_t fs_lock; /* For fs_probe_*, fs_proc */ struct list_head fs_probe_fast; /* List of afs_server to probe at 30s intervals */ struct list_head fs_probe_slow; /* List of afs_server to probe at 5m intervals */ struct hlist_head fs_proc; /* procfs servers list */ - struct hlist_head fs_addresses; /* afs_server (by lowest IPv6 addr) */ - seqlock_t fs_addr_lock; /* For fs_addresses[46] */ - - struct work_struct fs_manager; - struct timer_list fs_timer; - struct work_struct fs_prober; struct timer_list fs_probe_timer; atomic_t servers_outstanding; @@ -409,7 +402,7 @@ struct afs_cell { /* Active fileserver interaction state. */ struct rb_root fs_servers; /* afs_server (by server UUID) */ - seqlock_t fs_lock; /* For fs_servers */ + struct rw_semaphore fs_lock; /* For fs_servers */ /* VL server list. */ rwlock_t vl_servers_lock; /* Lock on vl_servers */ @@ -544,22 +537,22 @@ struct afs_server { }; struct afs_cell *cell; /* Cell to which belongs (pins ref) */ - struct rb_node uuid_rb; /* Link in net->fs_servers */ - struct afs_server __rcu *uuid_next; /* Next server with same UUID */ - struct afs_server *uuid_prev; /* Previous server with same UUID */ - struct list_head probe_link; /* Link in net->fs_probe_list */ - struct hlist_node addr_link; /* Link in net->fs_addresses6 */ + struct rb_node uuid_rb; /* Link in cell->fs_servers */ + struct list_head probe_link; /* Link in net->fs_probe_* */ struct hlist_node proc_link; /* Link in net->fs_proc */ struct list_head volumes; /* RCU list of afs_server_entry objects */ - struct afs_server *gc_next; /* Next server in manager's list */ + struct work_struct destroyer; /* Work item to try and destroy a server */ + struct timer_list timer; /* Management timer */ time64_t unuse_time; /* Time at which last unused */ unsigned long flags; #define AFS_SERVER_FL_RESPONDING 0 /* The server is responding */ #define AFS_SERVER_FL_UPDATING 1 #define AFS_SERVER_FL_NEEDS_UPDATE 2 /* Fileserver address list is out of date */ -#define AFS_SERVER_FL_NOT_READY 4 /* The record is not ready for use */ -#define AFS_SERVER_FL_NOT_FOUND 5 /* VL server says no such server */ -#define AFS_SERVER_FL_VL_FAIL 6 /* Failed to access VL server */ +#define AFS_SERVER_FL_UNCREATED 3 /* The record needs creating */ +#define AFS_SERVER_FL_CREATING 4 /* The record is being created */ +#define AFS_SERVER_FL_EXPIRED 5 /* The record has expired */ +#define AFS_SERVER_FL_NOT_FOUND 6 /* VL server says no such server */ +#define AFS_SERVER_FL_VL_FAIL 7 /* Failed to access VL server */ #define AFS_SERVER_FL_MAY_HAVE_CB 8 /* May have callbacks on this fileserver */ #define AFS_SERVER_FL_IS_YFS 16 /* Server is YFS not AFS */ #define AFS_SERVER_FL_NO_IBULK 17 /* Fileserver doesn't support FS.InlineBulkStatus */ @@ -569,6 +562,7 @@ struct afs_server { atomic_t active; /* Active user count */ u32 addr_version; /* Address list version */ u16 service_id; /* Service ID we're using. */ + short create_error; /* Creation error */ unsigned int rtt; /* Server's current RTT in uS */ unsigned int debug_id; /* Debugging ID for traces */ @@ -1513,19 +1507,29 @@ extern void __exit afs_clean_up_permit_cache(void); extern spinlock_t afs_server_peer_lock; struct afs_server *afs_find_server(const struct rxrpc_peer *peer); -extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *); extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32); extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace); -extern struct afs_server *afs_use_server(struct afs_server *, enum afs_server_trace); -extern void afs_unuse_server(struct afs_net *, struct afs_server *, enum afs_server_trace); -extern void afs_unuse_server_notime(struct afs_net *, struct afs_server *, enum afs_server_trace); +struct afs_server *afs_use_server(struct afs_server *server, bool activate, + enum afs_server_trace reason); +void afs_unuse_server(struct afs_net *net, struct afs_server *server, + enum afs_server_trace reason); +void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, + enum afs_server_trace reason); extern void afs_put_server(struct afs_net *, struct afs_server *, enum afs_server_trace); -extern void afs_manage_servers(struct work_struct *); -extern void afs_servers_timer(struct timer_list *); +void afs_purge_servers(struct afs_cell *cell); extern void afs_fs_probe_timer(struct timer_list *); -extern void __net_exit afs_purge_servers(struct afs_net *); +void __net_exit afs_wait_for_servers(struct afs_net *net); bool afs_check_server_record(struct afs_operation *op, struct afs_server *server, struct key *key); +static inline void afs_see_server(struct afs_server *server, enum afs_server_trace trace) +{ + int r = refcount_read(&server->ref); + int a = atomic_read(&server->active); + + trace_afs_server(server->debug_id, r, a, trace); + +} + static inline void afs_inc_servers_outstanding(struct afs_net *net) { atomic_inc(&net->servers_outstanding); diff --git a/fs/afs/main.c b/fs/afs/main.c index a7c7dc268302..bff0363286b0 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -86,16 +86,10 @@ static int __net_init afs_net_init(struct net *net_ns) INIT_HLIST_HEAD(&net->proc_cells); seqlock_init(&net->fs_lock); - net->fs_servers = RB_ROOT; INIT_LIST_HEAD(&net->fs_probe_fast); INIT_LIST_HEAD(&net->fs_probe_slow); INIT_HLIST_HEAD(&net->fs_proc); - INIT_HLIST_HEAD(&net->fs_addresses); - seqlock_init(&net->fs_addr_lock); - - INIT_WORK(&net->fs_manager, afs_manage_servers); - timer_setup(&net->fs_timer, afs_servers_timer, 0); INIT_WORK(&net->fs_prober, afs_fs_probe_dispatcher); timer_setup(&net->fs_probe_timer, afs_fs_probe_timer, 0); atomic_set(&net->servers_outstanding, 1); @@ -131,7 +125,7 @@ static int __net_init afs_net_init(struct net *net_ns) net->live = false; afs_fs_probe_cleanup(net); afs_cell_purge(net); - afs_purge_servers(net); + afs_wait_for_servers(net); error_cell_init: net->live = false; afs_proc_cleanup(net); @@ -153,7 +147,7 @@ static void __net_exit afs_net_exit(struct net *net_ns) net->live = false; afs_fs_probe_cleanup(net); afs_cell_purge(net); - afs_purge_servers(net); + afs_wait_for_servers(net); afs_close_socket(net); afs_proc_cleanup(net); afs_put_sysnames(net->sysnames); diff --git a/fs/afs/server.c b/fs/afs/server.c index 1140773f7aed..487e2134aea4 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -14,9 +14,9 @@ static unsigned afs_server_gc_delay = 10; /* Server record timeout in seconds */ static atomic_t afs_server_debug_id; -static struct afs_server *afs_maybe_use_server(struct afs_server *, - enum afs_server_trace); static void __afs_put_server(struct afs_net *, struct afs_server *); +static void afs_server_timer(struct timer_list *timer); +static void afs_server_destroyer(struct work_struct *work); /* * Find a server by one of its addresses. @@ -27,148 +27,91 @@ struct afs_server *afs_find_server(const struct rxrpc_peer *peer) if (!server) return NULL; - return afs_maybe_use_server(server, afs_server_trace_use_cm_call); + return afs_use_server(server, false, afs_server_trace_use_cm_call); } /* - * Look up a server by its UUID and mark it active. + * Look up a server by its UUID and mark it active. The caller must hold + * cell->fs_lock. */ -struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uuid) +static struct afs_server *afs_find_server_by_uuid(struct afs_cell *cell, const uuid_t *uuid) { - struct afs_server *server = NULL; + struct afs_server *server; struct rb_node *p; - int diff, seq = 1; + int diff; _enter("%pU", uuid); - do { - /* Unfortunately, rbtree walking doesn't give reliable results - * under just the RCU read lock, so we have to check for - * changes. - */ - if (server) - afs_unuse_server(net, server, afs_server_trace_unuse_uuid_rsq); - server = NULL; - seq++; /* 2 on the 1st/lockless path, otherwise odd */ - read_seqbegin_or_lock(&net->fs_lock, &seq); - - p = net->fs_servers.rb_node; - while (p) { - server = rb_entry(p, struct afs_server, uuid_rb); - - diff = memcmp(uuid, &server->uuid, sizeof(*uuid)); - if (diff < 0) { - p = p->rb_left; - } else if (diff > 0) { - p = p->rb_right; - } else { - afs_use_server(server, afs_server_trace_use_by_uuid); - break; - } - - server = NULL; - } - } while (need_seqretry(&net->fs_lock, seq)); + p = cell->fs_servers.rb_node; + while (p) { + server = rb_entry(p, struct afs_server, uuid_rb); - done_seqretry(&net->fs_lock, seq); + diff = memcmp(uuid, &server->uuid, sizeof(*uuid)); + if (diff < 0) { + p = p->rb_left; + } else if (diff > 0) { + p = p->rb_right; + } else { + if (test_bit(AFS_SERVER_FL_UNCREATED, &server->flags)) + return NULL; /* Need a write lock */ + afs_use_server(server, true, afs_server_trace_use_by_uuid); + return server; + } + } - _leave(" = %p", server); - return server; + return NULL; } /* - * Install a server record in the namespace tree. If there's a clash, we stick - * it into a list anchored on whichever afs_server struct is actually in the - * tree. + * Install a server record in the cell tree. The caller must hold an exclusive + * lock on cell->fs_lock. */ static struct afs_server *afs_install_server(struct afs_cell *cell, - struct afs_server *candidate) + struct afs_server **candidate) { - const struct afs_endpoint_state *estate; - const struct afs_addr_list *alist; - struct afs_server *server, *next; + struct afs_server *server; struct afs_net *net = cell->net; struct rb_node **pp, *p; int diff; _enter("%p", candidate); - write_seqlock(&net->fs_lock); - /* Firstly install the server in the UUID lookup tree */ - pp = &net->fs_servers.rb_node; + pp = &cell->fs_servers.rb_node; p = NULL; while (*pp) { p = *pp; _debug("- consider %p", p); server = rb_entry(p, struct afs_server, uuid_rb); - diff = memcmp(&candidate->uuid, &server->uuid, sizeof(uuid_t)); - if (diff < 0) { + diff = memcmp(&(*candidate)->uuid, &server->uuid, sizeof(uuid_t)); + if (diff < 0) pp = &(*pp)->rb_left; - } else if (diff > 0) { + else if (diff > 0) pp = &(*pp)->rb_right; - } else { - if (server->cell == cell) - goto exists; - - /* We have the same UUID representing servers in - * different cells. Append the new server to the list. - */ - for (;;) { - next = rcu_dereference_protected( - server->uuid_next, - lockdep_is_held(&net->fs_lock.lock)); - if (!next) - break; - server = next; - } - rcu_assign_pointer(server->uuid_next, candidate); - candidate->uuid_prev = server; - server = candidate; - goto added_dup; - } + else + goto exists; } - server = candidate; + server = *candidate; + *candidate = NULL; rb_link_node(&server->uuid_rb, p, pp); - rb_insert_color(&server->uuid_rb, &net->fs_servers); + rb_insert_color(&server->uuid_rb, &cell->fs_servers); + write_seqlock(&net->fs_lock); hlist_add_head_rcu(&server->proc_link, &net->fs_proc); + write_sequnlock(&net->fs_lock); afs_get_cell(cell, afs_cell_trace_get_server); -added_dup: - write_seqlock(&net->fs_addr_lock); - estate = rcu_dereference_protected(server->endpoint_state, - lockdep_is_held(&net->fs_addr_lock.lock)); - alist = estate->addresses; - - /* Secondly, if the server has any IPv4 and/or IPv6 addresses, install - * it in the IPv4 and/or IPv6 reverse-map lists. - * - * TODO: For speed we want to use something other than a flat list - * here; even sorting the list in terms of lowest address would help a - * bit, but anything we might want to do gets messy and memory - * intensive. - */ - if (alist->nr_addrs > 0) - hlist_add_head_rcu(&server->addr_link, &net->fs_addresses); - - write_sequnlock(&net->fs_addr_lock); - exists: - afs_get_server(server, afs_server_trace_get_install); - write_sequnlock(&net->fs_lock); + afs_use_server(server, true, afs_server_trace_get_install); return server; } /* - * Allocate a new server record and mark it active. + * Allocate a new server record and mark it as active but uncreated. */ -static struct afs_server *afs_alloc_server(struct afs_cell *cell, - const uuid_t *uuid, - struct afs_addr_list *alist) +static struct afs_server *afs_alloc_server(struct afs_cell *cell, const uuid_t *uuid) { - struct afs_endpoint_state *estate; struct afs_server *server; struct afs_net *net = cell->net; @@ -176,65 +119,49 @@ static struct afs_server *afs_alloc_server(struct afs_cell *cell, server = kzalloc(sizeof(struct afs_server), GFP_KERNEL); if (!server) - goto enomem; - - estate = kzalloc(sizeof(struct afs_endpoint_state), GFP_KERNEL); - if (!estate) - goto enomem_server; + return NULL; refcount_set(&server->ref, 1); - atomic_set(&server->active, 1); + atomic_set(&server->active, 0); + __set_bit(AFS_SERVER_FL_UNCREATED, &server->flags); server->debug_id = atomic_inc_return(&afs_server_debug_id); - server->addr_version = alist->version; server->uuid = *uuid; rwlock_init(&server->fs_lock); + INIT_WORK(&server->destroyer, &afs_server_destroyer); + timer_setup(&server->timer, afs_server_timer, 0); INIT_LIST_HEAD(&server->volumes); init_waitqueue_head(&server->probe_wq); INIT_LIST_HEAD(&server->probe_link); + INIT_HLIST_NODE(&server->proc_link); spin_lock_init(&server->probe_lock); server->cell = cell; server->rtt = UINT_MAX; server->service_id = FS_SERVICE; - server->probe_counter = 1; server->probed_at = jiffies - LONG_MAX / 2; - refcount_set(&estate->ref, 1); - estate->addresses = alist; - estate->server_id = server->debug_id; - estate->probe_seq = 1; - rcu_assign_pointer(server->endpoint_state, estate); afs_inc_servers_outstanding(net); - trace_afs_server(server->debug_id, 1, 1, afs_server_trace_alloc); - trace_afs_estate(estate->server_id, estate->probe_seq, refcount_read(&estate->ref), - afs_estate_trace_alloc_server); _leave(" = %p", server); return server; - -enomem_server: - kfree(server); -enomem: - _leave(" = NULL [nomem]"); - return NULL; } /* * Look up an address record for a server */ -static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell, - struct key *key, const uuid_t *uuid) +static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_server *server, + struct key *key) { struct afs_vl_cursor vc; struct afs_addr_list *alist = NULL; int ret; ret = -ERESTARTSYS; - if (afs_begin_vlserver_operation(&vc, cell, key)) { + if (afs_begin_vlserver_operation(&vc, server->cell, key)) { while (afs_select_vlserver(&vc)) { if (test_bit(AFS_VLSERVER_FL_IS_YFS, &vc.server->flags)) - alist = afs_yfsvl_get_endpoints(&vc, uuid); + alist = afs_yfsvl_get_endpoints(&vc, &server->uuid); else - alist = afs_vl_get_addrs_u(&vc, uuid); + alist = afs_vl_get_addrs_u(&vc, &server->uuid); } ret = afs_end_vlserver_operation(&vc); @@ -250,67 +177,116 @@ static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell, struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key, const uuid_t *uuid, u32 addr_version) { - struct afs_addr_list *alist; - struct afs_server *server, *candidate; + struct afs_addr_list *alist = NULL; + struct afs_server *server, *candidate = NULL; + bool creating = false; + int ret; _enter("%p,%pU", cell->net, uuid); - server = afs_find_server_by_uuid(cell->net, uuid); + down_read(&cell->fs_lock); + server = afs_find_server_by_uuid(cell, uuid); + /* Won't see servers marked uncreated. */ + up_read(&cell->fs_lock); + if (server) { + timer_delete_sync(&server->timer); + if (test_bit(AFS_SERVER_FL_CREATING, &server->flags)) + goto wait_for_creation; if (server->addr_version != addr_version) set_bit(AFS_SERVER_FL_NEEDS_UPDATE, &server->flags); return server; } - alist = afs_vl_lookup_addrs(cell, key, uuid); - if (IS_ERR(alist)) - return ERR_CAST(alist); - - candidate = afs_alloc_server(cell, uuid, alist); + candidate = afs_alloc_server(cell, uuid); if (!candidate) { afs_put_addrlist(alist, afs_alist_trace_put_server_oom); return ERR_PTR(-ENOMEM); } - server = afs_install_server(cell, candidate); - if (server != candidate) { - afs_put_addrlist(alist, afs_alist_trace_put_server_dup); + down_write(&cell->fs_lock); + server = afs_install_server(cell, &candidate); + if (test_bit(AFS_SERVER_FL_CREATING, &server->flags)) { + /* We need to wait for creation to complete. */ + up_write(&cell->fs_lock); + goto wait_for_creation; + } + if (test_bit(AFS_SERVER_FL_UNCREATED, &server->flags)) { + set_bit(AFS_SERVER_FL_CREATING, &server->flags); + clear_bit(AFS_SERVER_FL_UNCREATED, &server->flags); + creating = true; + } + up_write(&cell->fs_lock); + timer_delete_sync(&server->timer); + + /* If we get to create the server, we look up the addresses and then + * immediately dispatch an asynchronous probe to each interface on the + * fileserver. This will make sure the repeat-probing service is + * started. + */ + if (creating) { + alist = afs_vl_lookup_addrs(server, key); + if (IS_ERR(alist)) { + ret = PTR_ERR(alist); + goto create_failed; + } + + ret = afs_fs_probe_fileserver(cell->net, server, alist, key); + if (ret) + goto create_failed; + + clear_and_wake_up_bit(AFS_SERVER_FL_CREATING, &server->flags); + } + +out: + afs_put_addrlist(alist, afs_alist_trace_put_server_create); + if (candidate) { + kfree(rcu_access_pointer(server->endpoint_state)); kfree(candidate); - } else { - /* Immediately dispatch an asynchronous probe to each interface - * on the fileserver. This will make sure the repeat-probing - * service is started. - */ - afs_fs_probe_fileserver(cell->net, server, alist, key); + afs_dec_servers_outstanding(cell->net); + } + return server ?: ERR_PTR(ret); + +wait_for_creation: + afs_see_server(server, afs_server_trace_wait_create); + wait_on_bit(&server->flags, AFS_SERVER_FL_CREATING, TASK_UNINTERRUPTIBLE); + if (test_bit_acquire(AFS_SERVER_FL_UNCREATED, &server->flags)) { + /* Barrier: read flag before error */ + ret = READ_ONCE(server->create_error); + afs_put_server(cell->net, server, afs_server_trace_unuse_create_fail); + server = NULL; + goto out; } - return server; -} + ret = 0; + goto out; -/* - * Set the server timer to fire after a given delay, assuming it's not already - * set for an earlier time. - */ -static void afs_set_server_timer(struct afs_net *net, time64_t delay) -{ - if (net->live) { - afs_inc_servers_outstanding(net); - if (timer_reduce(&net->fs_timer, jiffies + delay * HZ)) - afs_dec_servers_outstanding(net); +create_failed: + down_write(&cell->fs_lock); + + WRITE_ONCE(server->create_error, ret); + smp_wmb(); /* Barrier: set error before flag. */ + set_bit(AFS_SERVER_FL_UNCREATED, &server->flags); + + clear_and_wake_up_bit(AFS_SERVER_FL_CREATING, &server->flags); + + if (test_bit(AFS_SERVER_FL_UNCREATED, &server->flags)) { + clear_bit(AFS_SERVER_FL_UNCREATED, &server->flags); + creating = true; } + afs_unuse_server(cell->net, server, afs_server_trace_unuse_create_fail); + server = NULL; + + up_write(&cell->fs_lock); + goto out; } /* - * Server management timer. We have an increment on fs_outstanding that we - * need to pass along to the work item. + * Set/reduce a server's timer. */ -void afs_servers_timer(struct timer_list *timer) +static void afs_set_server_timer(struct afs_server *server, unsigned int delay_secs) { - struct afs_net *net = container_of(timer, struct afs_net, fs_timer); - - _enter(""); - if (!queue_work(afs_wq, &net->fs_manager)) - afs_dec_servers_outstanding(net); + mod_timer(&server->timer, jiffies + delay_secs * HZ); } /* @@ -329,32 +305,20 @@ struct afs_server *afs_get_server(struct afs_server *server, } /* - * Try to get a reference on a server object. - */ -static struct afs_server *afs_maybe_use_server(struct afs_server *server, - enum afs_server_trace reason) -{ - unsigned int a; - int r; - - if (!__refcount_inc_not_zero(&server->ref, &r)) - return NULL; - - a = atomic_inc_return(&server->active); - trace_afs_server(server->debug_id, r + 1, a, reason); - return server; -} - -/* - * Get an active count on a server object. + * Get an active count on a server object and maybe remove from the inactive + * list. */ -struct afs_server *afs_use_server(struct afs_server *server, enum afs_server_trace reason) +struct afs_server *afs_use_server(struct afs_server *server, bool activate, + enum afs_server_trace reason) { unsigned int a; int r; __refcount_inc(&server->ref, &r); a = atomic_inc_return(&server->active); + if (a == 1 && activate && + !test_bit(AFS_SERVER_FL_EXPIRED, &server->flags)) + del_timer(&server->timer); trace_afs_server(server->debug_id, r + 1, a, reason); return server; @@ -387,13 +351,16 @@ void afs_put_server(struct afs_net *net, struct afs_server *server, void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, enum afs_server_trace reason) { - if (server) { - unsigned int active = atomic_dec_return(&server->active); + if (!server) + return; - if (active == 0) - afs_set_server_timer(net, afs_server_gc_delay); - afs_put_server(net, server, reason); + if (atomic_dec_and_test(&server->active)) { + if (test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) || + READ_ONCE(server->cell->state) >= AFS_CELL_FAILED) + schedule_work(&server->destroyer); } + + afs_put_server(net, server, reason); } /* @@ -402,10 +369,22 @@ void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, void afs_unuse_server(struct afs_net *net, struct afs_server *server, enum afs_server_trace reason) { - if (server) { - server->unuse_time = ktime_get_real_seconds(); - afs_unuse_server_notime(net, server, reason); + if (!server) + return; + + if (atomic_dec_and_test(&server->active)) { + if (!test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) && + READ_ONCE(server->cell->state) < AFS_CELL_FAILED) { + time64_t unuse_time = ktime_get_real_seconds(); + + server->unuse_time = unuse_time; + afs_set_server_timer(server, afs_server_gc_delay); + } else { + schedule_work(&server->destroyer); + } } + + afs_put_server(net, server, reason); } static void afs_server_rcu(struct rcu_head *rcu) @@ -435,166 +414,119 @@ static void afs_give_up_callbacks(struct afs_net *net, struct afs_server *server } /* - * destroy a dead server + * Check to see if the server record has expired. */ -static void afs_destroy_server(struct afs_net *net, struct afs_server *server) +static bool afs_has_server_expired(const struct afs_server *server) { - struct afs_endpoint_state *estate; + time64_t expires_at; - if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags)) - afs_give_up_callbacks(net, server); + if (atomic_read(&server->active)) + return false; - /* Unbind the rxrpc_peer records from the server. */ - estate = rcu_access_pointer(server->endpoint_state); - if (estate) - afs_set_peer_appdata(server, estate->addresses, NULL); + if (server->cell->net->live || + server->cell->state >= AFS_CELL_FAILED) { + trace_afs_server(server->debug_id, refcount_read(&server->ref), + 0, afs_server_trace_purging); + return true; + } - afs_put_server(net, server, afs_server_trace_destroy); + expires_at = server->unuse_time; + if (!test_bit(AFS_SERVER_FL_VL_FAIL, &server->flags) && + !test_bit(AFS_SERVER_FL_NOT_FOUND, &server->flags)) + expires_at += afs_server_gc_delay; + + return ktime_get_real_seconds() > expires_at; } /* - * Garbage collect any expired servers. + * Remove a server record from it's parent cell's database. */ -static void afs_gc_servers(struct afs_net *net, struct afs_server *gc_list) +static bool afs_remove_server_from_cell(struct afs_server *server) { - struct afs_server *server, *next, *prev; - int active; - - while ((server = gc_list)) { - gc_list = server->gc_next; - - write_seqlock(&net->fs_lock); - - active = atomic_read(&server->active); - if (active == 0) { - trace_afs_server(server->debug_id, refcount_read(&server->ref), - active, afs_server_trace_gc); - next = rcu_dereference_protected( - server->uuid_next, lockdep_is_held(&net->fs_lock.lock)); - prev = server->uuid_prev; - if (!prev) { - /* The one at the front is in the tree */ - if (!next) { - rb_erase(&server->uuid_rb, &net->fs_servers); - } else { - rb_replace_node_rcu(&server->uuid_rb, - &next->uuid_rb, - &net->fs_servers); - next->uuid_prev = NULL; - } - } else { - /* This server is not at the front */ - rcu_assign_pointer(prev->uuid_next, next); - if (next) - next->uuid_prev = prev; - } - - list_del(&server->probe_link); - hlist_del_rcu(&server->proc_link); - if (!hlist_unhashed(&server->addr_link)) - hlist_del_rcu(&server->addr_link); - } - write_sequnlock(&net->fs_lock); + struct afs_cell *cell = server->cell; - if (active == 0) - afs_destroy_server(net, server); + down_write(&cell->fs_lock); + + if (!afs_has_server_expired(server)) { + up_write(&cell->fs_lock); + return false; } + + set_bit(AFS_SERVER_FL_EXPIRED, &server->flags); + _debug("expire %pU %u", &server->uuid, atomic_read(&server->active)); + afs_see_server(server, afs_server_trace_see_expired); + rb_erase(&server->uuid_rb, &cell->fs_servers); + up_write(&cell->fs_lock); + return true; } -/* - * Manage the records of servers known to be within a network namespace. This - * includes garbage collecting unused servers. - * - * Note also that we were given an increment on net->servers_outstanding by - * whoever queued us that we need to deal with before returning. - */ -void afs_manage_servers(struct work_struct *work) +static void afs_server_destroyer(struct work_struct *work) { - struct afs_net *net = container_of(work, struct afs_net, fs_manager); - struct afs_server *gc_list = NULL; - struct rb_node *cursor; - time64_t now = ktime_get_real_seconds(), next_manage = TIME64_MAX; - bool purging = !net->live; - - _enter(""); + struct afs_endpoint_state *estate; + struct afs_server *server = container_of(work, struct afs_server, destroyer); + struct afs_net *net = server->cell->net; - /* Trawl the server list looking for servers that have expired from - * lack of use. - */ - read_seqlock_excl(&net->fs_lock); + afs_see_server(server, afs_server_trace_see_destroyer); - for (cursor = rb_first(&net->fs_servers); cursor; cursor = rb_next(cursor)) { - struct afs_server *server = - rb_entry(cursor, struct afs_server, uuid_rb); - int active = atomic_read(&server->active); + if (test_bit(AFS_SERVER_FL_EXPIRED, &server->flags)) + return; - _debug("manage %pU %u", &server->uuid, active); + if (!afs_remove_server_from_cell(server)) + return; - if (purging) { - trace_afs_server(server->debug_id, refcount_read(&server->ref), - active, afs_server_trace_purging); - if (active != 0) - pr_notice("Can't purge s=%08x\n", server->debug_id); - } + timer_shutdown_sync(&server->timer); + cancel_work(&server->destroyer); - if (active == 0) { - time64_t expire_at = server->unuse_time; - - if (!test_bit(AFS_SERVER_FL_VL_FAIL, &server->flags) && - !test_bit(AFS_SERVER_FL_NOT_FOUND, &server->flags)) - expire_at += afs_server_gc_delay; - if (purging || expire_at <= now) { - server->gc_next = gc_list; - gc_list = server; - } else if (expire_at < next_manage) { - next_manage = expire_at; - } - } - } + if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags)) + afs_give_up_callbacks(net, server); - read_sequnlock_excl(&net->fs_lock); + /* Unbind the rxrpc_peer records from the server. */ + estate = rcu_access_pointer(server->endpoint_state); + if (estate) + afs_set_peer_appdata(server, estate->addresses, NULL); - /* Update the timer on the way out. We have to pass an increment on - * servers_outstanding in the namespace that we are in to the timer or - * the work scheduler. - */ - if (!purging && next_manage < TIME64_MAX) { - now = ktime_get_real_seconds(); + write_seqlock(&net->fs_lock); + list_del_init(&server->probe_link); + if (!hlist_unhashed(&server->proc_link)) + hlist_del_rcu(&server->proc_link); + write_sequnlock(&net->fs_lock); - if (next_manage - now <= 0) { - if (queue_work(afs_wq, &net->fs_manager)) - afs_inc_servers_outstanding(net); - } else { - afs_set_server_timer(net, next_manage - now); - } - } + afs_put_server(net, server, afs_server_trace_destroy); +} - afs_gc_servers(net, gc_list); +static void afs_server_timer(struct timer_list *timer) +{ + struct afs_server *server = container_of(timer, struct afs_server, timer); - afs_dec_servers_outstanding(net); - _leave(" [%d]", atomic_read(&net->servers_outstanding)); + afs_see_server(server, afs_server_trace_see_timer); + if (!test_bit(AFS_SERVER_FL_EXPIRED, &server->flags)) + schedule_work(&server->destroyer); } -static void afs_queue_server_manager(struct afs_net *net) +/* + * Wake up all the servers in a cell so that they can purge themselves. + */ +void afs_purge_servers(struct afs_cell *cell) { - afs_inc_servers_outstanding(net); - if (!queue_work(afs_wq, &net->fs_manager)) - afs_dec_servers_outstanding(net); + struct afs_server *server; + struct rb_node *rb; + + down_read(&cell->fs_lock); + for (rb = rb_first(&cell->fs_servers); rb; rb = rb_next(rb)) { + server = rb_entry(rb, struct afs_server, uuid_rb); + afs_see_server(server, afs_server_trace_see_purge); + schedule_work(&server->destroyer); + } + up_read(&cell->fs_lock); } /* - * Purge list of servers. + * Wait for outstanding servers. */ -void afs_purge_servers(struct afs_net *net) +void afs_wait_for_servers(struct afs_net *net) { _enter(""); - if (del_timer_sync(&net->fs_timer)) - afs_dec_servers_outstanding(net); - - afs_queue_server_manager(net); - - _debug("wait"); atomic_dec(&net->servers_outstanding); wait_var_event(&net->servers_outstanding, !atomic_read(&net->servers_outstanding)); @@ -618,7 +550,7 @@ static noinline bool afs_update_server_record(struct afs_operation *op, atomic_read(&server->active), afs_server_trace_update); - alist = afs_vl_lookup_addrs(op->volume->cell, op->key, &server->uuid); + alist = afs_vl_lookup_addrs(server, op->key); if (IS_ERR(alist)) { rcu_read_lock(); estate = rcu_dereference(server->endpoint_state); diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c index 784236b9b2a9..20d5474837df 100644 --- a/fs/afs/server_list.c +++ b/fs/afs/server_list.c @@ -97,8 +97,8 @@ struct afs_server_list *afs_alloc_server_list(struct afs_volume *volume, break; if (j < slist->nr_servers) { if (slist->servers[j].server == server) { - afs_unuse_server(volume->cell->net, server, - afs_server_trace_unuse_slist_isort); + afs_unuse_server_notime(volume->cell->net, server, + afs_server_trace_unuse_slist_isort); continue; } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 4d798b9e43bf..02f8b2a6977c 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -127,7 +127,6 @@ enum yfs_cm_operation { E_(afs_call_trace_work, "QUEUE") #define afs_server_traces \ - EM(afs_server_trace_alloc, "ALLOC ") \ EM(afs_server_trace_callback, "CALLBACK ") \ EM(afs_server_trace_destroy, "DESTROY ") \ EM(afs_server_trace_free, "FREE ") \ @@ -137,12 +136,14 @@ enum yfs_cm_operation { EM(afs_server_trace_purging, "PURGE ") \ EM(afs_server_trace_put_cbi, "PUT cbi ") \ EM(afs_server_trace_put_probe, "PUT probe") \ + EM(afs_server_trace_see_destroyer, "SEE destr") \ EM(afs_server_trace_see_expired, "SEE expd ") \ + EM(afs_server_trace_see_purge, "SEE purge") \ + EM(afs_server_trace_see_timer, "SEE timer") \ EM(afs_server_trace_unuse_call, "UNU call ") \ EM(afs_server_trace_unuse_create_fail, "UNU cfail") \ EM(afs_server_trace_unuse_slist, "UNU slist") \ EM(afs_server_trace_unuse_slist_isort, "UNU isort") \ - EM(afs_server_trace_unuse_uuid_rsq, "PUT u-req") \ EM(afs_server_trace_update, "UPDATE ") \ EM(afs_server_trace_use_by_uuid, "USE uuid ") \ EM(afs_server_trace_use_cm_call, "USE cm-cl") \ @@ -229,7 +230,7 @@ enum yfs_cm_operation { EM(afs_alist_trace_put_getaddru, "PUT GtAdrU") \ EM(afs_alist_trace_put_parse_empty, "PUT p-empt") \ EM(afs_alist_trace_put_parse_error, "PUT p-err ") \ - EM(afs_alist_trace_put_server_dup, "PUT sv-dup") \ + EM(afs_alist_trace_put_server_create, "PUT sv-crt") \ EM(afs_alist_trace_put_server_oom, "PUT sv-oom") \ EM(afs_alist_trace_put_server_update, "PUT sv-upd") \ EM(afs_alist_trace_put_vlgetcaps, "PUT vgtcap") \ From patchwork Mon Feb 24 23:41:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13989098 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63F99214218 for ; Mon, 24 Feb 2025 23:43:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440595; cv=none; b=lqrFnwq7gZCWKmqolqxc0543b8OIYIYfe/Ax9gH2nQvt7cAqgJnwYGCK/2txqwxqY6DZEiPhi7YmXVnX3hCEHz7/h2tVTJ/Ha2g2DRrCplFpPFih4JmbnGbfNeI6luRIIvUNm52wdE3/+6RZZJaAT9PvhUaek69Bmm0I0yCM7D8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740440595; c=relaxed/simple; bh=qgEjn67W7WMuVEff3XRK+6grIeTt/3kHkp8vqk3plpU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KTIvXK31OBpVWcw5kDrsxpuDxkarxucHy/ESSw/3OYHlv9AmoXjbkw7LpuNuDh3CE8H/Bi2cdhUSbAaDYrbHq16rXeO/HkE871ptMwEf/bxIn4jG/M8aNaZPD6iruOqBV/uFS4ql1RddKR7iU7she1+yFb7ATaU5fVDsHcpOtno= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=MXgSqi9P; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MXgSqi9P" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740440591; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NgNtnG+SJi6gXT3e1JgMl+mgJxCRwFe3YXk4wSqcBUM=; b=MXgSqi9P/AKZaR/pBuvhNM0PV7WuWRdePBRh2PXzDSSr5VJMFiuH7Ss/Ou2j7C7/zQX6oy qDoJFOyFYW0qQ5rXmcrH7WlHwdSnki3JcrcJY7jR7jhBp2kF/G/GYew+Ytg8GNjAX6iuf+ yOTw0FszQwpPE71tIQWLKrSQjQEt2po= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-621-YvCjwAi4NqawYlT_dmjPiQ-1; Mon, 24 Feb 2025 18:43:09 -0500 X-MC-Unique: YvCjwAi4NqawYlT_dmjPiQ-1 X-Mimecast-MFC-AGG-ID: YvCjwAi4NqawYlT_dmjPiQ_1740440588 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4ACEC19783B7; Mon, 24 Feb 2025 23:43:08 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.9]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C2BD41800359; Mon, 24 Feb 2025 23:43:04 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Marc Dionne , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Christian Brauner , linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org, Simon Horman Subject: [PATCH net-next 15/15] afs: Simplify cell record handling Date: Mon, 24 Feb 2025 23:41:52 +0000 Message-ID: <20250224234154.2014840-16-dhowells@redhat.com> In-Reply-To: <20250224234154.2014840-1-dhowells@redhat.com> References: <20250224234154.2014840-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Patchwork-Delegate: kuba@kernel.org Simplify afs_cell record handling to avoid very occasional races that cause module removal to hang (it waits for all cell records to be removed). There are two things that particularly contribute to the difficulty: firstly, the code tries to pass a ref on the cell to the cell's maintenance work item (which gets awkward if the work item is already queued); and, secondly, there's an overall cell manager that tries to use just one timer for the entire cell collection (to avoid having loads of timers). However, both of these are probably unnecessarily restrictive. To simplify this, the following changes are made: (1) The cell record collection manager is removed. Each cell record manages itself individually. (2) Each afs_cell is given a second work item (cell->destroyer) that is queued when its refcount reaches zero. This is not done in the context of the putting thread as it might be in an inconvenient place to sleep. (3) Each afs_cell is given its own timer. The timer is used to expire the cell record after a period of unuse if not otherwise pinned and can also be used for other maintenance tasks if necessary (of which there are currently none as DNS refresh is triggered by filesystem operations). (4) The afs_cell manager work item (cell->manager) is no longer given a ref on the cell when queued; rather, the manager must be deleted. This does away with the need to deal with the consequences of losing a race to queue cell->manager. Clean up of extra queuing is deferred to the destroyer. (5) The cell destroyer work item makes sure the cell timer is removed and that the normal cell work is cancelled before farming the actual destruction off to RCU. (6) When a network namespace is destroyed or the kafs module is unloaded, it's now a simple matter of marking the namespace as dead then just waking up all the cell work items. They will then remove and destroy themselves once all remaining activity counts and/or a ref counts are dropped. This makes sure that all server records are dropped first. (7) The cell record state set is reduced to just four states: SETTING_UP, ACTIVE, REMOVING and DEAD. The record persists in the active state even when it's not being used until the time comes to remove it rather than downgrading it to an inactive state from whence it can be restored. This means that the cell still appears in /proc and /afs when not in use until it switches to the REMOVING state - at which point it is removed. Note that the REMOVING state is included so that someone wanting to resurrect the cell record is forced to wait whilst the cell is torn down in that state. Once it's in the DEAD state, it has been removed from net->cells tree and is no longer findable and can be replaced. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org --- fs/afs/cell.c | 404 +++++++++++++++---------------------- fs/afs/dynroot.c | 4 +- fs/afs/internal.h | 16 +- fs/afs/main.c | 3 - fs/afs/server.c | 8 +- fs/afs/vl_rotate.c | 2 +- include/trace/events/afs.h | 23 +-- 7 files changed, 187 insertions(+), 273 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 67632205edfd..e62181920907 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -20,8 +20,9 @@ static unsigned __read_mostly afs_cell_min_ttl = 10 * 60; static unsigned __read_mostly afs_cell_max_ttl = 24 * 60 * 60; static atomic_t cell_debug_id; -static void afs_queue_cell_manager(struct afs_net *); -static void afs_manage_cell_work(struct work_struct *); +static void afs_cell_timer(struct timer_list *timer); +static void afs_destroy_cell_work(struct work_struct *work); +static void afs_manage_cell_work(struct work_struct *work); static void afs_dec_cells_outstanding(struct afs_net *net) { @@ -29,19 +30,11 @@ static void afs_dec_cells_outstanding(struct afs_net *net) wake_up_var(&net->cells_outstanding); } -/* - * Set the cell timer to fire after a given delay, assuming it's not already - * set for an earlier time. - */ -static void afs_set_cell_timer(struct afs_net *net, time64_t delay) +static void afs_set_cell_state(struct afs_cell *cell, enum afs_cell_state state) { - if (net->live) { - atomic_inc(&net->cells_outstanding); - if (timer_reduce(&net->cells_timer, jiffies + delay * HZ)) - afs_dec_cells_outstanding(net); - } else { - afs_queue_cell_manager(net); - } + smp_store_release(&cell->state, state); /* Commit cell changes before state */ + smp_wmb(); /* Set cell state before task state */ + wake_up_var(&cell->state); } /* @@ -115,7 +108,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, const char *name, unsigned int namelen, const char *addresses) { - struct afs_vlserver_list *vllist; + struct afs_vlserver_list *vllist = NULL; struct afs_cell *cell; int i, ret; @@ -162,7 +155,9 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, cell->net = net; refcount_set(&cell->ref, 1); atomic_set(&cell->active, 0); + INIT_WORK(&cell->destroyer, afs_destroy_cell_work); INIT_WORK(&cell->manager, afs_manage_cell_work); + timer_setup(&cell->management_timer, afs_cell_timer, 0); init_rwsem(&cell->vs_lock); cell->volumes = RB_ROOT; INIT_HLIST_HEAD(&cell->proc_volumes); @@ -218,6 +213,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, if (ret == -EINVAL) printk(KERN_ERR "kAFS: bad VL server IP address\n"); error: + afs_put_vlserverlist(cell->net, vllist); kfree(cell->name - 1); kfree(cell); _leave(" = %d", ret); @@ -294,26 +290,28 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, cell = candidate; candidate = NULL; - atomic_set(&cell->active, 2); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), 2, afs_cell_trace_insert); + afs_use_cell(cell, trace); rb_link_node_rcu(&cell->net_node, parent, pp); rb_insert_color(&cell->net_node, &net->cells); up_write(&net->cells_lock); - afs_queue_cell(cell, afs_cell_trace_get_queue_new); + afs_queue_cell(cell, afs_cell_trace_queue_new); wait_for_cell: - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), atomic_read(&cell->active), - afs_cell_trace_wait); _debug("wait_for_cell"); - wait_var_event(&cell->state, - ({ - state = smp_load_acquire(&cell->state); /* vs error */ - state == AFS_CELL_ACTIVE || state == AFS_CELL_REMOVED; - })); + state = smp_load_acquire(&cell->state); /* vs error */ + if (state != AFS_CELL_ACTIVE && + state != AFS_CELL_DEAD) { + afs_see_cell(cell, afs_cell_trace_wait); + wait_var_event(&cell->state, + ({ + state = smp_load_acquire(&cell->state); /* vs error */ + state == AFS_CELL_ACTIVE || state == AFS_CELL_DEAD; + })); + } /* Check the state obtained from the wait check. */ - if (state == AFS_CELL_REMOVED) { + if (state == AFS_CELL_DEAD) { ret = cell->error; goto error; } @@ -395,7 +393,6 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) /* install the new cell */ down_write(&net->cells_lock); - afs_see_cell(new_root, afs_cell_trace_see_ws); old_root = net->ws_cell; net->ws_cell = new_root; up_write(&net->cells_lock); @@ -528,30 +525,14 @@ static void afs_cell_destroy(struct rcu_head *rcu) _leave(" [destroyed]"); } -/* - * Queue the cell manager. - */ -static void afs_queue_cell_manager(struct afs_net *net) -{ - int outstanding = atomic_inc_return(&net->cells_outstanding); - - _enter("%d", outstanding); - - if (!queue_work(afs_wq, &net->cells_manager)) - afs_dec_cells_outstanding(net); -} - -/* - * Cell management timer. We have an increment on cells_outstanding that we - * need to pass along to the work item. - */ -void afs_cells_timer(struct timer_list *timer) +static void afs_destroy_cell_work(struct work_struct *work) { - struct afs_net *net = container_of(timer, struct afs_net, cells_timer); + struct afs_cell *cell = container_of(work, struct afs_cell, destroyer); - _enter(""); - if (!queue_work(afs_wq, &net->cells_manager)) - afs_dec_cells_outstanding(net); + afs_see_cell(cell, afs_cell_trace_destroy); + timer_delete_sync(&cell->management_timer); + cancel_work_sync(&cell->manager); + call_rcu(&cell->rcu, afs_cell_destroy); } /* @@ -583,7 +564,7 @@ void afs_put_cell(struct afs_cell *cell, enum afs_cell_trace reason) if (zero) { a = atomic_read(&cell->active); WARN(a != 0, "Cell active count %u > 0\n", a); - call_rcu(&cell->rcu, afs_cell_destroy); + WARN_ON(!queue_work(afs_wq, &cell->destroyer)); } } } @@ -595,10 +576,9 @@ struct afs_cell *afs_use_cell(struct afs_cell *cell, enum afs_cell_trace reason) { int r, a; - r = refcount_read(&cell->ref); - WARN_ON(r == 0); + __refcount_inc(&cell->ref, &r); a = atomic_inc_return(&cell->active); - trace_afs_cell(cell->debug_id, r, a, reason); + trace_afs_cell(cell->debug_id, r + 1, a, reason); return cell; } @@ -610,6 +590,7 @@ void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason) { unsigned int debug_id; time64_t now, expire_delay; + bool zero; int r, a; if (!cell) @@ -624,13 +605,15 @@ void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason) expire_delay = afs_cell_gc_delay; debug_id = cell->debug_id; - r = refcount_read(&cell->ref); a = atomic_dec_return(&cell->active); - trace_afs_cell(debug_id, r, a, reason); - WARN_ON(a == 0); - if (a == 1) + if (!a) /* 'cell' may now be garbage collected. */ - afs_set_cell_timer(cell->net, expire_delay); + afs_set_cell_timer(cell, expire_delay); + + zero = __refcount_dec_and_test(&cell->ref, &r); + trace_afs_cell(debug_id, r - 1, a, reason); + if (zero) + WARN_ON(!queue_work(afs_wq, &cell->destroyer)); } /* @@ -650,9 +633,27 @@ void afs_see_cell(struct afs_cell *cell, enum afs_cell_trace reason) */ void afs_queue_cell(struct afs_cell *cell, enum afs_cell_trace reason) { - afs_get_cell(cell, reason); - if (!queue_work(afs_wq, &cell->manager)) - afs_put_cell(cell, afs_cell_trace_put_queue_fail); + queue_work(afs_wq, &cell->manager); +} + +/* + * Cell-specific management timer. + */ +static void afs_cell_timer(struct timer_list *timer) +{ + struct afs_cell *cell = container_of(timer, struct afs_cell, management_timer); + + afs_see_cell(cell, afs_cell_trace_see_mgmt_timer); + if (refcount_read(&cell->ref) > 0 && cell->net->live) + queue_work(afs_wq, &cell->manager); +} + +/* + * Set/reduce the cell timer. + */ +void afs_set_cell_timer(struct afs_cell *cell, unsigned int delay_secs) +{ + timer_reduce(&cell->management_timer, jiffies + delay_secs * HZ); } /* @@ -735,212 +736,125 @@ static void afs_deactivate_cell(struct afs_net *net, struct afs_cell *cell) _leave(""); } +static bool afs_has_cell_expired(struct afs_cell *cell, time64_t *_next_manage) +{ + const struct afs_vlserver_list *vllist; + time64_t expire_at = cell->last_inactive; + time64_t now = ktime_get_real_seconds(); + + if (atomic_read(&cell->active)) + return false; + if (!cell->net->live) + return true; + + vllist = rcu_dereference_protected(cell->vl_servers, true); + if (vllist && vllist->nr_servers > 0) + expire_at += afs_cell_gc_delay; + + if (expire_at <= now) + return true; + if (expire_at < *_next_manage) + *_next_manage = expire_at; + return false; +} + /* * Manage a cell record, initialising and destroying it, maintaining its DNS * records. */ -static void afs_manage_cell(struct afs_cell *cell) +static bool afs_manage_cell(struct afs_cell *cell) { struct afs_net *net = cell->net; - int ret, active; + time64_t next_manage = TIME64_MAX; + int ret; _enter("%s", cell->name); -again: _debug("state %u", cell->state); switch (cell->state) { - case AFS_CELL_INACTIVE: - case AFS_CELL_FAILED: - down_write(&net->cells_lock); - active = 1; - if (atomic_try_cmpxchg_relaxed(&cell->active, &active, 0)) { - rb_erase(&cell->net_node, &net->cells); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), 0, - afs_cell_trace_unuse_delete); - smp_store_release(&cell->state, AFS_CELL_REMOVED); - } - up_write(&net->cells_lock); - if (cell->state == AFS_CELL_REMOVED) { - wake_up_var(&cell->state); - goto final_destruction; - } - if (cell->state == AFS_CELL_FAILED) - goto done; - smp_store_release(&cell->state, AFS_CELL_UNSET); - wake_up_var(&cell->state); - goto again; - - case AFS_CELL_UNSET: - smp_store_release(&cell->state, AFS_CELL_ACTIVATING); - wake_up_var(&cell->state); - goto again; - - case AFS_CELL_ACTIVATING: - ret = afs_activate_cell(net, cell); - if (ret < 0) - goto activation_failed; + case AFS_CELL_SETTING_UP: + goto set_up_cell; + case AFS_CELL_ACTIVE: + goto cell_is_active; + case AFS_CELL_REMOVING: + WARN_ON_ONCE(1); + return false; + case AFS_CELL_DEAD: + return false; + default: + _debug("bad state %u", cell->state); + WARN_ON_ONCE(1); /* Unhandled state */ + return false; + } - smp_store_release(&cell->state, AFS_CELL_ACTIVE); - wake_up_var(&cell->state); - goto again; +set_up_cell: + ret = afs_activate_cell(net, cell); + if (ret < 0) { + cell->error = ret; + goto remove_cell; + } - case AFS_CELL_ACTIVE: - if (atomic_read(&cell->active) > 1) { - if (test_and_clear_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags)) { - ret = afs_update_cell(cell); - if (ret < 0) - cell->error = ret; - } - goto done; - } - smp_store_release(&cell->state, AFS_CELL_DEACTIVATING); - wake_up_var(&cell->state); - goto again; + afs_set_cell_state(cell, AFS_CELL_ACTIVE); - case AFS_CELL_DEACTIVATING: - if (atomic_read(&cell->active) > 1) - goto reverse_deactivation; - afs_deactivate_cell(net, cell); - smp_store_release(&cell->state, AFS_CELL_INACTIVE); - wake_up_var(&cell->state); - goto again; +cell_is_active: + if (afs_has_cell_expired(cell, &next_manage)) + goto remove_cell; - case AFS_CELL_REMOVED: - goto done; + if (test_and_clear_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags)) { + ret = afs_update_cell(cell); + if (ret < 0) + cell->error = ret; + } - default: - break; + if (next_manage < TIME64_MAX && cell->net->live) { + time64_t now = ktime_get_real_seconds(); + + if (next_manage - now <= 0) + afs_queue_cell(cell, afs_cell_trace_queue_again); + else + afs_set_cell_timer(cell, next_manage - now); } - _debug("bad state %u", cell->state); - BUG(); /* Unhandled state */ + _leave(" [done %u]", cell->state); + return false; -activation_failed: - cell->error = ret; - afs_deactivate_cell(net, cell); +remove_cell: + down_write(&net->cells_lock); - smp_store_release(&cell->state, AFS_CELL_FAILED); /* vs error */ - wake_up_var(&cell->state); - goto again; + if (atomic_read(&cell->active)) { + up_write(&net->cells_lock); + goto cell_is_active; + } -reverse_deactivation: - smp_store_release(&cell->state, AFS_CELL_ACTIVE); - wake_up_var(&cell->state); - _leave(" [deact->act]"); - return; + /* Make sure that the expiring server records are going to see the fact + * that the cell is caput. + */ + afs_set_cell_state(cell, AFS_CELL_REMOVING); -done: - _leave(" [done %u]", cell->state); - return; + afs_deactivate_cell(net, cell); + afs_purge_servers(cell); + + rb_erase(&cell->net_node, &net->cells); + afs_see_cell(cell, afs_cell_trace_unuse_delete); + up_write(&net->cells_lock); -final_destruction: /* The root volume is pinning the cell */ afs_put_volume(cell->root_volume, afs_volume_trace_put_cell_root); cell->root_volume = NULL; - afs_purge_servers(cell); - afs_put_cell(cell, afs_cell_trace_put_destroy); + + afs_set_cell_state(cell, AFS_CELL_DEAD); + return true; } static void afs_manage_cell_work(struct work_struct *work) { struct afs_cell *cell = container_of(work, struct afs_cell, manager); + bool final_put; - afs_manage_cell(cell); - afs_put_cell(cell, afs_cell_trace_put_queue_work); -} - -/* - * Manage the records of cells known to a network namespace. This includes - * updating the DNS records and garbage collecting unused cells that were - * automatically added. - * - * Note that constructed cell records may only be removed from net->cells by - * this work item, so it is safe for this work item to stash a cursor pointing - * into the tree and then return to caller (provided it skips cells that are - * still under construction). - * - * Note also that we were given an increment on net->cells_outstanding by - * whoever queued us that we need to deal with before returning. - */ -void afs_manage_cells(struct work_struct *work) -{ - struct afs_net *net = container_of(work, struct afs_net, cells_manager); - struct rb_node *cursor; - time64_t now = ktime_get_real_seconds(), next_manage = TIME64_MAX; - bool purging = !net->live; - - _enter(""); - - /* Trawl the cell database looking for cells that have expired from - * lack of use and cells whose DNS results have expired and dispatch - * their managers. - */ - down_read(&net->cells_lock); - - for (cursor = rb_first(&net->cells); cursor; cursor = rb_next(cursor)) { - struct afs_cell *cell = - rb_entry(cursor, struct afs_cell, net_node); - unsigned active; - bool sched_cell = false; - - active = atomic_read(&cell->active); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), - active, afs_cell_trace_manage); - - ASSERTCMP(active, >=, 1); - - if (purging) { - if (test_and_clear_bit(AFS_CELL_FL_NO_GC, &cell->flags)) { - active = atomic_dec_return(&cell->active); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), - active, afs_cell_trace_unuse_pin); - } - } - - if (active == 1) { - struct afs_vlserver_list *vllist; - time64_t expire_at = cell->last_inactive; - - read_lock(&cell->vl_servers_lock); - vllist = rcu_dereference_protected( - cell->vl_servers, - lockdep_is_held(&cell->vl_servers_lock)); - if (vllist->nr_servers > 0) - expire_at += afs_cell_gc_delay; - read_unlock(&cell->vl_servers_lock); - if (purging || expire_at <= now) - sched_cell = true; - else if (expire_at < next_manage) - next_manage = expire_at; - } - - if (!purging) { - if (test_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags)) - sched_cell = true; - } - - if (sched_cell) - afs_queue_cell(cell, afs_cell_trace_get_queue_manage); - } - - up_read(&net->cells_lock); - - /* Update the timer on the way out. We have to pass an increment on - * cells_outstanding in the namespace that we are in to the timer or - * the work scheduler. - */ - if (!purging && next_manage < TIME64_MAX) { - now = ktime_get_real_seconds(); - - if (next_manage - now <= 0) { - if (queue_work(afs_wq, &net->cells_manager)) - atomic_inc(&net->cells_outstanding); - } else { - afs_set_cell_timer(net, next_manage - now); - } - } - - afs_dec_cells_outstanding(net); - _leave(" [%d]", atomic_read(&net->cells_outstanding)); + afs_see_cell(cell, afs_cell_trace_manage); + final_put = afs_manage_cell(cell); + afs_see_cell(cell, afs_cell_trace_managed); + if (final_put) + afs_put_cell(cell, afs_cell_trace_put_final); } /* @@ -949,6 +863,7 @@ void afs_manage_cells(struct work_struct *work) void afs_cell_purge(struct afs_net *net) { struct afs_cell *ws; + struct rb_node *cursor; _enter(""); @@ -958,12 +873,19 @@ void afs_cell_purge(struct afs_net *net) up_write(&net->cells_lock); afs_unuse_cell(ws, afs_cell_trace_unuse_ws); - _debug("del timer"); - if (del_timer_sync(&net->cells_timer)) - atomic_dec(&net->cells_outstanding); + _debug("kick cells"); + down_read(&net->cells_lock); + for (cursor = rb_first(&net->cells); cursor; cursor = rb_next(cursor)) { + struct afs_cell *cell = rb_entry(cursor, struct afs_cell, net_node); + + afs_see_cell(cell, afs_cell_trace_purge); - _debug("kick mgr"); - afs_queue_cell_manager(net); + if (test_and_clear_bit(AFS_CELL_FL_NO_GC, &cell->flags)) + afs_unuse_cell(cell, afs_cell_trace_unuse_pin); + + afs_queue_cell(cell, afs_cell_trace_queue_purge); + } + up_read(&net->cells_lock); _debug("wait"); wait_var_event(&net->cells_outstanding, diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index a34b45f97d30..0b66865e3535 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -282,8 +282,8 @@ static int afs_dynroot_readdir_cells(struct afs_net *net, struct dir_context *ct cell = idr_get_next(&net->cells_dyn_ino, &ix); if (!cell) return 0; - if (READ_ONCE(cell->state) == AFS_CELL_FAILED || - READ_ONCE(cell->state) == AFS_CELL_REMOVED) { + if (READ_ONCE(cell->state) == AFS_CELL_REMOVING || + READ_ONCE(cell->state) == AFS_CELL_DEAD) { ctx->pos += 2; ctx->pos &= ~1; continue; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index bbc272caf31b..addce2f03562 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -289,8 +289,6 @@ struct afs_net { struct rb_root cells; struct idr cells_dyn_ino; /* cell->dynroot_ino mapping */ struct afs_cell *ws_cell; - struct work_struct cells_manager; - struct timer_list cells_timer; atomic_t cells_outstanding; struct rw_semaphore cells_lock; struct mutex cells_alias_lock; @@ -339,13 +337,10 @@ struct afs_net { extern const char afs_init_sysname[]; enum afs_cell_state { - AFS_CELL_UNSET, - AFS_CELL_ACTIVATING, + AFS_CELL_SETTING_UP, AFS_CELL_ACTIVE, - AFS_CELL_DEACTIVATING, - AFS_CELL_INACTIVE, - AFS_CELL_FAILED, - AFS_CELL_REMOVED, + AFS_CELL_REMOVING, + AFS_CELL_DEAD, }; /* @@ -376,7 +371,9 @@ struct afs_cell { struct afs_cell *alias_of; /* The cell this is an alias of */ struct afs_volume *root_volume; /* The root.cell volume if there is one */ struct key *anonymous_key; /* anonymous user key for this cell */ + struct work_struct destroyer; /* Destroyer for cell */ struct work_struct manager; /* Manager for init/deinit/dns */ + struct timer_list management_timer; /* General management timer */ struct hlist_node proc_link; /* /proc cell list link */ time64_t dns_expiry; /* Time AFSDB/SRV record expires */ time64_t last_inactive; /* Time of last drop of usage count */ @@ -1053,8 +1050,7 @@ extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_see_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_put_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_queue_cell(struct afs_cell *, enum afs_cell_trace); -extern void afs_manage_cells(struct work_struct *); -extern void afs_cells_timer(struct timer_list *); +void afs_set_cell_timer(struct afs_cell *cell, unsigned int delay_secs); extern void __net_exit afs_cell_purge(struct afs_net *); /* diff --git a/fs/afs/main.c b/fs/afs/main.c index bff0363286b0..c845c5daaeba 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -78,9 +78,6 @@ static int __net_init afs_net_init(struct net *net_ns) net->cells = RB_ROOT; idr_init(&net->cells_dyn_ino); init_rwsem(&net->cells_lock); - INIT_WORK(&net->cells_manager, afs_manage_cells); - timer_setup(&net->cells_timer, afs_cells_timer, 0); - mutex_init(&net->cells_alias_lock); mutex_init(&net->proc_cells_lock); INIT_HLIST_HEAD(&net->proc_cells); diff --git a/fs/afs/server.c b/fs/afs/server.c index 487e2134aea4..c530d1ca15df 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -103,7 +103,7 @@ static struct afs_server *afs_install_server(struct afs_cell *cell, afs_get_cell(cell, afs_cell_trace_get_server); exists: - afs_use_server(server, true, afs_server_trace_get_install); + afs_use_server(server, true, afs_server_trace_use_install); return server; } @@ -356,7 +356,7 @@ void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, if (atomic_dec_and_test(&server->active)) { if (test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) || - READ_ONCE(server->cell->state) >= AFS_CELL_FAILED) + READ_ONCE(server->cell->state) >= AFS_CELL_REMOVING) schedule_work(&server->destroyer); } @@ -374,7 +374,7 @@ void afs_unuse_server(struct afs_net *net, struct afs_server *server, if (atomic_dec_and_test(&server->active)) { if (!test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) && - READ_ONCE(server->cell->state) < AFS_CELL_FAILED) { + READ_ONCE(server->cell->state) < AFS_CELL_REMOVING) { time64_t unuse_time = ktime_get_real_seconds(); server->unuse_time = unuse_time; @@ -424,7 +424,7 @@ static bool afs_has_server_expired(const struct afs_server *server) return false; if (server->cell->net->live || - server->cell->state >= AFS_CELL_FAILED) { + server->cell->state >= AFS_CELL_REMOVING) { trace_afs_server(server->debug_id, refcount_read(&server->ref), 0, afs_server_trace_purging); return true; diff --git a/fs/afs/vl_rotate.c b/fs/afs/vl_rotate.c index d8f79f6ada3d..6ad9688d8f4b 100644 --- a/fs/afs/vl_rotate.c +++ b/fs/afs/vl_rotate.c @@ -48,7 +48,7 @@ static bool afs_start_vl_iteration(struct afs_vl_cursor *vc) cell->dns_expiry <= ktime_get_real_seconds()) { dns_lookup_count = smp_load_acquire(&cell->dns_lookup_count); set_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags); - afs_queue_cell(cell, afs_cell_trace_get_queue_dns); + afs_queue_cell(cell, afs_cell_trace_queue_dns); if (cell->dns_source == DNS_RECORD_UNAVAILABLE) { if (wait_var_event_interruptible( diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 02f8b2a6977c..8857f5ea77d4 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -131,7 +131,6 @@ enum yfs_cm_operation { EM(afs_server_trace_destroy, "DESTROY ") \ EM(afs_server_trace_free, "FREE ") \ EM(afs_server_trace_gc, "GC ") \ - EM(afs_server_trace_get_install, "GET inst ") \ EM(afs_server_trace_get_probe, "GET probe") \ EM(afs_server_trace_purging, "PURGE ") \ EM(afs_server_trace_put_cbi, "PUT cbi ") \ @@ -149,6 +148,7 @@ enum yfs_cm_operation { EM(afs_server_trace_use_cm_call, "USE cm-cl") \ EM(afs_server_trace_use_get_caps, "USE gcaps") \ EM(afs_server_trace_use_give_up_cb, "USE gvupc") \ + EM(afs_server_trace_use_install, "USE inst ") \ E_(afs_server_trace_wait_create, "WAIT crt ") #define afs_volume_traces \ @@ -171,37 +171,36 @@ enum yfs_cm_operation { #define afs_cell_traces \ EM(afs_cell_trace_alloc, "ALLOC ") \ + EM(afs_cell_trace_destroy, "DESTROY ") \ EM(afs_cell_trace_free, "FREE ") \ EM(afs_cell_trace_get_atcell, "GET atcell") \ - EM(afs_cell_trace_get_queue_dns, "GET q-dns ") \ - EM(afs_cell_trace_get_queue_manage, "GET q-mng ") \ - EM(afs_cell_trace_get_queue_new, "GET q-new ") \ EM(afs_cell_trace_get_server, "GET server") \ EM(afs_cell_trace_get_vol, "GET vol ") \ - EM(afs_cell_trace_insert, "INSERT ") \ - EM(afs_cell_trace_manage, "MANAGE ") \ + EM(afs_cell_trace_purge, "PURGE ") \ EM(afs_cell_trace_put_atcell, "PUT atcell") \ EM(afs_cell_trace_put_candidate, "PUT candid") \ - EM(afs_cell_trace_put_destroy, "PUT destry") \ - EM(afs_cell_trace_put_queue_work, "PUT q-work") \ - EM(afs_cell_trace_put_queue_fail, "PUT q-fail") \ + EM(afs_cell_trace_put_final, "PUT final ") \ EM(afs_cell_trace_put_server, "PUT server") \ EM(afs_cell_trace_put_vol, "PUT vol ") \ + EM(afs_cell_trace_queue_again, "QUE again ") \ + EM(afs_cell_trace_queue_dns, "QUE dns ") \ + EM(afs_cell_trace_queue_new, "QUE new ") \ + EM(afs_cell_trace_queue_purge, "QUE purge ") \ + EM(afs_cell_trace_manage, "MANAGE ") \ + EM(afs_cell_trace_managed, "MANAGED ") \ EM(afs_cell_trace_see_source, "SEE source") \ - EM(afs_cell_trace_see_ws, "SEE ws ") \ + EM(afs_cell_trace_see_mgmt_timer, "SEE mtimer") \ EM(afs_cell_trace_unuse_alias, "UNU alias ") \ EM(afs_cell_trace_unuse_check_alias, "UNU chk-al") \ EM(afs_cell_trace_unuse_delete, "UNU delete") \ EM(afs_cell_trace_unuse_dynroot_mntpt, "UNU dyn-mp") \ EM(afs_cell_trace_unuse_fc, "UNU fc ") \ - EM(afs_cell_trace_unuse_lookup, "UNU lookup") \ EM(afs_cell_trace_unuse_lookup_dynroot, "UNU lu-dyn") \ EM(afs_cell_trace_unuse_lookup_error, "UNU lu-err") \ EM(afs_cell_trace_unuse_mntpt, "UNU mntpt ") \ EM(afs_cell_trace_unuse_no_pin, "UNU no-pin") \ EM(afs_cell_trace_unuse_parse, "UNU parse ") \ EM(afs_cell_trace_unuse_pin, "UNU pin ") \ - EM(afs_cell_trace_unuse_probe, "UNU probe ") \ EM(afs_cell_trace_unuse_sbi, "UNU sbi ") \ EM(afs_cell_trace_unuse_ws, "UNU ws ") \ EM(afs_cell_trace_use_alias, "USE alias ") \