From patchwork Tue Apr 12 20:26:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12811243 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59F4BC433FE for ; Tue, 12 Apr 2022 20:42:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230476AbiDLUoc (ORCPT ); Tue, 12 Apr 2022 16:44:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231127AbiDLUoQ (ORCPT ); Tue, 12 Apr 2022 16:44:16 -0400 Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C287B0A6A for ; Tue, 12 Apr 2022 13:38:21 -0700 (PDT) Received: by mail-qv1-xf2e.google.com with SMTP id e22so60143qvf.9 for ; Tue, 12 Apr 2022 13:38:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=t3BVO86C+Bp5oTmscok1L9LDu9RzUD1FUdTUb5CffVs=; b=IhQC7HpgwoqGH+D6BJVCKdRcabHBEfjo4k9AIFPHU6WCEmgGIsZOXzR1KJNSzw+rIp p91YiO60i2OLegii7cVRow35HismQXnxq2dkARrjNM5qkLLH03qK52zjSgu77u1nDDNh XsVC5HYtvuKyRPBXGWWmmkLiXj60bxQJQ2cY3oCETykhHSAAayv5aWE2H4REy1R8zkP9 yWbFfr7O3oWUoIm7F1yb4ndrL71iQDmTuA676rnR3TaleXh7dlbwfFcM82xT5gi1o3FU HwPjy4oOZLOXeczqsoiBUU54hoHF3dKsjek2CAboayMXvFwAUnPWwqNCQOWR1zqPtzlu y5UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t3BVO86C+Bp5oTmscok1L9LDu9RzUD1FUdTUb5CffVs=; b=aDs2nWnTBT558UCkNuKC7Rsv6l4VTOadbs+srdNv3AKLKjldSCvnFXXpHiU7mYVeIN ccNTHNZR+lKQ6fS6Y4MvjHVej6xArOoeCubjrzV9EgDfhFMbNh3of3zy+bO6VBRtd9bo /MWQHGJAjt4cqEnenZ5ZX/9diSO/7FDqkaPjvz72RciJNYWCdbKWwq0DitDFG16cx7OX VE+jjsTWKRxGK4efCX7Rek035jF1uBAreb66AaIkyOgTW3BYnhsCMaB3C5//YgBttRVI INSQMmDcZ3tCUGGWeFZpAtNpOpQd5qUQ/CQbjuv3tE4DvikmGO5nin22TfgMgllkRSnu jmPQ== X-Gm-Message-State: AOAM533puWy6RIWVmSuikcdM8c2Uq39T+kHI6znN7kVnqYYInZjglPPx tW2Id2ImXByVJG7OHLyy7pO3bepSNV1XAX34 X-Google-Smtp-Source: ABdhPJx7U5RACtQ7eexyrDXrCZCXkQNJpXPgwtvjpTvzPqecB/HoWRpqT55YIWX2aXbldvutt/tXmw== X-Received: by 2002:a05:6a00:3309:b0:505:ffd5:f146 with SMTP id cq9-20020a056a00330900b00505ffd5f146mr4180628pfb.60.1649795176635; Tue, 12 Apr 2022 13:26:16 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id p12-20020a63ab0c000000b00381f7577a5csm3609084pgf.17.2022.04.12.13.26.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 13:26:16 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/4] net: add sock 'sk_no_lock' member Date: Tue, 12 Apr 2022 14:26:10 -0600 Message-Id: <20220412202613.234896-2-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220412202613.234896-1-axboe@kernel.dk> References: <20220412202613.234896-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org In preparation for allowing lockless access to the socket for specialized use cases, add a member denoting that the socket supports this. No functional changes in this patch. Signed-off-by: Jens Axboe --- include/net/sock.h | 3 +++ net/core/sock.c | 1 + 2 files changed, 4 insertions(+) diff --git a/include/net/sock.h b/include/net/sock.h index c4b91fc19b9c..e8283a65b757 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -131,6 +131,7 @@ typedef __u64 __bitwise __addrpair; * @skc_reuseport: %SO_REUSEPORT setting * @skc_ipv6only: socket is IPV6 only * @skc_net_refcnt: socket is using net ref counting + * @skc_no_lock: socket is private, no locking needed * @skc_bound_dev_if: bound device index if != 0 * @skc_bind_node: bind hash linkage for various protocol lookup tables * @skc_portaddr_node: second hash linkage for UDP/UDP-Lite protocol @@ -190,6 +191,7 @@ struct sock_common { unsigned char skc_reuseport:1; unsigned char skc_ipv6only:1; unsigned char skc_net_refcnt:1; + unsigned char skc_no_lock:1; int skc_bound_dev_if; union { struct hlist_node skc_bind_node; @@ -382,6 +384,7 @@ struct sock { #define sk_reuseport __sk_common.skc_reuseport #define sk_ipv6only __sk_common.skc_ipv6only #define sk_net_refcnt __sk_common.skc_net_refcnt +#define sk_no_lock __sk_common.skc_no_lock #define sk_bound_dev_if __sk_common.skc_bound_dev_if #define sk_bind_node __sk_common.skc_bind_node #define sk_prot __sk_common.skc_prot diff --git a/net/core/sock.c b/net/core/sock.c index 1180a0cb0110..fec892b384a4 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2101,6 +2101,7 @@ EXPORT_SYMBOL(sk_free); static void sk_init_common(struct sock *sk) { + sk->sk_no_lock = false; skb_queue_head_init(&sk->sk_receive_queue); skb_queue_head_init(&sk->sk_write_queue); skb_queue_head_init(&sk->sk_error_queue); From patchwork Tue Apr 12 20:26:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12811288 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB8EAC43217 for ; Tue, 12 Apr 2022 23:24:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230036AbiDLX0r (ORCPT ); Tue, 12 Apr 2022 19:26:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229934AbiDLX0W (ORCPT ); Tue, 12 Apr 2022 19:26:22 -0400 Received: from mail-oi1-x236.google.com (mail-oi1-x236.google.com [IPv6:2607:f8b0:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E564FE9964 for ; Tue, 12 Apr 2022 15:40:32 -0700 (PDT) Received: by mail-oi1-x236.google.com with SMTP id q189so232892oia.9 for ; Tue, 12 Apr 2022 15:40:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DAOJ+2Dw7odh73dz0PwJU+1pGsjezsYHV2vM38MSs88=; b=lGWY8rAe4Vy4evQsQJvRj5XO3VdNphEaelmiyDEGGx7ZQ3XPvMxaO0KRk50vpPwQeZ Q8EBGbj6+JC7sr2wmYu4rzv2ctD3TNwaH3VOruYmifXv+SyOYKuc+KxJCBdYnJvGiI+2 yoUd7/Qm8+JmxETaBRNLXTmFc4ge2zTiT0ra9iKvVaT8EhkL+EgKogHvBPWKOek8Epdk NmYCeOLckRq9JIZm6dp/nBYtPjNmgxfO/PSucCcw81WoEACFrOAOVLOJ/PUWLlFyY7XS V8qdj9EjxZQjYdUQsnFbbiRqbpFuqmsUmV4NG1c+k2ry3tRbELpyL1283ec23VgcWSMO eBBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DAOJ+2Dw7odh73dz0PwJU+1pGsjezsYHV2vM38MSs88=; b=5LSqNnXg5x5uHJmLU4drKAl9BkWSSowMV//rmys+DiaIteLbi2rkEmmOOhHrbjFzco z8PUVA/jXziwse7r0rttI7nrohkuaflsi+fv2griPG4SbEmsflu/2JU016myeXJ9rmEC siy9nra2fqSFTtcTTZVAAVB27kXNkKCws5Hr27RgaYpxg99HtpgGPkx2W5FUo5opY3SA b1o0x0cYtnQhb0Y96iboPAybWYy0zNPDoKRw9M9anuXYk8HyvC2DlPzkSDvGSSYuvDP+ Nt/ldEV+UnlDg61Le4s/RFFwHGOASA3TBWW8y0fn0YbSPuG/WJBvcrmsyfChYna5W0tM Em7g== X-Gm-Message-State: AOAM531nOebct7iK/I82Hg5FfXGSh9DBRI1bXOm9bUN3gIRAbckhO6yR b/4hvVCJtgKZ3J6fduSmm3pZvZgzBaPu77V2 X-Google-Smtp-Source: ABdhPJycsRiJanTh3wtw2KohBRMlkFr2bIjTqMD80zMNbSPd1Vas/+wa4hKSZlOf8uzVbxRel2O6Wg== X-Received: by 2002:a17:90a:4308:b0:1cb:b996:1dc with SMTP id q8-20020a17090a430800b001cbb99601dcmr6982659pjg.224.1649795177596; Tue, 12 Apr 2022 13:26:17 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id p12-20020a63ab0c000000b00381f7577a5csm3609084pgf.17.2022.04.12.13.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 13:26:17 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/4] net: allow sk_prot->release_cb() without sock lock held Date: Tue, 12 Apr 2022 14:26:11 -0600 Message-Id: <20220412202613.234896-3-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220412202613.234896-1-axboe@kernel.dk> References: <20220412202613.234896-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Add helpers that allow ->release_cb() to acquire the socket bh lock when needed. For normal sockets, ->release_cb() is always invoked with that lock held. For nolock sockets it will not be held, so provide an easy way to acquire it when necessary. Signed-off-by: Jens Axboe --- include/net/sock.h | 10 ++++++++++ net/atm/common.c | 5 ++++- net/ipv4/tcp_output.c | 2 ++ net/mptcp/protocol.c | 3 +++ net/smc/af_smc.c | 2 ++ 5 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/net/sock.h b/include/net/sock.h index e8283a65b757..99fcc4d7eed9 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1696,6 +1696,16 @@ void release_sock(struct sock *sk); SINGLE_DEPTH_NESTING) #define bh_unlock_sock(__sk) spin_unlock(&((__sk)->sk_lock.slock)) +/* nolock helpers */ +#define bh_lock_sock_on_nolock(__sk) do { \ + if ((__sk)->sk_no_lock) \ + spin_lock_bh(&(__sk)->sk_lock.slock); \ +} while (0) +#define bh_unlock_sock_on_nolock(__sk) do { \ + if ((__sk)->sk_no_lock) \ + spin_unlock_bh(&(__sk)->sk_lock.slock); \ +} while (0) + bool __lock_sock_fast(struct sock *sk) __acquires(&sk->sk_lock.slock); /** diff --git a/net/atm/common.c b/net/atm/common.c index 1cfa9bf1d187..471363e929f6 100644 --- a/net/atm/common.c +++ b/net/atm/common.c @@ -126,8 +126,11 @@ static void vcc_release_cb(struct sock *sk) { struct atm_vcc *vcc = atm_sk(sk); - if (vcc->release_cb) + if (vcc->release_cb) { + bh_lock_sock_on_nolock(sk); vcc->release_cb(vcc); + bh_lock_sock_on_nolock(sk); + } } static struct proto vcc_proto = { diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 9ede847f4199..9f86ea63cbac 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1100,6 +1100,7 @@ void tcp_release_cb(struct sock *sk) * But following code is meant to be called from BH handlers, * so we should keep BH disabled, but early release socket ownership */ + bh_lock_sock_on_nolock(sk); sock_release_ownership(sk); if (flags & TCPF_WRITE_TIMER_DEFERRED) { @@ -1114,6 +1115,7 @@ void tcp_release_cb(struct sock *sk) inet_csk(sk)->icsk_af_ops->mtu_reduced(sk); __sock_put(sk); } + bh_unlock_sock_on_nolock(sk); } EXPORT_SYMBOL(tcp_release_cb); diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 0cbea3b6d0a4..ae9078e8e137 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3065,6 +3065,8 @@ static void mptcp_release_cb(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk); + bh_lock_sock_on_nolock(sk); + for (;;) { unsigned long flags = (msk->cb_flags & MPTCP_FLAGS_PROCESS_CTX_NEED) | msk->push_pending; @@ -3103,6 +3105,7 @@ static void mptcp_release_cb(struct sock *sk) __mptcp_error_report(sk); __mptcp_update_rmem(sk); + bh_unlock_sock_on_nolock(sk); } /* MP_JOIN client subflow must wait for 4th ack before sending any data: diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index f0d118e9f155..3456dc6cd38b 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -201,10 +201,12 @@ static void smc_release_cb(struct sock *sk) { struct smc_sock *smc = smc_sk(sk); + bh_lock_sock_on_nolock(sk); if (smc->conn.tx_in_release_sock) { smc_tx_pending(&smc->conn); smc->conn.tx_in_release_sock = false; } + bh_unlock_sock_on_nolock(sk); } struct proto smc_proto = { From patchwork Tue Apr 12 20:26:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12811300 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94000C433EF for ; Tue, 12 Apr 2022 23:30:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230390AbiDLXcT (ORCPT ); Tue, 12 Apr 2022 19:32:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230385AbiDLXbz (ORCPT ); Tue, 12 Apr 2022 19:31:55 -0400 Received: from mail-vk1-xa29.google.com (mail-vk1-xa29.google.com [IPv6:2607:f8b0:4864:20::a29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 498C815AE13 for ; Tue, 12 Apr 2022 15:27:06 -0700 (PDT) Received: by mail-vk1-xa29.google.com with SMTP id o132so33778vko.11 for ; Tue, 12 Apr 2022 15:27:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EHJU0bX552Ws+jez1ticz4oJcjIwpL/pBsuAb1MhPQw=; b=SnQKPj2om0katQvYFgE7+kuQZfQ2C5lYrcA2+bwHXyeY//eLa5U0dZXfJMqOKMNtHK 6450rsOckSMIIzRj3nP8UNqWh9Ada01R++mfnY9lBSczTG3tnmMEqWpShMM65jaNRFUJ 48Hcz9R56fVt9KG3YbzE3oganG7pD9oZBaj0zlFo5+ZzadQWDwK+C/T3VIetBH+DUNLE /op7sfmA6pTQEHyAKgYzQnMRqa0xRbJ3ZyPuWdkejVlKxrUvzNmUARWAS2Jmki7eQDP0 bRQ2OzX7QIs9G6mPSahz5jOfY1R+6s7DsrdFkHd6yIPV0I9nPefF/aL1Zfms+ZeUtBeJ LAyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EHJU0bX552Ws+jez1ticz4oJcjIwpL/pBsuAb1MhPQw=; b=3wHhi5hxp6g+1YtWy18KQkwR5aMAyPDNPM3AoRxnGw5SliQlw9TqC8wdi0PMs7BDKr rr+ZYgun/gF+HKTVdG5X/kL+8Q1fTpLCCDtwyZ1oaaD/FAVu9/wnWtPfIY0ptb3ErB1k mxuzxARTgQeGB6hjayEPVxihn3g1zlSxlo+uZyerddvh8lVDrDaFI7q2Xj+lzrWoYueF ba1t+GzPrBwNu/byByg7MsVcJZDQ+xNiqUQZ9Cwy69IK8Bsh86AUBmMgQH2E7FPDGqIK OdqBQXLM5SnvHQJDS37/z6Rd/NqqumNtLO+eFTW58PHY2TrkJpxgy5ih7jYpWrWMCKAq TzrQ== X-Gm-Message-State: AOAM532/Zu/u0r1cPUtr/b0dXsjserDKtrgyQsQfvrW+JEe+oV1mgiqI DqEO736GHCA0VRxalQdBRXGiaRW1z8o9uizi X-Google-Smtp-Source: ABdhPJx1PutKa1NpJ3WKsrvWE5T1tICSKPQcUZPBYDY+AfXpPEIOGMkOq5u+zJ9AlJqSLIVLeG0sjQ== X-Received: by 2002:a17:903:32c4:b0:156:8fd2:4aae with SMTP id i4-20020a17090332c400b001568fd24aaemr40670112plr.150.1649795178645; Tue, 12 Apr 2022 13:26:18 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id p12-20020a63ab0c000000b00381f7577a5csm3609084pgf.17.2022.04.12.13.26.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 13:26:18 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/4] net: add support for socket no-lock Date: Tue, 12 Apr 2022 14:26:12 -0600 Message-Id: <20220412202613.234896-4-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220412202613.234896-1-axboe@kernel.dk> References: <20220412202613.234896-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org If we have a guaranteed single user of a socket, then we can optimize the lock/release of it. Signed-off-by: Jens Axboe --- include/net/sock.h | 10 ++++++++-- net/core/sock.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+), 2 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 99fcc4d7eed9..aefc94677c94 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1674,7 +1674,7 @@ do { \ static inline bool lockdep_sock_is_held(const struct sock *sk) { - return lockdep_is_held(&sk->sk_lock) || + return sk->sk_no_lock || lockdep_is_held(&sk->sk_lock) || lockdep_is_held(&sk->sk_lock.slock); } @@ -1774,18 +1774,20 @@ static inline void unlock_sock_fast(struct sock *sk, bool slow) static inline void sock_owned_by_me(const struct sock *sk) { #ifdef CONFIG_LOCKDEP - WARN_ON_ONCE(!lockdep_sock_is_held(sk) && debug_locks); + WARN_ON_ONCE(!sk->sk_no_lock && !lockdep_sock_is_held(sk) && debug_locks); #endif } static inline bool sock_owned_by_user(const struct sock *sk) { sock_owned_by_me(sk); + smp_rmb(); return sk->sk_lock.owned; } static inline bool sock_owned_by_user_nocheck(const struct sock *sk) { + smp_rmb(); return sk->sk_lock.owned; } @@ -1794,6 +1796,10 @@ static inline void sock_release_ownership(struct sock *sk) if (sock_owned_by_user_nocheck(sk)) { sk->sk_lock.owned = 0; + if (sk->sk_no_lock) { + smp_wmb(); + return; + } /* The sk_lock has mutex_unlock() semantics: */ mutex_release(&sk->sk_lock.dep_map, _RET_IP_); } diff --git a/net/core/sock.c b/net/core/sock.c index fec892b384a4..d7eea29c5699 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2764,6 +2764,9 @@ void __lock_sock(struct sock *sk) { DEFINE_WAIT(wait); + if (WARN_ON_ONCE(sk->sk_no_lock)) + return; + for (;;) { prepare_to_wait_exclusive(&sk->sk_lock.wq, &wait, TASK_UNINTERRUPTIBLE); @@ -3307,8 +3310,21 @@ void sock_init_data(struct socket *sock, struct sock *sk) } EXPORT_SYMBOL(sock_init_data); +static inline bool lock_sock_nolock(struct sock *sk) +{ + if (sk->sk_no_lock) { + sk->sk_lock.owned = 1; + smp_wmb(); + return true; + } + return false; +} + void lock_sock_nested(struct sock *sk, int subclass) { + if (lock_sock_nolock(sk)) + return; + /* The sk_lock has mutex_lock() semantics here. */ mutex_acquire(&sk->sk_lock.dep_map, subclass, 0, _RET_IP_); @@ -3321,8 +3337,23 @@ void lock_sock_nested(struct sock *sk, int subclass) } EXPORT_SYMBOL(lock_sock_nested); +static inline bool release_sock_nolock(struct sock *sk) +{ + if (!sk->sk_no_lock) + return false; + if (READ_ONCE(sk->sk_backlog.tail)) + return false; + if (sk->sk_prot->release_cb) + sk->sk_prot->release_cb(sk); + sock_release_ownership(sk); + return true; +} + void release_sock(struct sock *sk) { + if (release_sock_nolock(sk)) + return; + spin_lock_bh(&sk->sk_lock.slock); if (sk->sk_backlog.tail) __release_sock(sk); From patchwork Tue Apr 12 20:26:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12811272 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9443FC433EF for ; Tue, 12 Apr 2022 23:22:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229907AbiDLXZA (ORCPT ); Tue, 12 Apr 2022 19:25:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229951AbiDLXY1 (ORCPT ); Tue, 12 Apr 2022 19:24:27 -0400 Received: from mail-oa1-x31.google.com (mail-oa1-x31.google.com [IPv6:2001:4860:4864:20::31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A40D6D3BE for ; Tue, 12 Apr 2022 15:43:01 -0700 (PDT) Received: by mail-oa1-x31.google.com with SMTP id 586e51a60fabf-e2a00f2cc8so270414fac.4 for ; Tue, 12 Apr 2022 15:43:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zf27ijhZ6DfOjYkh/DtQyUcCz+B8LVWzZLm6lKBsAd4=; b=K6UWMKlZ8bGn9pFrvtfp3wp8/pqMdON9qMlVBNZ00YbV7rSBtD+oaoZfcKdZyRU7+Q qCo137H4JtvBQtSa8XDiRRWrb3ETzP3dJGVd5lzFTIl/jP1AipH5UM/+aXAenFiE7Hpl lV9q+w6it2qbUrFUtbI0IqFj6yMtDaG9j7FlgZUfRBU0ybRZVJh7Sni0KLxnRlRmeQdD jtyA6DFMoQKdlA1aoB9mltVCEV/kINjxeZUnMbnqXoYG0oZUTunVipXB85OUPvpL1FyB Rq+4dYBUwFpcJ6aVW20IhsubUOILVmjcmeTxVW+jQwrY+uOYut6zo2sgnnAUZhWod9y7 mlXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zf27ijhZ6DfOjYkh/DtQyUcCz+B8LVWzZLm6lKBsAd4=; b=3YqB1LOgL+MX/7g+U5iMKY17iY/r/MG0XZnTsHYcBCsJlBMO0fvHTf2EbfwX77k4UP ibDkykRV80lH4PuyLtmbGvjcQMu2pJbKyRsro+jYBAq39DXC35P3zsNQDBE0wyYmwbn7 WWRhledA1nsnOKnSRCdud69aDOV7xXMY+Gr6zXlVdgcvP6jaPiTQuGtYZxUgxLqj7EXQ uw4nhkrYhWjurpadbM7ipiV5wxaNYnAtTWtnG0EZDhS+brGMIp6NiYz2KpNvc7Y9/l/Z 2DiaXLM8qZCLwz69CWgsFYjEzCuRpY5Ql+uAC579z9jNdeYXkY99PkYV5mj5fJy9FRRR TUQw== X-Gm-Message-State: AOAM531say1kyO+imieCn8/tnU1cGSMCwPAE+U2bviY+hb62EDZfhdUJ rJc0/ola0XUTLlUXSpTABhezXfyTitZdAznx X-Google-Smtp-Source: ABdhPJyGv6IcVqddymzY/44cjNsd2brF8YFhvMuJWIr10Rb4OGyLDBB681rPG4QjqjnJqkXPIRVWTA== X-Received: by 2002:a17:90a:d082:b0:1ca:be58:c692 with SMTP id k2-20020a17090ad08200b001cabe58c692mr6902747pju.238.1649795180090; Tue, 12 Apr 2022 13:26:20 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id p12-20020a63ab0c000000b00381f7577a5csm3609084pgf.17.2022.04.12.13.26.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 13:26:19 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/4] io_uring: mark accept direct socket as no-lock Date: Tue, 12 Apr 2022 14:26:13 -0600 Message-Id: <20220412202613.234896-5-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220412202613.234896-1-axboe@kernel.dk> References: <20220412202613.234896-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Mark a socket as nolock if we're accepting it directly, eg without installing it into the process file table. For direct issue or task_work issue, we already grab the uring_lock for those, and hence they are serializing access to the socket for send/recv already. The only case where we don't always grab the lock is for async issue. Add a helper to ensure that it gets done if this is a nolock socket. Signed-off-by: Jens Axboe --- fs/io_uring.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 0a6bcc077637..17b4dc9f130f 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -5918,6 +5918,19 @@ static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) return 0; } +/* + * Mark the socket as not needing locking, io_uring will serialize access + * to it. Note there's no matching clear of this condition, as this is only + * applicable for a fixed/registerd file, and those go away when we unregister + * anyway. + */ +static void io_sock_nolock_set(struct file *file) +{ + struct sock *sk = sock_from_file(file)->sk; + + sk->sk_no_lock = true; +} + static int io_accept(struct io_kiocb *req, unsigned int issue_flags) { struct io_accept *accept = &req->accept; @@ -5947,6 +5960,7 @@ static int io_accept(struct io_kiocb *req, unsigned int issue_flags) fd_install(fd, file); ret = fd; } else { + io_sock_nolock_set(file); ret = io_install_fixed_file(req, file, issue_flags, accept->file_slot - 1); } @@ -7604,11 +7618,31 @@ static struct io_wq_work *io_wq_free_work(struct io_wq_work *work) return req ? &req->work : NULL; } +/* + * This could be improved with an FFS flag, but since it's only done for + * the slower path of io-wq offload, no point in optimizing it further. + */ +static bool io_req_needs_lock(struct io_kiocb *req) +{ +#if defined(CONFIG_NET) + struct socket *sock; + + if (!req->file) + return false; + + sock = sock_from_file(req->file); + if (sock && sock->sk->sk_no_lock) + return true; +#endif + return false; +} + static void io_wq_submit_work(struct io_wq_work *work) { struct io_kiocb *req = container_of(work, struct io_kiocb, work); const struct io_op_def *def = &io_op_defs[req->opcode]; unsigned int issue_flags = IO_URING_F_UNLOCKED; + struct io_ring_ctx *ctx = req->ctx; bool needs_poll = false; struct io_kiocb *timeout; int ret = 0, err = -ECANCELED; @@ -7645,6 +7679,11 @@ static void io_wq_submit_work(struct io_wq_work *work) } } + if (io_req_needs_lock(req)) { + mutex_lock(&ctx->uring_lock); + issue_flags &= ~IO_URING_F_UNLOCKED; + } + do { ret = io_issue_sqe(req, issue_flags); if (ret != -EAGAIN) @@ -7659,8 +7698,10 @@ static void io_wq_submit_work(struct io_wq_work *work) continue; } - if (io_arm_poll_handler(req, issue_flags) == IO_APOLL_OK) - return; + if (io_arm_poll_handler(req, issue_flags) == IO_APOLL_OK) { + ret = 0; + break; + } /* aborted or ready, in either case retry blocking */ needs_poll = false; issue_flags &= ~IO_URING_F_NONBLOCK; @@ -7669,6 +7710,9 @@ static void io_wq_submit_work(struct io_wq_work *work) /* avoid locking problems by failing it from a clean context */ if (ret) io_req_task_queue_fail(req, ret); + + if (!(issue_flags & IO_URING_F_UNLOCKED)) + mutex_unlock(&ctx->uring_lock); } static inline struct io_fixed_file *io_fixed_file_slot(struct io_file_table *table,