From patchwork Thu Jul 6 20:46:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maciej Fijalkowski X-Patchwork-Id: 13304121 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14CDD10968; Thu, 6 Jul 2023 20:47:28 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E84F91992; Thu, 6 Jul 2023 13:47:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688676446; x=1720212446; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b0Z1eG6V9O9yIRt1/MJBAoIEhz5a/0JTz49KoO+l28A=; b=ZV1qgCSR0OCSw0XVGBNa/42Lr0u3VvhkJQmuGBtINj2eYvBcuuqalOUv E9TzYLylPbqTooAbA6femkNMI6At854Zkhtf91nyjk5pCyIZBxW5s4+aW JN3+FVR2w9rU0dq9WQXXK5Jn5/BoTH8ZzHfmR7XlMZT/++QFaEBs1u4h1 MU0TVskSFl7e9eDIFttW4lin+BiukCeRRGGUwwZtNCYJaO+xRQQIuC7lt EoRwW+N63ofnS73QXl9Flo18GTrvEU3UsmE62UzPLPxHP7yGOegsEMIRR SemSCEtCMKlUWxr7oqSov7YW6+4ZGfMg3RyHpAHX7J7WcnMkvvs7FhtAu A==; X-IronPort-AV: E=McAfee;i="6600,9927,10763"; a="361195261" X-IronPort-AV: E=Sophos;i="6.01,185,1684825200"; d="scan'208";a="361195261" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2023 13:47:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10763"; a="785059516" X-IronPort-AV: E=Sophos;i="6.01,185,1684825200"; d="scan'208";a="785059516" Received: from boxer.igk.intel.com ([10.102.20.173]) by fmsmga008.fm.intel.com with ESMTP; 06 Jul 2023 13:47:24 -0700 From: Maciej Fijalkowski To: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org Cc: netdev@vger.kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, tirthendu.sarkar@intel.com, maciej.fijalkowski@intel.com, simon.horman@corigine.com, toke@kernel.org Subject: [PATCH v5 bpf-next 02/24] xsk: introduce XSK_USE_SG bind flag for xsk socket Date: Thu, 6 Jul 2023 22:46:28 +0200 Message-Id: <20230706204650.469087-3-maciej.fijalkowski@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230706204650.469087-1-maciej.fijalkowski@intel.com> References: <20230706204650.469087-1-maciej.fijalkowski@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Tirthendu Sarkar As of now xsk core drops any xdp_buff with data size greater than the xsk frame_size as set by the af_xdp application. With multi-buffer support introduced in the next patch xsk core can now split those buffers into multiple descriptors provided the af_xdp application can handle them. Such capability of the application needs to be independent of the xdp_prog's frag support capability since there are cases where even a single xdp_buffer may need to be split into multiple descriptors owing to a smaller xsk frame size. For e.g., with NIC rx_buffer size set to 4kB, a 3kB packet will constitute of a single buffer and so will be sent as such to AF_XDP layer irrespective of 'xdp.frags' capability of the XDP program. Now if the xsk frame size is set to 2kB by the AF_XDP application, then the packet will need to be split into 2 descriptors if AF_XDP application can handle multi-buffer, else it needs to be dropped. Applications can now advertise their frag handling capability to xsk core so that xsk core can decide if it should drop or split xdp_buffs that exceed xsk frame size. This is done using a new 'XSK_USE_SG' bind flag for the xdp socket. Signed-off-by: Tirthendu Sarkar --- include/net/xdp_sock.h | 1 + include/uapi/linux/if_xdp.h | 6 ++++++ net/xdp/xsk.c | 5 +++-- 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index e96a1151ec75..36b0411a0d1b 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -52,6 +52,7 @@ struct xdp_sock { struct xsk_buff_pool *pool; u16 queue_id; bool zc; + bool sg; enum { XSK_READY = 0, XSK_BOUND, diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 434f313dc26c..8d48863472b9 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -25,6 +25,12 @@ * application. */ #define XDP_USE_NEED_WAKEUP (1 << 3) +/* By setting this option, userspace application indicates that it can + * handle multiple descriptors per packet thus enabling AF_XDP to split + * multi-buffer XDP frames into multiple Rx descriptors. Without this set + * such frames will be dropped. + */ +#define XDP_USE_SG (1 << 4) /* Flags for xsk_umem_config flags */ #define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index b8f0af55793c..c8460927e5f3 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -896,7 +896,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) flags = sxdp->sxdp_flags; if (flags & ~(XDP_SHARED_UMEM | XDP_COPY | XDP_ZEROCOPY | - XDP_USE_NEED_WAKEUP)) + XDP_USE_NEED_WAKEUP | XDP_USE_SG)) return -EINVAL; rtnl_lock(); @@ -924,7 +924,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) struct socket *sock; if ((flags & XDP_COPY) || (flags & XDP_ZEROCOPY) || - (flags & XDP_USE_NEED_WAKEUP)) { + (flags & XDP_USE_NEED_WAKEUP) || (flags & XDP_USE_SG)) { /* Cannot specify flags for shared sockets. */ err = -EINVAL; goto out_unlock; @@ -1023,6 +1023,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) xs->dev = dev; xs->zc = xs->umem->zc; + xs->sg = !!(flags & XDP_USE_SG); xs->queue_id = qid; xp_add_xsk(xs->pool, xs);