From patchwork Fri Feb 3 13:27:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Aptel X-Patchwork-Id: 13127530 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F5C1C636D4 for ; Fri, 3 Feb 2023 13:30:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232917AbjBCNar (ORCPT ); Fri, 3 Feb 2023 08:30:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232891AbjBCNaa (ORCPT ); Fri, 3 Feb 2023 08:30:30 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on20608.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe59::608]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 344A6A002A for ; Fri, 3 Feb 2023 05:30:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SCLkfvq9o9RfzW2N3H9NBX8rbi4en1aIqgGVxHBS7WHvImV/apB7VRngbSOR09l8MW2dWt1cdJrthIO5vSeNywd5BbkL8n4xY34ooWees3ZGkhKqWU3ygQAe4ve2w8pa3v7F3SWLq1yl/R7hj9vVK61Jz8QniKoL0KLF4W68mv281acwRFz0r1PxdFU+4yPokVbak7xxXSnlksNd3oIkeqkS8UtwTg9KM6wazXaKj/L1Roagf5eTBAktGFBymMzFLDsppkiQh3D4dUWm4WoFjzApKvnkTXMggIu5dzrLfLnwglnC/64AQ43l09UZfaeRWhULo/BYxNAIMTH7i6ATwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YjxcR4c5pkldIX0nqOl2bPYOQtxAeQHdp+jUFNfXVlM=; b=eOe32X0Vvo+bJDQhk1RbtISySfm2xDwhUbtie5fsvWaz+t4vGrnJUMQ3vgm8q+GvKTLjBs+kp11Mzv8qAkE2Z4QCBjsKGVja29PtlrHPZY2wvjG0IF2qGXEHr0QLs0nDd+uE7i5wHT/xWfRbgecMpRSIxSBE0jQTTyGWpbcU345xIaW9ZJOQTJ2SjbtX5HZv/W5lsIlV5wQZaAMxNWkNCQojoI/3N2mS01mORU4myC3mC3jpWFngyf3uwVpDYFnvweLeilodQYybEAbmthUWtqX+kaU+4XaD8RIVrwzzKShoaNoJ4hUvKX9JUKL7NV/tm//haOpceVVxWyEG6U9mbA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YjxcR4c5pkldIX0nqOl2bPYOQtxAeQHdp+jUFNfXVlM=; b=pNyl5+vVJfxjUTiBDEh/lQAQVGClN3F+G54VKAzbdzgxWK6x+c0dmR+uiVXdkqeoZEwvKFeqEkI997TT8yCr/KceOkzx1HiBGdFoSe3IoS460EBwGNrOXb21qPlt2n132Jr7F56JR6CYgWPlN4sXIax5/b2YN9cqrlVl80xgbxlxKZ3v51SjL/wkgrCXEMX3SoqE+stqgzTlkjbbnkFqHLFrMze4YA0ugeWcsKs+PeL3+9iJPIV6ggtNuaic2WPdLc4NrRX2zNLrFTtZiiFyKVIsqBf+lRvKN6LJnOBE461MvrNt3FcwXMY1wubH/4pFcRTJyEMYfLUmvOQwqEzrAw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) by PH0PR12MB7929.namprd12.prod.outlook.com (2603:10b6:510:284::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6064.27; Fri, 3 Feb 2023 13:29:58 +0000 Received: from SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::9fd9:dadc:4d8a:2d84]) by SJ1PR12MB6075.namprd12.prod.outlook.com ([fe80::9fd9:dadc:4d8a:2d84%4]) with mapi id 15.20.6064.027; Fri, 3 Feb 2023 13:29:57 +0000 From: Aurelien Aptel To: linux-nvme@lists.infradead.org, netdev@vger.kernel.org, sagi@grimberg.me, hch@lst.de, kbusch@kernel.org, axboe@fb.com, chaitanyak@nvidia.com, davem@davemloft.net, kuba@kernel.org Cc: Aurelien Aptel , aurelien.aptel@gmail.com, smalin@nvidia.com, malin1024@gmail.com, ogerlitz@nvidia.com, yorayz@nvidia.com, borisp@nvidia.com Subject: [PATCH v11 24/25] net/mlx5e: NVMEoTCP, data-path for DDP+DDGST offload Date: Fri, 3 Feb 2023 15:27:04 +0200 Message-Id: <20230203132705.627232-25-aaptel@nvidia.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230203132705.627232-1-aaptel@nvidia.com> References: <20230203132705.627232-1-aaptel@nvidia.com> X-ClientProxiedBy: LO4P123CA0501.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1ab::20) To SJ1PR12MB6075.namprd12.prod.outlook.com (2603:10b6:a03:45e::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PR12MB6075:EE_|PH0PR12MB7929:EE_ X-MS-Office365-Filtering-Correlation-Id: be4f21eb-ca2b-4d10-cb01-08db05eabe35 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7DVOp6qlsKp4IDrx8hk1P1iJco+rDmdWcsYxMSRa0UiIh4YZ4CiJ2GAA+uaP6oxvTCaDyrPcVLnL3OJuNkSstY4l1whRAqmXgrmlZA6o1mT+yiU7zQQn6JOKOgVVmC8OG3BsRa0bfKF9eba2PPfj96V7IjZgaq7yMwiVezsB/asjigb99IxbJpmWfbYUcJjz/Wp4hMx/2rY34FKKKcYTaBvWXGC9wlczsIu32ywtKaFsB217SMpPPU0SDsk/yr17ITkVKkn5MWnkjo0feykBrh+aZG4b89Hn4x0KyxbyRaaCZ0Ubv0Zv+vt6aGTPVst3t7Qf6u76GXLD7FBBdobTtMrNs+HYiaqY5BW95FDwjCdEi3hkV4PGHku8COxofOcwrHuagk1vfClMYk8mdOeNeNA4PFgedIiazC0Xwe6QhEj/I/uPzDz91iY3OnAw/OL7YygUtErt7JBGpWwflYQ2LPrqpfPvQxa0MbP0Spnuk0kT/mDhMR4mMTU4S7oetI6Z5jJvjjp/BYukDqGUqzLa9An3qrQfuRR1XC7nrnsludd7JMi9DX/wzyd4jDYG6082SJ0OVfyVEy57iV15xO5PqJ7qhhdx5AmKNFv0r37r2vDrpX3E2FXqlvxUhrUWiW0nf2tAZcX/ovIBDcpxaXVlTg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ1PR12MB6075.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(366004)(376002)(39860400002)(346002)(136003)(396003)(451199018)(2906002)(86362001)(41300700001)(478600001)(36756003)(30864003)(7416002)(107886003)(6512007)(6506007)(26005)(6666004)(186003)(6486002)(2616005)(8936002)(5660300002)(1076003)(38100700002)(8676002)(66946007)(66556008)(4326008)(66476007)(83380400001)(316002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: DG7k9a8ZQx7WeIQNpsNOO3T1MWaJUlRCyb9zcWmNG427FJks2cncQupOFiW/Bdb+UXMSQ1cRhMsM12yJK1Zm+Rwn5H0f2BQjaCA8LWK5GS0x30x2YKx8feWTJo7Oqc1fpU7UvwBFRnysieLr9Y8KCqb/+NAsqcbTtRv/+bRC++wAy49QIxzOuxZDspLuIdMvgKQey9sglyIuujzfPfYNG6ziIEhVn+NSosfmNHLstohAE2Z2UwRpe3jz1Q7Hcuyv85zSVedtGAqiF7SZpyvyYVkuq6RpgeBfJGHz84n81HJZoMQRWaBeR5kgs5Jl/V0ZJnZBZZ3kawJpkUTepbZSE4vTjWUTl2ijFUV9CebHJLUOxOSYBAvM+Q7kpncFZyw9pwu7njMmxHFRoDo0rfdL+Fvv/PJu46wdh9hkq80q6H/jvDlLtWToZ419zveNHjQ++azbT+JmsfcgsTS3vec5UULdds7XtdUARXctKeiAbZmwFoePppnkf4el6RBCDWy/5EZ38VteReLscbC1tkGt2CXOnl6NM4+kML5CKVJS1QHNjYuJ8/NEpK7adQlHw5w/hJ7QaHsw7xYm3/WRAK89VaxHlqOA9oy+mL2vJntURwdeDcvHfxL1oAfVMsWLTtywAuIaxmEwR1mBRoip2+amodC/SmoH2w7fAxveEvFmVq3SdyKIofOujvoTr5TLW5xU8tG4vwjP/02h8Wb5oZFwxt3/sytBkIgJ5p5Yn0OD7HwJCSXmT0Qgny/CymYvViHBwwMXxfyzMsuhWjn0fI1TCMpIH8t6Tbt2r+asuTMtFNpLEp944hfqGwvTMvSiWWExcPGslVvnJ2+dZPLoT2BDLCumCpK38PB05xKVkeSNtXEli/388Vm/BQ3D3Th5gZJb8CTZEfwFCDzSJ9aSKwXOJkPNOBu/scuvWQhsNV9iBo4TdgrD2zmMEYdhqQs5sUvDmls7hK4aahupyOoPX7Wzv75wex5qLgk0vTRc3hT1V/+MeC6Mq0HGlj19ZxSV6+dsKypD98+SXrw1wurCAXMLg0OonjOYTPpKkK/ddKNpAspQvWigAxxcPIlLu388lRmITssmqEMliA2SE5Nu2wkFmauOhnFoq3epouEoQ+s4byoM0F0ymi2fq+xiPbZ3w4K2psBfx0Aglg6wYNsAFpKdxfV/Gdhz+Z0BNp7HhtRvMwIms+iywPHur+Qa9CbwXIO/QxIduC59AfyBvHOHYI4MnWq/aFY8bYUUNoUIxPti5A/qHxPnbODdr/aG6vaOs62WmiILRFc83BpPrMvJfBAG9OlTUtG7wptqRj3KAiwlto1Ove4+GwULm4XsYEykPsuE4P6pzDT9w+iHU/g1Id8pNURRJKSwn0pSXYF7DXVNxkFCnLtxniZlxK8KhE3Jy7I6mWctkZro836GAtKMgGyFknoNt0pcVgH18LYeceDce9PEYkQffCsQwy/i3H2Y1V3e3iOsJ7jObGuqFNSuw8TnQONndVgos9CmY7tsZxHH/a+o+6bevGq1O8LRe/LaU2M+UEvIm3ZTEtMGVHyUpRGnjuQt9mOfrXMQWHE9A3/hdLGhs4ntXURb2CqAkQwLifnY X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: be4f21eb-ca2b-4d10-cb01-08db05eabe35 X-MS-Exchange-CrossTenant-AuthSource: SJ1PR12MB6075.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Feb 2023 13:29:57.6146 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: yityFkEKmA5tAdNbDQ1kW5vnoMrhMFbmQYulr3nSY3Lj/v5gUojiNQQxccEXzWBghhjlcrRj4d6RMZaq0+77iQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB7929 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Ben Ben-Ishay This patch implements the data-path for direct data placement (DDP) and DDGST offloads. NVMEoTCP DDP constructs an SKB from each CQE, while pointing at NVME destination buffers. In turn, this enables the offload, as the NVMe-TCP layer will skip the copy when src == dst. Additionally, this patch adds support for DDGST (CRC32) offload. HW will report DDGST offload only if it has not encountered an error in the received packet. We pass this indication in skb->ulp_crc up the stack to NVMe-TCP to skip computing the DDGST if all corresponding SKBs were verified by HW. This patch also handles context resynchronization requests made by NIC HW. The resync request is passed to the NVMe-TCP layer to be handled at a later point in time. Finally, we also use the skb->ulp_ddp bit to avoid skb_condense. This is critical as every SKB that uses DDP has a hole that fits perfectly with skb_condense's policy, but filling this hole is counter-productive as the data there already resides in its destination buffer. This work has been done on pre-silicon functional simulator, and hence data-path performance numbers are not provided. Signed-off-by: Ben Ben-Ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack Signed-off-by: Aurelien Aptel Reviewed-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- .../mlx5/core/en_accel/nvmeotcp_rxtx.c | 316 ++++++++++++++++++ .../mlx5/core/en_accel/nvmeotcp_rxtx.h | 37 ++ .../net/ethernet/mellanox/mlx5/core/en_rx.c | 40 ++- 4 files changed, 384 insertions(+), 11 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 9df9999047d1..9804bd086bf4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -103,7 +103,7 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/ktls_stats.o \ en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \ en_accel/ktls_tx.o en_accel/ktls_rx.o -mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o +mlx5_core-$(CONFIG_MLX5_EN_NVMEOTCP) += en_accel/fs_tcp.o en_accel/nvmeotcp.o en_accel/nvmeotcp_rxtx.o mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \ steering/dr_matcher.o steering/dr_rule.o \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c new file mode 100644 index 000000000000..4c7dab28ef56 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c @@ -0,0 +1,316 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + +#include "en_accel/nvmeotcp_rxtx.h" +#include + +#define MLX5E_TC_FLOW_ID_MASK 0x00ffffff +static void nvmeotcp_update_resync(struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_cqe128 *cqe128) +{ + const struct ulp_ddp_ulp_ops *ulp_ops; + u32 seq; + + seq = be32_to_cpu(cqe128->resync_tcp_sn); + ulp_ops = inet_csk(queue->sk)->icsk_ulp_ddp_ops; + if (ulp_ops && ulp_ops->resync_request) + ulp_ops->resync_request(queue->sk, seq, ULP_DDP_RESYNC_PENDING); +} + +static void mlx5e_nvmeotcp_advance_sgl_iter(struct mlx5e_nvmeotcp_queue *queue) +{ + struct mlx5e_nvmeotcp_queue_entry *nqe = &queue->ccid_table[queue->ccid]; + + queue->ccoff += nqe->sgl[queue->ccsglidx].length; + queue->ccoff_inner = 0; + queue->ccsglidx++; +} + +static inline void +mlx5e_nvmeotcp_add_skb_frag(struct net_device *netdev, struct sk_buff *skb, + struct mlx5e_nvmeotcp_queue *queue, + struct mlx5e_nvmeotcp_queue_entry *nqe, u32 fragsz) +{ + dma_sync_single_for_cpu(&netdev->dev, + nqe->sgl[queue->ccsglidx].offset + queue->ccoff_inner, + fragsz, DMA_FROM_DEVICE); + page_ref_inc(compound_head(sg_page(&nqe->sgl[queue->ccsglidx]))); + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + sg_page(&nqe->sgl[queue->ccsglidx]), + nqe->sgl[queue->ccsglidx].offset + queue->ccoff_inner, + fragsz, + fragsz); +} + +static inline void +mlx5_nvmeotcp_add_tail_nonlinear(struct mlx5e_nvmeotcp_queue *queue, + struct sk_buff *skb, skb_frag_t *org_frags, + int org_nr_frags, int frag_index) +{ + while (org_nr_frags != frag_index) { + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + skb_frag_page(&org_frags[frag_index]), + skb_frag_off(&org_frags[frag_index]), + skb_frag_size(&org_frags[frag_index]), + skb_frag_size(&org_frags[frag_index])); + page_ref_inc(skb_frag_page(&org_frags[frag_index])); + frag_index++; + } +} + +static void +mlx5_nvmeotcp_add_tail(struct mlx5e_nvmeotcp_queue *queue, struct sk_buff *skb, + int offset, int len) +{ + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, virt_to_page(skb->data), offset, len, + len); + page_ref_inc(virt_to_page(skb->data)); +} + +static void mlx5_nvmeotcp_trim_nonlinear(struct sk_buff *skb, skb_frag_t *org_frags, + int *frag_index, int remaining) +{ + unsigned int frag_size; + int nr_frags; + + /* skip @remaining bytes in frags */ + *frag_index = 0; + while (remaining) { + frag_size = skb_frag_size(&skb_shinfo(skb)->frags[*frag_index]); + if (frag_size > remaining) { + skb_frag_off_add(&skb_shinfo(skb)->frags[*frag_index], + remaining); + skb_frag_size_sub(&skb_shinfo(skb)->frags[*frag_index], + remaining); + remaining = 0; + } else { + remaining -= frag_size; + skb_frag_unref(skb, *frag_index); + *frag_index += 1; + } + } + + /* save original frags for the tail and unref */ + nr_frags = skb_shinfo(skb)->nr_frags; + memcpy(&org_frags[*frag_index], &skb_shinfo(skb)->frags[*frag_index], + (nr_frags - *frag_index) * sizeof(skb_frag_t)); + while (--nr_frags >= *frag_index) + skb_frag_unref(skb, nr_frags); + + /* remove frags from skb */ + skb_shinfo(skb)->nr_frags = 0; + skb->len -= skb->data_len; + skb->truesize -= skb->data_len; + skb->data_len = 0; +} + +static bool +mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; + struct net_device *netdev = rq->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_queue_entry *nqe; + skb_frag_t org_frags[MAX_SKB_FRAGS]; + struct mlx5e_nvmeotcp_queue *queue; + int org_nr_frags, frag_index; + struct mlx5e_cqe128 *cqe128; + u32 queue_id; + + queue_id = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK); + queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); + if (unlikely(!queue)) { + dev_kfree_skb_any(skb); + return false; + } + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + if (cqe_is_nvmeotcp_resync(cqe)) { + nvmeotcp_update_resync(queue, cqe128); + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + if (unlikely(queue->after_resync_cqe) && cqe_is_nvmeotcp_crcvalid(cqe)) { + skb->ulp_crc = 0; + queue->after_resync_cqe = 0; + } else { + if (queue->crc_rx) + skb->ulp_crc = cqe_is_nvmeotcp_crcvalid(cqe); + } + + skb->ulp_ddp = cqe_is_nvmeotcp_zc(cqe); + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* cc ddp from cqe */ + ccid = be16_to_cpu(cqe128->ccid); + ccoff = be32_to_cpu(cqe128->ccoff); + cclen = be16_to_cpu(cqe128->cclen); + hlen = be16_to_cpu(cqe128->hlen); + + /* carve a hole in the skb for DDP data */ + org_nr_frags = skb_shinfo(skb)->nr_frags; + mlx5_nvmeotcp_trim_nonlinear(skb, org_frags, &frag_index, cclen); + nqe = &queue->ccid_table[ccid]; + + /* packet starts new ccid? */ + if (queue->ccid != ccid || queue->ccid_gen != nqe->ccid_gen) { + queue->ccid = ccid; + queue->ccoff = 0; + queue->ccoff_inner = 0; + queue->ccsglidx = 0; + queue->ccid_gen = nqe->ccid_gen; + } + + /* skip inside cc until the ccoff in the cqe */ + while (queue->ccoff + queue->ccoff_inner < ccoff) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(off_t, remaining, + ccoff - (queue->ccoff + queue->ccoff_inner)); + + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + /* adjust the skb according to the cqe cc */ + while (to_copy < cclen) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(int, remaining, cclen - to_copy); + + mlx5e_nvmeotcp_add_skb_frag(netdev, skb, queue, nqe, fragsz); + to_copy += fragsz; + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + if (cqe_bcnt > hlen + cclen) { + remaining = cqe_bcnt - hlen - cclen; + mlx5_nvmeotcp_add_tail_nonlinear(queue, skb, org_frags, + org_nr_frags, + frag_index); + } + + mlx5e_nvmeotcp_put_queue(queue); + return true; +} + +static bool +mlx5e_nvmeotcp_rebuild_rx_skb_linear(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + int ccoff, cclen, hlen, ccid, remaining, fragsz, to_copy = 0; + struct net_device *netdev = rq->netdev; + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_nvmeotcp_queue_entry *nqe; + struct mlx5e_nvmeotcp_queue *queue; + struct mlx5e_cqe128 *cqe128; + u32 queue_id; + + queue_id = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK); + queue = mlx5e_nvmeotcp_get_queue(priv->nvmeotcp, queue_id); + if (unlikely(!queue)) { + dev_kfree_skb_any(skb); + return false; + } + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + if (cqe_is_nvmeotcp_resync(cqe)) { + nvmeotcp_update_resync(queue, cqe128); + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + if (unlikely(queue->after_resync_cqe) && cqe_is_nvmeotcp_crcvalid(cqe)) { + skb->ulp_crc = 0; + queue->after_resync_cqe = 0; + } else { + if (queue->crc_rx) + skb->ulp_crc = cqe_is_nvmeotcp_crcvalid(cqe); + } + + skb->ulp_ddp = cqe_is_nvmeotcp_zc(cqe); + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return true; + } + + /* cc ddp from cqe */ + ccid = be16_to_cpu(cqe128->ccid); + ccoff = be32_to_cpu(cqe128->ccoff); + cclen = be16_to_cpu(cqe128->cclen); + hlen = be16_to_cpu(cqe128->hlen); + + /* carve a hole in the skb for DDP data */ + skb_trim(skb, hlen); + nqe = &queue->ccid_table[ccid]; + + /* packet starts new ccid? */ + if (queue->ccid != ccid || queue->ccid_gen != nqe->ccid_gen) { + queue->ccid = ccid; + queue->ccoff = 0; + queue->ccoff_inner = 0; + queue->ccsglidx = 0; + queue->ccid_gen = nqe->ccid_gen; + } + + /* skip inside cc until the ccoff in the cqe */ + while (queue->ccoff + queue->ccoff_inner < ccoff) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(off_t, remaining, + ccoff - (queue->ccoff + queue->ccoff_inner)); + + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + /* adjust the skb according to the cqe cc */ + while (to_copy < cclen) { + remaining = nqe->sgl[queue->ccsglidx].length - queue->ccoff_inner; + fragsz = min_t(int, remaining, cclen - to_copy); + + mlx5e_nvmeotcp_add_skb_frag(netdev, skb, queue, nqe, fragsz); + to_copy += fragsz; + if (fragsz == remaining) + mlx5e_nvmeotcp_advance_sgl_iter(queue); + else + queue->ccoff_inner += fragsz; + } + + if (cqe_bcnt > hlen + cclen) { + remaining = cqe_bcnt - hlen - cclen; + mlx5_nvmeotcp_add_tail(queue, skb, + offset_in_page(skb->data) + + hlen + cclen, remaining); + } + + mlx5e_nvmeotcp_put_queue(queue); + return true; +} + +bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + if (skb->data_len) + return mlx5e_nvmeotcp_rebuild_rx_skb_nonlinear(rq, skb, cqe, cqe_bcnt); + else + return mlx5e_nvmeotcp_rebuild_rx_skb_linear(rq, skb, cqe, cqe_bcnt); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h new file mode 100644 index 000000000000..a8ca8a53bac6 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ +#ifndef __MLX5E_NVMEOTCP_RXTX_H__ +#define __MLX5E_NVMEOTCP_RXTX_H__ + +#ifdef CONFIG_MLX5_EN_NVMEOTCP + +#include +#include "en_accel/nvmeotcp.h" + +bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt); + +static inline int mlx5_nvmeotcp_get_headlen(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + struct mlx5e_cqe128 *cqe128; + + if (!cqe_is_nvmeotcp_zc(cqe)) + return cqe_bcnt; + + cqe128 = container_of(cqe, struct mlx5e_cqe128, cqe64); + return be16_to_cpu(cqe128->hlen); +} + +#else + +static inline bool +mlx5e_nvmeotcp_rebuild_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ return true; } + +static inline int mlx5_nvmeotcp_get_headlen(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ return cqe_bcnt; } + +#endif /* CONFIG_MLX5_EN_NVMEOTCP */ +#endif /* __MLX5E_NVMEOTCP_RXTX_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 71f73cd5d36e..1a6786448d75 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -53,7 +53,7 @@ #include "en_accel/macsec.h" #include "en_accel/ipsec_rxtx.h" #include "en_accel/ktls_txrx.h" -#include "en_accel/nvmeotcp.h" +#include "en_accel/nvmeotcp_rxtx.h" #include "en/xdp.h" #include "en/xsk/rx.h" #include "en/health.h" @@ -1481,7 +1481,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev, #define MLX5E_CE_BIT_MASK 0x80 -static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, +static inline bool mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, u32 cqe_bcnt, struct mlx5e_rq *rq, struct sk_buff *skb) @@ -1492,6 +1492,13 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, skb->mac_len = ETH_HLEN; + if (IS_ENABLED(CONFIG_MLX5_EN_NVMEOTCP) && cqe_is_nvmeotcp(cqe)) { + bool ret = mlx5e_nvmeotcp_rebuild_rx_skb(rq, skb, cqe, cqe_bcnt); + + if (unlikely(!ret)) + return ret; + } + if (unlikely(get_cqe_tls_offload(cqe))) mlx5e_ktls_handle_rx_skb(rq, skb, cqe, &cqe_bcnt); @@ -1537,6 +1544,8 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, if (unlikely(mlx5e_skb_is_multicast(skb))) stats->mcast_packets++; + + return true; } static void mlx5e_shampo_complete_rx_cqe(struct mlx5e_rq *rq, @@ -1560,7 +1569,7 @@ static void mlx5e_shampo_complete_rx_cqe(struct mlx5e_rq *rq, } } -static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, +static inline bool mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, u32 cqe_bcnt, struct sk_buff *skb) @@ -1569,7 +1578,7 @@ static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq, stats->packets++; stats->bytes += cqe_bcnt; - mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb); + return mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb); } static inline @@ -1810,7 +1819,8 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) goto free_wqe; } - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb(cqe, skb)) { @@ -1863,7 +1873,8 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) goto free_wqe; } - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; if (rep->vlan && skb_vlan_tag_present(skb)) skb_vlan_pop(skb); @@ -1914,7 +1925,8 @@ static void mlx5e_handle_rx_cqe_mpwrq_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 if (!skb) goto mpwrq_cqe_out; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto mpwrq_cqe_out; mlx5e_rep_tc_receive(cqe, rq, skb); @@ -1959,13 +1971,18 @@ mlx5e_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq, } } +static inline u16 mlx5e_get_headlen_hint(struct mlx5_cqe64 *cqe, u32 cqe_bcnt) +{ + return min_t(u32, MLX5E_RX_MAX_HEAD, mlx5_nvmeotcp_get_headlen(cqe, cqe_bcnt)); +} + static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; - u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); + u16 headlen = mlx5e_get_headlen_hint(cqe, cqe_bcnt); u32 frag_offset = head_offset + headlen; u32 byte_cnt = cqe_bcnt - headlen; union mlx5e_alloc_unit *head_au = au; @@ -2277,7 +2294,8 @@ static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cq if (!skb) goto mpwrq_cqe_out; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto mpwrq_cqe_out; if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb(cqe, skb)) { @@ -2614,7 +2632,9 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe if (!skb) goto free_wqe; - mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); + if (unlikely(!mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb))) + goto free_wqe; + skb_push(skb, ETH_HLEN); dl_port = mlx5e_devlink_get_dl_port(priv);